ed3c680b |
2023-03-30 11:16:30 |
Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) |
slaren |
9cbc404b |
2023-03-29 23:44:39 |
ci : re-enable AVX512 testing (Windows-MSVC) (#584) |
anzz1 |
b51c717d |
2023-03-29 22:15:34 |
ggml : init time on first ggml_init() call |
Georgi Gerganov |
0ba76c1e |
2023-03-29 22:13:12 |
llama : fix compile warnings when reading the vocab |
Georgi Gerganov |
cea1c859 |
2023-03-29 22:10:01 |
ggml : add ARM_NEON dequantize_row_q4_1() |
Georgi Gerganov |
f202ada1 |
2023-03-29 22:03:02 |
ggml : add ARM_NEON quantize_row_q4_1() |
Georgi Gerganov |
3b44d30d |
2023-03-29 21:47:33 |
ggml : add ARM_NEON ggml_vec_dot_q4_1() |
Georgi Gerganov |
61cbfff5 |
2023-03-29 20:09:25 |
rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) |
Pavol Rusnak |
d9ad1044 |
2023-03-29 19:21:09 |
Create chat-13B.bat (#592) |
Thérence |
b467702b |
2023-03-29 19:38:31 |
readme : fix typos |
Georgi Gerganov |
516d88e7 |
2023-03-29 19:37:20 |
readme : add GPT4All instructions (close #588) |
Georgi Gerganov |
53635c08 |
2023-03-29 19:29:26 |
py : add GPT4All conversion script |
Georgi Gerganov |
41318d70 |
2023-03-29 18:10:07 |
llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) |
Maël Kerbiriou |
a6956b25 |
2023-03-29 17:10:24 |
add example of re-act pattern (#583) |
Tobias Lütke |
83df5639 |
2023-03-29 16:20:07 |
Fix GCC warning about binary literal (#595) |
anzz1 |
a5c42c4b |
2023-03-29 16:19:29 |
Fix typo in llama.h (#593) |
anzz1 |
5a5f8b15 |
2023-03-28 22:44:29 |
Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) |
anzz1 |
f1217055 |
2023-03-28 22:43:25 |
CI: fix subdirectory path globbing (#546) |
anzz1 |
7f4c5c66 |
2023-03-28 21:23:09 |
llama : fix linkage with mingw (#551) |
anzz1 |
2a98bc18 |
2023-03-28 20:06:03 |
ggml : add AVX2 implementation of quantize_row_q4_1 (#515) |
slaren |
d0aaff57 |
2023-03-28 19:55:42 |
py : add temporary script to convert old ggml files to newer version (#539) |
thement |
d0330fd7 |
2023-03-28 13:51:29 |
py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) |
Tai Duc Nguyen |
99c5b276 |
2023-03-28 17:13:01 |
ggml : refactor quantized processing functions (#509) |
Stephan Walter |
692ce316 |
2023-03-29 02:02:34 |
py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) |
DooWoong Lee (David) |
96f9c050 |
2023-03-28 20:01:09 |
ci : make ctest verbose, hopefully we see what is wrong with the sanitizer |
Georgi Gerganov |
d502bc7c |
2023-03-28 19:51:55 |
tests : free llama context at the end of the test |
Georgi Gerganov |
436e5619 |
2023-03-28 16:48:20 |
all : be more strict about converting float to double (#458) |
Stephan Walter |
20e1e848 |
2023-03-28 11:39:01 |
deploy : add a Package.swift for SwiftPM support (#393) |
Jed Fox |
c1f88506 |
2023-03-28 15:56:03 |
ggml : introduce structs for the q4 data blocks (#356) |
Stephan Walter |
e0670260 |
2023-03-28 18:34:35 |
gitignore : add "embedding" |
Georgi Gerganov |
28ba975a |
2023-03-28 23:06:28 |
Check the existence of f16_model_path_base in quantize.py (#574) |
dotpy314 |
a6bdc47c |
2023-03-28 16:26:55 |
Fix usage of F16C intrinsics in AVX code (#563) |
slaren |
7b8dbcb7 |
2023-03-28 17:09:55 |
main.cpp fixes, refactoring (#571) |
anzz1 |
4b8efff0 |
2023-03-28 08:11:09 |
Add embedding example to Makefile (#540) |
RJ Adriaansen |
7e539557 |
2023-03-27 06:55:26 |
Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) |
Marco Matthies |
34c1072e |
2023-03-26 17:48:40 |
ci: add debug build to sanitizer build matrix (#527) |
Erik Scholz |
939ad2d3 |
2023-03-26 15:34:02 |
Fix undefined variables in debug build, remove unused variables (#531) |
Stephan Walter |
8c2ec5e2 |
2023-03-26 10:48:42 |
Add support for linux/arm64 platform during Docker Builds (#514) |
Juan Calderon-Perez |
b391579d |
2023-03-26 13:14:01 |
Update README and comments for standalone perplexity tool (#525) |
Stephan Walter |
7a87d31f |
2023-03-26 16:06:10 |
[main] fix infinite generation (-n == -1) (#523) |
anzz1 |
348d6926 |
2023-03-26 10:20:49 |
Add logo to README.md |
Georgi Gerganov |
33e35b8f |
2023-03-26 07:25:46 |
Exit from interactive mode if input stream is bad (#491) |
Harald Fernengel |
19726169 |
2023-03-26 00:13:28 |
CI: Run other sanitizer builds even if one fails (#511) |
anzz1 |
f732695c |
2023-03-25 14:53:55 |
Clarify console output in convert-pth-to-ggml.py (#512) |
jp-x-g |
2f7bf7dd |
2023-03-25 23:38:11 |
CMake / CI additions (#497) |
anzz1 |
34ab5268 |
2023-03-25 22:29:22 |
(Windows) Set console to UTF-8 on init (#420) |
anzz1 |
c2b25b69 |
2023-03-25 21:53:39 |
Fix colors enabling on WIN32 |
Georgi Gerganov |
79b2b266 |
2023-03-25 21:51:41 |
If n_predict == -1, generate forever |
Georgi Gerganov |
e2d490da |
2023-03-25 21:36:22 |
Inifinite generation via context swapping (#71) |
Georgi Gerganov |
03f7e335 |
2023-03-25 20:51:14 |
Cleanup STL headers + fix embedding examples + minor stuff |
Georgi Gerganov |
55ad42af |
2023-03-25 20:36:52 |
Move chat scripts into "./examples" |
Georgi Gerganov |
459e93cc |
2023-03-25 19:31:48 |
Add AVX2 implementation of dequantize_row_q4_1 (#505) |
slaren |
a316a425 |
2023-03-25 20:26:40 |
Overhaul the examples structure |
Georgi Gerganov |
ecbe466a |
2023-03-25 19:47:21 |
Retire the ggml_mul_mat() branch for transposed src0 (#500) |
Georgi Gerganov |
502a4001 |
2023-03-25 17:16:50 |
Disable prompt verbosity by default and add option to enable (#480) |
Georgi Gerganov |
09aecbf6 |
2023-03-25 16:06:49 |
Add AVX2 implementation of dequantize_row_q4_0 (#467) |
slaren |
4640eff2 |
2023-03-25 17:03:10 |
Don't interefe with BLAS for large prompts by running only 1 thread |
Georgi Gerganov |
ab77d763 |
2023-03-25 16:47:59 |
Add longer DAN prompt for testing big batch numbers |
Georgi Gerganov |
29b7baab |
2023-03-25 15:34:23 |
Add timings for the prompt evaluation (#478) |
slaren |
4a7129ac |
2023-03-25 16:30:32 |
Remove obsolete information from README |
Georgi Gerganov |
6b6dbc89 |
2023-03-25 16:22:05 |
Remove obsolete assert and fix compiler warning |
Georgi Gerganov |
2a2e63ce |
2023-03-25 16:09:54 |
Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS |
Georgi Gerganov |
e899bf54 |
2023-03-25 14:42:09 |
bounds checking for input prefix (#492) |
anzz1 |
fbd4d38c |
2023-03-25 14:03:19 |
feat: '--in-prefix STRING' option (#426) |
anzz1 |
58e6c9f3 |
2023-03-25 01:26:28 |
Add support for file load progress reporting callbacks (#434) |
Jed Fox |
36d07532 |
2023-03-25 01:21:24 |
Add missing struct annotation (#483) |
Doomsdayrs |
6f1ee4b6 |
2023-03-24 23:38:14 |
Fix crash for 65B model with pre-allocated memory (#485) |
Chris Kuehl |
8520fc31 |
2023-03-24 23:47:06 |
Disable BLAS altogether - the bug is not just for qunatized mat mul |
Georgi Gerganov |
b3f460e9 |
2023-03-24 23:39:17 |
Disable BLAS branch in mul_mat - seems there is a bug |
Georgi Gerganov |
04c6f5ed |
2023-03-24 23:17:58 |
Immediately start processing the prompt before user input has been provided (#476) |
Georgi Gerganov |
7a9b6c3a |
2023-03-24 23:17:37 |
Reduce memory usage and allocate enough memory for largest context (#473) |
Georgi Gerganov |
31572d96 |
2023-03-24 18:23:56 |
Temporary bump the memory buffer size - hopefully fix issues from 483bab2e |
Georgi Gerganov |
f4f5362e |
2023-03-24 15:23:09 |
Update README.md (#444) |
Gary Mulder |
863f65e2 |
2023-03-24 10:22:39 |
fix instruct mode (#445) |
rabidcopy |
afd220d9 |
2023-03-24 17:21:01 |
Properly free llama_context on failure |
Georgi Gerganov |
481044d5 |
2023-03-24 08:19:26 |
additional optimizations for POWER9 (#454) |
Cameron Kaiser |
563cdc39 |
2023-03-24 08:19:05 |
Support calling mlock() on loaded model data on Linux and macOS (#453) |
comex |
8d4a855c |
2023-03-24 08:05:13 |
Add embedding mode with arg flag. Currently working (#282) |
Luciano |
b6b268d4 |
2023-03-24 09:13:35 |
Add link to Roadmap discussion |
Georgi Gerganov |
3cd8dde0 |
2023-03-24 06:22:28 |
Revert "Fix memory allocation issues and seg faults" |
Georgi Gerganov |
4870e455 |
2023-03-24 00:11:53 |
Fix memory allocation issues and seg faults |
Georgi Gerganov |
483bab2e |
2023-03-23 23:22:01 |
Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) |
Georgi Gerganov |
404e1da3 |
2023-03-23 16:42:52 |
Fix quantize script not finding models in parent directory (#428) |
Jed Fox |
4cc053b6 |
2023-03-23 22:39:44 |
Remove oboslete command from Docker script |
Georgi Gerganov |
0ba5a3a9 |
2023-03-23 22:32:02 |
Obsolete |
Georgi Gerganov |
2e17dfd8 |
2023-03-23 15:22:47 |
Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) |
rabidcopy |
20a1a4e0 |
2023-03-23 10:18:13 |
Fix GPTQ converter (#423) |
Timmy Knight |
ad072fc5 |
2023-03-24 05:16:48 |
Generate library with CMake (#430) |
nusu-github |
ea10d3de |
2023-03-23 19:54:28 |
Command line args bounds checking (#424) |
anzz1 |
a18c1925 |
2023-03-22 00:37:02 |
Fix Nix build |
Ben Siraphob |
a50e39c6 |
2023-03-23 14:15:48 |
Revert "Delete SHA256SUMS for now" (#429) |
Stephan Walter |
a140219e |
2023-03-23 05:41:32 |
Fix Makefile echo escape codes (by removing them). (#418) |
Kerfuffle |
8a3e5ef8 |
2023-03-23 11:30:40 |
Move model section from issue template to README.md (#421) |
Gary Mulder |
8eea5ae0 |
2023-03-23 12:26:19 |
Delete SHA256SUMS for now (#416) |
anzz1 |
93208cfb |
2023-03-23 10:46:58 |
Adjust repetition penalty .. |
Georgi Gerganov |
03ace14c |
2023-03-23 09:48:51 |
Add link to recent podcast about whisper.cpp and llama.cpp |
Georgi Gerganov |
e4412b45 |
2023-03-23 04:20:34 |
CI: CMake: Separate build and test steps (#376) |
anzz1 |
f7dc43bc |
2023-03-23 01:30:23 |
Fix instruct mode broken by PR #354 (#409) |
tjohnman |
ee8a7887 |
2023-03-22 19:06:18 |
Update issue template so people will use it (#404) |
Gary Mulder |
69c92298 |
2023-03-22 17:29:06 |
Deduplicate q4 quantization functions (#383) |
Stephan Walter |
97940520 |
2023-03-22 18:20:25 |
fix: add POSIX functionality for Linux compilation (#51) |
Valentyn Bezshapkin |
305ba6f0 |
2023-03-22 18:16:35 |
Don't force immediate interactive without `-i` (#354) |
tjohnman |
4122dfff |
2023-03-22 17:37:10 |
cmake: make llama an actual library (#392) |
Erik Scholz |
56e659a0 |
2023-03-22 17:09:38 |
fix perplexity after c-api refactor (#390) |
Erik Scholz |
40ea807a |
2023-03-22 08:53:54 |
Add details on perplexity to README.md (#395) |
Gary Linscott |
d5850c53 |
2023-03-22 11:55:45 |
Add missing header for memcpy (#386) |
Yusuf Kağan Hanoğlu |
ae44e23e |
2023-03-22 07:47:15 |
When seed <= 0 - use the clock to generate one |
Georgi Gerganov |
928480ef |
2023-03-22 07:45:00 |
Init llama_context_params properly from CLI (#370) |
Georgi Gerganov |
56817b1f |
2023-03-22 07:34:02 |
Remove temporary notice and update hot topics |
Georgi Gerganov |
f5a77a62 |
2023-03-22 07:32:36 |
Introduce C-style API (#370) |
Georgi Gerganov |
da0e9fe9 |
2023-03-20 20:14:06 |
Add SHA256SUMS file and instructions to README how to obtain and verify the downloads |
Gary Mulder |
e6c9e098 |
2023-03-21 23:49:24 |
Fix bin dir for win ci |
anzz1 |
01a297b0 |
2023-03-21 22:34:25 |
specify build type for ctest on windows (#371) |
Erik Scholz |
3366853e |
2023-03-21 22:57:35 |
Add notice about pending change |
Georgi Gerganov |
3f9c6135 |
2023-03-21 16:52:27 |
fix typo in chatLLaMa (#368) |
Mathieu Nayrolles |
0f613527 |
2023-03-21 19:47:27 |
Update issue templates |
Georgi Gerganov |
353ec251 |
2023-03-21 14:21:50 |
We could use std::unordered_map over std::map (#305) |
Fabio R. Sluzala |
89d5d90f |
2023-03-21 18:11:01 |
Fix color codes emitting mid-UTF8 code. (#312) |
Matvey Soloviev |
16ffc013 |
2023-03-21 09:42:25 |
Importer for GPTQ quantized LLaMA models (#301) |
comex |
486ae645 |
2023-03-21 09:27:42 |
Compute perplexity over prompt (#270) |
Gary Linscott |
3ab3e658 |
2023-03-21 18:23:15 |
Add chatLLaMa script (#198) |
Jean-Christophe Hoelt |
f157088c |
2023-03-21 11:21:06 |
makefile: Fix CPU feature detection on Haiku (#218) |
Alex von Gluck IV |
c86ba036 |
2023-03-21 18:14:46 |
Enable ANSI colors on Windows 10+ (#311) |
anzz1 |
1daf4dd7 |
2023-03-21 18:10:32 |
Minor style changes |
Georgi Gerganov |
dc6a845b |
2023-03-21 18:09:37 |
Add chat.sh script |
Georgi Gerganov |
6a612959 |
2023-03-21 17:05:06 |
Check for reverse prompt by characters instead of tokens (#292) (#330) |
tjohnman |
d5f56a5e |
2023-03-21 17:04:43 |
Check for reverse prompt by characters instead of tokens (#292) (#330) |
tjohnman |
3bfa3b43 |
2023-03-21 17:59:16 |
Fix convert script, warnings alpaca instructions, default params |
Georgi Gerganov |
715d292e |
2023-03-21 09:50:09 |
Add OpenBSD support (#314) |
Kevin Lo |
c98ae026 |
2023-03-21 08:49:43 |
fix typo in comment (#318) |
Mack Straight |
c3b2306b |
2023-03-21 23:44:11 |
Makefile: slightly cleanup for Mac Intel; echo instead of run ./main -h (#335) |
Qingyou Meng |
975d2ceb |
2023-03-21 17:42:43 |
cmdline option for custom amount of model parts (--n_parts N) (#348) |
anzz1 |
e0ffc861 |
2023-03-21 08:34:49 |
Update IPFS links to quantized alpaca with new tokenizer format (#352) |
Kevin Kwok |
8f644a0a |
2023-03-21 17:32:14 |
Change default repeat_penalty to 1.0 |
Georgi Gerganov |
eb34620a |
2023-03-21 17:29:41 |
Add tokenizer test + revert to C++11 (#355) |
Georgi Gerganov |
2e664f1f |
2023-03-21 07:35:42 |
Add initial AVX512 support for dot product on Linux (#320) |
Casey Primozic |
8cf9f34e |
2023-03-21 09:37:16 |
Adding missing features of CMakeLists.txt & Refactoring (#131) |
nusu-github |
bd4b46d6 |
2023-03-20 16:44:30 |
Nix flake: set meta.mainProgram to llama |
Ben Siraphob |
6b6d5b50 |
2023-03-21 03:33:10 |
Fixed tokenizer.model not found error when model dir is symlink (#325) |
Qingyou Meng |
a791a68b |
2023-03-20 12:26:01 |
move file magic/version to header, print expected version (#319) |
Mack Straight |
0f1b21cb |
2023-03-20 18:05:20 |
Docker - Fix publish docker image in GitHub Registry (#235) |
Bernat Vadell |
074bea2e |
2023-03-20 03:17:23 |
sentencepiece bpe compatible tokenizer (#252) |
Mack Straight |
5cb63e24 |
2023-03-20 08:24:11 |
Add tqdm to Python requirements (#293) |
Stephan Walter |
da5303c1 |
2023-03-19 17:44:20 |
bugfix: default should not be interactive (#304) |
cocktailpeanut |
4545539d |
2023-03-19 21:58:51 |
Rename script |
Georgi Gerganov |
edeba283 |
2023-03-19 21:57:28 |
Add temporary helper script for Alpaca chat |
Georgi Gerganov |
5c19c70b |
2023-03-19 13:44:30 |
fix coloring of last `n_batch` of prompt, and refactor line input (#221) |
Rickey Bowers Jr |
24568371 |
2023-03-19 20:33:06 |
Support for multiple reverse prompts. (#299) |
tjohnman |
7392f1cd |
2023-03-19 12:38:44 |
Improved quantize script (#222) |
Suaj Carrot |
ad5fd5b6 |
2023-03-19 19:36:19 |
Make prompt randomization optional. (#300) |
tjohnman |
368d0c8a |
2023-03-19 19:31:17 |
Respect the maximum number of tokens in interactive. (#298) |
tjohnman |
50fae10d |
2023-03-19 19:22:48 |
Add --ignore-eos parameter (#181) |
slaren |
084e2f0e |
2023-03-20 02:10:00 |
interactive mode: print '\n' in sigint_handler, this flush stdout thus ensure color reset. (#283) |
Qingyou Meng |
0b366e73 |
2023-03-19 18:57:00 |
Command line switch to use F16 for memory_k and memory_v (refactor of #154) (#294) |
Erik Scholz |
160bfb21 |
2023-03-19 19:51:55 |
Update hot topics to mention Alpaca support |
Georgi Gerganov |
c494ed5b |
2023-03-19 19:46:32 |
Fix off-by-one bug (#115) |
Georgi Gerganov |
c1c7026b |
2023-03-19 19:33:18 |
Fix python stuff (#109) |
Georgi Gerganov |
467b1497 |
2023-03-19 20:17:39 |
Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109) |
qunash |
70f01cb8 |
2023-03-19 19:04:44 |
Drop trailing new line from file prompts (#80) |
Georgi Gerganov |
a4e63b73 |
2023-03-19 18:49:50 |
Add instruction for using Alpaca (#240) |
Georgi Gerganov |
9e170721 |
2023-03-19 18:37:02 |
Add "--instruct" argument for usage with Alpaca (#240) |
Georgi Gerganov |
22213a17 |
2023-03-19 17:30:00 |
Change RMSNorm eps to 1e-6 (#173) |
Georgi Gerganov |
d7def1a7 |
2023-03-18 17:10:47 |
Warn user if a context size greater than 2048 tokens is specified (#274) |
Ronsor |
6f61c18e |
2023-03-18 22:39:46 |
Fix typo in readme |
Pavol Rusnak |
1e5a6d08 |
2023-03-18 22:20:04 |
Add note about Python 3.11 to readme |
Pavol Rusnak |
554b5415 |
2023-03-18 21:58:46 |
Add memory/disk requirements to readme |
Pavol Rusnak |
d3f202d5 |
2023-03-18 20:51:49 |
Remove unused code since n_vocab is model.hparams.n_vocab (#262) |
Alex Nguyen |
e03e3597 |
2023-03-18 07:44:09 |
fixed warning with std::ignore about unused function result (#151) |
Justin Suess |
a81d0c2a |
2023-03-18 04:17:19 |
Fix n^2 loop in tokenization (#254) |
Gary Linscott |
b2de7f18 |
2023-03-18 09:27:12 |
CI Improvements (#230) |
anzz1 |
a2927478 |
2023-03-17 23:03:48 |
Nix flake (#40) |
Niklas Korz |
c9f670a1 |
2023-03-17 21:05:58 |
Implement non-greedy tokenizer that tries to maximize token lengths (#242) |
thement |
4f546091 |
2023-03-17 21:46:46 |
Default to 4 threads (#243) |
Georgi Gerganov |
e81b9c81 |
2023-03-17 20:30:04 |
Update Contributing section |
Georgi Gerganov |
367946c6 |
2023-03-17 17:47:35 |
Don't tell users to use a bad number of threads (#243) |
Stephan Walter |
6b0df5cc |
2023-03-18 00:38:24 |
add ptread link to fix cmake build under linux (#114) |
mmyjona |
2af23d30 |
2023-03-17 10:47:06 |
🚀 Dockerize llamacpp (#132) |
Bernat Vadell |
904d2a8d |
2023-03-17 05:48:39 |
Q4_1 quantization (#193) |
Matvey Soloviev |
72131107 |
2023-03-16 15:00:09 |
Update README.md |
Georgi Gerganov |
ac15de78 |
2023-03-16 08:55:13 |
Expand "Contributing" section |
Georgi Gerganov |
273abc47 |
2023-03-16 07:12:12 |
Update hot topics - RMSnorm |
Georgi Gerganov |
9b4a15b1 |
2023-03-15 19:29:25 |
Fix RMS norm in GGML (#191) |
Nebula |
6eac39ba |
2023-03-15 18:41:38 |
Add RMS norm and use it (#187) |
hoangmit |
27944c42 |
2023-03-15 21:35:25 |
fixed typo (#178) |
moritzbrantner |
2d15d6c9 |
2023-03-15 13:56:24 |
add SIGINT support for _WIN32 environments (#120) |
Rickey Bowers Jr |
2d64715a |
2023-03-15 15:42:40 |
added ctx_size parameter (#148) |
Justin Suess |
16b2c61a |
2023-03-15 15:39:38 |
fixed color reset on exit (#149) |
Justin Suess |
977295c7 |
2023-03-15 22:39:06 |
Fix potential licensing issue (#126) |
Musab Gultekin |
956dfda8 |
2023-03-15 12:37:50 |
Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142) |
Ronsor |
113e685d |
2023-03-15 15:05:14 |
inline -> static inline for "bytesFromNibbles" (#161) |
hoangmit |
47857e56 |
2023-03-14 12:34:37 |
Don't use vdotq_s32 if it's not available (#139) |
Ronsor |
60f819a2 |
2023-03-14 15:30:08 |
Add section to README on how to run the project on Android (#130) |
Radoslav Gerganov |
97ab2b25 |
2023-03-14 09:43:52 |
Add Misc section + update hot topics + minor fixes |
Georgi Gerganov |
2f700a27 |
2023-03-13 17:29:10 |
Add windows to the CI (#98) |
Sebastián A |
c09a9cfb |
2023-03-13 21:22:15 |
CMake build in Release by default (#75) |
Georgi Gerganov |
7ec903d3 |
2023-03-13 19:21:51 |
Update contribution section, hot topics, limitations, etc. |
Georgi Gerganov |
4497ad81 |
2023-03-13 19:15:08 |
Print system information |
Georgi Gerganov |
ed6849cc |
2023-03-13 14:12:33 |
Initial support for CMake (#75) |
Sebastián A |
41be0a3b |
2023-03-13 17:40:54 |
Add NetBSD support. (#90) |
Thomas Klausner |
671d5cac |
2023-03-13 17:39:56 |
Use fprintf for diagnostic output (#48) |
Pavol Rusnak |
84d9015c |
2023-03-13 18:36:44 |
Use vdotq_s32 to improve performance (#67) |
Georgi Gerganov |
63fd76fb |
2023-03-14 01:33:43 |
Reduce model loading time (#43) |
uint256_t |
2a20f48e |
2023-03-13 12:24:18 |
Fix UTF-8 handling (including colors) (#79) |
Val Kharitonov |
d1f22471 |
2023-03-13 17:15:20 |
Add quantize script for batch quantization (#92) |
Pavol Rusnak |
1808ee05 |
2023-03-13 09:42:26 |
Add initial contribution guidelines |
Georgi Gerganov |
a169bb88 |
2023-03-13 04:08:01 |
Gate signal support on being on a unixoid system. (#74) |
Matvey Soloviev |
460c4825 |
2023-03-13 00:35:51 |
Fix token count accounting |
Matvey Soloviev |
c80e2a8f |
2023-03-13 01:28:08 |
Revert "10% performance boost on ARM" |
Georgi Gerganov |
54a0e66e |
2023-03-13 01:21:03 |
Check for vdotq_s32 availability |
Georgi Gerganov |
543c57e9 |
2023-03-13 01:05:24 |
Ammend to previous commit - forgot to update non-QRDMX branch |
Georgi Gerganov |
113a9e83 |
2023-03-13 00:56:10 |
10% performance boost on ARM |
Georgi Gerganov |
404fac0d |
2023-03-12 23:07:34 |
Fix color getting reset before prompt output done (#65) |
Matvey Soloviev |
1a0a7430 |
2023-03-12 23:39:01 |
Update README.md |
Georgi Gerganov |
96ea727f |
2023-03-12 22:13:28 |
Add interactive mode (#61) |
Matvey Soloviev |
96619548 |
2023-03-13 03:30:08 |
Fix typo in README (#45) |
Marc Köhlbrugge |
f385f8de |
2023-03-12 13:28:36 |
Allow using prompt files (#59) |
Ben Garney |
02f0c6fe |
2023-03-12 16:23:15 |
Add back top_k (#56) |
beiller |
eb062bb0 |
2023-03-12 17:15:00 |
Windows fixes (#31) |
Sebastián A |
7027a978 |
2023-03-12 22:09:26 |
Update README.md |
Georgi Gerganov |
2d555e5b |
2023-03-12 22:08:24 |
Add CI (#60) |
Georgi Gerganov |
7c9e54e5 |
2023-03-12 20:59:01 |
Revert "weights_only" arg - this causing more trouble than help |
Georgi Gerganov |
b9bd1d01 |
2023-03-12 14:16:33 |
python/pytorch compat notes (#44) |
Oleksandr Nikitin |
129c7d1e |
2023-03-12 05:27:42 |
Add repetition penalty (#20) |
beiller |
702fddf5 |
2023-03-12 09:03:25 |
Clarify meaning of hacking |
Georgi Gerganov |
7d86e25b |
2023-03-12 08:41:54 |
README: add "Supported platforms" + update hot topics |
Georgi Gerganov |
a9312023 |
2023-03-11 22:36:35 |
use weights_only in conversion script (#32) |
deepdiffuser |
6a9a67f0 |
2023-03-12 07:36:03 |
Add LICENSE (#21) |
Pavol Rusnak |
da1a4ff0 |
2023-03-12 01:26:32 |
Update README.md |
Georgi Gerganov |
6b2cb630 |
2023-03-11 18:32:20 |
Fix a typo in model name (#16) |
Juraj Bednar |
4235e3d5 |
2023-03-11 18:10:18 |
Update README.md |
Georgi Gerganov |
f1eaff47 |
2023-03-11 17:58:18 |
Add AVX2 support for x86 architectures thanks to @Const-me ! |
Georgi Gerganov |
a9e58529 |
2023-03-11 17:40:14 |
Fix un-initialized FP16 tables on x86 (#15, #2) |
Georgi Gerganov |
7d9ed7b2 |
2023-03-11 12:44:21 |
Bump memory buffer |
Georgi Gerganov |
0c680332 |
2023-03-11 12:31:21 |
Update README.md |
Georgi Gerganov |
f60fa9e5 |
2023-03-11 12:26:46 |
.gitignore models/ |
Georgi Gerganov |
7211862c |
2023-03-11 12:26:16 |
Update Makefile var + add comment |
Georgi Gerganov |
a5c5ae2f |
2023-03-11 11:34:25 |
Update README.md |
Georgi Gerganov |
ea977e85 |
2023-03-11 11:34:11 |
Update README.md |
Georgi Gerganov |
007a8f6f |
2023-03-11 10:47:09 |
Support all LLaMA models + change Q4_0 quantization storage |
Georgi Gerganov |
5f2f970d |
2023-03-10 21:47:26 |
Include Python dependencies in README (#6) |
Simon Willison |
73c6ed5e |
2023-03-11 01:30:47 |
Update README.md |
Georgi Gerganov |
01eeed8f |
2023-03-11 01:22:58 |
Update README.md |
Georgi Gerganov |
6da2df34 |
2023-03-11 01:18:10 |
Update README.md |
Georgi Gerganov |
9dcf4dba |
2023-03-10 18:04:06 |
Add missing headers for memcpy and assert (#3) |
Jean-Michaël Celerier |
920a7fe2 |
2023-03-11 00:55:22 |
Update README.md |
Georgi Gerganov |
3a57ee59 |
2023-03-11 00:51:46 |
Update README.md |
Georgi Gerganov |
b8502852 |
2023-03-11 00:09:19 |
Update README.md |
Georgi Gerganov |
8a01f565 |
2023-03-10 23:53:11 |
Update README.md |
Georgi Gerganov |
70bc0b8b |
2023-03-10 23:46:39 |
Fix a bug in the rope calculation |
Georgi Gerganov |
18ebda34 |
2023-03-10 21:52:27 |
Update README.md |
Georgi Gerganov |
319cdb3e |
2023-03-10 21:50:46 |
Final touches |
Georgi Gerganov |
77532806 |
2023-03-10 21:47:46 |
Create README.md |
Georgi Gerganov |
26c08466 |
2023-03-10 20:40:58 |
Initial release |
Georgi Gerganov |