Liu Song’s Projects


~/Projects/llama.cpp

git clone https://code.lsong.org/llama.cpp

History

ref
master
Hash Date Commit message Author
ed3c680b 2023-03-30 11:16:30 Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) slaren
9cbc404b 2023-03-29 23:44:39 ci : re-enable AVX512 testing (Windows-MSVC) (#584) anzz1
b51c717d 2023-03-29 22:15:34 ggml : init time on first ggml_init() call Georgi Gerganov
0ba76c1e 2023-03-29 22:13:12 llama : fix compile warnings when reading the vocab Georgi Gerganov
cea1c859 2023-03-29 22:10:01 ggml : add ARM_NEON dequantize_row_q4_1() Georgi Gerganov
f202ada1 2023-03-29 22:03:02 ggml : add ARM_NEON quantize_row_q4_1() Georgi Gerganov
3b44d30d 2023-03-29 21:47:33 ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov
61cbfff5 2023-03-29 20:09:25 rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak
d9ad1044 2023-03-29 19:21:09 Create chat-13B.bat (#592) Thérence
b467702b 2023-03-29 19:38:31 readme : fix typos Georgi Gerganov
516d88e7 2023-03-29 19:37:20 readme : add GPT4All instructions (close #588) Georgi Gerganov
53635c08 2023-03-29 19:29:26 py : add GPT4All conversion script Georgi Gerganov
41318d70 2023-03-29 18:10:07 llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou
a6956b25 2023-03-29 17:10:24 add example of re-act pattern (#583) Tobias Lütke
83df5639 2023-03-29 16:20:07 Fix GCC warning about binary literal (#595) anzz1
a5c42c4b 2023-03-29 16:19:29 Fix typo in llama.h (#593) anzz1
5a5f8b15 2023-03-28 22:44:29 Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) anzz1
f1217055 2023-03-28 22:43:25 CI: fix subdirectory path globbing (#546) anzz1
7f4c5c66 2023-03-28 21:23:09 llama : fix linkage with mingw (#551) anzz1
2a98bc18 2023-03-28 20:06:03 ggml : add AVX2 implementation of quantize_row_q4_1 (#515) slaren
d0aaff57 2023-03-28 19:55:42 py : add temporary script to convert old ggml files to newer version (#539) thement
d0330fd7 2023-03-28 13:51:29 py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen
99c5b276 2023-03-28 17:13:01 ggml : refactor quantized processing functions (#509) Stephan Walter
692ce316 2023-03-29 02:02:34 py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David)
96f9c050 2023-03-28 20:01:09 ci : make ctest verbose, hopefully we see what is wrong with the sanitizer Georgi Gerganov
d502bc7c 2023-03-28 19:51:55 tests : free llama context at the end of the test Georgi Gerganov
436e5619 2023-03-28 16:48:20 all : be more strict about converting float to double (#458) Stephan Walter
20e1e848 2023-03-28 11:39:01 deploy : add a Package.swift for SwiftPM support (#393) Jed Fox
c1f88506 2023-03-28 15:56:03 ggml : introduce structs for the q4 data blocks (#356) Stephan Walter
e0670260 2023-03-28 18:34:35 gitignore : add "embedding" Georgi Gerganov
28ba975a 2023-03-28 23:06:28 Check the existence of f16_model_path_base in quantize.py (#574) dotpy314
a6bdc47c 2023-03-28 16:26:55 Fix usage of F16C intrinsics in AVX code (#563) slaren
7b8dbcb7 2023-03-28 17:09:55 main.cpp fixes, refactoring (#571) anzz1
4b8efff0 2023-03-28 08:11:09 Add embedding example to Makefile (#540) RJ Adriaansen
7e539557 2023-03-27 06:55:26 Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) Marco Matthies
34c1072e 2023-03-26 17:48:40 ci: add debug build to sanitizer build matrix (#527) Erik Scholz
939ad2d3 2023-03-26 15:34:02 Fix undefined variables in debug build, remove unused variables (#531) Stephan Walter
8c2ec5e2 2023-03-26 10:48:42 Add support for linux/arm64 platform during Docker Builds (#514) Juan Calderon-Perez
b391579d 2023-03-26 13:14:01 Update README and comments for standalone perplexity tool (#525) Stephan Walter
7a87d31f 2023-03-26 16:06:10 [main] fix infinite generation (-n == -1) (#523) anzz1
348d6926 2023-03-26 10:20:49 Add logo to README.md Georgi Gerganov
33e35b8f 2023-03-26 07:25:46 Exit from interactive mode if input stream is bad (#491) Harald Fernengel
19726169 2023-03-26 00:13:28 CI: Run other sanitizer builds even if one fails (#511) anzz1
f732695c 2023-03-25 14:53:55 Clarify console output in convert-pth-to-ggml.py (#512) jp-x-g
2f7bf7dd 2023-03-25 23:38:11 CMake / CI additions (#497) anzz1
34ab5268 2023-03-25 22:29:22 (Windows) Set console to UTF-8 on init (#420) anzz1
c2b25b69 2023-03-25 21:53:39 Fix colors enabling on WIN32 Georgi Gerganov
79b2b266 2023-03-25 21:51:41 If n_predict == -1, generate forever Georgi Gerganov
e2d490da 2023-03-25 21:36:22 Inifinite generation via context swapping (#71) Georgi Gerganov
03f7e335 2023-03-25 20:51:14 Cleanup STL headers + fix embedding examples + minor stuff Georgi Gerganov
55ad42af 2023-03-25 20:36:52 Move chat scripts into "./examples" Georgi Gerganov
459e93cc 2023-03-25 19:31:48 Add AVX2 implementation of dequantize_row_q4_1 (#505) slaren
a316a425 2023-03-25 20:26:40 Overhaul the examples structure Georgi Gerganov
ecbe466a 2023-03-25 19:47:21 Retire the ggml_mul_mat() branch for transposed src0 (#500) Georgi Gerganov
502a4001 2023-03-25 17:16:50 Disable prompt verbosity by default and add option to enable (#480) Georgi Gerganov
09aecbf6 2023-03-25 16:06:49 Add AVX2 implementation of dequantize_row_q4_0 (#467) slaren
4640eff2 2023-03-25 17:03:10 Don't interefe with BLAS for large prompts by running only 1 thread Georgi Gerganov
ab77d763 2023-03-25 16:47:59 Add longer DAN prompt for testing big batch numbers Georgi Gerganov
29b7baab 2023-03-25 15:34:23 Add timings for the prompt evaluation (#478) slaren
4a7129ac 2023-03-25 16:30:32 Remove obsolete information from README Georgi Gerganov
6b6dbc89 2023-03-25 16:22:05 Remove obsolete assert and fix compiler warning Georgi Gerganov
2a2e63ce 2023-03-25 16:09:54 Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS Georgi Gerganov
e899bf54 2023-03-25 14:42:09 bounds checking for input prefix (#492) anzz1
fbd4d38c 2023-03-25 14:03:19 feat: '--in-prefix STRING' option (#426) anzz1
58e6c9f3 2023-03-25 01:26:28 Add support for file load progress reporting callbacks (#434) Jed Fox
36d07532 2023-03-25 01:21:24 Add missing struct annotation (#483) Doomsdayrs
6f1ee4b6 2023-03-24 23:38:14 Fix crash for 65B model with pre-allocated memory (#485) Chris Kuehl
8520fc31 2023-03-24 23:47:06 Disable BLAS altogether - the bug is not just for qunatized mat mul Georgi Gerganov
b3f460e9 2023-03-24 23:39:17 Disable BLAS branch in mul_mat - seems there is a bug Georgi Gerganov
04c6f5ed 2023-03-24 23:17:58 Immediately start processing the prompt before user input has been provided (#476) Georgi Gerganov
7a9b6c3a 2023-03-24 23:17:37 Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov
31572d96 2023-03-24 18:23:56 Temporary bump the memory buffer size - hopefully fix issues from 483bab2e Georgi Gerganov
f4f5362e 2023-03-24 15:23:09 Update README.md (#444) Gary Mulder
863f65e2 2023-03-24 10:22:39 fix instruct mode (#445) rabidcopy
afd220d9 2023-03-24 17:21:01 Properly free llama_context on failure Georgi Gerganov
481044d5 2023-03-24 08:19:26 additional optimizations for POWER9 (#454) Cameron Kaiser
563cdc39 2023-03-24 08:19:05 Support calling mlock() on loaded model data on Linux and macOS (#453) comex
8d4a855c 2023-03-24 08:05:13 Add embedding mode with arg flag. Currently working (#282) Luciano
b6b268d4 2023-03-24 09:13:35 Add link to Roadmap discussion Georgi Gerganov
3cd8dde0 2023-03-24 06:22:28 Revert "Fix memory allocation issues and seg faults" Georgi Gerganov
4870e455 2023-03-24 00:11:53 Fix memory allocation issues and seg faults Georgi Gerganov
483bab2e 2023-03-23 23:22:01 Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) Georgi Gerganov
404e1da3 2023-03-23 16:42:52 Fix quantize script not finding models in parent directory (#428) Jed Fox
4cc053b6 2023-03-23 22:39:44 Remove oboslete command from Docker script Georgi Gerganov
0ba5a3a9 2023-03-23 22:32:02 Obsolete Georgi Gerganov
2e17dfd8 2023-03-23 15:22:47 Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) rabidcopy
20a1a4e0 2023-03-23 10:18:13 Fix GPTQ converter (#423) Timmy Knight
ad072fc5 2023-03-24 05:16:48 Generate library with CMake (#430) nusu-github
ea10d3de 2023-03-23 19:54:28 Command line args bounds checking (#424) anzz1
a18c1925 2023-03-22 00:37:02 Fix Nix build Ben Siraphob
a50e39c6 2023-03-23 14:15:48 Revert "Delete SHA256SUMS for now" (#429) Stephan Walter
a140219e 2023-03-23 05:41:32 Fix Makefile echo escape codes (by removing them). (#418) Kerfuffle
8a3e5ef8 2023-03-23 11:30:40 Move model section from issue template to README.md (#421) Gary Mulder
8eea5ae0 2023-03-23 12:26:19 Delete SHA256SUMS for now (#416) anzz1
93208cfb 2023-03-23 10:46:58 Adjust repetition penalty .. Georgi Gerganov
03ace14c 2023-03-23 09:48:51 Add link to recent podcast about whisper.cpp and llama.cpp Georgi Gerganov
e4412b45 2023-03-23 04:20:34 CI: CMake: Separate build and test steps (#376) anzz1
f7dc43bc 2023-03-23 01:30:23 Fix instruct mode broken by PR #354 (#409) tjohnman
ee8a7887 2023-03-22 19:06:18 Update issue template so people will use it (#404) Gary Mulder
69c92298 2023-03-22 17:29:06 Deduplicate q4 quantization functions (#383) Stephan Walter
97940520 2023-03-22 18:20:25 fix: add POSIX functionality for Linux compilation (#51) Valentyn Bezshapkin
305ba6f0 2023-03-22 18:16:35 Don't force immediate interactive without `-i` (#354) tjohnman
4122dfff 2023-03-22 17:37:10 cmake: make llama an actual library (#392) Erik Scholz
56e659a0 2023-03-22 17:09:38 fix perplexity after c-api refactor (#390) Erik Scholz
40ea807a 2023-03-22 08:53:54 Add details on perplexity to README.md (#395) Gary Linscott
d5850c53 2023-03-22 11:55:45 Add missing header for memcpy (#386) Yusuf Kağan Hanoğlu
ae44e23e 2023-03-22 07:47:15 When seed <= 0 - use the clock to generate one Georgi Gerganov
928480ef 2023-03-22 07:45:00 Init llama_context_params properly from CLI (#370) Georgi Gerganov
56817b1f 2023-03-22 07:34:02 Remove temporary notice and update hot topics Georgi Gerganov
f5a77a62 2023-03-22 07:32:36 Introduce C-style API (#370) Georgi Gerganov
da0e9fe9 2023-03-20 20:14:06 Add SHA256SUMS file and instructions to README how to obtain and verify the downloads Gary Mulder
e6c9e098 2023-03-21 23:49:24 Fix bin dir for win ci anzz1
01a297b0 2023-03-21 22:34:25 specify build type for ctest on windows (#371) Erik Scholz
3366853e 2023-03-21 22:57:35 Add notice about pending change Georgi Gerganov
3f9c6135 2023-03-21 16:52:27 fix typo in chatLLaMa (#368) Mathieu Nayrolles
0f613527 2023-03-21 19:47:27 Update issue templates Georgi Gerganov
353ec251 2023-03-21 14:21:50 We could use std::unordered_map over std::map (#305) Fabio R. Sluzala
89d5d90f 2023-03-21 18:11:01 Fix color codes emitting mid-UTF8 code. (#312) Matvey Soloviev
16ffc013 2023-03-21 09:42:25 Importer for GPTQ quantized LLaMA models (#301) comex
486ae645 2023-03-21 09:27:42 Compute perplexity over prompt (#270) Gary Linscott
3ab3e658 2023-03-21 18:23:15 Add chatLLaMa script (#198) Jean-Christophe Hoelt
f157088c 2023-03-21 11:21:06 makefile: Fix CPU feature detection on Haiku (#218) Alex von Gluck IV
c86ba036 2023-03-21 18:14:46 Enable ANSI colors on Windows 10+ (#311) anzz1
1daf4dd7 2023-03-21 18:10:32 Minor style changes Georgi Gerganov
dc6a845b 2023-03-21 18:09:37 Add chat.sh script Georgi Gerganov
6a612959 2023-03-21 17:05:06 Check for reverse prompt by characters instead of tokens (#292) (#330) tjohnman
d5f56a5e 2023-03-21 17:04:43 Check for reverse prompt by characters instead of tokens (#292) (#330) tjohnman
3bfa3b43 2023-03-21 17:59:16 Fix convert script, warnings alpaca instructions, default params Georgi Gerganov
715d292e 2023-03-21 09:50:09 Add OpenBSD support (#314) Kevin Lo
c98ae026 2023-03-21 08:49:43 fix typo in comment (#318) Mack Straight
c3b2306b 2023-03-21 23:44:11 Makefile: slightly cleanup for Mac Intel; echo instead of run ./main -h (#335) Qingyou Meng
975d2ceb 2023-03-21 17:42:43 cmdline option for custom amount of model parts (--n_parts N) (#348) anzz1
e0ffc861 2023-03-21 08:34:49 Update IPFS links to quantized alpaca with new tokenizer format (#352) Kevin Kwok
8f644a0a 2023-03-21 17:32:14 Change default repeat_penalty to 1.0 Georgi Gerganov
eb34620a 2023-03-21 17:29:41 Add tokenizer test + revert to C++11 (#355) Georgi Gerganov
2e664f1f 2023-03-21 07:35:42 Add initial AVX512 support for dot product on Linux (#320) Casey Primozic
8cf9f34e 2023-03-21 09:37:16 Adding missing features of CMakeLists.txt & Refactoring (#131) nusu-github
bd4b46d6 2023-03-20 16:44:30 Nix flake: set meta.mainProgram to llama Ben Siraphob
6b6d5b50 2023-03-21 03:33:10 Fixed tokenizer.model not found error when model dir is symlink (#325) Qingyou Meng
a791a68b 2023-03-20 12:26:01 move file magic/version to header, print expected version (#319) Mack Straight
0f1b21cb 2023-03-20 18:05:20 Docker - Fix publish docker image in GitHub Registry (#235) Bernat Vadell
074bea2e 2023-03-20 03:17:23 sentencepiece bpe compatible tokenizer (#252) Mack Straight
5cb63e24 2023-03-20 08:24:11 Add tqdm to Python requirements (#293) Stephan Walter
da5303c1 2023-03-19 17:44:20 bugfix: default should not be interactive (#304) cocktailpeanut
4545539d 2023-03-19 21:58:51 Rename script Georgi Gerganov
edeba283 2023-03-19 21:57:28 Add temporary helper script for Alpaca chat Georgi Gerganov
5c19c70b 2023-03-19 13:44:30 fix coloring of last `n_batch` of prompt, and refactor line input (#221) Rickey Bowers Jr
24568371 2023-03-19 20:33:06 Support for multiple reverse prompts. (#299) tjohnman
7392f1cd 2023-03-19 12:38:44 Improved quantize script (#222) Suaj Carrot
ad5fd5b6 2023-03-19 19:36:19 Make prompt randomization optional. (#300) tjohnman
368d0c8a 2023-03-19 19:31:17 Respect the maximum number of tokens in interactive. (#298) tjohnman
50fae10d 2023-03-19 19:22:48 Add --ignore-eos parameter (#181) slaren
084e2f0e 2023-03-20 02:10:00 interactive mode: print '\n' in sigint_handler, this flush stdout thus ensure color reset. (#283) Qingyou Meng
0b366e73 2023-03-19 18:57:00 Command line switch to use F16 for memory_k and memory_v (refactor of #154) (#294) Erik Scholz
160bfb21 2023-03-19 19:51:55 Update hot topics to mention Alpaca support Georgi Gerganov
c494ed5b 2023-03-19 19:46:32 Fix off-by-one bug (#115) Georgi Gerganov
c1c7026b 2023-03-19 19:33:18 Fix python stuff (#109) Georgi Gerganov
467b1497 2023-03-19 20:17:39 Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109) qunash
70f01cb8 2023-03-19 19:04:44 Drop trailing new line from file prompts (#80) Georgi Gerganov
a4e63b73 2023-03-19 18:49:50 Add instruction for using Alpaca (#240) Georgi Gerganov
9e170721 2023-03-19 18:37:02 Add "--instruct" argument for usage with Alpaca (#240) Georgi Gerganov
22213a17 2023-03-19 17:30:00 Change RMSNorm eps to 1e-6 (#173) Georgi Gerganov
d7def1a7 2023-03-18 17:10:47 Warn user if a context size greater than 2048 tokens is specified (#274) Ronsor
6f61c18e 2023-03-18 22:39:46 Fix typo in readme Pavol Rusnak
1e5a6d08 2023-03-18 22:20:04 Add note about Python 3.11 to readme Pavol Rusnak
554b5415 2023-03-18 21:58:46 Add memory/disk requirements to readme Pavol Rusnak
d3f202d5 2023-03-18 20:51:49 Remove unused code since n_vocab is model.hparams.n_vocab (#262) Alex Nguyen
e03e3597 2023-03-18 07:44:09 fixed warning with std::ignore about unused function result (#151) Justin Suess
a81d0c2a 2023-03-18 04:17:19 Fix n^2 loop in tokenization (#254) Gary Linscott
b2de7f18 2023-03-18 09:27:12 CI Improvements (#230) anzz1
a2927478 2023-03-17 23:03:48 Nix flake (#40) Niklas Korz
c9f670a1 2023-03-17 21:05:58 Implement non-greedy tokenizer that tries to maximize token lengths (#242) thement
4f546091 2023-03-17 21:46:46 Default to 4 threads (#243) Georgi Gerganov
e81b9c81 2023-03-17 20:30:04 Update Contributing section Georgi Gerganov
367946c6 2023-03-17 17:47:35 Don't tell users to use a bad number of threads (#243) Stephan Walter
6b0df5cc 2023-03-18 00:38:24 add ptread link to fix cmake build under linux (#114) mmyjona
2af23d30 2023-03-17 10:47:06 🚀 Dockerize llamacpp (#132) Bernat Vadell
904d2a8d 2023-03-17 05:48:39 Q4_1 quantization (#193) Matvey Soloviev
72131107 2023-03-16 15:00:09 Update README.md Georgi Gerganov
ac15de78 2023-03-16 08:55:13 Expand "Contributing" section Georgi Gerganov
273abc47 2023-03-16 07:12:12 Update hot topics - RMSnorm Georgi Gerganov
9b4a15b1 2023-03-15 19:29:25 Fix RMS norm in GGML (#191) Nebula
6eac39ba 2023-03-15 18:41:38 Add RMS norm and use it (#187) hoangmit
27944c42 2023-03-15 21:35:25 fixed typo (#178) moritzbrantner
2d15d6c9 2023-03-15 13:56:24 add SIGINT support for _WIN32 environments (#120) Rickey Bowers Jr
2d64715a 2023-03-15 15:42:40 added ctx_size parameter (#148) Justin Suess
16b2c61a 2023-03-15 15:39:38 fixed color reset on exit (#149) Justin Suess
977295c7 2023-03-15 22:39:06 Fix potential licensing issue (#126) Musab Gultekin
956dfda8 2023-03-15 12:37:50 Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142) Ronsor
113e685d 2023-03-15 15:05:14 inline -> static inline for "bytesFromNibbles" (#161) hoangmit
47857e56 2023-03-14 12:34:37 Don't use vdotq_s32 if it's not available (#139) Ronsor
60f819a2 2023-03-14 15:30:08 Add section to README on how to run the project on Android (#130) Radoslav Gerganov
97ab2b25 2023-03-14 09:43:52 Add Misc section + update hot topics + minor fixes Georgi Gerganov
2f700a27 2023-03-13 17:29:10 Add windows to the CI (#98) Sebastián A
c09a9cfb 2023-03-13 21:22:15 CMake build in Release by default (#75) Georgi Gerganov
7ec903d3 2023-03-13 19:21:51 Update contribution section, hot topics, limitations, etc. Georgi Gerganov
4497ad81 2023-03-13 19:15:08 Print system information Georgi Gerganov
ed6849cc 2023-03-13 14:12:33 Initial support for CMake (#75) Sebastián A
41be0a3b 2023-03-13 17:40:54 Add NetBSD support. (#90) Thomas Klausner
671d5cac 2023-03-13 17:39:56 Use fprintf for diagnostic output (#48) Pavol Rusnak
84d9015c 2023-03-13 18:36:44 Use vdotq_s32 to improve performance (#67) Georgi Gerganov
63fd76fb 2023-03-14 01:33:43 Reduce model loading time (#43) uint256_t
2a20f48e 2023-03-13 12:24:18 Fix UTF-8 handling (including colors) (#79) Val Kharitonov
d1f22471 2023-03-13 17:15:20 Add quantize script for batch quantization (#92) Pavol Rusnak
1808ee05 2023-03-13 09:42:26 Add initial contribution guidelines Georgi Gerganov
a169bb88 2023-03-13 04:08:01 Gate signal support on being on a unixoid system. (#74) Matvey Soloviev
460c4825 2023-03-13 00:35:51 Fix token count accounting Matvey Soloviev
c80e2a8f 2023-03-13 01:28:08 Revert "10% performance boost on ARM" Georgi Gerganov
54a0e66e 2023-03-13 01:21:03 Check for vdotq_s32 availability Georgi Gerganov
543c57e9 2023-03-13 01:05:24 Ammend to previous commit - forgot to update non-QRDMX branch Georgi Gerganov
113a9e83 2023-03-13 00:56:10 10% performance boost on ARM Georgi Gerganov
404fac0d 2023-03-12 23:07:34 Fix color getting reset before prompt output done (#65) Matvey Soloviev
1a0a7430 2023-03-12 23:39:01 Update README.md Georgi Gerganov
96ea727f 2023-03-12 22:13:28 Add interactive mode (#61) Matvey Soloviev
96619548 2023-03-13 03:30:08 Fix typo in README (#45) Marc Köhlbrugge
f385f8de 2023-03-12 13:28:36 Allow using prompt files (#59) Ben Garney
02f0c6fe 2023-03-12 16:23:15 Add back top_k (#56) beiller
eb062bb0 2023-03-12 17:15:00 Windows fixes (#31) Sebastián A
7027a978 2023-03-12 22:09:26 Update README.md Georgi Gerganov
2d555e5b 2023-03-12 22:08:24 Add CI (#60) Georgi Gerganov
7c9e54e5 2023-03-12 20:59:01 Revert "weights_only" arg - this causing more trouble than help Georgi Gerganov
b9bd1d01 2023-03-12 14:16:33 python/pytorch compat notes (#44) Oleksandr Nikitin
129c7d1e 2023-03-12 05:27:42 Add repetition penalty (#20) beiller
702fddf5 2023-03-12 09:03:25 Clarify meaning of hacking Georgi Gerganov
7d86e25b 2023-03-12 08:41:54 README: add "Supported platforms" + update hot topics Georgi Gerganov
a9312023 2023-03-11 22:36:35 use weights_only in conversion script (#32) deepdiffuser
6a9a67f0 2023-03-12 07:36:03 Add LICENSE (#21) Pavol Rusnak
da1a4ff0 2023-03-12 01:26:32 Update README.md Georgi Gerganov
6b2cb630 2023-03-11 18:32:20 Fix a typo in model name (#16) Juraj Bednar
4235e3d5 2023-03-11 18:10:18 Update README.md Georgi Gerganov
f1eaff47 2023-03-11 17:58:18 Add AVX2 support for x86 architectures thanks to @Const-me ! Georgi Gerganov
a9e58529 2023-03-11 17:40:14 Fix un-initialized FP16 tables on x86 (#15, #2) Georgi Gerganov
7d9ed7b2 2023-03-11 12:44:21 Bump memory buffer Georgi Gerganov
0c680332 2023-03-11 12:31:21 Update README.md Georgi Gerganov
f60fa9e5 2023-03-11 12:26:46 .gitignore models/ Georgi Gerganov
7211862c 2023-03-11 12:26:16 Update Makefile var + add comment Georgi Gerganov
a5c5ae2f 2023-03-11 11:34:25 Update README.md Georgi Gerganov
ea977e85 2023-03-11 11:34:11 Update README.md Georgi Gerganov
007a8f6f 2023-03-11 10:47:09 Support all LLaMA models + change Q4_0 quantization storage Georgi Gerganov
5f2f970d 2023-03-10 21:47:26 Include Python dependencies in README (#6) Simon Willison
73c6ed5e 2023-03-11 01:30:47 Update README.md Georgi Gerganov
01eeed8f 2023-03-11 01:22:58 Update README.md Georgi Gerganov
6da2df34 2023-03-11 01:18:10 Update README.md Georgi Gerganov
9dcf4dba 2023-03-10 18:04:06 Add missing headers for memcpy and assert (#3) Jean-Michaël Celerier
920a7fe2 2023-03-11 00:55:22 Update README.md Georgi Gerganov
3a57ee59 2023-03-11 00:51:46 Update README.md Georgi Gerganov
b8502852 2023-03-11 00:09:19 Update README.md Georgi Gerganov
8a01f565 2023-03-10 23:53:11 Update README.md Georgi Gerganov
70bc0b8b 2023-03-10 23:46:39 Fix a bug in the rope calculation Georgi Gerganov
18ebda34 2023-03-10 21:52:27 Update README.md Georgi Gerganov
319cdb3e 2023-03-10 21:50:46 Final touches Georgi Gerganov
77532806 2023-03-10 21:47:46 Create README.md Georgi Gerganov
26c08466 2023-03-10 20:40:58 Initial release Georgi Gerganov