Liu Song’s Projects

Hash	Date	Commit message	Author
ed3c680b	2023-03-30 11:16:30	Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)	slaren
9cbc404b	2023-03-29 23:44:39	ci : re-enable AVX512 testing (Windows-MSVC) (#584)	anzz1
b51c717d	2023-03-29 22:15:34	ggml : init time on first ggml_init() call	Georgi Gerganov
0ba76c1e	2023-03-29 22:13:12	llama : fix compile warnings when reading the vocab	Georgi Gerganov
cea1c859	2023-03-29 22:10:01	ggml : add ARM_NEON dequantize_row_q4_1()	Georgi Gerganov
f202ada1	2023-03-29 22:03:02	ggml : add ARM_NEON quantize_row_q4_1()	Georgi Gerganov
3b44d30d	2023-03-29 21:47:33	ggml : add ARM_NEON ggml_vec_dot_q4_1()	Georgi Gerganov
61cbfff5	2023-03-29 20:09:25	rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600)	Pavol Rusnak
d9ad1044	2023-03-29 19:21:09	Create chat-13B.bat (#592)	Thérence
b467702b	2023-03-29 19:38:31	readme : fix typos	Georgi Gerganov
516d88e7	2023-03-29 19:37:20	readme : add GPT4All instructions (close #588)	Georgi Gerganov
53635c08	2023-03-29 19:29:26	py : add GPT4All conversion script	Georgi Gerganov
41318d70	2023-03-29 18:10:07	llama : use the same threshold for OpenBLAS and ggml thread limiting (#577)	Maël Kerbiriou
a6956b25	2023-03-29 17:10:24	add example of re-act pattern (#583)	Tobias Lütke
83df5639	2023-03-29 16:20:07	Fix GCC warning about binary literal (#595)	anzz1
a5c42c4b	2023-03-29 16:19:29	Fix typo in llama.h (#593)	anzz1
5a5f8b15	2023-03-28 22:44:29	Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)	anzz1
f1217055	2023-03-28 22:43:25	CI: fix subdirectory path globbing (#546)	anzz1
7f4c5c66	2023-03-28 21:23:09	llama : fix linkage with mingw (#551)	anzz1
2a98bc18	2023-03-28 20:06:03	ggml : add AVX2 implementation of quantize_row_q4_1 (#515)	slaren
d0aaff57	2023-03-28 19:55:42	py : add temporary script to convert old ggml files to newer version (#539)	thement
d0330fd7	2023-03-28 13:51:29	py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403)	Tai Duc Nguyen
99c5b276	2023-03-28 17:13:01	ggml : refactor quantized processing functions (#509)	Stephan Walter
692ce316	2023-03-29 02:02:34	py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547)	DooWoong Lee (David)
96f9c050	2023-03-28 20:01:09	ci : make ctest verbose, hopefully we see what is wrong with the sanitizer	Georgi Gerganov
d502bc7c	2023-03-28 19:51:55	tests : free llama context at the end of the test	Georgi Gerganov
436e5619	2023-03-28 16:48:20	all : be more strict about converting float to double (#458)	Stephan Walter
20e1e848	2023-03-28 11:39:01	deploy : add a Package.swift for SwiftPM support (#393)	Jed Fox
c1f88506	2023-03-28 15:56:03	ggml : introduce structs for the q4 data blocks (#356)	Stephan Walter
e0670260	2023-03-28 18:34:35	gitignore : add "embedding"	Georgi Gerganov
28ba975a	2023-03-28 23:06:28	Check the existence of f16_model_path_base in quantize.py (#574)	dotpy314
a6bdc47c	2023-03-28 16:26:55	Fix usage of F16C intrinsics in AVX code (#563)	slaren
7b8dbcb7	2023-03-28 17:09:55	main.cpp fixes, refactoring (#571)	anzz1
4b8efff0	2023-03-28 08:11:09	Add embedding example to Makefile (#540)	RJ Adriaansen
7e539557	2023-03-27 06:55:26	Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542)	Marco Matthies
34c1072e	2023-03-26 17:48:40	ci: add debug build to sanitizer build matrix (#527)	Erik Scholz
939ad2d3	2023-03-26 15:34:02	Fix undefined variables in debug build, remove unused variables (#531)	Stephan Walter
8c2ec5e2	2023-03-26 10:48:42	Add support for linux/arm64 platform during Docker Builds (#514)	Juan Calderon-Perez
b391579d	2023-03-26 13:14:01	Update README and comments for standalone perplexity tool (#525)	Stephan Walter
7a87d31f	2023-03-26 16:06:10	[main] fix infinite generation (-n == -1) (#523)	anzz1
348d6926	2023-03-26 10:20:49	Add logo to README.md	Georgi Gerganov
33e35b8f	2023-03-26 07:25:46	Exit from interactive mode if input stream is bad (#491)	Harald Fernengel
19726169	2023-03-26 00:13:28	CI: Run other sanitizer builds even if one fails (#511)	anzz1
f732695c	2023-03-25 14:53:55	Clarify console output in convert-pth-to-ggml.py (#512)	jp-x-g
2f7bf7dd	2023-03-25 23:38:11	CMake / CI additions (#497)	anzz1
34ab5268	2023-03-25 22:29:22	(Windows) Set console to UTF-8 on init (#420)	anzz1
c2b25b69	2023-03-25 21:53:39	Fix colors enabling on WIN32	Georgi Gerganov
79b2b266	2023-03-25 21:51:41	If n_predict == -1, generate forever	Georgi Gerganov
e2d490da	2023-03-25 21:36:22	Inifinite generation via context swapping (#71)	Georgi Gerganov
03f7e335	2023-03-25 20:51:14	Cleanup STL headers + fix embedding examples + minor stuff	Georgi Gerganov
55ad42af	2023-03-25 20:36:52	Move chat scripts into "./examples"	Georgi Gerganov
459e93cc	2023-03-25 19:31:48	Add AVX2 implementation of dequantize_row_q4_1 (#505)	slaren
a316a425	2023-03-25 20:26:40	Overhaul the examples structure	Georgi Gerganov
ecbe466a	2023-03-25 19:47:21	Retire the ggml_mul_mat() branch for transposed src0 (#500)	Georgi Gerganov
502a4001	2023-03-25 17:16:50	Disable prompt verbosity by default and add option to enable (#480)	Georgi Gerganov
09aecbf6	2023-03-25 16:06:49	Add AVX2 implementation of dequantize_row_q4_0 (#467)	slaren
4640eff2	2023-03-25 17:03:10	Don't interefe with BLAS for large prompts by running only 1 thread	Georgi Gerganov
ab77d763	2023-03-25 16:47:59	Add longer DAN prompt for testing big batch numbers	Georgi Gerganov
29b7baab	2023-03-25 15:34:23	Add timings for the prompt evaluation (#478)	slaren
4a7129ac	2023-03-25 16:30:32	Remove obsolete information from README	Georgi Gerganov
6b6dbc89	2023-03-25 16:22:05	Remove obsolete assert and fix compiler warning	Georgi Gerganov
2a2e63ce	2023-03-25 16:09:54	Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS	Georgi Gerganov
e899bf54	2023-03-25 14:42:09	bounds checking for input prefix (#492)	anzz1
fbd4d38c	2023-03-25 14:03:19	feat: '--in-prefix STRING' option (#426)	anzz1
58e6c9f3	2023-03-25 01:26:28	Add support for file load progress reporting callbacks (#434)	Jed Fox
36d07532	2023-03-25 01:21:24	Add missing struct annotation (#483)	Doomsdayrs
6f1ee4b6	2023-03-24 23:38:14	Fix crash for 65B model with pre-allocated memory (#485)	Chris Kuehl
8520fc31	2023-03-24 23:47:06	Disable BLAS altogether - the bug is not just for qunatized mat mul	Georgi Gerganov
b3f460e9	2023-03-24 23:39:17	Disable BLAS branch in mul_mat - seems there is a bug	Georgi Gerganov
04c6f5ed	2023-03-24 23:17:58	Immediately start processing the prompt before user input has been provided (#476)	Georgi Gerganov
7a9b6c3a	2023-03-24 23:17:37	Reduce memory usage and allocate enough memory for largest context (#473)	Georgi Gerganov
31572d96	2023-03-24 18:23:56	Temporary bump the memory buffer size - hopefully fix issues from 483bab2e	Georgi Gerganov
f4f5362e	2023-03-24 15:23:09	Update README.md (#444)	Gary Mulder
863f65e2	2023-03-24 10:22:39	fix instruct mode (#445)	rabidcopy
afd220d9	2023-03-24 17:21:01	Properly free llama_context on failure	Georgi Gerganov
481044d5	2023-03-24 08:19:26	additional optimizations for POWER9 (#454)	Cameron Kaiser
563cdc39	2023-03-24 08:19:05	Support calling mlock() on loaded model data on Linux and macOS (#453)	comex
8d4a855c	2023-03-24 08:05:13	Add embedding mode with arg flag. Currently working (#282)	Luciano
b6b268d4	2023-03-24 09:13:35	Add link to Roadmap discussion	Georgi Gerganov
3cd8dde0	2023-03-24 06:22:28	Revert "Fix memory allocation issues and seg faults"	Georgi Gerganov
4870e455	2023-03-24 00:11:53	Fix memory allocation issues and seg faults	Georgi Gerganov
483bab2e	2023-03-23 23:22:01	Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439)	Georgi Gerganov
404e1da3	2023-03-23 16:42:52	Fix quantize script not finding models in parent directory (#428)	Jed Fox
4cc053b6	2023-03-23 22:39:44	Remove oboslete command from Docker script	Georgi Gerganov
0ba5a3a9	2023-03-23 22:32:02	Obsolete	Georgi Gerganov
2e17dfd8	2023-03-23 15:22:47	Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333)	rabidcopy
20a1a4e0	2023-03-23 10:18:13	Fix GPTQ converter (#423)	Timmy Knight
ad072fc5	2023-03-24 05:16:48	Generate library with CMake (#430)	nusu-github
ea10d3de	2023-03-23 19:54:28	Command line args bounds checking (#424)	anzz1
a18c1925	2023-03-22 00:37:02	Fix Nix build	Ben Siraphob
a50e39c6	2023-03-23 14:15:48	Revert "Delete SHA256SUMS for now" (#429)	Stephan Walter
a140219e	2023-03-23 05:41:32	Fix Makefile echo escape codes (by removing them). (#418)	Kerfuffle
8a3e5ef8	2023-03-23 11:30:40	Move model section from issue template to README.md (#421)	Gary Mulder
8eea5ae0	2023-03-23 12:26:19	Delete SHA256SUMS for now (#416)	anzz1
93208cfb	2023-03-23 10:46:58	Adjust repetition penalty ..	Georgi Gerganov
03ace14c	2023-03-23 09:48:51	Add link to recent podcast about whisper.cpp and llama.cpp	Georgi Gerganov
e4412b45	2023-03-23 04:20:34	CI: CMake: Separate build and test steps (#376)	anzz1
f7dc43bc	2023-03-23 01:30:23	Fix instruct mode broken by PR #354 (#409)	tjohnman
ee8a7887	2023-03-22 19:06:18	Update issue template so people will use it (#404)	Gary Mulder
69c92298	2023-03-22 17:29:06	Deduplicate q4 quantization functions (#383)	Stephan Walter
97940520	2023-03-22 18:20:25	fix: add POSIX functionality for Linux compilation (#51)	Valentyn Bezshapkin
305ba6f0	2023-03-22 18:16:35	Don't force immediate interactive without `-i` (#354)	tjohnman
4122dfff	2023-03-22 17:37:10	cmake: make llama an actual library (#392)	Erik Scholz
56e659a0	2023-03-22 17:09:38	fix perplexity after c-api refactor (#390)	Erik Scholz
40ea807a	2023-03-22 08:53:54	Add details on perplexity to README.md (#395)	Gary Linscott
d5850c53	2023-03-22 11:55:45	Add missing header for memcpy (#386)	Yusuf Kağan Hanoğlu
ae44e23e	2023-03-22 07:47:15	When seed <= 0 - use the clock to generate one	Georgi Gerganov
928480ef	2023-03-22 07:45:00	Init llama_context_params properly from CLI (#370)	Georgi Gerganov
56817b1f	2023-03-22 07:34:02	Remove temporary notice and update hot topics	Georgi Gerganov
f5a77a62	2023-03-22 07:32:36	Introduce C-style API (#370)	Georgi Gerganov
da0e9fe9	2023-03-20 20:14:06	Add SHA256SUMS file and instructions to README how to obtain and verify the downloads	Gary Mulder
e6c9e098	2023-03-21 23:49:24	Fix bin dir for win ci	anzz1
01a297b0	2023-03-21 22:34:25	specify build type for ctest on windows (#371)	Erik Scholz
3366853e	2023-03-21 22:57:35	Add notice about pending change	Georgi Gerganov
3f9c6135	2023-03-21 16:52:27	fix typo in chatLLaMa (#368)	Mathieu Nayrolles
0f613527	2023-03-21 19:47:27	Update issue templates	Georgi Gerganov
353ec251	2023-03-21 14:21:50	We could use std::unordered_map over std::map (#305)	Fabio R. Sluzala
89d5d90f	2023-03-21 18:11:01	Fix color codes emitting mid-UTF8 code. (#312)	Matvey Soloviev
16ffc013	2023-03-21 09:42:25	Importer for GPTQ quantized LLaMA models (#301)	comex
486ae645	2023-03-21 09:27:42	Compute perplexity over prompt (#270)	Gary Linscott
3ab3e658	2023-03-21 18:23:15	Add chatLLaMa script (#198)	Jean-Christophe Hoelt
f157088c	2023-03-21 11:21:06	makefile: Fix CPU feature detection on Haiku (#218)	Alex von Gluck IV
c86ba036	2023-03-21 18:14:46	Enable ANSI colors on Windows 10+ (#311)	anzz1
1daf4dd7	2023-03-21 18:10:32	Minor style changes	Georgi Gerganov
dc6a845b	2023-03-21 18:09:37	Add chat.sh script	Georgi Gerganov
6a612959	2023-03-21 17:05:06	Check for reverse prompt by characters instead of tokens (#292) (#330)	tjohnman
d5f56a5e	2023-03-21 17:04:43	Check for reverse prompt by characters instead of tokens (#292) (#330)	tjohnman
3bfa3b43	2023-03-21 17:59:16	Fix convert script, warnings alpaca instructions, default params	Georgi Gerganov
715d292e	2023-03-21 09:50:09	Add OpenBSD support (#314)	Kevin Lo
c98ae026	2023-03-21 08:49:43	fix typo in comment (#318)	Mack Straight
c3b2306b	2023-03-21 23:44:11	Makefile: slightly cleanup for Mac Intel; echo instead of run ./main -h (#335)	Qingyou Meng
975d2ceb	2023-03-21 17:42:43	cmdline option for custom amount of model parts (--n_parts N) (#348)	anzz1
e0ffc861	2023-03-21 08:34:49	Update IPFS links to quantized alpaca with new tokenizer format (#352)	Kevin Kwok
8f644a0a	2023-03-21 17:32:14	Change default repeat_penalty to 1.0	Georgi Gerganov
eb34620a	2023-03-21 17:29:41	Add tokenizer test + revert to C++11 (#355)	Georgi Gerganov
2e664f1f	2023-03-21 07:35:42	Add initial AVX512 support for dot product on Linux (#320)	Casey Primozic
8cf9f34e	2023-03-21 09:37:16	Adding missing features of CMakeLists.txt & Refactoring (#131)	nusu-github
bd4b46d6	2023-03-20 16:44:30	Nix flake: set meta.mainProgram to llama	Ben Siraphob
6b6d5b50	2023-03-21 03:33:10	Fixed tokenizer.model not found error when model dir is symlink (#325)	Qingyou Meng
a791a68b	2023-03-20 12:26:01	move file magic/version to header, print expected version (#319)	Mack Straight
0f1b21cb	2023-03-20 18:05:20	Docker - Fix publish docker image in GitHub Registry (#235)	Bernat Vadell
074bea2e	2023-03-20 03:17:23	sentencepiece bpe compatible tokenizer (#252)	Mack Straight
5cb63e24	2023-03-20 08:24:11	Add tqdm to Python requirements (#293)	Stephan Walter
da5303c1	2023-03-19 17:44:20	bugfix: default should not be interactive (#304)	cocktailpeanut
4545539d	2023-03-19 21:58:51	Rename script	Georgi Gerganov
edeba283	2023-03-19 21:57:28	Add temporary helper script for Alpaca chat	Georgi Gerganov
5c19c70b	2023-03-19 13:44:30	fix coloring of last `n_batch` of prompt, and refactor line input (#221)	Rickey Bowers Jr
24568371	2023-03-19 20:33:06	Support for multiple reverse prompts. (#299)	tjohnman
7392f1cd	2023-03-19 12:38:44	Improved quantize script (#222)	Suaj Carrot
ad5fd5b6	2023-03-19 19:36:19	Make prompt randomization optional. (#300)	tjohnman
368d0c8a	2023-03-19 19:31:17	Respect the maximum number of tokens in interactive. (#298)	tjohnman
50fae10d	2023-03-19 19:22:48	Add --ignore-eos parameter (#181)	slaren
084e2f0e	2023-03-20 02:10:00	interactive mode: print '\n' in sigint_handler, this flush stdout thus ensure color reset. (#283)	Qingyou Meng
0b366e73	2023-03-19 18:57:00	Command line switch to use F16 for memory_k and memory_v (refactor of #154) (#294)	Erik Scholz
160bfb21	2023-03-19 19:51:55	Update hot topics to mention Alpaca support	Georgi Gerganov
c494ed5b	2023-03-19 19:46:32	Fix off-by-one bug (#115)	Georgi Gerganov
c1c7026b	2023-03-19 19:33:18	Fix python stuff (#109)	Georgi Gerganov
467b1497	2023-03-19 20:17:39	Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)	qunash
70f01cb8	2023-03-19 19:04:44	Drop trailing new line from file prompts (#80)	Georgi Gerganov
a4e63b73	2023-03-19 18:49:50	Add instruction for using Alpaca (#240)	Georgi Gerganov
9e170721	2023-03-19 18:37:02	Add "--instruct" argument for usage with Alpaca (#240)	Georgi Gerganov
22213a17	2023-03-19 17:30:00	Change RMSNorm eps to 1e-6 (#173)	Georgi Gerganov
d7def1a7	2023-03-18 17:10:47	Warn user if a context size greater than 2048 tokens is specified (#274)	Ronsor
6f61c18e	2023-03-18 22:39:46	Fix typo in readme	Pavol Rusnak
1e5a6d08	2023-03-18 22:20:04	Add note about Python 3.11 to readme	Pavol Rusnak
554b5415	2023-03-18 21:58:46	Add memory/disk requirements to readme	Pavol Rusnak
d3f202d5	2023-03-18 20:51:49	Remove unused code since n_vocab is model.hparams.n_vocab (#262)	Alex Nguyen
e03e3597	2023-03-18 07:44:09	fixed warning with std::ignore about unused function result (#151)	Justin Suess
a81d0c2a	2023-03-18 04:17:19	Fix n^2 loop in tokenization (#254)	Gary Linscott
b2de7f18	2023-03-18 09:27:12	CI Improvements (#230)	anzz1
a2927478	2023-03-17 23:03:48	Nix flake (#40)	Niklas Korz
c9f670a1	2023-03-17 21:05:58	Implement non-greedy tokenizer that tries to maximize token lengths (#242)	thement
4f546091	2023-03-17 21:46:46	Default to 4 threads (#243)	Georgi Gerganov
e81b9c81	2023-03-17 20:30:04	Update Contributing section	Georgi Gerganov
367946c6	2023-03-17 17:47:35	Don't tell users to use a bad number of threads (#243)	Stephan Walter
6b0df5cc	2023-03-18 00:38:24	add ptread link to fix cmake build under linux (#114)	mmyjona
2af23d30	2023-03-17 10:47:06	🚀 Dockerize llamacpp (#132)	Bernat Vadell
904d2a8d	2023-03-17 05:48:39	Q4_1 quantization (#193)	Matvey Soloviev
72131107	2023-03-16 15:00:09	Update README.md	Georgi Gerganov
ac15de78	2023-03-16 08:55:13	Expand "Contributing" section	Georgi Gerganov
273abc47	2023-03-16 07:12:12	Update hot topics - RMSnorm	Georgi Gerganov
9b4a15b1	2023-03-15 19:29:25	Fix RMS norm in GGML (#191)	Nebula
6eac39ba	2023-03-15 18:41:38	Add RMS norm and use it (#187)	hoangmit
27944c42	2023-03-15 21:35:25	fixed typo (#178)	moritzbrantner
2d15d6c9	2023-03-15 13:56:24	add SIGINT support for _WIN32 environments (#120)	Rickey Bowers Jr
2d64715a	2023-03-15 15:42:40	added ctx_size parameter (#148)	Justin Suess
16b2c61a	2023-03-15 15:39:38	fixed color reset on exit (#149)	Justin Suess
977295c7	2023-03-15 22:39:06	Fix potential licensing issue (#126)	Musab Gultekin
956dfda8	2023-03-15 12:37:50	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)	Ronsor
113e685d	2023-03-15 15:05:14	inline -> static inline for "bytesFromNibbles" (#161)	hoangmit
47857e56	2023-03-14 12:34:37	Don't use vdotq_s32 if it's not available (#139)	Ronsor
60f819a2	2023-03-14 15:30:08	Add section to README on how to run the project on Android (#130)	Radoslav Gerganov
97ab2b25	2023-03-14 09:43:52	Add Misc section + update hot topics + minor fixes	Georgi Gerganov
2f700a27	2023-03-13 17:29:10	Add windows to the CI (#98)	Sebastián A
c09a9cfb	2023-03-13 21:22:15	CMake build in Release by default (#75)	Georgi Gerganov
7ec903d3	2023-03-13 19:21:51	Update contribution section, hot topics, limitations, etc.	Georgi Gerganov
4497ad81	2023-03-13 19:15:08	Print system information	Georgi Gerganov
ed6849cc	2023-03-13 14:12:33	Initial support for CMake (#75)	Sebastián A
41be0a3b	2023-03-13 17:40:54	Add NetBSD support. (#90)	Thomas Klausner
671d5cac	2023-03-13 17:39:56	Use fprintf for diagnostic output (#48)	Pavol Rusnak
84d9015c	2023-03-13 18:36:44	Use vdotq_s32 to improve performance (#67)	Georgi Gerganov
63fd76fb	2023-03-14 01:33:43	Reduce model loading time (#43)	uint256_t
2a20f48e	2023-03-13 12:24:18	Fix UTF-8 handling (including colors) (#79)	Val Kharitonov
d1f22471	2023-03-13 17:15:20	Add quantize script for batch quantization (#92)	Pavol Rusnak
1808ee05	2023-03-13 09:42:26	Add initial contribution guidelines	Georgi Gerganov
a169bb88	2023-03-13 04:08:01	Gate signal support on being on a unixoid system. (#74)	Matvey Soloviev
460c4825	2023-03-13 00:35:51	Fix token count accounting	Matvey Soloviev
c80e2a8f	2023-03-13 01:28:08	Revert "10% performance boost on ARM"	Georgi Gerganov
54a0e66e	2023-03-13 01:21:03	Check for vdotq_s32 availability	Georgi Gerganov
543c57e9	2023-03-13 01:05:24	Ammend to previous commit - forgot to update non-QRDMX branch	Georgi Gerganov
113a9e83	2023-03-13 00:56:10	10% performance boost on ARM	Georgi Gerganov
404fac0d	2023-03-12 23:07:34	Fix color getting reset before prompt output done (#65)	Matvey Soloviev
1a0a7430	2023-03-12 23:39:01	Update README.md	Georgi Gerganov
96ea727f	2023-03-12 22:13:28	Add interactive mode (#61)	Matvey Soloviev
96619548	2023-03-13 03:30:08	Fix typo in README (#45)	Marc Köhlbrugge
f385f8de	2023-03-12 13:28:36	Allow using prompt files (#59)	Ben Garney
02f0c6fe	2023-03-12 16:23:15	Add back top_k (#56)	beiller
eb062bb0	2023-03-12 17:15:00	Windows fixes (#31)	Sebastián A
7027a978	2023-03-12 22:09:26	Update README.md	Georgi Gerganov
2d555e5b	2023-03-12 22:08:24	Add CI (#60)	Georgi Gerganov
7c9e54e5	2023-03-12 20:59:01	Revert "weights_only" arg - this causing more trouble than help	Georgi Gerganov
b9bd1d01	2023-03-12 14:16:33	python/pytorch compat notes (#44)	Oleksandr Nikitin
129c7d1e	2023-03-12 05:27:42	Add repetition penalty (#20)	beiller
702fddf5	2023-03-12 09:03:25	Clarify meaning of hacking	Georgi Gerganov
7d86e25b	2023-03-12 08:41:54	README: add "Supported platforms" + update hot topics	Georgi Gerganov
a9312023	2023-03-11 22:36:35	use weights_only in conversion script (#32)	deepdiffuser
6a9a67f0	2023-03-12 07:36:03	Add LICENSE (#21)	Pavol Rusnak
da1a4ff0	2023-03-12 01:26:32	Update README.md	Georgi Gerganov
6b2cb630	2023-03-11 18:32:20	Fix a typo in model name (#16)	Juraj Bednar
4235e3d5	2023-03-11 18:10:18	Update README.md	Georgi Gerganov
f1eaff47	2023-03-11 17:58:18	Add AVX2 support for x86 architectures thanks to @Const-me !	Georgi Gerganov
a9e58529	2023-03-11 17:40:14	Fix un-initialized FP16 tables on x86 (#15, #2)	Georgi Gerganov
7d9ed7b2	2023-03-11 12:44:21	Bump memory buffer	Georgi Gerganov
0c680332	2023-03-11 12:31:21	Update README.md	Georgi Gerganov
f60fa9e5	2023-03-11 12:26:46	.gitignore models/	Georgi Gerganov
7211862c	2023-03-11 12:26:16	Update Makefile var + add comment	Georgi Gerganov
a5c5ae2f	2023-03-11 11:34:25	Update README.md	Georgi Gerganov
ea977e85	2023-03-11 11:34:11	Update README.md	Georgi Gerganov
007a8f6f	2023-03-11 10:47:09	Support all LLaMA models + change Q4_0 quantization storage	Georgi Gerganov
5f2f970d	2023-03-10 21:47:26	Include Python dependencies in README (#6)	Simon Willison
73c6ed5e	2023-03-11 01:30:47	Update README.md	Georgi Gerganov
01eeed8f	2023-03-11 01:22:58	Update README.md	Georgi Gerganov
6da2df34	2023-03-11 01:18:10	Update README.md	Georgi Gerganov
9dcf4dba	2023-03-10 18:04:06	Add missing headers for memcpy and assert (#3)	Jean-Michaël Celerier
920a7fe2	2023-03-11 00:55:22	Update README.md	Georgi Gerganov
3a57ee59	2023-03-11 00:51:46	Update README.md	Georgi Gerganov
b8502852	2023-03-11 00:09:19	Update README.md	Georgi Gerganov
8a01f565	2023-03-10 23:53:11	Update README.md	Georgi Gerganov
70bc0b8b	2023-03-10 23:46:39	Fix a bug in the rope calculation	Georgi Gerganov
18ebda34	2023-03-10 21:52:27	Update README.md	Georgi Gerganov
319cdb3e	2023-03-10 21:50:46	Final touches	Georgi Gerganov
77532806	2023-03-10 21:47:46	Create README.md	Georgi Gerganov
26c08466	2023-03-10 20:40:58	Initial release	Georgi Gerganov

ed3c680b

2023-03-30 11:16:30

Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)

slaren

9cbc404b

2023-03-29 23:44:39

ci : re-enable AVX512 testing (Windows-MSVC) (#584)

anzz1

b51c717d

2023-03-29 22:15:34

ggml : init time on first ggml_init() call

Georgi Gerganov

0ba76c1e

2023-03-29 22:13:12

llama : fix compile warnings when reading the vocab

Georgi Gerganov

cea1c859

2023-03-29 22:10:01

ggml : add ARM_NEON dequantize_row_q4_1()

Georgi Gerganov

f202ada1

2023-03-29 22:03:02

ggml : add ARM_NEON quantize_row_q4_1()

Georgi Gerganov

3b44d30d

2023-03-29 21:47:33

ggml : add ARM_NEON ggml_vec_dot_q4_1()

Georgi Gerganov

61cbfff5

2023-03-29 20:09:25

rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600)

Pavol Rusnak

d9ad1044

2023-03-29 19:21:09

Create chat-13B.bat (#592)

Thérence

b467702b

2023-03-29 19:38:31

readme : fix typos

Georgi Gerganov

516d88e7

2023-03-29 19:37:20

readme : add GPT4All instructions (close #588)

Georgi Gerganov

53635c08

2023-03-29 19:29:26

py : add GPT4All conversion script

Georgi Gerganov

41318d70

2023-03-29 18:10:07

llama : use the same threshold for OpenBLAS and ggml thread limiting (#577)

Maël Kerbiriou

a6956b25

2023-03-29 17:10:24

add example of re-act pattern (#583)

Tobias Lütke

83df5639

2023-03-29 16:20:07

Fix GCC warning about binary literal (#595)

anzz1

a5c42c4b

2023-03-29 16:19:29

Fix typo in llama.h (#593)

anzz1

5a5f8b15

2023-03-28 22:44:29

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)

anzz1

f1217055

2023-03-28 22:43:25

CI: fix subdirectory path globbing (#546)

anzz1

7f4c5c66

2023-03-28 21:23:09

llama : fix linkage with mingw (#551)

anzz1

2a98bc18

2023-03-28 20:06:03

ggml : add AVX2 implementation of quantize_row_q4_1 (#515)

slaren

d0aaff57

2023-03-28 19:55:42

py : add temporary script to convert old ggml files to newer version (#539)

thement

d0330fd7

2023-03-28 13:51:29

py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403)

Tai Duc Nguyen

99c5b276

2023-03-28 17:13:01

ggml : refactor quantized processing functions (#509)

Stephan Walter

692ce316

2023-03-29 02:02:34

py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547)

DooWoong Lee (David)

96f9c050

2023-03-28 20:01:09

ci : make ctest verbose, hopefully we see what is wrong with the sanitizer

Georgi Gerganov

d502bc7c

2023-03-28 19:51:55

tests : free llama context at the end of the test

Georgi Gerganov

436e5619

2023-03-28 16:48:20

all : be more strict about converting float to double (#458)

Stephan Walter

20e1e848

2023-03-28 11:39:01

deploy : add a Package.swift for SwiftPM support (#393)

Jed Fox

c1f88506

2023-03-28 15:56:03

ggml : introduce structs for the q4 data blocks (#356)

Stephan Walter

e0670260

2023-03-28 18:34:35

gitignore : add "embedding"

Georgi Gerganov

28ba975a

2023-03-28 23:06:28

Check the existence of f16_model_path_base in quantize.py (#574)

dotpy314

a6bdc47c

2023-03-28 16:26:55

Fix usage of F16C intrinsics in AVX code (#563)

slaren

7b8dbcb7

2023-03-28 17:09:55

main.cpp fixes, refactoring (#571)

anzz1

4b8efff0

2023-03-28 08:11:09

Add embedding example to Makefile (#540)

RJ Adriaansen

7e539557

2023-03-27 06:55:26

Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542)

Marco Matthies

34c1072e

2023-03-26 17:48:40

ci: add debug build to sanitizer build matrix (#527)

Erik Scholz

939ad2d3

2023-03-26 15:34:02

Fix undefined variables in debug build, remove unused variables (#531)

Stephan Walter

8c2ec5e2

2023-03-26 10:48:42

Add support for linux/arm64 platform during Docker Builds (#514)

Juan Calderon-Perez

b391579d

2023-03-26 13:14:01

Update README and comments for standalone perplexity tool (#525)

Stephan Walter

7a87d31f

2023-03-26 16:06:10

[main] fix infinite generation (-n == -1) (#523)

anzz1

348d6926

2023-03-26 10:20:49

Add logo to README.md

Georgi Gerganov

33e35b8f

2023-03-26 07:25:46

Exit from interactive mode if input stream is bad (#491)

Harald Fernengel

19726169

2023-03-26 00:13:28

CI: Run other sanitizer builds even if one fails (#511)

anzz1

f732695c

2023-03-25 14:53:55

Clarify console output in convert-pth-to-ggml.py (#512)

jp-x-g

2f7bf7dd

2023-03-25 23:38:11

CMake / CI additions (#497)

anzz1

34ab5268

2023-03-25 22:29:22

(Windows) Set console to UTF-8 on init (#420)

anzz1

c2b25b69

2023-03-25 21:53:39

Fix colors enabling on WIN32

Georgi Gerganov

79b2b266

2023-03-25 21:51:41

If n_predict == -1, generate forever

Georgi Gerganov

e2d490da

2023-03-25 21:36:22

Inifinite generation via context swapping (#71)

Georgi Gerganov

03f7e335

2023-03-25 20:51:14

Cleanup STL headers + fix embedding examples + minor stuff

Georgi Gerganov

55ad42af

2023-03-25 20:36:52

Move chat scripts into "./examples"

Georgi Gerganov

459e93cc

2023-03-25 19:31:48

Add AVX2 implementation of dequantize_row_q4_1 (#505)

slaren

a316a425

2023-03-25 20:26:40

Overhaul the examples structure

Georgi Gerganov

ecbe466a

2023-03-25 19:47:21

Retire the ggml_mul_mat() branch for transposed src0 (#500)

Georgi Gerganov

502a4001

2023-03-25 17:16:50

Disable prompt verbosity by default and add option to enable (#480)

Georgi Gerganov

09aecbf6

2023-03-25 16:06:49

Add AVX2 implementation of dequantize_row_q4_0 (#467)

slaren

4640eff2

2023-03-25 17:03:10

Don't interefe with BLAS for large prompts by running only 1 thread

Georgi Gerganov

ab77d763

2023-03-25 16:47:59

Add longer DAN prompt for testing big batch numbers

Georgi Gerganov

29b7baab

2023-03-25 15:34:23

Add timings for the prompt evaluation (#478)

slaren

4a7129ac

2023-03-25 16:30:32

Remove obsolete information from README

Georgi Gerganov

6b6dbc89

2023-03-25 16:22:05

Remove obsolete assert and fix compiler warning

Georgi Gerganov

2a2e63ce

2023-03-25 16:09:54

Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS

Georgi Gerganov

e899bf54

2023-03-25 14:42:09

bounds checking for input prefix (#492)

anzz1

fbd4d38c

2023-03-25 14:03:19

feat: '--in-prefix STRING' option (#426)

anzz1

58e6c9f3

2023-03-25 01:26:28

Add support for file load progress reporting callbacks (#434)

Jed Fox

36d07532

2023-03-25 01:21:24

Add missing struct annotation (#483)

Doomsdayrs

6f1ee4b6

2023-03-24 23:38:14

Fix crash for 65B model with pre-allocated memory (#485)

Chris Kuehl

8520fc31

2023-03-24 23:47:06

Disable BLAS altogether - the bug is not just for qunatized mat mul

Georgi Gerganov

b3f460e9

2023-03-24 23:39:17

Disable BLAS branch in mul_mat - seems there is a bug

Georgi Gerganov

04c6f5ed

2023-03-24 23:17:58

Immediately start processing the prompt before user input has been provided (#476)

Georgi Gerganov

7a9b6c3a

2023-03-24 23:17:37

Reduce memory usage and allocate enough memory for largest context (#473)

Georgi Gerganov

31572d96

2023-03-24 18:23:56

Temporary bump the memory buffer size - hopefully fix issues from 483bab2e

Georgi Gerganov

f4f5362e

2023-03-24 15:23:09

Update README.md (#444)

Gary Mulder

863f65e2

2023-03-24 10:22:39

fix instruct mode (#445)

rabidcopy

afd220d9

2023-03-24 17:21:01

Properly free llama_context on failure

Georgi Gerganov

481044d5

2023-03-24 08:19:26

additional optimizations for POWER9 (#454)

Cameron Kaiser

563cdc39

2023-03-24 08:19:05

Support calling mlock() on loaded model data on Linux and macOS (#453)

comex

8d4a855c

2023-03-24 08:05:13

Add embedding mode with arg flag. Currently working (#282)

Luciano

b6b268d4

2023-03-24 09:13:35

Add link to Roadmap discussion

Georgi Gerganov

3cd8dde0

2023-03-24 06:22:28

Revert "Fix memory allocation issues and seg faults"

Georgi Gerganov

4870e455

2023-03-24 00:11:53

Fix memory allocation issues and seg faults

Georgi Gerganov

483bab2e

2023-03-23 23:22:01

Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439)

Georgi Gerganov

404e1da3

2023-03-23 16:42:52

Fix quantize script not finding models in parent directory (#428)

Jed Fox

4cc053b6

2023-03-23 22:39:44

Remove oboslete command from Docker script

Georgi Gerganov

0ba5a3a9

2023-03-23 22:32:02

Obsolete

Georgi Gerganov

2e17dfd8

2023-03-23 15:22:47

Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333)

rabidcopy

20a1a4e0

2023-03-23 10:18:13

Fix GPTQ converter (#423)

Timmy Knight

ad072fc5

2023-03-24 05:16:48

Generate library with CMake (#430)

nusu-github

ea10d3de

2023-03-23 19:54:28

Command line args bounds checking (#424)

anzz1

a18c1925

2023-03-22 00:37:02

Fix Nix build

Ben Siraphob

a50e39c6

2023-03-23 14:15:48

Revert "Delete SHA256SUMS for now" (#429)

Stephan Walter

a140219e

2023-03-23 05:41:32

Fix Makefile echo escape codes (by removing them). (#418)

Kerfuffle

8a3e5ef8

2023-03-23 11:30:40

Move model section from issue template to README.md (#421)

Gary Mulder

8eea5ae0

2023-03-23 12:26:19

Delete SHA256SUMS for now (#416)

anzz1

93208cfb

2023-03-23 10:46:58

Adjust repetition penalty ..

Georgi Gerganov

03ace14c

2023-03-23 09:48:51

Add link to recent podcast about whisper.cpp and llama.cpp

Georgi Gerganov

e4412b45

2023-03-23 04:20:34

CI: CMake: Separate build and test steps (#376)

anzz1

f7dc43bc

2023-03-23 01:30:23

Fix instruct mode broken by PR #354 (#409)

tjohnman

ee8a7887

2023-03-22 19:06:18

Update issue template so people will use it (#404)

Gary Mulder

69c92298

2023-03-22 17:29:06

Deduplicate q4 quantization functions (#383)

Stephan Walter

97940520

2023-03-22 18:20:25

fix: add POSIX functionality for Linux compilation (#51)

Valentyn Bezshapkin

305ba6f0

2023-03-22 18:16:35

Don't force immediate interactive without `-i` (#354)

tjohnman

4122dfff

2023-03-22 17:37:10

cmake: make llama an actual library (#392)

Erik Scholz

56e659a0

2023-03-22 17:09:38

fix perplexity after c-api refactor (#390)

Erik Scholz

40ea807a

2023-03-22 08:53:54

Add details on perplexity to README.md (#395)

Gary Linscott

d5850c53

2023-03-22 11:55:45

Add missing header for memcpy (#386)

Yusuf Kağan Hanoğlu

ae44e23e

2023-03-22 07:47:15

When seed <= 0 - use the clock to generate one

Georgi Gerganov

928480ef

2023-03-22 07:45:00

Init llama_context_params properly from CLI (#370)

Georgi Gerganov

56817b1f

2023-03-22 07:34:02

Remove temporary notice and update hot topics

Georgi Gerganov

f5a77a62

2023-03-22 07:32:36

Introduce C-style API (#370)

Georgi Gerganov

da0e9fe9

2023-03-20 20:14:06

Add SHA256SUMS file and instructions to README how to obtain and verify the downloads

Gary Mulder

e6c9e098

2023-03-21 23:49:24

Fix bin dir for win ci

anzz1

01a297b0

2023-03-21 22:34:25

specify build type for ctest on windows (#371)

Erik Scholz

3366853e

2023-03-21 22:57:35

Add notice about pending change

Georgi Gerganov

3f9c6135

2023-03-21 16:52:27

fix typo in chatLLaMa (#368)

Mathieu Nayrolles

0f613527

2023-03-21 19:47:27

Update issue templates

Georgi Gerganov

353ec251

2023-03-21 14:21:50

We could use std::unordered_map over std::map (#305)

Fabio R. Sluzala

89d5d90f

2023-03-21 18:11:01

Fix color codes emitting mid-UTF8 code. (#312)

Matvey Soloviev

16ffc013

2023-03-21 09:42:25

Importer for GPTQ quantized LLaMA models (#301)

comex

486ae645

2023-03-21 09:27:42

Compute perplexity over prompt (#270)

Gary Linscott

3ab3e658

2023-03-21 18:23:15

Add chatLLaMa script (#198)

Jean-Christophe Hoelt

f157088c

2023-03-21 11:21:06

makefile: Fix CPU feature detection on Haiku (#218)

Alex von Gluck IV

c86ba036

2023-03-21 18:14:46

Enable ANSI colors on Windows 10+ (#311)

anzz1

1daf4dd7

2023-03-21 18:10:32

Minor style changes

Georgi Gerganov

dc6a845b

2023-03-21 18:09:37

Add chat.sh script

Georgi Gerganov

6a612959

2023-03-21 17:05:06

Check for reverse prompt by characters instead of tokens (#292) (#330)

tjohnman

d5f56a5e

2023-03-21 17:04:43

Check for reverse prompt by characters instead of tokens (#292) (#330)

tjohnman

3bfa3b43

2023-03-21 17:59:16

Fix convert script, warnings alpaca instructions, default params

Georgi Gerganov

715d292e

2023-03-21 09:50:09

Add OpenBSD support (#314)

Kevin Lo

c98ae026

2023-03-21 08:49:43

fix typo in comment (#318)

Mack Straight

c3b2306b

2023-03-21 23:44:11

Makefile: slightly cleanup for Mac Intel; echo instead of run ./main -h (#335)

Qingyou Meng

975d2ceb

2023-03-21 17:42:43

cmdline option for custom amount of model parts (--n_parts N) (#348)

anzz1

e0ffc861

2023-03-21 08:34:49

Update IPFS links to quantized alpaca with new tokenizer format (#352)

Kevin Kwok

8f644a0a

2023-03-21 17:32:14

Change default repeat_penalty to 1.0

Georgi Gerganov

eb34620a

2023-03-21 17:29:41

Add tokenizer test + revert to C++11 (#355)

Georgi Gerganov

2e664f1f

2023-03-21 07:35:42

Add initial AVX512 support for dot product on Linux (#320)

Casey Primozic

8cf9f34e

2023-03-21 09:37:16

Adding missing features of CMakeLists.txt & Refactoring (#131)

nusu-github

bd4b46d6

2023-03-20 16:44:30

Nix flake: set meta.mainProgram to llama

Ben Siraphob

6b6d5b50

2023-03-21 03:33:10

Fixed tokenizer.model not found error when model dir is symlink (#325)

Qingyou Meng

a791a68b

2023-03-20 12:26:01

move file magic/version to header, print expected version (#319)

Mack Straight

0f1b21cb

2023-03-20 18:05:20

Docker - Fix publish docker image in GitHub Registry (#235)

Bernat Vadell

074bea2e

2023-03-20 03:17:23

sentencepiece bpe compatible tokenizer (#252)

Mack Straight

5cb63e24

2023-03-20 08:24:11

Add tqdm to Python requirements (#293)

Stephan Walter

da5303c1

2023-03-19 17:44:20

bugfix: default should not be interactive (#304)

cocktailpeanut

4545539d

2023-03-19 21:58:51

Rename script

Georgi Gerganov

edeba283

2023-03-19 21:57:28

Add temporary helper script for Alpaca chat

Georgi Gerganov

5c19c70b

2023-03-19 13:44:30

fix coloring of last `n_batch` of prompt, and refactor line input (#221)

Rickey Bowers Jr

24568371

2023-03-19 20:33:06

Support for multiple reverse prompts. (#299)

tjohnman

7392f1cd

2023-03-19 12:38:44

Improved quantize script (#222)

Suaj Carrot

ad5fd5b6

2023-03-19 19:36:19

Make prompt randomization optional. (#300)

tjohnman

368d0c8a

2023-03-19 19:31:17

Respect the maximum number of tokens in interactive. (#298)

tjohnman

50fae10d

2023-03-19 19:22:48

Add --ignore-eos parameter (#181)

slaren

084e2f0e

2023-03-20 02:10:00

interactive mode: print '\n' in sigint_handler, this flush stdout thus ensure color reset. (#283)

Qingyou Meng

0b366e73

2023-03-19 18:57:00

Command line switch to use F16 for memory_k and memory_v (refactor of #154) (#294)

Erik Scholz

160bfb21

2023-03-19 19:51:55

Update hot topics to mention Alpaca support

Georgi Gerganov

c494ed5b

2023-03-19 19:46:32

Fix off-by-one bug (#115)

Georgi Gerganov

c1c7026b

2023-03-19 19:33:18

Fix python stuff (#109)

Georgi Gerganov

467b1497

2023-03-19 20:17:39

Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)

qunash

70f01cb8

2023-03-19 19:04:44

Drop trailing new line from file prompts (#80)

Georgi Gerganov

a4e63b73

2023-03-19 18:49:50

Add instruction for using Alpaca (#240)

Georgi Gerganov

9e170721

2023-03-19 18:37:02

Add "--instruct" argument for usage with Alpaca (#240)

Georgi Gerganov

22213a17

2023-03-19 17:30:00

Change RMSNorm eps to 1e-6 (#173)

Georgi Gerganov

d7def1a7

2023-03-18 17:10:47

Warn user if a context size greater than 2048 tokens is specified (#274)

Ronsor

6f61c18e

2023-03-18 22:39:46

Fix typo in readme

Pavol Rusnak

1e5a6d08

2023-03-18 22:20:04

Add note about Python 3.11 to readme

Pavol Rusnak

554b5415

2023-03-18 21:58:46

Add memory/disk requirements to readme

Pavol Rusnak

d3f202d5

2023-03-18 20:51:49

Remove unused code since n_vocab is model.hparams.n_vocab (#262)

Alex Nguyen

e03e3597

2023-03-18 07:44:09

fixed warning with std::ignore about unused function result (#151)

Justin Suess

a81d0c2a

2023-03-18 04:17:19

Fix n^2 loop in tokenization (#254)

Gary Linscott

b2de7f18

2023-03-18 09:27:12

CI Improvements (#230)

anzz1

a2927478

2023-03-17 23:03:48

Nix flake (#40)

Niklas Korz

c9f670a1

2023-03-17 21:05:58

Implement non-greedy tokenizer that tries to maximize token lengths (#242)

thement

4f546091

2023-03-17 21:46:46

Default to 4 threads (#243)

Georgi Gerganov

e81b9c81

2023-03-17 20:30:04

Update Contributing section

Georgi Gerganov

367946c6

2023-03-17 17:47:35

Don't tell users to use a bad number of threads (#243)

Stephan Walter

6b0df5cc

2023-03-18 00:38:24

add ptread link to fix cmake build under linux (#114)

mmyjona

2af23d30

2023-03-17 10:47:06

🚀 Dockerize llamacpp (#132)

Bernat Vadell

904d2a8d

2023-03-17 05:48:39

Q4_1 quantization (#193)

Matvey Soloviev

72131107

2023-03-16 15:00:09

Update README.md

Georgi Gerganov

ac15de78

2023-03-16 08:55:13

Expand "Contributing" section

Georgi Gerganov

273abc47

2023-03-16 07:12:12

Update hot topics - RMSnorm

Georgi Gerganov

9b4a15b1

2023-03-15 19:29:25

Fix RMS norm in GGML (#191)

Nebula

6eac39ba

2023-03-15 18:41:38

Add RMS norm and use it (#187)

hoangmit

27944c42

2023-03-15 21:35:25

fixed typo (#178)

moritzbrantner

2d15d6c9

2023-03-15 13:56:24

add SIGINT support for _WIN32 environments (#120)

Rickey Bowers Jr

2d64715a

2023-03-15 15:42:40

added ctx_size parameter (#148)

Justin Suess

16b2c61a

2023-03-15 15:39:38

fixed color reset on exit (#149)

Justin Suess

977295c7

2023-03-15 22:39:06

Fix potential licensing issue (#126)

Musab Gultekin

956dfda8

2023-03-15 12:37:50

Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)

Ronsor

113e685d

2023-03-15 15:05:14

inline -> static inline for "bytesFromNibbles" (#161)

hoangmit

47857e56

2023-03-14 12:34:37

Don't use vdotq_s32 if it's not available (#139)

Ronsor

60f819a2

2023-03-14 15:30:08

Add section to README on how to run the project on Android (#130)

Radoslav Gerganov

97ab2b25

2023-03-14 09:43:52

Add Misc section + update hot topics + minor fixes

Georgi Gerganov

2f700a27

2023-03-13 17:29:10

Add windows to the CI (#98)

Sebastián A

c09a9cfb

2023-03-13 21:22:15

CMake build in Release by default (#75)

Georgi Gerganov

7ec903d3

2023-03-13 19:21:51

Update contribution section, hot topics, limitations, etc.

Georgi Gerganov

4497ad81

2023-03-13 19:15:08

Print system information

Georgi Gerganov

ed6849cc

2023-03-13 14:12:33

Initial support for CMake (#75)

Sebastián A

41be0a3b

2023-03-13 17:40:54

Add NetBSD support. (#90)

Thomas Klausner

671d5cac

2023-03-13 17:39:56

Use fprintf for diagnostic output (#48)

Pavol Rusnak

84d9015c

2023-03-13 18:36:44

Use vdotq_s32 to improve performance (#67)

Georgi Gerganov

63fd76fb

2023-03-14 01:33:43

Reduce model loading time (#43)

uint256_t

2a20f48e

2023-03-13 12:24:18

Fix UTF-8 handling (including colors) (#79)

Val Kharitonov

d1f22471

2023-03-13 17:15:20

Add quantize script for batch quantization (#92)

Pavol Rusnak

1808ee05

2023-03-13 09:42:26

Add initial contribution guidelines

Georgi Gerganov

a169bb88

2023-03-13 04:08:01

Gate signal support on being on a unixoid system. (#74)

Matvey Soloviev

460c4825

2023-03-13 00:35:51

Fix token count accounting

Matvey Soloviev

c80e2a8f

2023-03-13 01:28:08

Revert "10% performance boost on ARM"

Georgi Gerganov

54a0e66e

2023-03-13 01:21:03

Check for vdotq_s32 availability

Georgi Gerganov

543c57e9

2023-03-13 01:05:24

Ammend to previous commit - forgot to update non-QRDMX branch

Georgi Gerganov

113a9e83

2023-03-13 00:56:10

10% performance boost on ARM

Georgi Gerganov

404fac0d

2023-03-12 23:07:34

Fix color getting reset before prompt output done (#65)

Matvey Soloviev

1a0a7430

2023-03-12 23:39:01

Update README.md

Georgi Gerganov

96ea727f

2023-03-12 22:13:28

Add interactive mode (#61)

Matvey Soloviev

96619548

2023-03-13 03:30:08

Fix typo in README (#45)

Marc Köhlbrugge

f385f8de

2023-03-12 13:28:36

Allow using prompt files (#59)

Ben Garney

02f0c6fe

2023-03-12 16:23:15

Add back top_k (#56)

beiller

eb062bb0

2023-03-12 17:15:00

Windows fixes (#31)

Sebastián A

7027a978

2023-03-12 22:09:26

Update README.md

Georgi Gerganov

2d555e5b

2023-03-12 22:08:24

Add CI (#60)

Georgi Gerganov

7c9e54e5

2023-03-12 20:59:01

Revert "weights_only" arg - this causing more trouble than help

Georgi Gerganov

b9bd1d01

2023-03-12 14:16:33

python/pytorch compat notes (#44)

Oleksandr Nikitin

129c7d1e

2023-03-12 05:27:42

Add repetition penalty (#20)

beiller

702fddf5

2023-03-12 09:03:25

Clarify meaning of hacking

Georgi Gerganov

7d86e25b

2023-03-12 08:41:54

README: add "Supported platforms" + update hot topics

Georgi Gerganov

a9312023

2023-03-11 22:36:35

use weights_only in conversion script (#32)

deepdiffuser

6a9a67f0

2023-03-12 07:36:03

Add LICENSE (#21)

Pavol Rusnak

da1a4ff0

2023-03-12 01:26:32

Update README.md

Georgi Gerganov

6b2cb630

2023-03-11 18:32:20

Fix a typo in model name (#16)

Juraj Bednar

4235e3d5

2023-03-11 18:10:18

Update README.md

Georgi Gerganov

f1eaff47

2023-03-11 17:58:18

Add AVX2 support for x86 architectures thanks to @Const-me !

Georgi Gerganov

a9e58529

2023-03-11 17:40:14

Fix un-initialized FP16 tables on x86 (#15, #2)

Georgi Gerganov

7d9ed7b2

2023-03-11 12:44:21

Bump memory buffer

Georgi Gerganov

0c680332

2023-03-11 12:31:21

Update README.md

Georgi Gerganov

f60fa9e5

2023-03-11 12:26:46

.gitignore models/

Georgi Gerganov

7211862c

2023-03-11 12:26:16

Update Makefile var + add comment

Georgi Gerganov

a5c5ae2f

2023-03-11 11:34:25

Update README.md

Georgi Gerganov

ea977e85

2023-03-11 11:34:11

Update README.md

Georgi Gerganov

007a8f6f

2023-03-11 10:47:09

Support all LLaMA models + change Q4_0 quantization storage

Georgi Gerganov

5f2f970d

2023-03-10 21:47:26

Include Python dependencies in README (#6)

Simon Willison

73c6ed5e

2023-03-11 01:30:47

Update README.md

Georgi Gerganov

01eeed8f

2023-03-11 01:22:58

Update README.md

Georgi Gerganov

6da2df34

2023-03-11 01:18:10

Update README.md

Georgi Gerganov

9dcf4dba

2023-03-10 18:04:06

Add missing headers for memcpy and assert (#3)

Jean-Michaël Celerier

920a7fe2

2023-03-11 00:55:22

Update README.md

Georgi Gerganov

3a57ee59

2023-03-11 00:51:46

Update README.md

Georgi Gerganov

b8502852

2023-03-11 00:09:19

Update README.md

Georgi Gerganov

8a01f565

2023-03-10 23:53:11

Update README.md

Georgi Gerganov

70bc0b8b

2023-03-10 23:46:39

Fix a bug in the rope calculation

Georgi Gerganov

18ebda34

2023-03-10 21:52:27

Update README.md

Georgi Gerganov

319cdb3e

2023-03-10 21:50:46

Final touches

Georgi Gerganov

77532806

2023-03-10 21:47:46

Create README.md

Georgi Gerganov

26c08466

2023-03-10 20:40:58

Initial release

Georgi Gerganov

Liu Song’s Projects

~/Projects/llama.cpp

History