Spaces:
Running
Running
Commit History
gguf : fix comparison (ggml/715) 80cfca4 unverified
gguf : add input validation, prevent integer overflows (ggml/709) 5bf1614 unverified
ggml : add Vulkan backend (llama/2059) 5a97aba unverified
ggml : minor type fix (int64_t -> size_t) 1bbb1a9 unverified
Add OpenCL add kernel (llama/5151) f833987 unverified
ggml : update softmax n_task calculation (llama/5126) 3a3eb8e unverified
snadampal commited on
minor : clean-up some warnings and style (llama/5094) 7df090b unverified
ggml : parallelize FP32 conversion when using BLAS (llama/5045) 7bf2c87 unverified
llava : MobileVLM support (llama/4954) dc8f956 unverified
ggml : check ggml_add src1 type (ggml/708) aa5d6ed unverified
Judd Judd commited on
ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) 227f2ae unverified
imatrix : offload to GPU support (llama/4957) 6490f98 unverified
ggml : importance matrix support for legacy quants (llama/4969) d8bb9d8 unverified
ggml : introduce GGML_CALL function annotation (llama/4850) 7815f68 unverified
Add ability to use importance matrix for all k-quants (llama/4930) 7032309 unverified
2-bit quantizations (llama/4897) 8a399ab unverified
ggml: cache sin/cos for RoPE (llama/4908) c315fbf unverified
gguf : fix potential infinite for-loop (llama/4600) 0e93179 unverified
texmex76 Bernhard Gstrein commited on
llama : ggml-backend integration (llama/4766) 362430b unverified
ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856) 5e827d5 unverified
ggml : remove ggml_cpy_inplace and ggml_cont_inplace (ggml/693) 6469bfe unverified
Timothy Cronin commited on
Fix execlp call (ggml/689) abda16e unverified
Halalaluyafail3 commited on
SOTA 2-bit quants (llama/4773) 75de5bf unverified
ggml : do not sched_yield when calling BLAS (llama/4761) 5d1dffc unverified
ggml : extend ggml_get_rows, ggml_repeat, ggml_concat (ggml/639) f17d170
sync : ggml (VMM, sync-ggml-am, dotprod ARM fixes, CUDA fixes) (#1691) 919a447 unverified
sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677) aa86ade unverified
sync : ggml (Metal fixes, new ops, tests) (#1633) a0d4b48 unverified
sync : ggml (new ops, new backend, etc) (#1602) 895e87a unverified
ggml : re-enable blas for src0 != F32 (#1583) 87987de unverified
sync : ggml (ggml-alloc + linker + gguf fixes) (#1501) 58507b9 unverified
whisper : add full CUDA and Metal offloading (#1472) da4acca unverified
ggml : fix MIN / MAX macro re-definition 1344fc4 unverified
sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422) 7006035 unverified
sync : ggml (const correctness) 4ce2d25 unverified
metal : add F32 support + update bench output 02d7878 unverified
whisper : Metal and ggml-alloc support (#1270) 714ee6b unverified
whisper : fix bench regression + fix performance when using CPU BLAS (#1275) abbf5f2 unverified
sync : ggml (HBM + Metal + style) (#1264) 88deeba unverified
build : do not use _GNU_SOURCE gratuitously (#1129) beefa34 unverified
Przemysław Pawełczyk commited on
ggml : posixify pagesize (#1251) 4902c26 unverified
Przemysław Pawełczyk commited on
ggml : sync latest llama.cpp (view_src + alloc improvements) (#1247) 8bb66c1 unverified
ggml : sync (ggml-alloc, GPU, eps, etc.) (#1220) d41ba35 unverified
ggml : fix compilation errors incurred by -Werror (#1227) 45ef7b5 unverified
ChangSeok Oh commited on
ggml : fix compiling when SSE3 is available but not SSSE3 (#1210) b7995b7 unverified
Przemysław Pawełczyk commited on
ggml : detect SSSE3 (#1211) 82a619c unverified
Przemysław Pawełczyk commited on