Spaces:
Running
Running
Commit History
ggml : remove oboslete alibi code (skipme) (#0) d25c1e3
ggml : full ALiBi support (llama/7192) 192bda4
CUDA: generalize FP16 fattn vec kernel (llama/7061) ca79691
Introduction of CUDA Graphs to LLama.cpp (llama/6766) 08fc76d
agray3 slaren commited on
CUDA: CUDART < 11.7 workaround for __hmax, __hmax2 (llama/7019) 4cf786d
ggml : add Flash Attention (llama/5021) 34d3b03
Fix more int overflow during quant (PPL/CUDA). (llama/6563) 531387f
ggml : group all experts in a single ggml_mul_mat_id (llama/6505) f0b5c67
feat: implemented sigmoid function (ggml/806) cd0c122
Justina Cho commited on