Spaces:
Running
Running
Commit History
common : fix wav buffer detection (#1819) bc84057 unverified
server : add fields to `verbose_json` response (#1802) 763d09d unverified
make : update MSYS_NT (#1813) 587152f unverified
talk-llama : sync llama.cpp 1453539 unverified
sync : ggml 278a9b3 unverified
ggml : add Vulkan backend (llama/2059) 5a97aba unverified
ggml : minor type fix (int64_t -> size_t) 1bbb1a9 unverified
common : fix input buffer check (#1812) 6c38a7f unverified
talk-llama : sync llama.cpp 92cfd93 unverified
sync : ggml 5a9540e unverified
Add OpenCL add kernel (llama/5151) f833987 unverified
cuda : fix tensor size calculation for non-split buffer (llama/5145) 8f3eb65 unverified
slaren commited on
ggml-alloc : add 10% margin to the buffer sizes (llama/5149) c55bdf8 unverified
slaren commited on
ggml : update softmax n_task calculation (llama/5126) 3a3eb8e unverified
snadampal commited on
metal : remove unused `n_buffers` and `buffers` (llama/5129) a3e87d3 unverified
Paul Tsochantaris commited on
metal : show compile log messages ae08f31 unverified
cuda : fix 2-bit quants on amd hip (llama/5105) aadbd67 unverified
Engininja2 commited on
llama : pre-allocate input tensors in a separate buffer (llama/5100) 20a4ca1 unverified
slaren commited on
metal : disable support for MUL_MAT F32 x F16 7fbc01f unverified
CUDA: more info when no device code (llama/5088) e96ba7d unverified
minor : clean-up some warnings and style (llama/5094) 7df090b unverified
ggml : parallelize FP32 conversion when using BLAS (llama/5045) 7bf2c87 unverified
llava : MobileVLM support (llama/4954) dc8f956 unverified
llama : run all KQV ops on the CPU with no KV offload (llama/5049) 97ce95c unverified
slaren commited on
cuda : fix compile error in jetson platform (llama/4975) 0935414 unverified
Kylin commited on
ggml : check ggml_add src1 type (ggml/708) aa5d6ed unverified
Judd Judd commited on
docs : make model options / model install methods clearer (#1806) a2bec1d unverified
cmake : make libwhisper.so position independent (#1792) 1cf1553 unverified
trixirt commited on
cmake : temporary remove VLA check (#1795) 1a32e6f unverified
whisper.android : return output from benchmarks (#1785) 5cff61b unverified
server : implement "verbose_json" format with token details (#1781) d6e13b6 unverified
ggml : sync ggml-metal.m b4085c3 unverified
sync : llama.cpp 5de718a unverified
sync : ggml 34bdd70 unverified
ggml : add IQ2 to test-backend-ops + refactoring (llama/4990) 227f2ae unverified
imatrix : offload to GPU support (llama/4957) 6490f98 unverified
backend : add eval callback (llama/4935) 3cc64d6 unverified
metal : create autorelease pool during library build (llama/4970) 9027276 unverified
ggml : importance matrix support for legacy quants (llama/4969) d8bb9d8 unverified
metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936) e2cc0e5 unverified
ggml : introduce GGML_CALL function annotation (llama/4850) 7815f68 unverified
cuda : fix dequantize kernel names (llama/4938) 95f6502 unverified
CUDA: faster dequantize kernels for Q4_0 and Q4_1 (llama/4938) 73c6598 unverified
Add ability to use importance matrix for all k-quants (llama/4930) 7032309 unverified
talk-llama : optional wake-up command and audio confirmation (#1765) 542e8da unverified
server : fix building and simplify lib deps on Windows (#1772) f928f33 unverified
Przemysław Pawełczyk commited on