Commit History

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000)
6cb8158

Garf commited on

cuda: add q8_0->f32 cpy operation (llama/9571)
6201c74

Nekotekina commited on

cuda : fix defrag with quantized KV (llama/9319)
061ca37

slaren commited on

ggml : reduce hash table reset cost (llama/8698)
9808fbf

slaren commited on

Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (llama/8258)
cc49462

HanClinto commited on

whisper : reorganize source code + improve CMake (#2256)
f75c2e3
unverified

ggerganov HF Staff commited on