HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of replaceing it. (llama/12209) 18afa4b uvos commited on Mar 6, 2025
ggml : upgrade init_tensor API to return a ggml_status (llama/11854) d6b6852 William Tambellini slaren commited on Feb 28, 2025
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129) f959b90 cmdr2 commited on Feb 28, 2025
cuda/cpu: Increase support for fp16 unary operations (ggml/1125) 67e8c32 cmdr2 commited on Feb 28, 2025
CUDA: app option to compile without FlashAttention (llama/12025) fbc5f16 JohannesGaessler commited on Feb 22, 2025
cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000) 6cb8158 Garf commited on Feb 22, 2025
MUSA: support ARM64 and enable dp4a .etc (llama/11843) ab96dac Bodhi Bodhi Hu commited on Feb 21, 2025
HIP: Switch to std::vector in rocblas version check (llama/11820) e144c94 uvos commited on Feb 12, 2025
CUDA: use arch list for compatibility check (llama/11775) b88e163 JohannesGaessler Diego Devesa commited on Feb 10, 2025
CUDA: support for mat. mul. with ne03 != ne13 (llama/11656) 78e36a2 JohannesGaessler commited on Feb 5, 2025
CUDA: non-contiguous (RMS) norm support (llama/11659) 4c2e171 JohannesGaessler ggerganov HF Staff commited on Feb 4, 2025
HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectures for amd gpus are not supersets of eatch other (llama/11601) 4850c24 uvos commited on Feb 2, 2025
HIP: Only call rocblas_initialize on rocblas versions with the multiple instantation bug (llama/11080) 82bb7f3 Nikita Sarychev commited on Jan 28, 2025
AMD: parse the architecture as supplied by gcnArchName (llama/11244) 04b01d8 Haus1 commited on Jan 27, 2025
Hip: disable VMM on hip as it seams that it dosent work in some configurations (llama/11420) 2cc4df4 uvos commited on Jan 25, 2025
rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (llama/11356) 6f5687a uvos commited on Jan 24, 2025
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380) 855a9fe JohannesGaessler commited on Jan 24, 2025
CUDA: backwards pass for misc. ops, add tests (llama/11257) 2fbcec1 JohannesGaessler commited on Jan 16, 2025
RoPE: fix back, CUDA support for back + noncont. (llama/11240) 131a21e JohannesGaessler commited on Jan 15, 2025
cuda : CUDA Graph Compute Function Refactor (precursor for performance improvements) (llama/11042) 25882f6 Andreas Kieslinger slaren commited on Jan 13, 2025
llama: add support for QRWKV6 model architecture (llama/11001) 4a6b7e0 mollysama ggerganov HF Staff compilade commited on Jan 10, 2025
CUDA: rename macros to avoid conflicts with WinAPI (llama/10736) 8544072 Andreas Kieslinger commited on Dec 10, 2024
ggml : refactor online repacking (llama/10446) 163128e Djip007 ggerganov HF Staff commited on Dec 7, 2024
ggml : add support for dynamic loading of backends (llama/10469) b73266f Diego Devesa ggerganov HF Staff commited on Nov 25, 2024
CUDA: fix MMV kernel being used for FP16 src1 (llama/10357) af4dff1 JohannesGaessler commited on Nov 17, 2024
CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318) e446f60 JohannesGaessler commited on Nov 17, 2024
ggml : build backends as libraries (llama/10256) 3dc93f3 Diego Devesa ggerganov HF Staff R0CKSTAR commited on Nov 14, 2024