ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (llama/12154) 05466a9 Rémy O commited on Mar 6, 2025
HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (llama/12032) a027c1d David Huang commited on Mar 3, 2025
CUDA: compress mode option and default to size (llama/12029) 4ec988a Green-Sky commited on Mar 1, 2025
cmake: Fix ggml backend dependencies and installation (llama/11818) c6c2a2c Vladimir Vuksanovic commited on Feb 27, 2025
Told cmake to install ggml-cpp.h as a public header file. (ggml/1126) 3d4f29c petterreinholdtsen Petter Reinholdtsen commited on Feb 26, 2025
ggml-cpu: Support s390x SIMD Instruction Set (llama/12019) 4aa54ec Aaron Teo Jinyang He junchao-zhao commited on Feb 22, 2025
CUDA: app option to compile without FlashAttention (llama/12025) fbc5f16 JohannesGaessler commited on Feb 22, 2025
ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390) 9de6d81 Charles Xu commited on Feb 20, 2025
cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096) 729db34 unverified Christian Kastner commited on Feb 3, 2025
cmake: add ggml find package (llama/11369) ca6577f bandoti ggerganov HF Staff commited on Jan 26, 2025
Hip: disable VMM on hip as it seams that it dosent work in some configurations (llama/11420) 2cc4df4 uvos commited on Jan 25, 2025
cmake : avoid -march=native when reproducible build is wanted (llama/11366) 3cae2d9 Bernhard M. Wiedemann commited on Jan 24, 2025
GGUF: C++ refactor, backend support, misc fixes (llama/11030) 21c5b64 JohannesGaessler commited on Jan 7, 2025
ggml : do not install metal source when embed library (ggml/1054) 9615cf2 ggerganov HF Staff commited on Jan 3, 2025
Introducing experimental OpenCL backend with support for Qualcomm Adreno GPUs (llama/10693) 83a0899 lhez Skyler Szot Shangqing Gu Alexander Angus Hongqiang Wang Max Krasnyansky commited on Dec 13, 2024
ggml : add predefined list of CPU backend variants to build (llama/10626) 1794b43 Diego Devesa commited on Dec 4, 2024
ggml : add support for dynamic loading of backends (llama/10469) b73266f Diego Devesa ggerganov HF Staff commited on Nov 25, 2024
Add required ggml-base and backend libs to cmake pkg (llama/10407) 8fdd994 bandoti commited on Nov 19, 2024
sycl : Add option to set the SYCL architecture for all targets (llama/10266) 0d836df Romain Biessy commited on Nov 19, 2024
CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318) e446f60 JohannesGaessler commited on Nov 17, 2024
backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921) 3541ee8 Charles Xu Diego Devesa commited on Nov 15, 2024
ggml : build backends as libraries (llama/10256) 3dc93f3 Diego Devesa ggerganov HF Staff R0CKSTAR commited on Nov 14, 2024
metal : opt-in compile flag for BF16 (llama/10218) 5f667d1 ggerganov HF Staff commited on Nov 8, 2024
ggml : add ggml-cpu.h to the public headers (llama/10204) 936a35f Diego Devesa commited on Nov 7, 2024
cmake : do not hide GGML options + rename option (llama/9465) 8c32d36 ggerganov HF Staff commited on Sep 16, 2024
cmake : remove unused option GGML_CURL (llama/9011) 12634fc ggerganov HF Staff commited on Aug 14, 2024
ggml : move sgemm sources to llamafile subfolder (llama/8394) 1554348 ggerganov HF Staff commited on Jul 10, 2024
cmake : only enable GGML_NATIVE and x86 flags if not crosscompiling (ggml/885) 0456299 stanimirovb commited on Jul 12, 2024
ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (llama/8140) e83fdad unverified slaren commited on Jun 26, 2024
whisper : reorganize source code + improve CMake (#2256) f75c2e3 unverified ggerganov HF Staff commited on Jun 26, 2024