Commit History

vulkan: implement initial support for IQ2 and IQ3 quantizations (llama/11360)
bd93c1b

Rémy Oudompheng jeffbolznv commited on

vulkan: Catch pipeline creation failure and print an error message (llama/11436)
d4f6b2c

jeffbolznv commited on

HIP: Supress transformation warning in softmax.cu
72c6f1d

uvos commited on

HIP: Only call rocblas_initialize on rocblas versions with the multiple instantation bug (llama/11080)
82bb7f3

Nikita Sarychev commited on

cmake : don't fail on `GGML_CPU=OFF` (llama/11457)
6406a6e

someone13574 commited on

SYCL : SOFTMAX F16 mask support and other fixes (llama/11261)
8aaf0c8

qnixsynapse commited on

AMD: parse the architecture as supplied by gcnArchName (llama/11244)
04b01d8

Haus1 commited on

metal: Handle null returned from MTLCreateSystemDefaultDevice() (llama/11441)
4e38ed4

Ihar Hrachyshka commited on

metal : use residency sets (llama/11427)
9da4d68

ggerganov HF Staff commited on

cmake: add ggml find package (llama/11369)
ca6577f

bandoti ggerganov HF Staff commited on

vulkan: compile shaders on-demand (llama/11406)
5c008f7

jeffbolznv commited on

Hip: disable VMM on hip as it seams that it dosent work in some configurations (llama/11420)
2cc4df4

uvos commited on

hip : Add hipGraph and VMM support to ROCM (llama/11362)
089afa0

uvos commited on

CUDA: fix FP16 cuBLAS GEMM (llama/11396)
7b7c5d3

JohannesGaessler commited on

rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (llama/11356)
6f5687a

uvos commited on

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
855a9fe

JohannesGaessler commited on

Vulkan-run-test: fix mmq_wg_denoms (llama/11343)
133a580

amd-dwang commited on

vulkan: sort shaders for more deterministic binary (llama/11315)
d7c0046

jeffbolznv commited on

vulkan: fix diag_mask_inf (llama/11323)
f76204e

jeffbolznv commited on

rpc : better caching of the base buffer pointer (llama/11331)
81a6cae

rgerganov commited on

metal : fix out-of-bounds write (llama/11314)
1101050

ggerganov HF Staff commited on

vulkan: fix coopmat2 validation failures (llama/11284)
f2cc7e9

jeffbolznv commited on

SYCL: Introducing memory host pool (llama/11251)
aedb0b3

Nicolò Scipione commited on

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)
e0e73fa

jeffbolznv commited on

rpc : early register backend devices (llama/11262)
4134077

rgerganov commited on

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166)
3bb9e77

jeffbolznv commited on

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)
ee122d3

jeffbolznv commited on

vulkan: optimize coopmat2 q2_k dequant function (llama/11130)
d49a569

jeffbolznv commited on

CUDA: backwards pass for misc. ops, add tests (llama/11257)
2fbcec1

JohannesGaessler commited on

ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (llama/11227)
bf3dc93

fj-y-saito ggerganov HF Staff commited on

vulkan: scale caching for k quants + misc fixes (llama/11081)
03ab36f

Eve commited on

fix: ggml: fix vulkan-shaders-gen build (llama/10448)
ad8f031

Sparkleholic commited on

RoPE: fix back, CUDA support for back + noncont. (llama/11240)
131a21e

JohannesGaessler commited on

SYCL: Add gated linear attention kernel (llama/11175)
fdb1fe5

qnixsynapse commited on

ggml : add option to not print stack on abort (ggml/1081)
9b2706e

William Tambellini Diego Devesa commited on

ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. (ggml/1065)
8e57313

issixx issi commited on

GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030)
92311a3

JohannesGaessler commited on

ggml : add opencl backend (skip) (llama/10693)
226358f

lhez Skyler Szot Shangqing Gu Alexander Angus Hongqiang Wang Max Krasnyansky commited on

cuda : CUDA Graph Compute Function Refactor (precursor for performance improvements) (llama/11042)
25882f6

Andreas Kieslinger slaren commited on

ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL (llama/11211)
79f750d

rgerganov commited on

Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161)
5ad3f1d

OccamRazor commited on

SYCL: Refactor ggml_sycl_compute_forward (llama/11121)
fa23a38

qnixsynapse commited on

fix: add missing msg in static_assert (llama/11143)
8c60d6a

hydaitw commited on

llamafile : ppc64le MMA INT8 implementation (llama/10912)
6f18eed

amritahs-ibm commited on

Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117)
623b74d

mbaudier commited on

fix: Vulkan shader gen binary path when Cross-compiling (llama/11096)
966a7bb

ag2s20150909 commited on

GGUF: C++ refactor, backend support, misc fixes (llama/11030)
21c5b64

JohannesGaessler commited on

ggml-backend : only offload from host buffers (fix) (llama/11124)
9ac3c7e

Diego Devesa commited on