Commit History

ggml: backward pass for split swiglu (llama/14483)
45c8df6

JohannesGaessler commited on

Fix conditional enabling following arch checks for ggml-sycl (llama/14504)
1f15602

Nicolò Scipione commited on

kv-cache : use ggml_set_rows (llama/14285)
7d6d9e8

ggerganov HF Staff commited on

ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81

ggerganov HF Staff commited on

CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497)
8e1f56c

am17an commited on

llama : initial Mamba-2 support (llama/9126)
1b4087e

compilade commited on

CUDA: add softmax broadcast (llama/14475)
05351ac

am17an commited on

CUDA: broadcasting for FlashAttention mask (llama/14500)
47e02a8

JohannesGaessler commited on

vulkan: support softmax/FA batch and broadcast (llama/14449)
f6b0b76

jeffbolznv commited on

ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e

ggerganov HF Staff commited on

opencl : fix possible buffer overflow in dump_tensor (llama/14490)
deb934d

jeffzhou2000 commited on

opencl : skip empty nodes on cgraph compute (llama/14491)
5c36e7c

Eric Zhang commited on

opencl : update upscale to support align corners (llama/14488)
2b95b05

lhez commited on

ggml : Callback before abort (llama/14481)
ccee17d

Bytealyzer Diego Devesa commited on

ci : disable fast-math for Metal GHA CI (llama/14478)
ec4b1b3

ggerganov HF Staff commited on

CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411)
d8d5b0b

Chenguang Li commited on

vulkan: Split large mul_mat_id to fit in shared memory (llama/14451)
bf678f0

jeffbolznv commited on

add GELU_ERF (llama/14455)
235ebf7

Sigbjørn Skjæret commited on

vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291)
666e65b

Acly commited on

vulkan : implement ggml_roll (ggml/1290)
968f9e8

Acly commited on

ggml : add version function to get lib version (ggml/1286)
880f633

danbev ggerganov HF Staff commited on

server : add dtw.params for v3-large-turbo (#3307)
1250fd1
unverified

accessiblepixel commited on

feat: support vad for addon.node (#3301)
f795870
unverified

Lin Xiaodong linxiaodong commited on

sync : ggml
4966aed

ggerganov HF Staff commited on

talk-llama : sync llama.cpp
0ec1374

ggerganov HF Staff commited on

ggml : remove trailing whitespace (llama/0)
e37767f

ggerganov HF Staff commited on

opencl : add GEGLU, REGLU, SWIGLU (llama/14456)
d70ff9f

lhez commited on

Add Conv2d for CPU (llama/14388)
68eb27a

am17an commited on

metal : disable fast-math for some cpy kernels (llama/14460)
9d1185a

ggerganov HF Staff commited on

ggml-cpu: sycl: Re-enable exp f16 (llama/14462)
c0fcd7a

Romain Biessy commited on

cmake : Remove redundant include path in CMakeLists.txt (llama/14452)
6b59b68

xiaobing318 commited on

scripts : make the shell scripts cross-platform (llama/14341)
9de52c8

Vedran Miletić commited on

SYCL: disable faulty fp16 exp kernel (llama/14395)
b4969ff

qnixsynapse commited on

ggml : fix unmerged GGML_FPxx_TO_FPxx refactoring (llama/14443)
f7995cb

Sigbjørn Skjæret commited on

vulkan: Add fusion support for RMS_NORM+MUL (llama/14366)
737f12d

jeffbolznv slaren commited on

CUDA: add bf16 and f32 support to cublas_mul_mat_batched (llama/14361)
c7936d3

am17an commited on

vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline (llama/14378)
1c3b94c

jeffbolznv commited on

vulkan: lock accesses of pinned_memory vector (llama/14333)
59dca4f

jeffbolznv commited on

fix async_mode bug (llama/14432)
122b29a

dou112 commited on

vulkan: Fix GGML_VULKAN_SHADER_DEBUG_INFO (llama/14427)
a06c8ca

jeffbolznv commited on

ggml : add ggml_set_rows (llama/14274)
ac46a22

rgerganov ggerganov HF Staff commited on

cmake: regen vulkan shaders when shaders-gen sources change (llama/14398)
7988638

bandoti commited on

metal : add special-case mat-vec mul for ne00 == 4 (llama/14385)
724622d

ggerganov HF Staff commited on

metal : batch rows copy in a single threadgroup (llama/14384)
b4ff704

ggerganov HF Staff commited on

ggml-cpu: enable IBM NNPA Vector Intrinsics (llama/14317)
fea8f94

taronaeo slaren commited on

ggml : do not output unprintable characters on GGUF load failure (llama/14381)
e7b2e19

Sigbjørn Skjæret commited on

sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (llama/13973)
b25d3bf

Anton Mitkov commited on

opencl: ref count `ggml_backend_opencl_context` and refactor profiling (llama/14254)
ae0c7b8

lhez commited on