Commit History

metal : fuse add, mul + add tests (llama/14596)
66ae493

ggerganov HF Staff commited on

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741)
bb523fb

Oliver Simons commited on

CUDA: set_rows + cpy.cu refactor (llama/14712)
536128f

am17an commited on

use max work group size for device to replace the magic number (llama/14732)
e5e9b79

Neo Zhang Jianyu commited on

ggml: Add initial WebGPU backend (llama/14521)
0dd208f

Reese Levine commited on

ggml : add asserts (llama/14720)
7073590

ggerganov HF Staff Diego Devesa commited on

vulkan: fix noncontig check for mat_mul_id splitting (llama/14683)
4d0d8b8

jeffbolznv commited on

vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653)
bac21a7

jeffbolznv commited on

cuda: fix build warnings in set-rows.cu (unused variable) (llama/14687)
1e145c7

yeahdongcn commited on

sycl: Hotfix for non dnnl codepath (llama/14677)
75496c9

Anton Mitkov commited on

ggml : refactor llamafile_sgemm PPC code (llama/14673)
3e2a209

shalinib commited on

SYCL: use 1D kernel for set_rows (llama/14618)
b305121

qnixsynapse commited on

sycl: Batched mulmat rework for oneDNN dispatch (llama/14617)
2722bea

Anton Mitkov commited on

cuda : add set rows for bf16 (llama/14664)
1f97ff4

Sigbjørn Skjæret commited on

cuda : add ELU support (llama/14657)
cbe8006

Yavor Ivanov commited on

ggml : add build-time message to remind about ggml_set_rows (llama/14661)
0f5d4ba

ggerganov HF Staff commited on

metal : Add missing unary ops Metal support (llama/14660)
2ed022e

Yavor Ivanov commited on

CUDA: add set rows for f32 and f16 (llama/14551)
e51f2d4

am17an commited on

whisper: validate get_rows support for cpu extra buffer (#3323)
3c6ba32
unverified

Charles Xu commited on

examples : update links in wasm examples (#3318)
e03b5a6
unverified

facehugger11 commited on

sync : resolve conflicts (#0)
5ec49ef

ggerganov HF Staff commited on

talk-llama : sync llama.cpp
bc53087

ggerganov HF Staff commited on

sync : ggml
116dcaa

ggerganov HF Staff commited on

sync : resolve conflicts (ggml/0)
497add0

ggerganov HF Staff commited on

vulkan: support SET_ROWS (llama/14587)
9821f43

jeffbolznv commited on

vulkan: optimizations for deepseek prompt processing (llama/14555)
04b631e

jeffbolznv commited on

model : support LiquidAI LFM2 hybrid family (llama/14620)
07ff90a

Tarek Dakhran commited on

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)
4354560

Slobodan Josic commited on

opencl: add tiled mul_mat_f16_f32 (llama/14535)
398dc49

mrfatso commited on

opencl: add `set_rows` for `f16` and `f32` (llama/14547)
5e203ec

lhez commited on

SYCL: Initial set_rows kernel implementation (llama/14562)
e62ef85

qnixsynapse commited on

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602)
92b2d32

compilade commited on

ggml : add ggml_scale_bias (llama/14417)
573d50a

ngxson HF Staff commited on

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595)
31f34e7

yuuoniy commited on

vulkan: optimize flash attention split_k_reduce (llama/14554)
45fbb42

jeffbolznv commited on

vulkan : fix rope with partial rotation and non-cont src (llama/14582)
367fa85

jeffbolznv commited on

cuda : fix rope with partial rotation and non-cont src (llama/14580)
aaf2d96

ggerganov HF Staff commited on

CUDA: add bilinear interpolation for upscale (llama/14563)
68ded09

am17an commited on

musa: fix build warnings (unused variable) (llama/14561)
891b1d1

yeahdongcn commited on

CUDA: add bf16 and i32 to getrows (llama/14529)
014494c

am17an commited on

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)
effd61f

Eve Rémy Oudompheng commited on

vulkan: fix rms_norm+mul fusion (llama/14545)
0791e65

jeffbolznv commited on

vulkan: Handle updated FA dim2/3 definition (llama/14518)
d1e619e

jeffbolznv commited on

opencl: add GELU_ERF (llama/14476)
b19d736

Sigbjørn Skjæret commited on

metal : disable fast math in all quantize kernels (llama/14528)
df9d510

ggerganov HF Staff commited on

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002)
b9b5859

luyhcsu luyuhong commited on

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922

Sigbjørn Skjæret commited on

opencl : broadcast for soft_max (llama/14510)
4434043

lhez commited on

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)
90cefa0

jeffbolznv commited on