whisper.cpp

Running

App Files Files Community

whisper.cpp

Commit History

metal : fuse add, mul + add tests (llama/14596)

66ae493

ggerganov HF Staff commited on Jul 18, 2025

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741)

bb523fb

Oliver Simons commited on Jul 18, 2025

CUDA: set_rows + cpy.cu refactor (llama/14712)

536128f

am17an commited on Jul 18, 2025

use max work group size for device to replace the magic number (llama/14732)

e5e9b79

Neo Zhang Jianyu commited on Jul 18, 2025

ggml: Add initial WebGPU backend (llama/14521)

0dd208f

Reese Levine commited on Jul 16, 2025

llama : add high-throughput mode (llama/14363)

b2d73a2

ggerganov HF Staff

JohannesGaessler commited on Jul 16, 2025

ggml : add asserts (llama/14720)

7073590

ggerganov HF Staff Diego Devesa commited on Jul 16, 2025

vulkan: fix noncontig check for mat_mul_id splitting (llama/14683)

4d0d8b8

jeffbolznv commited on Jul 15, 2025

vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653)

bac21a7

jeffbolznv commited on Jul 15, 2025

cuda: fix build warnings in set-rows.cu (unused variable) (llama/14687)

1e145c7

yeahdongcn commited on Jul 15, 2025

sycl: Hotfix for non dnnl codepath (llama/14677)

75496c9

Anton Mitkov commited on Jul 14, 2025

ggml : refactor llamafile_sgemm PPC code (llama/14673)

3e2a209

shalinib commited on Jul 14, 2025

SYCL: use 1D kernel for set_rows (llama/14618)

b305121

qnixsynapse commited on Jul 14, 2025

sycl: Batched mulmat rework for oneDNN dispatch (llama/14617)

2722bea

Anton Mitkov commited on Jul 14, 2025

cuda : add set rows for bf16 (llama/14664)

1f97ff4

Sigbjørn Skjæret commited on Jul 13, 2025

cuda : add ELU support (llama/14657)

cbe8006

Yavor Ivanov commited on Jul 13, 2025

ggml : add build-time message to remind about ggml_set_rows (llama/14661)

0f5d4ba

ggerganov HF Staff commited on Jul 13, 2025

metal : Add missing unary ops Metal support (llama/14660)

2ed022e

Yavor Ivanov commited on Jul 13, 2025

CUDA: add set rows for f32 and f16 (llama/14551)

e51f2d4

am17an commited on Jul 12, 2025

whisper: validate get_rows support for cpu extra buffer (#3323)

3c6ba32
unverified

Charles Xu commited on Jul 14, 2025

examples : update links in wasm examples (#3318)

e03b5a6
unverified

facehugger11 commited on Jul 12, 2025

sync : resolve conflicts (#0)

5ec49ef

ggerganov HF Staff commited on Jul 12, 2025

talk-llama : sync llama.cpp

bc53087

ggerganov HF Staff commited on Jul 12, 2025

sync : ggml

116dcaa

ggerganov HF Staff commited on Jul 12, 2025

sync : resolve conflicts (ggml/0)

497add0

ggerganov HF Staff commited on Jul 12, 2025

vulkan: support SET_ROWS (llama/14587)

9821f43

jeffbolznv commited on Jul 12, 2025

vulkan: optimizations for deepseek prompt processing (llama/14555)

04b631e

jeffbolznv commited on Jul 12, 2025

model : support LiquidAI LFM2 hybrid family (llama/14620)

07ff90a

Tarek Dakhran commited on Jul 11, 2025

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)

4354560

Slobodan Josic commited on Jul 11, 2025

opencl: add tiled mul_mat_f16_f32 (llama/14535)

398dc49

mrfatso commited on Jul 10, 2025

opencl: add `set_rows` for `f16` and `f32` (llama/14547)

5e203ec

lhez commited on Jul 10, 2025

SYCL: Initial set_rows kernel implementation (llama/14562)

e62ef85

qnixsynapse commited on Jul 10, 2025

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602)

92b2d32

compilade commited on Jul 10, 2025

ggml : add ggml_scale_bias (llama/14417)

573d50a

ngxson HF Staff commited on Jul 9, 2025

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595)

31f34e7

yuuoniy commited on Jul 9, 2025

vulkan: optimize flash attention split_k_reduce (llama/14554)

45fbb42

jeffbolznv commited on Jul 8, 2025

vulkan : fix rope with partial rotation and non-cont src (llama/14582)

367fa85

jeffbolznv commited on Jul 8, 2025

cuda : fix rope with partial rotation and non-cont src (llama/14580)

aaf2d96

ggerganov HF Staff commited on Jul 8, 2025

CUDA: add bilinear interpolation for upscale (llama/14563)

68ded09

am17an commited on Jul 8, 2025

musa: fix build warnings (unused variable) (llama/14561)

891b1d1

yeahdongcn commited on Jul 7, 2025

CUDA: add bf16 and i32 to getrows (llama/14529)

014494c

am17an commited on Jul 7, 2025

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)

effd61f

Eve Rémy Oudompheng commited on Jul 6, 2025

vulkan: fix rms_norm+mul fusion (llama/14545)

0791e65

jeffbolznv commited on Jul 6, 2025

vulkan: Handle updated FA dim2/3 definition (llama/14518)

d1e619e

jeffbolznv commited on Jul 5, 2025

opencl: add GELU_ERF (llama/14476)

b19d736

Sigbjørn Skjæret commited on Jul 5, 2025

metal : disable fast math in all quantize kernels (llama/14528)

df9d510

ggerganov HF Staff commited on Jul 4, 2025

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002)

b9b5859

luyhcsu luyuhong commited on Jul 4, 2025

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)

f798922

Sigbjørn Skjæret commited on Jul 3, 2025

opencl : broadcast for soft_max (llama/14510)

4434043

lhez commited on Jul 3, 2025

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)

90cefa0

jeffbolznv commited on Jul 3, 2025

Commit History

metal : fuse add, mul + add tests (llama/14596) 66ae493

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741) bb523fb

CUDA: set_rows + cpy.cu refactor (llama/14712) 536128f

use max work group size for device to replace the magic number (llama/14732) e5e9b79

ggml: Add initial WebGPU backend (llama/14521) 0dd208f

llama : add high-throughput mode (llama/14363) b2d73a2

ggml : add asserts (llama/14720) 7073590

vulkan: fix noncontig check for mat_mul_id splitting (llama/14683) 4d0d8b8

vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653) bac21a7

cuda: fix build warnings in set-rows.cu (unused variable) (llama/14687) 1e145c7

sycl: Hotfix for non dnnl codepath (llama/14677) 75496c9

ggml : refactor llamafile_sgemm PPC code (llama/14673) 3e2a209

SYCL: use 1D kernel for set_rows (llama/14618) b305121

sycl: Batched mulmat rework for oneDNN dispatch (llama/14617) 2722bea

cuda : add set rows for bf16 (llama/14664) 1f97ff4

cuda : add ELU support (llama/14657) cbe8006

ggml : add build-time message to remind about ggml_set_rows (llama/14661) 0f5d4ba

metal : Add missing unary ops Metal support (llama/14660) 2ed022e

CUDA: add set rows for f32 and f16 (llama/14551) e51f2d4

whisper: validate get_rows support for cpu extra buffer (#3323) 3c6ba32 unverified

examples : update links in wasm examples (#3318) e03b5a6 unverified

sync : resolve conflicts (#0) 5ec49ef

talk-llama : sync llama.cpp bc53087

sync : ggml 116dcaa

sync : resolve conflicts (ggml/0) 497add0

vulkan: support SET_ROWS (llama/14587) 9821f43

vulkan: optimizations for deepseek prompt processing (llama/14555) 04b631e

model : support LiquidAI LFM2 hybrid family (llama/14620) 07ff90a

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634) 4354560

opencl: add tiled mul_mat_f16_f32 (llama/14535) 398dc49

opencl: add `set_rows` for `f16` and `f32` (llama/14547) 5e203ec

SYCL: Initial set_rows kernel implementation (llama/14562) e62ef85

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602) 92b2d32

ggml : add ggml_scale_bias (llama/14417) 573d50a

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595) 31f34e7

vulkan: optimize flash attention split_k_reduce (llama/14554) 45fbb42

vulkan : fix rope with partial rotation and non-cont src (llama/14582) 367fa85

cuda : fix rope with partial rotation and non-cont src (llama/14580) aaf2d96

CUDA: add bilinear interpolation for upscale (llama/14563) 68ded09

musa: fix build warnings (unused variable) (llama/14561) 891b1d1

CUDA: add bf16 and i32 to getrows (llama/14529) 014494c

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485) effd61f

vulkan: fix rms_norm+mul fusion (llama/14545) 0791e65

vulkan: Handle updated FA dim2/3 definition (llama/14518) d1e619e

opencl: add GELU_ERF (llama/14476) b19d736

metal : disable fast math in all quantize kernels (llama/14528) df9d510

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002) b9b5859

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445) f798922

opencl : broadcast for soft_max (llama/14510) 4434043

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509) 90cefa0

metal : fuse add, mul + add tests (llama/14596)

66ae493

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741)

bb523fb

CUDA: set_rows + cpy.cu refactor (llama/14712)

536128f

use max work group size for device to replace the magic number (llama/14732)

e5e9b79

ggml: Add initial WebGPU backend (llama/14521)

0dd208f

llama : add high-throughput mode (llama/14363)

b2d73a2

ggml : add asserts (llama/14720)

7073590

vulkan: fix noncontig check for mat_mul_id splitting (llama/14683)

4d0d8b8

vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653)

bac21a7

cuda: fix build warnings in set-rows.cu (unused variable) (llama/14687)

1e145c7

sycl: Hotfix for non dnnl codepath (llama/14677)

75496c9

ggml : refactor llamafile_sgemm PPC code (llama/14673)

3e2a209

SYCL: use 1D kernel for set_rows (llama/14618)

b305121

sycl: Batched mulmat rework for oneDNN dispatch (llama/14617)

2722bea

cuda : add set rows for bf16 (llama/14664)

1f97ff4

cuda : add ELU support (llama/14657)

cbe8006

ggml : add build-time message to remind about ggml_set_rows (llama/14661)

0f5d4ba

metal : Add missing unary ops Metal support (llama/14660)

2ed022e

CUDA: add set rows for f32 and f16 (llama/14551)

e51f2d4

whisper: validate get_rows support for cpu extra buffer (#3323)

3c6ba32
unverified

examples : update links in wasm examples (#3318)

e03b5a6
unverified

sync : resolve conflicts (#0)

5ec49ef

talk-llama : sync llama.cpp

bc53087

sync : ggml

116dcaa

sync : resolve conflicts (ggml/0)

497add0

vulkan: support SET_ROWS (llama/14587)

9821f43

vulkan: optimizations for deepseek prompt processing (llama/14555)

04b631e

model : support LiquidAI LFM2 hybrid family (llama/14620)

07ff90a

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)

4354560

opencl: add tiled mul_mat_f16_f32 (llama/14535)

398dc49

opencl: add `set_rows` for `f16` and `f32` (llama/14547)

5e203ec

SYCL: Initial set_rows kernel implementation (llama/14562)

e62ef85

cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602)

92b2d32

ggml : add ggml_scale_bias (llama/14417)

573d50a

ggml : prevent integer overflow in gguf tensor size calculation (llama/14595)

31f34e7

vulkan: optimize flash attention split_k_reduce (llama/14554)

45fbb42

vulkan : fix rope with partial rotation and non-cont src (llama/14582)

367fa85

cuda : fix rope with partial rotation and non-cont src (llama/14580)

aaf2d96

CUDA: add bilinear interpolation for upscale (llama/14563)

68ded09

musa: fix build warnings (unused variable) (llama/14561)

891b1d1

CUDA: add bf16 and i32 to getrows (llama/14529)

014494c

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485)

effd61f

vulkan: fix rms_norm+mul fusion (llama/14545)

0791e65

vulkan: Handle updated FA dim2/3 definition (llama/14518)

d1e619e

opencl: add GELU_ERF (llama/14476)

b19d736

metal : disable fast math in all quantize kernels (llama/14528)

df9d510

CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002)

b9b5859

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)

f798922

opencl : broadcast for soft_max (llama/14510)

4434043

vulkan: support mixed/deepseekR1 FA head sizes (llama/14509)

90cefa0