whisper.cpp

Running

App Files Files Community

whisper.cpp

Commit History

ggml: fix compilation error s390x (llama/12848)

2458d68

Aaron Teo Aleksei Nikiforov commited on Apr 11, 2025

cpu: fix cpu backend's supports-op for GET_ROWS_BACK. fixes a fatal when running test-backend-ops with only the CPU backend (ggml/1190)

ee7706c

cmdr2 commited on Apr 11, 2025

CANN: Support more ops (llama/12841)

6aecea5

Chenguang Li commited on Apr 10, 2025

Fixes #12823 (llama/12830)

8a74c6b

prajwal-ibm commited on Apr 9, 2025

ggml-cpu-impl.h: do not redefine bool on POWER9 (llama/12856)

bb47d22

Piotr Kubaj commited on Apr 9, 2025

ggml-impl.h: fix build on POWER9 (llama/12855)

3a1d5ca

Piotr Kubaj commited on Apr 9, 2025

CANN: Support Opt CONV_TRANSPOSE_1D and ELU (llama/12786)

3b46fdc

Chenguang Li commited on Apr 9, 2025

vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (llama/12833)

4b7a407

jeffbolznv commited on Apr 9, 2025

vulkan: Use fp16 for the flash attention P*V multiplication (llama/12783)

4e46f41

jeffbolznv commited on Apr 9, 2025

cuda : add f32 to bf16 copy op (llama/12806)

9dcb047

Sigbjørn Skjæret commited on Apr 8, 2025

llama : fix FA when KV cache is not used (i.e. embeddings) (llama/12825)

e7cb2dc

ggerganov HF Staff commited on Apr 8, 2025

ggml: don't include arm_neon.h when using CUDA 12 with ARM Neon (ggml/1187)

87f1ea3

cmdr2 commited on Apr 10, 2025

ggml : add bilinear upscale support (ggml/1185)

4c5e449

Diego Devesa commited on Apr 9, 2025

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)

ba7a5f8

Diego Devesa commited on Apr 9, 2025

Revert "sycl:remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor" (llama/12812)

3d4b079

Neo Zhang Jianyu commited on Apr 8, 2025

opencl: better identify Adreno GPU (llama/12760)

5560cd6

lhez commited on Apr 7, 2025

cuda : fix HIP and MUSA BF16 (llama/0)

6dc5583

ggerganov HF Staff commited on Apr 7, 2025

sycl: remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor (llama/12734)

7d3e668

jeffzhou2000 commited on Apr 7, 2025

CANN: fix typo in ggml-cann (llama/12733)

65ced74

jeffzhou2000 commited on Apr 7, 2025

CANN: Refactor to reduce duplicate code (llama/12731)

44ac81c

hipudding commited on Apr 7, 2025

musa: fix compilation warnings in mp_22/31 (llama/12780)

090ad80

R0CKSTAR commited on Apr 6, 2025

vulkan: fix NaN issue in flash attention shader (llama/12776)

77d7613

jeffbolznv commited on Apr 6, 2025

vulkan: Use unclamped loads for flash attention mask (llama/12720)

a76ef69

jeffbolznv commited on Apr 6, 2025

Vulkan: Tune Vulkan mmq int dot shader for performance (llama/12767)

b3bf710

OccamRazor commited on Apr 5, 2025

sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution (llama/12625)

27cbcc9

Nicolò Scipione commited on Apr 4, 2025

cmake: fix ggml-shaders-gen compiler paths containing spaces (llama/12747)

1c89b7d

Ronny Brendel commited on Apr 4, 2025

vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (llama/12630)

ee422be

jeffbolznv commited on Apr 4, 2025

vulkan: set cmake minimum and project name in vulkan-shaders (llama/12744)

2459781

jeffbolznv commited on Apr 4, 2025

CUDA: Prefer vector flash decoding kernel for Gemma models (llama/12738)

5d7a13f

Gaurav Garg

JohannesGaessler commited on Apr 3, 2025

vulkan: Fix missing cmake logic for dot product extension (llama/12721)

7a1e8f8

jeffbolznv commited on Apr 3, 2025

fix MUSA compiler warning (llama/12704)

8d43aa6

a3sh commited on Apr 3, 2025

CANN: Support operator SIN COS ARGMAX (llama/12709)

904aaf5

Chenguang Li

noemotiovon commited on Apr 3, 2025

Simplify and improve CUDA graphs through use of indirect copy pointers (llama/9017)

a2fdbe6

Alan Gray slaren commited on Apr 3, 2025

CANN: Fix failed test cases (llama/12708)

7d5f3d4

hipudding commited on Apr 3, 2025

opencl: use `max_alloc_size` in backend ctx instead of querying again (llama/12705)

3847456

lhez commited on Apr 3, 2025

vulkan: Implement split_k for coopmat2 flash attention. (llama/12627)

5ab06d6

jeffbolznv commited on Apr 2, 2025

cmake: remove caching from vulkan coopmat checks (llama/12719)

fac18c1

bandoti commited on Apr 2, 2025

vulkan: Implement grouped query attention in the coopmat2 FA shader (llama/12559)

e7bebe6

jeffbolznv commited on Apr 2, 2025

Vulkan: Fix mmq int dot float cache size (llama/12722)

1cecf5d

OccamRazor commited on Apr 2, 2025

llama : add option to override model tensor buffers (llama/11397)

3d000b6

Diego Devesa commited on Apr 2, 2025

ggml : simplify Arm fp16 CPU logic (ggml/1177)

fb13b88

ggerganov HF Staff commited on Apr 7, 2025

CUDA: don't convert BF16 weights to FP32 (ggml/1174)

332bcaf

Sigbjørn Skjæret commited on Apr 4, 2025

coreml : set convert_to="mlprogram" in convert

d41b883
unverified

danbev commited on Apr 23, 2025

ci : disable freeBSD job in build.yml (#3064)

9374466
unverified

danbev commited on Apr 22, 2025

examples : add HEAPU8 to exported runtime methods (#3062)

2339555
unverified

danbev commited on Apr 20, 2025

ruby : make Ruby bindings installed with build options (#3056)

8d0a50d
unverified

KitaitiMakoto commited on Apr 17, 2025

whisper : add no_context parameter to whisper_params (#3045)

0e991f8
unverified

sachaarbonel commited on Apr 16, 2025

examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (#3038)

880d905
unverified

fujimotos commited on Apr 15, 2025

ruby: use CMake in build process (#3043)

470918e
unverified

KitaitiMakoto commited on Apr 14, 2025

docs : update README.md to note newer nvidia gpus (#3031)

9401dde
unverified

Jeff Klassen commited on Apr 11, 2025

Commit History

ggml: fix compilation error s390x (llama/12848) 2458d68

cpu: fix cpu backend's supports-op for GET_ROWS_BACK. fixes a fatal when running test-backend-ops with only the CPU backend (ggml/1190) ee7706c

CANN: Support more ops (llama/12841) 6aecea5

Fixes #12823 (llama/12830) 8a74c6b

ggml-cpu-impl.h: do not redefine bool on POWER9 (llama/12856) bb47d22

ggml-impl.h: fix build on POWER9 (llama/12855) 3a1d5ca

CANN: Support Opt CONV_TRANSPOSE_1D and ELU (llama/12786) 3b46fdc

vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (llama/12833) 4b7a407

vulkan: Use fp16 for the flash attention P*V multiplication (llama/12783) 4e46f41

cuda : add f32 to bf16 copy op (llama/12806) 9dcb047

llama : fix FA when KV cache is not used (i.e. embeddings) (llama/12825) e7cb2dc

ggml: don't include arm_neon.h when using CUDA 12 with ARM Neon (ggml/1187) 87f1ea3

ggml : add bilinear upscale support (ggml/1185) 4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183) ba7a5f8

Revert "sycl:remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor" (llama/12812) 3d4b079

opencl: better identify Adreno GPU (llama/12760) 5560cd6

cuda : fix HIP and MUSA BF16 (llama/0) 6dc5583

sycl: remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor (llama/12734) 7d3e668

CANN: fix typo in ggml-cann (llama/12733) 65ced74

CANN: Refactor to reduce duplicate code (llama/12731) 44ac81c

musa: fix compilation warnings in mp_22/31 (llama/12780) 090ad80

vulkan: fix NaN issue in flash attention shader (llama/12776) 77d7613

vulkan: Use unclamped loads for flash attention mask (llama/12720) a76ef69

Vulkan: Tune Vulkan mmq int dot shader for performance (llama/12767) b3bf710

sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution (llama/12625) 27cbcc9

cmake: fix ggml-shaders-gen compiler paths containing spaces (llama/12747) 1c89b7d

vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (llama/12630) ee422be

vulkan: set cmake minimum and project name in vulkan-shaders (llama/12744) 2459781

CUDA: Prefer vector flash decoding kernel for Gemma models (llama/12738) 5d7a13f

vulkan: Fix missing cmake logic for dot product extension (llama/12721) 7a1e8f8

fix MUSA compiler warning (llama/12704) 8d43aa6

CANN: Support operator SIN COS ARGMAX (llama/12709) 904aaf5

Simplify and improve CUDA graphs through use of indirect copy pointers (llama/9017) a2fdbe6

CANN: Fix failed test cases (llama/12708) 7d5f3d4

opencl: use `max_alloc_size` in backend ctx instead of querying again (llama/12705) 3847456

vulkan: Implement split_k for coopmat2 flash attention. (llama/12627) 5ab06d6

cmake: remove caching from vulkan coopmat checks (llama/12719) fac18c1

vulkan: Implement grouped query attention in the coopmat2 FA shader (llama/12559) e7bebe6

Vulkan: Fix mmq int dot float cache size (llama/12722) 1cecf5d

llama : add option to override model tensor buffers (llama/11397) 3d000b6

ggml : simplify Arm fp16 CPU logic (ggml/1177) fb13b88

CUDA: don't convert BF16 weights to FP32 (ggml/1174) 332bcaf

coreml : set convert_to="mlprogram" in convert d41b883 unverified

ci : disable freeBSD job in build.yml (#3064) 9374466 unverified

examples : add HEAPU8 to exported runtime methods (#3062) 2339555 unverified

ruby : make Ruby bindings installed with build options (#3056) 8d0a50d unverified

whisper : add no_context parameter to whisper_params (#3045) 0e991f8 unverified

examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (#3038) 880d905 unverified

ruby: use CMake in build process (#3043) 470918e unverified

docs : update README.md to note newer nvidia gpus (#3031) 9401dde unverified