whisper.cpp

Running

App Files Files Community

whisper.cpp

Commit History

sync : ggml

305dc4e

ggerganov HF Staff commited on Jun 16, 2024

ggml : fix and optimize ppc64le (ggml/849)

e3d09d2

Hong Bo PENG commited on Jun 16, 2024

ggml : remove duplicate include of ggml-common.h (ggml/853)

8c3ae74

danbev commited on Jun 16, 2024

remove global variables (llama/7710)

4cb73ba

hengyu commited on Jun 15, 2024

CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (llama/7921)

5931562

JohannesGaessler commited on Jun 14, 2024

metal : utilize max shared memory for mul_mat_id (llama/7935)

d4b3604

ggerganov HF Staff commited on Jun 14, 2024

rpc : fix ggml_backend_rpc_supports_buft() (llama/7918)

56e6751

rgerganov commited on Jun 13, 2024

move BLAS to a separate backend (llama/6210)

c773aa9

slaren

ggerganov HF Staff commited on Jun 13, 2024

CUDA: fix broken oob check for FA vec f32 kernel (llama/7904)

efbb7be

JohannesGaessler commited on Jun 12, 2024

tests : add non-cont unary tests (llama/7857)

6dc2887

ggerganov HF Staff commited on Jun 12, 2024

ggml : improve ggml_is_contiguous logic (llama/7856)

ea3aa71

ggerganov HF Staff commited on Jun 12, 2024

vulkan: select only one device for single gpu with multiple drivers (llama/7582)

ee56a37

Adriankhl commited on Jun 11, 2024

Update Vulkan RoPE implementation (llama/7818)

71850e7

OccamRazor slaren commited on Jun 11, 2024

CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (llama/7860)

154bf2b

JohannesGaessler commited on Jun 11, 2024

CUDA: use tensor cores for MMQ (llama/7676)

78a5b67

JohannesGaessler commited on Jun 10, 2024

use the correct SYCL context for host USM allocations (llama/7777)

9f87c2f

bashbaug commited on Jun 10, 2024

CUDA: revise q8_1 data layout for mul_mat_q (llama/7824)

fcfd59e

JohannesGaessler commited on Jun 9, 2024

vulkan : reuse parent extra for views (llama/7806)

b9b60de

slaren

OccamRazor commited on Jun 7, 2024

fix softmax r2r result wrong issue (llama/7811)

c3a7159

PPxin commited on Jun 7, 2024

CUDA: refactor mmq, dmmv, mmvq (llama/7716)

849ff52

JohannesGaessler commited on Jun 5, 2024

ggml : refactor rope norm/neox (llama/7634)

ded0c68

ggerganov HF Staff commited on Jun 5, 2024

Allow number of nodes in CUDA graph to change (llama/7738)

6124287

agray3 commited on Jun 4, 2024

ggml : remove OpenCL (llama/7735)

4ff3b72

ggerganov HF Staff commited on Jun 4, 2024

ggml : prevent builds with -ffinite-math-only (llama/7726)

154f0f8

ggerganov HF Staff commited on Jun 4, 2024

llama : offload to RPC in addition to other backends (llama/7640)

eab8082

rgerganov slaren commited on Jun 3, 2024

ggml : use OpenMP as a thread pool (llama/7606)

7e5d850

Masaya, Kato slaren

ggerganov HF Staff commited on Jun 3, 2024

Vulkan Mixture of Experts (MoE) support (llama/7628)

ad9ee26

OccamRazor commited on Jun 3, 2024

kompute : implement op_getrows_f32 (llama/6403)

fa0872f

woachk commited on Jun 3, 2024

fix bug introduced in using calloc (llama/7701)

f22c7e4

Dave Airlie commited on Jun 2, 2024

Fix FlashAttention debug test, FP32 assert (llama/7684)

1bed92f

JohannesGaessler commited on Jun 1, 2024

CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (llama/7681)

d4c0faf

JohannesGaessler commited on Jun 1, 2024

CUDA: quantized KV support for FA vec (llama/7527)

315df8c

JohannesGaessler commited on Jun 1, 2024

ggml : fix loongson compile warnings (llama/7537)

c1442f3

ggerganov HF Staff junchao-loongson commited on May 31, 2024

faster avx512 exp implementation (llama/7551)

6dbbbab

chriselrod commited on May 30, 2024