whisper.cpp / ggml-metal.metal

Commit History

metal : support FA without mask + add asserts (llama/7278)
98ce302
unverified

ggerganov HF Staff commited on

ggml : resolve merge (ggml/0)
d692b06

ggerganov HF Staff commited on

ggml : full ALiBi support (llama/7192)
192bda4

ggerganov HF Staff commited on

metal : fix unused warning
24e883a

ggerganov HF Staff commited on

ggml : group all experts in a single ggml_mul_mat_id (llama/6505)
f0b5c67

slaren ggerganov HF Staff commited on

llama : add qwen2moe (llama/6074)
daae175

Shijie ggerganov HF Staff commited on

Added support for GGML_OP_CLAMP in Metal (llama/6662)
a06cbc7

Dave dave-fl commited on

metal : unify mul_mv_id kernels (llama/6556)
e9910b5

slaren commited on

feat: implemented sigmoid function (ggml/806)
cd0c122

Justina Cho commited on

ggml : mul_mat_id use the same tensor for all the experts (llama/6387)
26fdc9f
unverified

slaren ggerganov HF Staff commited on

sync : ggml (#2001)
cbbfa9e
unverified

ggerganov HF Staff commited on

metal : build metallib + fix embed path (llama/6015)
27311ef
unverified

ggerganov HF Staff commited on

ggml : reuse quantum structs across backends (llama/5943)
bb0625f
unverified

ggerganov HF Staff commited on

1.5 bit: we can do even better (llama/5999)
36cc71e
unverified

Kawrakow ikawrakow commited on

Better 1.5 bit quantization (llama/5971)
f3a62cc
unverified

Kawrakow ikawrakow commited on

metal : move mm_id indices to shared mem (llama/5982)
1350705
unverified

ggerganov HF Staff commited on

ggml : add ggml-common.h to deduplicate shared code (llama/5940)
0a37735
unverified

ggerganov HF Staff commited on

ggml : IQ3_S improvements (llama/5829)
06a8e30
unverified

Kawrakow ikawrakow commited on

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747)
dd8e3f9
unverified

leejet ggerganov HF Staff slaren commited on

ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (llama/5760)
9a07f42
unverified

Kawrakow ikawrakow commited on

IQ4_XS: a 4.25 bpw quantization (llama/5747)
0ee1bfb
unverified

Kawrakow ikawrakow commited on

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)
2b9bb9e
unverified

Kawrakow ikawrakow ggerganov HF Staff commited on

IQ3_S: a much better alternative to Q3_K (llama/5676)
32589c9
unverified

Kawrakow ikawrakow commited on

sync : llama.cpp (ggml/0)
f8e8d34
unverified

ggerganov HF Staff commited on

cuda, metal : fix nans in soft_max (llama/5574)
44164ac
unverified

slaren ggerganov HF Staff commited on

metal : fix unused warnings (llama/0)
d12cda5
unverified

ggerganov HF Staff commited on

1.5 bit quantization (llama/5453)
9c3aa6a
unverified

Kawrakow ikawrakow commited on

ggml : add ALiBi support for ggml_soft_max_ext (llama/5488)
26c019a
unverified

ggerganov HF Staff commited on

metal : add im2col F32 dst support (llama/5132)
26aec77
unverified

ggerganov HF Staff commited on

ggml : fix IQ3_XXS on Metal (llama/5219)
f066321
unverified

Kawrakow ikawrakow commited on

SOTA 3-bit quants (llama/5196)
4649943
unverified

Kawrakow ikawrakow commited on

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)
5e827d5
unverified

Kawrakow ikawrakow commited on

metal : improve dequantize precision to match CPU (llama/4836)
f2da2a4
unverified

ggerganov HF Staff commited on

SOTA 2-bit quants (llama/4773)
75de5bf
unverified

Kawrakow ikawrakow commited on

metal : add kernel_get_rows_i32
459dd87

ggerganov HF Staff commited on

metal : optimize ggml_mul_mat_id (faster Mixtral PP) (llama/4725)
8bc6274

ggerganov HF Staff commited on

metal : enable shader debugging (cmake option) (llama/4705)
7dd37dc

ggerganov HF Staff commited on

sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677)
aa86ade
unverified

ggerganov HF Staff commited on

sync : ggml (Metal fixes, new ops, tests) (#1633)
a0d4b48
unverified

ggerganov HF Staff commited on

sync : ggml (new ops, new backend, etc) (#1602)
895e87a
unverified

ggerganov HF Staff commited on

whisper : add full CUDA and Metal offloading (#1472)
da4acca
unverified

ggerganov HF Staff commited on

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422)
7006035
unverified

ggerganov HF Staff Chris Raethke commited on

metal : add F32 support + update bench output
02d7878
unverified

ggerganov HF Staff commited on

whisper : Metal and ggml-alloc support (#1270)
714ee6b
unverified

ggerganov HF Staff commited on

sync : ggml (HBM + Metal + style) (#1264)
88deeba
unverified

ggerganov HF Staff commited on

ggml : sync latest llama.cpp (view_src + alloc improvements) (#1247)
8bb66c1
unverified

ggerganov HF Staff commited on

ggml : sync (ggml-alloc, GPU, eps, etc.) (#1220)
d41ba35
unverified

ggerganov HF Staff commited on

ggml : sync latest repo (mostly refactoring changes)
d97fd69
unverified

ggerganov HF Staff commited on

metal : sync ggml-metal (ref #1047)
799974c
unverified

ggerganov HF Staff commited on