whisper.cpp / ggml-metal.m

Commit History

metal : support FA without mask + add asserts (llama/7278)
98ce302
unverified

ggerganov HF Staff commited on

metal : tune soft_max number of threads (#0)
99d668a

ggerganov HF Staff commited on

metal : fix indent (ggml/0)
d4f82d5

ggerganov HF Staff commited on

ggml : full ALiBi support (llama/7192)
192bda4

ggerganov HF Staff commited on

metal : fix flash attention kernel requirements (llama/7169)
6cb3028

ggerganov HF Staff commited on

metal : use `vm_allocate` instead of `posix_memalign` on macOS (llama/7078)
eb910b1

Gilad S commited on

ggml : introduce bfloat16 support (llama/6412)
81ec961

Justine Tunney commited on

switch to using localizedDescription (llama/7010)
fd25ba6

bakkot commited on

metal : remove deprecated error code (llama/7008)
42a84fb

ggerganov HF Staff commited on

metal : log more info on error (llama/6987)
d4dcef9

bakkot commited on

ggml : group all experts in a single ggml_mul_mat_id (llama/6505)
f0b5c67

slaren ggerganov HF Staff commited on

llama : add qwen2moe (llama/6074)
daae175

Shijie ggerganov HF Staff commited on

Added support for GGML_OP_CLAMP in Metal (llama/6662)
a06cbc7

Dave dave-fl commited on

metal : unify mul_mv_id kernels (llama/6556)
e9910b5

slaren commited on

feat: implemented sigmoid function (ggml/806)
cd0c122

Justina Cho commited on

ggml : mul_mat_id use the same tensor for all the experts (llama/6387)
26fdc9f
unverified

slaren ggerganov HF Staff commited on

sync : ggml (#2001)
cbbfa9e
unverified

ggerganov HF Staff commited on

metal : build metallib + fix embed path (llama/6015)
27311ef
unverified

ggerganov HF Staff commited on

llama : add pipeline parallelism support (llama/6017)
b5bb3f3
unverified

slaren compilade ggerganov HF Staff commited on

ggml : reuse quantum structs across backends (llama/5943)
bb0625f
unverified

ggerganov HF Staff commited on

metal : move mm_id indices to shared mem (llama/5982)
1350705
unverified

ggerganov HF Staff commited on

ggml : introduce ggml_status (ggml/750)
151c676
unverified

Michael Podvitskiy slaren ggerganov HF Staff commited on

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747)
dd8e3f9
unverified

leejet ggerganov HF Staff slaren commited on

IQ4_XS: a 4.25 bpw quantization (llama/5747)
0ee1bfb
unverified

Kawrakow ikawrakow commited on

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)
2b9bb9e
unverified

Kawrakow ikawrakow ggerganov HF Staff commited on

code : normalize enum names (llama/5697)
93e0830
unverified

ggerganov HF Staff commited on

IQ3_S: a much better alternative to Q3_K (llama/5676)
32589c9
unverified

Kawrakow ikawrakow commited on

Introduce backend GUIDs (ggml/743)
a7eb9f6
unverified

UEXTM.com slaren commited on

sync : llama.cpp (ggml/0)
f8e8d34
unverified

ggerganov HF Staff commited on

1.5 bit quantization (llama/5453)
9c3aa6a
unverified

Kawrakow ikawrakow commited on

ggml : add ALiBi support for ggml_soft_max_ext (llama/5488)
26c019a
unverified

ggerganov HF Staff commited on

ci : add an option to fail on compile warning (llama/3952)
b5903fc
unverified

abastola ggerganov HF Staff commited on

metal : use autoreleasepool to avoid memory leaks (llama/5437)
c276f12
unverified

irbull commited on

metal : option to embed MSL source into compiled binary (#1842)
a46b62a
unverified

Didzis Gosko commited on

metal : add im2col F32 dst support (llama/5132)
26aec77
unverified

ggerganov HF Staff commited on

SOTA 3-bit quants (llama/5196)
4649943
unverified

Kawrakow ikawrakow commited on

ggml : add max buffer sizes to opencl and metal backends (llama/5181)
3d354d0
unverified

slaren commited on

metal : free metal objects (llama/5161)
ea7167a
unverified

Paul Tsochantaris commited on

ci : fix yolo URLs + fix metal capture (ggml/712)
588f789
unverified

ggerganov HF Staff commited on

metal : add debug capture backend function (ggml/694)
ece88c3
unverified

Jack Mousseau ggerganov HF Staff commited on

ggml : add Vulkan backend (llama/2059)
5a97aba
unverified

OccamRazor SlyEcho Concedo slaren ggerganov HF Staff commited on

metal : remove unused `n_buffers` and `buffers` (llama/5129)
a3e87d3
unverified

Paul Tsochantaris commited on

metal : show compile log messages
ae08f31
unverified

ggerganov HF Staff commited on

metal : disable support for MUL_MAT F32 x F16
7fbc01f
unverified

ggerganov HF Staff commited on

ggml : sync ggml-metal.m
b4085c3
unverified

ggerganov HF Staff commited on

metal : create autorelease pool during library build (llama/4970)
9027276
unverified

ggerganov HF Staff commited on

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936)
e2cc0e5
unverified

azarovalex ggerganov HF Staff commited on

ggml : introduce GGML_CALL function annotation (llama/4850)
7815f68
unverified

jartine commited on

metal : correctly set SIMD support flags on iOS (llama/4923)
1cf2fa9
unverified

azarovalex commited on