whisper.cpp / ggml-metal.m

Commit History

ggml : mul_mat_id use the same tensor for all the experts (llama/6387)
26fdc9f
unverified

slaren ggerganov HF Staff commited on

sync : ggml (#2001)
cbbfa9e
unverified

ggerganov HF Staff commited on

metal : build metallib + fix embed path (llama/6015)
27311ef
unverified

ggerganov HF Staff commited on

llama : add pipeline parallelism support (llama/6017)
b5bb3f3
unverified

slaren compilade ggerganov HF Staff commited on

ggml : reuse quantum structs across backends (llama/5943)
bb0625f
unverified

ggerganov HF Staff commited on

metal : move mm_id indices to shared mem (llama/5982)
1350705
unverified

ggerganov HF Staff commited on

ggml : introduce ggml_status (ggml/750)
151c676
unverified

Michael Podvitskiy slaren ggerganov HF Staff commited on

add some new ops, fix some operators and add batch operations to certain operators. (ggml/747)
dd8e3f9
unverified

leejet ggerganov HF Staff slaren commited on

IQ4_XS: a 4.25 bpw quantization (llama/5747)
0ee1bfb
unverified

Kawrakow ikawrakow commited on

Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (llama/5721)
2b9bb9e
unverified

Kawrakow ikawrakow ggerganov HF Staff commited on

code : normalize enum names (llama/5697)
93e0830
unverified

ggerganov HF Staff commited on

IQ3_S: a much better alternative to Q3_K (llama/5676)
32589c9
unverified

Kawrakow ikawrakow commited on

Introduce backend GUIDs (ggml/743)
a7eb9f6
unverified

UEXTM.com slaren commited on

sync : llama.cpp (ggml/0)
f8e8d34
unverified

ggerganov HF Staff commited on

1.5 bit quantization (llama/5453)
9c3aa6a
unverified

Kawrakow ikawrakow commited on

ggml : add ALiBi support for ggml_soft_max_ext (llama/5488)
26c019a
unverified

ggerganov HF Staff commited on

ci : add an option to fail on compile warning (llama/3952)
b5903fc
unverified

abastola ggerganov HF Staff commited on

metal : use autoreleasepool to avoid memory leaks (llama/5437)
c276f12
unverified

irbull commited on

metal : option to embed MSL source into compiled binary (#1842)
a46b62a
unverified

Didzis Gosko commited on

metal : add im2col F32 dst support (llama/5132)
26aec77
unverified

ggerganov HF Staff commited on

SOTA 3-bit quants (llama/5196)
4649943
unverified

Kawrakow ikawrakow commited on

ggml : add max buffer sizes to opencl and metal backends (llama/5181)
3d354d0
unverified

slaren commited on

metal : free metal objects (llama/5161)
ea7167a
unverified

Paul Tsochantaris commited on

ci : fix yolo URLs + fix metal capture (ggml/712)
588f789
unverified

ggerganov HF Staff commited on

metal : add debug capture backend function (ggml/694)
ece88c3
unverified

Jack Mousseau ggerganov HF Staff commited on

ggml : add Vulkan backend (llama/2059)
5a97aba
unverified

OccamRazor SlyEcho Concedo slaren ggerganov HF Staff commited on

metal : remove unused `n_buffers` and `buffers` (llama/5129)
a3e87d3
unverified

Paul Tsochantaris commited on

metal : show compile log messages
ae08f31
unverified

ggerganov HF Staff commited on

metal : disable support for MUL_MAT F32 x F16
7fbc01f
unverified

ggerganov HF Staff commited on

ggml : sync ggml-metal.m
b4085c3
unverified

ggerganov HF Staff commited on

metal : create autorelease pool during library build (llama/4970)
9027276
unverified

ggerganov HF Staff commited on

metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (llama/4936)
e2cc0e5
unverified

azarovalex ggerganov HF Staff commited on

ggml : introduce GGML_CALL function annotation (llama/4850)
7815f68
unverified

jartine commited on

metal : correctly set SIMD support flags on iOS (llama/4923)
1cf2fa9
unverified

azarovalex commited on

metal : remove old API (llama/4919)
d6abb6a
unverified

ggerganov HF Staff commited on

metal : disable log for loaded kernels (llama/4794)
2305485
unverified

ggerganov HF Staff commited on

metal : refactor kernel loading code (llama/4794)
53e6bf8
unverified

ggerganov HF Staff commited on

llama : ggml-backend integration (llama/4766)
362430b
unverified

slaren ggerganov HF Staff JohannesGaessler commited on

ggml : SOTA 2-bit quants (add IQ2_XS) (llama/4856)
5e827d5
unverified

Kawrakow ikawrakow commited on

metal : put encoder debug group behind a define (llama/4873)
6e822b8
unverified

Paul Tsochantaris commited on

metal : fix deprecation warning (ggml/690)
b1e29bc
unverified

ggerganov HF Staff commited on

metal : wrap each operation in debug group (ggml/690)
b5e360f
unverified

Jack Mousseau commited on

SOTA 2-bit quants (llama/4773)
75de5bf
unverified

Kawrakow ikawrakow commited on

metal : switch back to default.metallib (ggml/681)
b945a8f
unverified

ggerganov HF Staff commited on

ggml : add error handling to graph_compute (#1714)
92f24ee
unverified

finnvoorhees commited on

metal : add kernel_get_rows_i32
459dd87

ggerganov HF Staff commited on

metal : optimize ggml_mul_mat_id (faster Mixtral PP) (llama/4725)
8bc6274

ggerganov HF Staff commited on

metal : enable shader debugging (cmake option) (llama/4705)
7dd37dc

ggerganov HF Staff commited on

sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677)
aa86ade
unverified

ggerganov HF Staff commited on

sync : ggml (Metal fixes, new ops, tests) (#1633)
a0d4b48
unverified

ggerganov HF Staff commited on