Commit History

ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541)
bf73242

shupeif commited on

kompute : improve backend to pass test_backend_ops (llama/10542)
c8008b8

slpnix commited on

CANN: Fix SOC_TYPE compile bug (llama/10519)
7f24ebb

leo-pony commited on

CANN: ROPE operator optimization (llama/10540)
63ee002

Chenguang Li noemotiovon commited on

Add some minimal optimizations for CDNA (llama/10498)
bf49bbe

uvos commited on

metal : fix group_norm support condition (llama/0)
20ee62d

ggerganov HF Staff commited on

vulkan: define all quant data structures in types.comp (llama/10440)
cea89af

jeffbolznv commited on

vulkan: Handle GPUs with less shared memory (llama/10468)
18a0ad1

jeffbolznv commited on

vulkan: further optimize q5_k mul_mat_vec (llama/10479)
cb018d4

jeffbolznv commited on

vulkan: skip integer div/mod in get_offsets for batch_idx==0 (llama/10506)
c6d15e0

jeffbolznv commited on

vulkan: optimize Q2_K and Q3_K mul_mat_vec (llama/10459)
c032c06

jeffbolznv commited on

mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (llama/10516)
f2a87fc

R0CKSTAR commited on

vulkan: fix group_norm (llama/10496)
8f5eeb8

jeffbolznv commited on

cmake : enable warnings in llama (llama/10474)
26a670b

ggerganov HF Staff commited on

ggml-cpu: cmake add arm64 cpu feature check for macos (llama/10487)
6d586a0

Charles Xu commited on

CANN: Improve the Inferencing Performance for Ascend NPU Device (llama/10454)
f9fd6d6

Shanshan Shen shanshan shen Frank Mai commited on

CANN: RoPE and CANCAT operator optimization (llama/10488)
b357ea7

Chenguang Li noemotiovon commited on

vulkan: Fix a vulkan-shaders-gen arugment parsing error (llama/10484)
6a4b6ae

Sparkleholic commited on

metal : enable mat-vec kernels for bs <= 4 (llama/10491)
6d07dee

ggerganov HF Staff commited on

llama : accept a list of devices to use to offload a model (llama/10497)
6d7599e

Diego Devesa commited on

ggml : add support for dynamic loading of backends (llama/10469)
b73266f

Diego Devesa ggerganov HF Staff commited on

metal : minor code formatting
385a521

ggerganov HF Staff commited on

ggml : do not use ARM features not included in the build (llama/10457)
0001327

Diego Devesa commited on

CANN: Support Ascend310P to accelerate F32 and F16 Model (llama/10216)
c9e03e6

leo-pony commited on

cuda : optimize argmax (llama/10441)
69ae50d

Diego Devesa JohannesGaessler commited on

vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437)
0a14325

jeffbolznv commited on

vulkan: copy iq4_nl LUT into shared memory (llama/10409)
c31abdb

jeffbolznv commited on

vulkan: further optimize mul_mat_vec using larger loads (llama/10387)
50a2978

jeffbolznv commited on

add cmake rvv support (llama/10411)
e0bf47c

haopeng commited on

CUDA: remove unnecessary warp reduce in FA (ggml/1032)
9a8c238

mahorozte mahorozte commited on

feat: add `GGML_UNARY_OP_ARGMAX` Metal kernel (ggml/1019)
c7e59ef

PABannier Diego Devesa commited on

metal : add `GGML_OP_CONV_TRANSPOSE_1D` kernels (ggml/1026)
9c845f4

PABannier commited on

Do not include arm_neon.h when compiling CUDA code (ggml/1028)
80663f4

Frankie Robertson commited on

ggml-opt: fix data corruption (ggml/1022)
a916e92

JohannesGaessler commited on

ruby : Add low-level methods to transcribe (#2585)
4bf69ed
unverified

KitaitiMakoto commited on

models : add `q8_0` models to `download-ggml-model.sh` (#2589)
7feeb43
unverified

mikey-rrr commited on

ruby : Follow source tree change (#2580)
7895d75
unverified

KitaitiMakoto commited on

whisper : use backend registry (#0)
b9f5e40

ggerganov HF Staff commited on

ggml/sched : do not skip views in pre-assignments
b1eba61

slaren commited on

whisper : adapt to new ggml (wip)
ec6f374

ggerganov HF Staff commited on

talk-llama : sync llama.cpp
1568fc8

ggerganov HF Staff commited on

sync : ggml
e3c317a

ggerganov HF Staff commited on

ggml : sync resolve (skip) (#0)
d4d67dc

ggerganov HF Staff commited on

Add required ggml-base and backend libs to cmake pkg (llama/10407)
8fdd994

bandoti commited on

cuda : fix CUDA_FLAGS not being applied (llama/10403)
22e1593

Diego Devesa commited on

sycl : Add option to set the SYCL architecture for all targets (llama/10266)
0d836df

Romain Biessy commited on

vulkan: Optimize soft_max (llama/10301)
5cb851d

jeffbolznv commited on

sycl: Revert MUL_MAT_OP support changes (llama/10385)
6df9941

Alberto Cabrera Pérez commited on

cuda : only use native when supported by cmake (llama/10389)
24d2e82

Diego Devesa commited on

vulkan: remove use of null initializer (llama/10372)
dacdc69

jeffbolznv commited on