Commit History

sync : ggml
305dc4e

ggerganov HF Staff commited on

ggml : fix and optimize ppc64le (ggml/849)
e3d09d2

Hong Bo PENG commited on

ggml : remove duplicate include of ggml-common.h (ggml/853)
8c3ae74

danbev commited on

remove global variables (llama/7710)
4cb73ba

hengyu commited on

CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (llama/7921)
5931562

JohannesGaessler commited on

metal : utilize max shared memory for mul_mat_id (llama/7935)
d4b3604

ggerganov HF Staff commited on

rpc : fix ggml_backend_rpc_supports_buft() (llama/7918)
56e6751

rgerganov commited on

move BLAS to a separate backend (llama/6210)
c773aa9

slaren ggerganov HF Staff commited on

CUDA: fix broken oob check for FA vec f32 kernel (llama/7904)
efbb7be

JohannesGaessler commited on

tests : add non-cont unary tests (llama/7857)
6dc2887

ggerganov HF Staff commited on

ggml : improve ggml_is_contiguous logic (llama/7856)
ea3aa71

ggerganov HF Staff commited on

vulkan: select only one device for single gpu with multiple drivers (llama/7582)
ee56a37

Adriankhl commited on

Update Vulkan RoPE implementation (llama/7818)
71850e7

OccamRazor slaren commited on

CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (llama/7860)
154bf2b

JohannesGaessler commited on

CUDA: use tensor cores for MMQ (llama/7676)
78a5b67

JohannesGaessler commited on

use the correct SYCL context for host USM allocations (llama/7777)
9f87c2f

bashbaug commited on

CUDA: revise q8_1 data layout for mul_mat_q (llama/7824)
fcfd59e

JohannesGaessler commited on

vulkan : reuse parent extra for views (llama/7806)
b9b60de

slaren OccamRazor commited on

fix softmax r2r result wrong issue (llama/7811)
c3a7159

PPxin commited on

CUDA: refactor mmq, dmmv, mmvq (llama/7716)
849ff52

JohannesGaessler commited on

ggml : refactor rope norm/neox (llama/7634)
ded0c68

ggerganov HF Staff commited on

Allow number of nodes in CUDA graph to change (llama/7738)
6124287

agray3 commited on

ggml : remove OpenCL (llama/7735)
4ff3b72

ggerganov HF Staff commited on

ggml : prevent builds with -ffinite-math-only (llama/7726)
154f0f8

ggerganov HF Staff commited on

llama : offload to RPC in addition to other backends (llama/7640)
eab8082

rgerganov slaren commited on

ggml : use OpenMP as a thread pool (llama/7606)
7e5d850

Masaya, Kato slaren ggerganov HF Staff commited on

Vulkan Mixture of Experts (MoE) support (llama/7628)
ad9ee26

OccamRazor commited on

kompute : implement op_getrows_f32 (llama/6403)
fa0872f

woachk commited on

fix bug introduced in using calloc (llama/7701)
f22c7e4

Dave Airlie commited on

Fix FlashAttention debug test, FP32 assert (llama/7684)
1bed92f

JohannesGaessler commited on

CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (llama/7681)
d4c0faf

JohannesGaessler commited on

CUDA: quantized KV support for FA vec (llama/7527)
315df8c

JohannesGaessler commited on

ggml : fix loongson compile warnings (llama/7537)
c1442f3

ggerganov HF Staff junchao-loongson commited on

faster avx512 exp implementation (llama/7551)
6dbbbab

chriselrod commited on

ggml : fix loongarch build (O2 issue) (llama/7636)
133ffbf

junchao-loongson commited on

metal : remove invalid asserts (llama/7617)
562afce

ggerganov HF Staff commited on

metal : add missing asserts (llama/7617)
be552ab

ggerganov HF Staff commited on

ggml : fix YARN + add tests + add asserts (llama/7617)
15da5f7

ggerganov HF Staff commited on

cuda : non-cont concat support (llama/7610)
64d3007

ggerganov HF Staff commited on

llama-bench : add support for the RPC backend (llama/7435)
d460266

rgerganov commited on

ggml : use atomic_flag for critical section (llama/7598)
68c6582

slaren commited on

examples : adapt to new ggml_concat (ggml/0)
36af6c5

ggerganov HF Staff commited on

ggml : fix typo in ggml.c (llama/7603)
f06f1cb

jeffzhou2000 commited on

Align GEMM dispatch (llama/7566)
2171dc6

hengyu commited on

sycl : fix assert (llama/7563)
b4fb287

ggerganov HF Staff commited on

vulkan: properly initialize vulkan devices for LLAMA_SPLIT_MODE_NONE (llama/7552)
da90a1e

Adriankhl commited on

rpc : resource management rework (llama/7562)
7571b13

rgerganov commited on

fix ggml_sycl_mul_mat_id() to match the change of api (llama/7436)
f0ee71c

Neo Zhang commited on

ggml : generalize GGML_OP_CONCAT (llama/7563)
8d359ad

ggerganov HF Staff commited on

update HIP_UMA #7399 (llama/7414)
7097123

Djip007 slaren commited on