Spaces:
Running
Running
Commit History
Fix conditional enabling following arch checks for ggml-sycl (llama/14504) 1f15602
Nicolò Scipione commited on
kv-cache : use ggml_set_rows (llama/14285) 7d6d9e8
ggml : fix FA mask dim 2 and 3 (llama/14505) a89dc81
CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497) 8e1f56c
llama : initial Mamba-2 support (llama/9126) 1b4087e
CUDA: add softmax broadcast (llama/14475) 05351ac
CUDA: broadcasting for FlashAttention mask (llama/14500) 47e02a8
vulkan: support softmax/FA batch and broadcast (llama/14449) f6b0b76
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435) ebacb3e
opencl : fix possible buffer overflow in dump_tensor (llama/14490) deb934d
opencl : skip empty nodes on cgraph compute (llama/14491) 5c36e7c
Eric Zhang commited on
opencl : update upscale to support align corners (llama/14488) 2b95b05
lhez commited on
ggml : Callback before abort (llama/14481) ccee17d
ci : disable fast-math for Metal GHA CI (llama/14478) ec4b1b3
CANN: update aclnnGroupedMatmulV2 to aclnnGroupedMatmulV3 (llama/14411) d8d5b0b
Chenguang Li commited on
vulkan: Split large mul_mat_id to fit in shared memory (llama/14451) bf678f0
add GELU_ERF (llama/14455) 235ebf7
Sigbjørn Skjæret commited on
vulkan : implement bilinear interpolation for ggml_upscale/ggml_interpolate (ggml/1291) 666e65b
vulkan : implement ggml_roll (ggml/1290) 968f9e8
server : add dtw.params for v3-large-turbo (#3307) 1250fd1 unverified
accessiblepixel commited on
feat: support vad for addon.node (#3301) f795870 unverified
Lin Xiaodong linxiaodong commited on
sync : ggml 4966aed
talk-llama : sync llama.cpp 0ec1374
ggml : remove trailing whitespace (llama/0) e37767f
opencl : add GEGLU, REGLU, SWIGLU (llama/14456) d70ff9f
lhez commited on
Add Conv2d for CPU (llama/14388) 68eb27a
metal : disable fast-math for some cpy kernels (llama/14460) 9d1185a
ggml-cpu: sycl: Re-enable exp f16 (llama/14462) c0fcd7a
Romain Biessy commited on
cmake : Remove redundant include path in CMakeLists.txt (llama/14452) 6b59b68
xiaobing318 commited on
scripts : make the shell scripts cross-platform (llama/14341) 9de52c8
Vedran Miletić commited on
SYCL: disable faulty fp16 exp kernel (llama/14395) b4969ff
ggml : fix unmerged GGML_FPxx_TO_FPxx refactoring (llama/14443) f7995cb
Sigbjørn Skjæret commited on
ggml : implement REGLU/GEGLU/SWIGLU ops (llama/14158) add5c0f
vulkan: Add fusion support for RMS_NORM+MUL (llama/14366) 737f12d
CUDA: add bf16 and f32 support to cublas_mul_mat_batched (llama/14361) c7936d3
vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline (llama/14378) 1c3b94c
vulkan: lock accesses of pinned_memory vector (llama/14333) 59dca4f
fix async_mode bug (llama/14432) 122b29a
vulkan: Fix GGML_VULKAN_SHADER_DEBUG_INFO (llama/14427) a06c8ca
cmake: regen vulkan shaders when shaders-gen sources change (llama/14398) 7988638
bandoti commited on
metal : add special-case mat-vec mul for ne00 == 4 (llama/14385) 724622d
metal : batch rows copy in a single threadgroup (llama/14384) b4ff704
musa: enable fp16 mma (all) and cublas on qy2 (llama/13842) e35329b
ggml-cpu: enable IBM NNPA Vector Intrinsics (llama/14317) fea8f94
ggml : do not output unprintable characters on GGUF load failure (llama/14381) e7b2e19
Sigbjørn Skjæret commited on
sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (llama/13973) b25d3bf
Anton Mitkov commited on
opencl: ref count `ggml_backend_opencl_context` and refactor profiling (llama/14254) ae0c7b8
lhez commited on