Spaces:
Running
Running
Commit History
examples : add HEAPU8 to all of the exported runtime methods (#3134) a4fc5fb unverified
Enes Grahovac commited on
wasm : add note about worker.js file generation [no ci] (#3133) c6a619d unverified
whisper : deprecate WHISPER_CCACHE CMake option (#3131) c4aa3ee unverified
stream.wasm : add HEAPU8 to exported runtime methods (#3130) df2c5e7 unverified
sync : ggml 87f0773
cuda : remove nrows_x in mul_mat_q_process_tile (llama/13325) 0fd6120
R0CKSTAR commited on
CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (llama/13135) 9fb68a1
SYCL: Disable reorder optimize by default and stop setting tensor extras when optimize is disabled (llama/13254) 53abb97
CUDA: fix bad asserts for partial offload (llama/13337) 23e676b
CUDA: fix --split-mode row for MMQ (llama/13323) 1136116
CUDA: fix logic for clearing padding with -ngl 0 (llama/13320) c3e51a2
SYCL: Disable mul_mat kernels for noncontiguous tensor b (llama/13308) 3628417
rpc : use backend registry, support dl backends (llama/13304) 0286805
Diego Devesa commited on
ggml : activate s390x simd for Q3_K (llama/13301) 1bfe279
CUDA: fix race condition in MMQ stream-k fixup (llama/13299) 160742f
CUDA: fix race condition in MMQ ids_dst (llama/13294) d249810
vulkan: Additional type support for unary, binary, and copy (llama/13266) b9cb11e
ci : add bindings-java jar artifact to release (#3126) 03b0716 unverified
cli : avoid std::exchange ba2be5c
sync : ggml 27f99b0
vulkan : fix lint (llama/0) 49be727
ggml : Enable MMA for BF16 in llamafile_sgemm (llama/13148) 7da5bcc
rpc : avoid uninitialized memory in serialize_tensor (llama/13210) 31cad24
Justin Santa Barbara commited on
ggml: Don't assert fail when tensor data changes (llama/13222) af16d74
Jesse Gross commited on
build : fix build info on windows (llama/13239) 415b9fc
Diego Devesa commited on
vulkan: Add bfloat16 support (llama/12554) b21f8a1
vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul shader (llama/13191) 710fdcf
vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204) 43d9f3e
ci : zip windows artifacts for release uploading (#3124) 3dbef6c unverified
ci : add zip extension to xcframework artifact name (#3120) a8a2519 unverified
whisper: remove MSVC warnings pragmas (#3090) e0d130c unverified
server: update abort mechanism to handle HTTP connection closure (#3112) 02b25fa unverified
cli : support "-" for stdout like stdin (#3050) 7e3c27c unverified
Daniel Tang commited on
docs : Update cli documentation (#3102) 8566207 unverified
cmake : removed stdc++fs (#3097) e715962 unverified
server : update httplib.h to version 0.20.0 (#3101) 238f652 unverified
ruby : refine HTTP cache feature (#3109) f1d4a23 unverified
talk-llama : sync llama.cpp 05fda4a
sync : ggml 6d29e32
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (llama/13199) a867083
vulkan: use uint array index to avoid glslang bug (llama/13193) fd2d86d
ggml : fix ppc64le build (llama/13176) 07ec79f
feat(ggml-cpu): enable z17 compile (llama/13182) 10f7d18
Aaron Teo commited on
CUDA: fix non-cont. inputs for batched mat mul (llama/13155) d13b876
fix(rpc): Improve input validation and error handling (llama/13069) 9e9f2fe
Ville Vesilehto commited on
SYCL: Add all missing unary kernels (llama/13074) d2ce872
musa: fix typo in cc control (llama/13144) 5fb7320
R0CKSTAR commited on
CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137) e9c9d4b
musa: fix build warning (llama/13129) 3436ba4
R0CKSTAR commited on