Apr 11: Updated with Google chat template fixes + more
pinnedπ€β€οΈ 14
7
#24 opened 1 day ago
by
danielhanchen
Gemma 4 Tool Calling is amazing in Unsloth Studio!
pinnedπ₯ 4
4
#4 opened 10 days ago
by
danielhanchen
Apr 11 chat template causes ~7.5s template rendering overhead per request in llama.cpp
#27 opened about 7 hours ago
by
btdeviant
llama.cpp flags / visual token budget
#26 opened about 11 hours ago
by
234r89r23u89023rui90
D:\a\llama.cpp\llama.cpp\ggml\src\ggml-cuda\ggml-cuda.cu:911: GGML_ASSERT(tensor->view_src == nullptr) failed
#25 opened 1 day ago
by
osabc
Do NOT use CUDA 13.2
β€οΈ 8
#22 opened 4 days ago
by
danielhanchen
Gemma 4 seems to work best with high temperature for coding
π 1
8
#21 opened 4 days ago
by
Reverger
Apr 8 - New GGUF Updates
πβ€οΈ 14
10
#20 opened 4 days ago
by
danielhanchen
gguf updates
π 5
1
#17 opened 6 days ago
by
tstello
Ollama Error
π 1
3
#16 opened 6 days ago
by
edm-research
Inference speed on 12GB VRAM
6
#15 opened 6 days ago
by
drakexp
Fails to run on vLLM
1
#14 opened 7 days ago
by
Skodra
Only 2nd <13GB model to one-shot the Heptagon-Tumbler
β€οΈπ₯ 3
1
#12 opened 8 days ago
by
BingoBird
New uploads adds llama.cpp fixes
π 6
16
#11 opened 9 days ago
by
danielhanchen
Commit description
π 4
1
#10 opened 9 days ago
by
Kelheor
Q4_0 and Q4_1?
π 1
#9 opened 9 days ago
by
elpirater312
How to enable thinking
β€οΈπ 7
5
#6 opened 10 days ago
by
watchingyousleep
Tool call with dates fails
2
#5 opened 10 days ago
by
EmilPi
Model produces `<|channel><unused49><unused49><unused49>`
π 5
30
#2 opened 10 days ago
by
kyuz0