Mar 5 Update: New iMatrix + Variants
pinnedβ€οΈ 6
4
#20 opened about 1 month ago
by
danielhanchen
Tool calling is broken when thinking is enabled, at least with UD-Q4_K_XL quants
π 1
#27 opened 18 days ago
by
Juodumas
Comparison to steampunque hybrid quant
#26 opened 25 days ago
by
Throghar
27B GGUF quants benchmark?
π 1
1
#25 opened about 1 month ago
by
rtzurtz
Instruction following and infinite loop
1
#24 opened about 1 month ago
by
saipubw
Could you create a Qwen3.5-27B-UD-Q4_K_L or UD-Q4_K_M version that fits within 16GB VRAM?
#23 opened about 1 month ago
by
YuunaPhos
when new updated version waiting
ππ 3
1
#22 opened about 1 month ago
by
gopi87
[SOLVED] llama_model_load: unknown model architecture: 'qwen35'
3
#21 opened about 1 month ago
by
drraug
qwen3.5 27b bf16 question
1
#19 opened about 1 month ago
by
OldMan12345
Hmm. Super slow performance on newest llama.cpp
2
#15 opened about 1 month ago
by
jeffwadsworth
Update status for Qwen3.5 122B and 27B GGUFs?
ππ 23
6
#13 opened about 1 month ago
by
Chlheng
WHY qwen35 as the architecture name?
3
#12 opened about 1 month ago
by
kouji
Not supported on llama.cpp
2
#11 opened about 1 month ago
by
RealBiggly
memory error when loading 27b model on 20gb vram gpu
1
#10 opened about 2 months ago
by
selmee
LM studio error
π 1
1
#9 opened about 2 months ago
by
rakhilml
could you create a ud q_2_k_xl?
π₯π 5
2
#6 opened about 2 months ago
by
drmcbride
OLLAMA version
π 3
11
#5 opened about 2 months ago
by
d4rksou1
Qwen3.5-27B still thinking even with enable_thinking false, here is the method for now to actually disable Thinking Mode on Qwen3.5-27B (llama.cpp b8148οΌ
ππ 5
6
#4 opened about 2 months ago
by
gannima
Q3_K_XL?
π 2
6
#2 opened about 2 months ago
by
floory