Resources

Mar 5 Update: New iMatrix + Variants

pinned

❤️ 6

#20 opened about 1 month ago by

danielhanchen

Tool calling is broken when thinking is enabled, at least with UD-Q4_K_XL quants

👀 1

#27 opened 18 days ago by

Juodumas

Comparison to steampunque hybrid quant

#26 opened 25 days ago by

Throghar

27B GGUF quants benchmark?

👀 1

#25 opened about 1 month ago by

rtzurtz

Instruction following and infinite loop

#24 opened about 1 month ago by

saipubw

Could you create a Qwen3.5-27B-UD-Q4_K_L or UD-Q4_K_M version that fits within 16GB VRAM?

#23 opened about 1 month ago by

YuunaPhos

when new updated version waiting

👀👍 3

#22 opened about 1 month ago by

gopi87

[SOLVED] llama_model_load: unknown model architecture: 'qwen35'

#21 opened about 1 month ago by

drraug

qwen3.5 27b bf16 question

#19 opened about 1 month ago by

OldMan12345

Hmm. Super slow performance on newest llama.cpp

#15 opened about 1 month ago by

jeffwadsworth

Update status for Qwen3.5 122B and 27B GGUFs?

👍👀 23

#13 opened about 1 month ago by

Chlheng

WHY qwen35 as the architecture name?

#12 opened about 1 month ago by

kouji

Not supported on llama.cpp

#11 opened about 1 month ago by

RealBiggly

memory error when loading 27b model on 20gb vram gpu

#10 opened about 2 months ago by

selmee

LM studio error

👀 1

#9 opened about 2 months ago by

rakhilml

could you create a ud q_2_k_xl?

🔥👀 5

#6 opened about 2 months ago by

drmcbride

OLLAMA version

👍 3

#5 opened about 2 months ago by

d4rksou1

Qwen3.5-27B still thinking even with enable_thinking false, here is the method for now to actually disable Thinking Mode on Qwen3.5-27B (llama.cpp b8148）

👀👍 5

#4 opened about 2 months ago by

gannima

Q3_K_XL?

👍 2

#2 opened about 2 months ago by

floory