This is an experimental 4-bit quantization of the dense Qwen3.5-27B, using the unsloth imatrix data, but with the following special rules applied:
IQ4_NL script:
QUANT="IQ4_NL"
llama-quantize \
--output-tensor-type q8_0 \
--token-embedding-type q8_0 \
--tensor-type attn_qkv=q8_0 \
--tensor-type attn_k=bf16 \
--tensor-type attn_v=bf16 \
--tensor-type attn_q=q8_0 \
--tensor-type attn_output=q8_0 \
--tensor-type attn_gate=q8_0 \
--tensor-type ssm_ba=bf16 \
--tensor-type ssm_beta=bf16 \
--tensor-type ssm_alpha=bf16 \
--tensor-type ssm_out=q8_0 \
--imatrix Qwen3.5-27B-imatrix.gguf_file \
Qwen3.5-27B-BF16-00001-of-00002.gguf \
Qwen3.5-27B.${QUANT}.gguf \
${QUANT}
IQ4_XS script:
QUANT="IQ4_XS"
llama-quantize \
--output-tensor-type Q6_K \
--token-embedding-type Q6_K \
--tensor-type attn_qkv=q8_0 \
--tensor-type attn_k=bf16 \
--tensor-type attn_v=bf16 \
--tensor-type attn_q=Q6_K \
--tensor-type attn_output=q8_0 \
--tensor-type attn_gate=q8_0 \
--tensor-type ssm_ba=bf16 \
--tensor-type ssm_beta=bf16 \
--tensor-type ssm_alpha=bf16 \
--tensor-type ssm_out=q8_0 \
--tensor-type ffn_down=Q5_K \
--imatrix Qwen3.5-27B-imatrix.gguf_file \
BF16/Qwen3.5-27B-BF16-00001-of-00002.gguf \
Qwen3.5-27B.${QUANT}.gguf \
${QUANT}
BONUS TRACK BONUS TRACK For users of ik_llama.cpp, I've added an iq4_k version as well:
QUANT="iq4_k"
llama-quantize \
--output-tensor-type iq6_k \
--token-embedding-type iq6_k \
--custom-q attn_qkv=iq6_k \
--custom-q attn_k=bf16 \
--custom-q attn_v=bf16 \
--custom-q attn_q=iq6_k \
--custom-q attn_output=iq6_k \
--custom-q attn_gate=iq6_k \
--custom-q ssm_ba=bf16 \
--custom-q ssm_beta=bf16 \
--custom-q ssm_alpha=bf16 \
--custom-q ssm_out=q8_0 \
--custom-q ffn_down=iq5_k \
--imatrix Qwen3.5-27B-imatrix.dat \
BF16/Qwen3.5-27B-BF16-00001-of-00002.gguf \
Qwen3.5-27B.${QUANT}.ik.gguf \
${QUANT}
- Downloads last month
- 1,204
Hardware compatibility
Log In to add your hardware
4-bit
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for dinerburger/Qwen3.5-27B-GGUF
Base model
Qwen/Qwen3.5-27B