AxionML Qwen3.5-27B-NVFP4 GGUF

This repository contains a GGUF conversion of AxionML/Qwen3.5-27B-NVFP4 for llama.cpp-compatible runtimes such as LM Studio.

Files

AxionML-Qwen3.5-27B-NVFP4.gguf — main model
mmproj-BF16.gguf — multimodal projector for image support

Notes

The main GGUF was converted from the Hugging Face NVFP4 checkpoint using convert_hf_to_gguf.py
Conversion was performed with --outtype bf16, producing a mixed-format GGUF with preserved supported tensor types and required floating-point auxiliary tensors
mmproj-BF16.gguf is used for image support
The mmproj-BF16.gguf file was sourced from unsloth/Qwen3.5-27B-GGUF and verified working in local testing
Original source model: AxionML/Qwen3.5-27B-NVFP4
Base model family: Qwen/Qwen3.5-27B

License

Please refer to the upstream model license and attribution requirements.

📊 Qwen 3.5 vs Qwen 3 Benchmark Overview

Higher is better.
This repository is based on Qwen3.5-27B, one of the strongest balanced models in the Qwen family.

Model	Knowledge & STEM	Instruction Following	Long Context	Math	Coding	General Agent	Multilingualism
Qwen3-235B-A22B	83	63	57	87	54	56	75
Qwen3.5-122B-A10B	85	76	63	91	59	75	79
Qwen3-Next-80B-A3B-Thinking	80	67	50	77	49	53	71
Qwen3.5-35B-A3B	84	74	58	89	55	74	77
Qwen3-30BA3B-Thinking-2507	78	62	47	68	46	42	69
Qwen3.5-27B	84	77	63	91	60	74	79
Qwen3.5-9B	80	70	59	83	47	73	73
Qwen3.5-4B	76	66	53	75	40	64	68
Qwen3-4B-2507	72	59	37	63	N/A	41	61
Qwen3.5-2B	64	51	32	21	N/A	46	52
Qwen3-1.7B	57	42	17	9	N/A	18	47
Qwen3.5-0.8B	43	28	16	N/A	N/A	N/A	37

Benchmark note: The comparison table above is a community summary / visualization, with the visual overview based on this Reddit post: Visualizing all Qwen 3.5 vs Qwen 3 benchmarks

For official benchmark details from the Qwen team, see the benchmark section of: Qwen/Qwen3.5-27B

This repository is a GGUF conversion of AxionML/Qwen3.5-27B-NVFP4, stored as a mixed-format GGUF with native NVFP4 weights plus floating-point auxiliary tensors where required by the conversion/runtime.

Quantized checkpoints may show small quality differences relative to their base model on benchmark results.

✨ Why Qwen3.5-27B stands out

Qwen3.5-27B delivers one of the strongest overall quality-to-size tradeoffs in the entire Qwen family.

Key highlights:

Best reported coding score in this comparison with 60, ahead of every other listed model with a published coding result
Top-tier math performance with 91, matching the strongest model in the table
Excellent instruction following with 77
Strong multilingual capability with 79
Very strong long-context performance with 63, while remaining practical to run locally

In short, Qwen3.5-27B is a particularly compelling choice for users who want strong coding ability, excellent reasoning, large-context usability, and multilingual performance without stepping up to the largest flagship models.

🚀 Practical local performance

In local testing on an NVIDIA GeForce RTX 5090 32GB, this GGUF build sustains 50+ tok/s across 80K–96K context window.

That makes it especially attractive for:

long-document analysis
large codebase work
multi-file reasoning
extended chat sessions
retrieval-heavy workflows

Performance note: this is a local test result on RTX 5090 32GB hardware. Actual throughput will vary depending on runtime version, context length, batch settings, prompt shape, and sampling configuration.

🔥 At a glance

Qwen3.5-27B combines:

Best reported coding score in this comparison
Top-tier math performance
Strong long-context capability
Excellent multilingual and instruction-following performance
50+ tok/s at 96K context in local RTX 5090 32GB testing

That makes this GGUF release a particularly strong option for users who want a model that is both high quality and practical to run locally.

Downloads last month: 1,410

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Freenixi/AxionML-Qwen3.5-27B-NVFP4-GGUF

Base model

Qwen/Qwen3.5-27B

Quantized

AxionML/Qwen3.5-27B-NVFP4

Quantized

(1)

this model