File size: 2,486 Bytes
7ceb2cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9c42cfb
7ceb2cc
30a10b9
 
 
7ceb2cc
22a2415
 
7ceb2cc
 
 
 
 
9c42cfb
7ceb2cc
9c42cfb
7ceb2cc
 
 
 
 
 
9c42cfb
 
 
 
 
 
7ceb2cc
 
9c42cfb
7ceb2cc
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
language:
- en
- zh
- ko
license: apache-2.0
base_model: Jackrong/Qwopus3.5-27B-v3
tags:
- unsloth
- qwen
- qwen3.5
- reasoning
- chain-of-thought
- lora
- competitive-programming
- mlx
pipeline_tag: image-text-to-text
library_name: mlx
---
# MLX-Qwopus3.5-27B-v3-vision-4bit

A **4-bit MLX** quantization of [Jackrong/Qwopus3.5-27B-v3](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3) with a few tweaks to restore the multimodal capabilities.

Supports `{%- set enable_thinking = false %}` Jinja variable.

> Update [2026-04-12]: Refined the chat template to further improve stability for long-running tasks and tool use; mitigated an issue where incorrect think tag formatting could leak from a distillation dataset.

---
## Quantization Details

| Property | Value |
|----------|-------|
| Method | 4-bit (4.695 bits per weight) |
| Tool | `mlx-vlm 0.4.2` via `mlx-vlm.convert` |
| Size | ~16.1GB |

---
## Other Available Quants

| Model | Size | Quantization | Bits per weight | Multimodal |
|--------|--------|--------|--------|--------|
| [Jackrong/MLX-Qwopus3.5-27B-v3-4bit](https://huggingface.co/Jackrong/MLX-Qwopus3.5-27B-v3-4bit) | 15.15 GB | 4-bit | 4.501 | ✗ |
| [(This model)](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-vision-4bit) | 16.08 GB | 4-bit | 4.695 | ✓ (Vision) |
| [matt-here/MLX-Qwopus3.5-27B-v3-5bit](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-5bit) | 18.56 GB | 5-bit | 5.501 | ✗ |
| [matt-here/MLX-Qwopus3.5-27B-v3-vision-5bit](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-vision-5bit) | 19.46 GB | 5-bit | 5.678 | ✓ (Vision) |
| [Jackrong/MLX-Qwopus3.5-27B-v3-6bit](https://huggingface.co/Jackrong/MLX-Qwopus3.5-27B-v3-6bit) | 21.88 GB | 6-bit | 6.501 | ✗ |
| [matt-here/MLX-Qwopus3.5-27B-v3-vision-6bit](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-vision-6bit) | 22.85 GB | 6-bit | 6.661 | ✓ (Vision) |
| [Jackrong/MLX-Qwopus3.5-27B-v3-bf16](https://huggingface.co/Jackrong/MLX-Qwopus3.5-27B-v3-bf16) | 53.81 GB | bf16 | 16 | ✗ |

> GGUF quants - [Jackrong/Qwopus3.5-27B-v3-GGUF](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3-GGUF)

---
## Credits

- [**Alibaba Qwen Team**](https://huggingface.co/Qwen) — [Qwen 3.5 27B](https://huggingface.co/Qwen/Qwen3.5-27B) dense model
- [**Jackrong**](https://huggingface.co/Jackrong) - Claude 4.6 Opus v3 distillation work
- [**Unsloth**](https://unsloth.ai/) - Training framework
- **Apple MLX Team** - High-speed local inference on Apple Silicon