README.md · matt-here/MLX-Qwopus3.5-27B-v3-vision-4bit at 22a2415fb3865eaf6190b96969e20058389992c9

MLX-Qwopus3.5-27B-v3-vision-4bit / README.md

matt-here

Update README.md

22a2415 verified 16 days ago

preview code

raw

history blame

2.49 kB

	---
	language:
	- en
	- zh
	- ko
	license: apache-2.0
	base_model: Jackrong/Qwopus3.5-27B-v3
	tags:
	- unsloth
	- qwen
	- qwen3.5
	- reasoning
	- chain-of-thought
	- lora
	- competitive-programming
	- mlx
	pipeline_tag: image-text-to-text
	library_name: mlx
	---
	# MLX-Qwopus3.5-27B-v3-vision-4bit

	A 4-bit MLX quantization of [Jackrong/Qwopus3.5-27B-v3](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3) with a few tweaks to restore the multimodal capabilities.

	Supports `{%- set enable_thinking = false %}` Jinja variable.

	> Update [2026-04-12]: Refined the chat template to further improve stability for long-running tasks and tool use; mitigated an issue where incorrect think tag formatting could leak from a distillation dataset.

	---
	## Quantization Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Method \| 4-bit (4.695 bits per weight) \|
	\| Tool \| `mlx-vlm 0.4.2` via `mlx-vlm.convert` \|
	\| Size \| ~16.1GB \|

	---
	## Other Available Quants

	\| Model \| Size \| Quantization \| Bits per weight \| Multimodal \|
	\|--------\|--------\|--------\|--------\|--------\|
	\| [Jackrong/MLX-Qwopus3.5-27B-v3-4bit](https://huggingface.co/Jackrong/MLX-Qwopus3.5-27B-v3-4bit) \| 15.15 GB \| 4-bit \| 4.501 \| ✗ \|
	\| [(This model)](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-vision-4bit) \| 16.08 GB \| 4-bit \| 4.695 \| ✓ (Vision) \|
	\| [matt-here/MLX-Qwopus3.5-27B-v3-5bit](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-5bit) \| 18.56 GB \| 5-bit \| 5.501 \| ✗ \|
	\| [matt-here/MLX-Qwopus3.5-27B-v3-vision-5bit](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-vision-5bit) \| 19.46 GB \| 5-bit \| 5.678 \| ✓ (Vision) \|
	\| [Jackrong/MLX-Qwopus3.5-27B-v3-6bit](https://huggingface.co/Jackrong/MLX-Qwopus3.5-27B-v3-6bit) \| 21.88 GB \| 6-bit \| 6.501 \| ✗ \|
	\| [matt-here/MLX-Qwopus3.5-27B-v3-vision-6bit](https://huggingface.co/matt-here/MLX-Qwopus3.5-27B-v3-vision-6bit) \| 22.85 GB \| 6-bit \| 6.661 \| ✓ (Vision) \|
	\| [Jackrong/MLX-Qwopus3.5-27B-v3-bf16](https://huggingface.co/Jackrong/MLX-Qwopus3.5-27B-v3-bf16) \| 53.81 GB \| bf16 \| 16 \| ✗ \|

	> GGUF quants - [Jackrong/Qwopus3.5-27B-v3-GGUF](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3-GGUF)

	---
	## Credits

	- [Alibaba Qwen Team](https://huggingface.co/Qwen) — [Qwen 3.5 27B](https://huggingface.co/Qwen/Qwen3.5-27B) dense model
	- [Jackrong](https://huggingface.co/Jackrong) - Claude 4.6 Opus v3 distillation work
	- [Unsloth](https://unsloth.ai/) - Training framework
	- Apple MLX Team - High-speed local inference on Apple Silicon