How to use from
Docker Model Runner
docker model run hf.co/ijwfly/Qwen3.6-35B-A3B-uncensored-heretic-mlx-8Bit
Quick Links

ijwfly/Qwen3.6-35B-A3B-uncensored-heretic-mlx-8Bit

The Model ijwfly/Qwen3.6-35B-A3B-uncensored-heretic-mlx-8Bit was converted to MLX format from llmfan46/Qwen3.6-35B-A3B-uncensored-heretic using mlx-vlm version 0.4.4.

Use with mlx-openai-server

pip install mlx-openai-server

You have to choose trade-off between model types (model-type parameter):

  • lm: prefix caching works, but vision is not supported
  • multimodal: vision works, but prefix caching is unavailable
mlx-openai-server launch \
  --model-path ijwfly/Qwen3.6-35B-A3B-uncensored-heretic-mlx-8Bit \
  --reasoning-parser qwen3_5 \
  --model-type lm \
  --tool-call-parser qwen3_coder \
  --context-length 65535 \
  --port 8432 \
  --temperature 0.6 \
  --top-p=0.95 \
  --top-k=20 \
  --min-p=0.0 \
  --presence-penalty=0.0 \
  --repetition-penalty=1.0
Downloads last month
342
Safetensors
Model size
10B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ijwfly/Qwen3.6-35B-A3B-uncensored-heretic-mlx-8Bit

Quantized
(18)
this model