MiniMax-M2.7-ultra-uncensored-heretic-oQ4-MLX

This repository contains an oMLX oQ4 mixed-precision MLX quantization of llmfan46/MiniMax-M2.7-BF16-ultra-uncensored-heretic.

This build follows the artifact lineage from llmfan46/MiniMax-M2.7-ultra-uncensored-heretic-GGUF. oMLX oQ quantization operates on MLX/safetensors checkpoints rather than GGUF files, so this build uses the corresponding BF16 safetensors checkpoint and excludes the existing GGUF quantizations.

Quantization

Field Value
Method oMLX oQ mixed-precision MLX
Quantization oQ4
Model type minimax_m2
Group size 64
Quantization mode affine
Effective plan 4.57 bpw
Layer policy entries 250
Output shards 24 safetensors
Output size 121.7 GiB

Usage

huggingface-cli download dawncr0w/MiniMax-M2.7-ultra-uncensored-heretic-oQ4-MLX \
  --local-dir MiniMax-M2.7-ultra-uncensored-heretic-oQ4-MLX

Then load it with an MLX-LM/oMLX runtime that supports minimax_m2:

python -m mlx_lm.generate \
  --model MiniMax-M2.7-ultra-uncensored-heretic-oQ4-MLX \
  --prompt "Write a short greeting." \
  --max-tokens 64

Validation

Local validation completed with the bundled oMLX runtime on macOS:

model discovery: passed
model type: minimax_m2
quantization: bits=4, group_size=64, mode=affine
shards: 24

Source

Upstream model card license tag: other.

Downloads last month
569
Safetensors
Model size
36B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dawncr0w/MiniMax-M2.7-ultra-uncensored-heretic-oQ4-MLX

Collection including dawncr0w/MiniMax-M2.7-ultra-uncensored-heretic-oQ4-MLX