Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

Mistral Small 3.2 24B Instruct — 8-bit MLX

8-bit MLX quantization of mistralai/Mistral-Small-3.2-24B-Instruct-2506, for Apple Silicon (~23 GB). Text-only build (vision tower not included) — run it with mlx-lm, not mlx-vlm. Ships a LlamaTokenizerFast tokenizer (vocab 131072) and chat template.

Usage

pip install -U mlx-lm

mlx_lm.generate \
  --model TyKaoz/Mistral-Small-3.2-24B-Instruct-2506-8bit \
  --prompt "Explique la quantization en une phrase." \
  --max-tokens 200

Base	Tool	Precision	Size
`mistralai/Mistral-Small-3.2-24B-Instruct-2506`	`mlx-lm`	8-bit · group 64	~23 GB

By TyKaoz — privacy-first native macOS LLM chat client. Apache 2.0, inherited from the base model.

Downloads last month: 188

Safetensors

Model size

24B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Model tree for TyKaoz/Mistral-Small-3.2-24B-Instruct-2506-8bit

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Finetuned

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Quantized

(62)

this model