AIZYBRAIN-NANO-4B (GGUF)

AIZYBRAIN-NANO-4B is a high-performance, compact language model in GGUF format, specifically optimized for local inference. It is based on the Qwen3-4B architecture, designed to deliver a perfect balance between reasoning depth and computational efficiency.

🚀 Model Overview

Developer: steef68
Architecture: Qwen3 (Causal Language Model)
Base Model: Qwen3-4B-Instruct
Parameters: 4 Billion
Format: GGUF
Context Window: 32,768 tokens
License: Apache 2.0

✨ Key Features

Reasoning Capabilities: Native support for "Chain-of-Thought" processing using <think> blocks, allowing the model to solve complex logical problems before answering.
Optimized for Edge Devices: Specifically tuned to run smoothly on consumer-grade hardware, including laptops with limited VRAM and CPU-only setups.
Multilingual Expertise: Exceptional performance in French and English, with robust understanding across 20+ additional languages.
High Efficiency: Utilizes Grouped-Query Attention (GQA) for faster inference and lower memory consumption.

🛠 Installation and Usage

For LM Studio / AnythingLLM

Search for steef68/AIZYBRAIN-NANO-4B in the app.
Download the AIZYBRAIN-V2-4B.gguf file.
Select the ChatML or Qwen prompt template.

For Ollama

You can use this model by creating a Modelfile:

FROM ./AIZYBRAIN-V2-4B.gguf
PARAMETER temperature 0.7
TEMPLATE "{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"

Downloads last month: 15

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

We're not able to determine the quantization variants.

View all variants