AIZYBRAIN-NANO-4B (GGUF)

AIZYBRAIN-NANO-4B is a high-performance, compact language model in GGUF format, specifically optimized for local inference. It is based on the Qwen3-4B architecture, designed to deliver a perfect balance between reasoning depth and computational efficiency.

πŸš€ Model Overview

  • Developer: steef68
  • Architecture: Qwen3 (Causal Language Model)
  • Base Model: Qwen3-4B-Instruct
  • Parameters: 4 Billion
  • Format: GGUF
  • Context Window: 32,768 tokens
  • License: Apache 2.0

✨ Key Features

  • Reasoning Capabilities: Native support for "Chain-of-Thought" processing using <think> blocks, allowing the model to solve complex logical problems before answering.
  • Optimized for Edge Devices: Specifically tuned to run smoothly on consumer-grade hardware, including laptops with limited VRAM and CPU-only setups.
  • Multilingual Expertise: Exceptional performance in French and English, with robust understanding across 20+ additional languages.
  • High Efficiency: Utilizes Grouped-Query Attention (GQA) for faster inference and lower memory consumption.

πŸ›  Installation and Usage

For LM Studio / AnythingLLM

  1. Search for steef68/AIZYBRAIN-NANO-4B in the app.
  2. Download the AIZYBRAIN-V2-4B.gguf file.
  3. Select the ChatML or Qwen prompt template.

For Ollama

You can use this model by creating a Modelfile:

FROM ./AIZYBRAIN-V2-4B.gguf
PARAMETER temperature 0.7
TEMPLATE "{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"
Downloads last month
15
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support