AIZYBRAIN-NANO-4B (GGUF)
AIZYBRAIN-NANO-4B is a high-performance, compact language model in GGUF format, specifically optimized for local inference. It is based on the Qwen3-4B architecture, designed to deliver a perfect balance between reasoning depth and computational efficiency.
π Model Overview
- Developer: steef68
- Architecture: Qwen3 (Causal Language Model)
- Base Model: Qwen3-4B-Instruct
- Parameters: 4 Billion
- Format: GGUF
- Context Window: 32,768 tokens
- License: Apache 2.0
β¨ Key Features
- Reasoning Capabilities: Native support for "Chain-of-Thought" processing using
<think>blocks, allowing the model to solve complex logical problems before answering. - Optimized for Edge Devices: Specifically tuned to run smoothly on consumer-grade hardware, including laptops with limited VRAM and CPU-only setups.
- Multilingual Expertise: Exceptional performance in French and English, with robust understanding across 20+ additional languages.
- High Efficiency: Utilizes Grouped-Query Attention (GQA) for faster inference and lower memory consumption.
π Installation and Usage
For LM Studio / AnythingLLM
- Search for
steef68/AIZYBRAIN-NANO-4Bin the app. - Download the
AIZYBRAIN-V2-4B.gguffile. - Select the ChatML or Qwen prompt template.
For Ollama
You can use this model by creating a Modelfile:
FROM ./AIZYBRAIN-V2-4B.gguf
PARAMETER temperature 0.7
TEMPLATE "{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"
- Downloads last month
- 15
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.