Qapricorn-4B ๐โ๏ธ
"Nusquam est qui ubique est" โ Kto jest wszฤdzie, nie jest nigdzie.
Qapricorn jest tam gdzie trzeba โ reasoning, nie encyklopedia.
Model Description
Qapricorn-4B is a fine-tuned version of Qwen3-4B with enhanced reasoning capabilities, Polish language support, and coding skills. The model was trained using QLoRA with a carefully curated multilingual dataset combining mathematical reasoning, coding tasks, and Polish language data.
The name Qapricorn comes from:
- Qa โ Qapla' (Klingon for "Success")
- pricorn โ Capricorn (perseverance, patience)
A model built with persistence, on a free GPU, step by step. ๐ช
Key Features
- ๐ง Native
<think>reasoning โ inherited from Qwen3-4B and strengthened through training - ๐ต๐ฑ Polish language support โ thinks and responds in Polish when prompted in Polish
- ๐ป Coding capability โ Python, algorithms, data structures
- ๐ข Mathematical reasoning โ step-by-step problem solving
- ๐ Multilingual โ seamlessly switches between Polish and English based on prompt language
Reasoning Example
Prompt: Staลa Kaprekara to liczba 6174. Zastosuj algorytm dla liczby 1234.
<think>
Algorytm Kaprekara dziaลa tak:
- Iteracja 1: 4321 - 1234 = 3087
- Iteracja 2: 8730 - 0378 = 8352
- Iteracja 3: 8532 - 2358 = 6174 โ
Warto teลผ wspomnieฤ ลผe dla liczb trzycyfrowych analogiczna staลa to 495...
</think>
Algorytm Kaprekara dla 1234:
- Iteracja 1: 4321 - 1234 = 3087
- Iteracja 2: 8730 - 0378 = 8352
- Iteracja 3: 8532 - 2358 = **6174** โ
The model correctly solved the Kaprekar constant problem AND independently noted the analogous constant (495) for 3-digit numbers โ demonstrating genuine reasoning, not pattern matching.
Training Details
Base Model
- Model:
unsloth/Qwen3-4B-unsloth-bnb-4bit - Method: QLoRA (4-bit quantization)
- LoRA rank: r=16, alpha=32
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Trainable parameters: 33M / 4B (0.81%)
Training Infrastructure
- Platform: Kaggle (free tier) โ 1x Tesla T4 16GB
- Framework: Unsloth 2026.3.3 + TRL
- Total steps: 1500
- Dataset coverage: ~33% of training data
Training Phases
| Phase | Steps | LR | Scheduler | Notes |
|---|---|---|---|---|
| 1 | 0โ100 | 2e-4 | linear | Initial learning |
| 2 | 100โ300 | 5e-5 | cosine | Main training |
| 3 | 300โ600 | 5e-5 | cosine | Stabilization |
| 4 | 600โ1100 | 1e-5 | cosine | Fine-tuning |
| 5 | 1100โ1500 | 5e-6 | cosine | Final polish |
Loss Progression
Phase 1: ~1.8 โ 1.1 (rapid learning)
Phase 2: avg 0.755 (best convergence)
Phase 3: avg 1.114 (new cosine cycle)
Phase 4: avg 1.076 (gradual improvement)
Phase 5: avg 1.046 (final polish, min=0.600)
Dataset
Mixed multilingual dataset (~48k records, 269MB):
| Source | Records | Type |
|---|---|---|
| Math Reasoning PL | 25,357 | Math + <think> CoT |
| Claude Opus 4.5 | 250 | Coding + <think> |
| Polish Language CSV | 7,799 | Polish language |
| Synthia-Coder v1.5 | 14,982 | Coding instructions |
| Total | 48,388 |
All datasets were converted to ChatML format with <think> blocks preserved.
Available Formats
| File | Size | Recommended for |
|---|---|---|
qapricorn_f16.gguf |
~7.5 GB | Further quantization |
qapricorn_Q8_0.gguf |
~4.3 GB | Best quality, 6GB+ VRAM |
qapricorn_Q6_K.gguf |
~3.3 GB | Great quality, 4GB+ VRAM |
qapricorn_Q4_K_M.gguf |
~2.5 GB | Best speed/quality ratio |
Usage
llama.cpp
./llama-cli \
-m qapricorn_Q4_K_M.gguf \
-p "Czym jest ciฤ
g geometryczny? Wyjaลnij krok po kroku." \
--temp 0.6 \
--top-p 0.95 \
-n 512
llama-server
./llama-server \
-m qapricorn_Q4_K_M.gguf \
--port 8080 \
--ctx-size 4096 \
--temp 0.6
Python (transformers)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("HattoriHanzo1/Qapricorn-4B-merged-bf16")
tokenizer = AutoTokenizer.from_pretrained("HattoriHanzo1/Qapricorn-4B-merged-bf16")
messages = [
{"role": "system", "content": "Jesteล pomocnym asystentem AI."},
{"role": "user", "content": "Wyjaลnij algorytm Kaprekara."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Thinking Mode
Qapricorn inherits Qwen3's thinking toggle:
/think โ enable thinking (default)
/no_think โ disable thinking for faster responses
Benchmarks (Qualitative)
| Task | Result |
|---|---|
| Kaprekar constant (4B model) | โ Solved + found 3-digit analogy |
| Polish mathematical reasoning | โ Step-by-step with verification |
| Linked list reversal (Python) | โ With edge cases |
| Lateral thinking puzzles | โ Level 3/4 reasoning |
| Sentiment analysis (Polish) | โ With justification |
| Multilingual switching | โ Auto-detects prompt language |
Limitations
- Biology/natural science responses may contain occasional hallucinations (dataset gap)
- 4B parameter limit โ complex multi-step reasoning may be less reliable than larger models
- Training covered ~33% of available dataset โ further training possible
Related Repositories
- ๐ง Qapricorn-4B-merged-bf16 โ Full merged model
- ๐ฏ Qapricorn-4B-adapter-1500 โ LoRA adapter (1500 steps)
- ๐พ Qapricorn-4B-adapter-1100 โ LoRA adapter (1100 steps)
License
Apache 2.0 โ same as base model Qwen3-4B.
Built on a free GPU, on a Sunday, step by step.
"Heghlu'meH QaQ jajvam" โ Today is a good day to train. โ๏ธ
- Downloads last month
- 10
4-bit
6-bit
8-bit
16-bit