Qapricorn-4B 🐐⚔️

"Nusquam est qui ubique est" — Kto jest wszędzie, nie jest nigdzie.
Qapricorn jest tam gdzie trzeba — reasoning, nie encyklopedia.

Model Description

Qapricorn-4B is a fine-tuned version of Qwen3-4B with enhanced reasoning capabilities, Polish language support, and coding skills. The model was trained using QLoRA with a carefully curated multilingual dataset combining mathematical reasoning, coding tasks, and Polish language data.

The name Qapricorn comes from:

Qa — Qapla' (Klingon for "Success")
pricorn — Capricorn (perseverance, patience)

A model built with persistence, on a free GPU, step by step. 💪

Key Features

🧠 Native <think> reasoning — inherited from Qwen3-4B and strengthened through training
🇵🇱 Polish language support — thinks and responds in Polish when prompted in Polish
💻 Coding capability — Python, algorithms, data structures
🔢 Mathematical reasoning — step-by-step problem solving
🌍 Multilingual — seamlessly switches between Polish and English based on prompt language

Reasoning Example

Prompt: Stała Kaprekara to liczba 6174. Zastosuj algorytm dla liczby 1234.

<think>
Algorytm Kaprekara działa tak: 
- Iteracja 1: 4321 - 1234 = 3087
- Iteracja 2: 8730 - 0378 = 8352  
- Iteracja 3: 8532 - 2358 = 6174 ✓

Warto też wspomnieć że dla liczb trzycyfrowych analogiczna stała to 495...
</think>

Algorytm Kaprekara dla 1234:
- Iteracja 1: 4321 - 1234 = 3087
- Iteracja 2: 8730 - 0378 = 8352
- Iteracja 3: 8532 - 2358 = **6174** ✓

The model correctly solved the Kaprekar constant problem AND independently noted the analogous constant (495) for 3-digit numbers — demonstrating genuine reasoning, not pattern matching.

Training Details

Base Model

Model: unsloth/Qwen3-4B-unsloth-bnb-4bit
Method: QLoRA (4-bit quantization)
LoRA rank: r=16, alpha=32
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameters: 33M / 4B (0.81%)

Training Infrastructure

Platform: Kaggle (free tier) — 1x Tesla T4 16GB
Framework: Unsloth 2026.3.3 + TRL
Total steps: 1500
Dataset coverage: ~33% of training data

Training Phases

Phase	Steps	LR	Scheduler	Notes
1	0→100	2e-4	linear	Initial learning
2	100→300	5e-5	cosine	Main training
3	300→600	5e-5	cosine	Stabilization
4	600→1100	1e-5	cosine	Fine-tuning
5	1100→1500	5e-6	cosine	Final polish

Loss Progression

Phase 1: ~1.8 → 1.1  (rapid learning)
Phase 2: avg 0.755    (best convergence)
Phase 3: avg 1.114    (new cosine cycle)
Phase 4: avg 1.076    (gradual improvement)
Phase 5: avg 1.046    (final polish, min=0.600)

Dataset

Mixed multilingual dataset (~48k records, 269MB):

Source	Records	Type
Math Reasoning PL	25,357	Math + `<think>` CoT
Claude Opus 4.5	250	Coding + `<think>`
Polish Language CSV	7,799	Polish language
Synthia-Coder v1.5	14,982	Coding instructions
Total	48,388

All datasets were converted to ChatML format with <think> blocks preserved.

Available Formats

File	Size	Recommended for
`qapricorn_f16.gguf`	~7.5 GB	Further quantization
`qapricorn_Q8_0.gguf`	~4.3 GB	Best quality, 6GB+ VRAM
`qapricorn_Q6_K.gguf`	~3.3 GB	Great quality, 4GB+ VRAM
`qapricorn_Q4_K_M.gguf`	~2.5 GB	Best speed/quality ratio

Usage

llama.cpp

./llama-cli \
  -m qapricorn_Q4_K_M.gguf \
  -p "Czym jest ciąg geometryczny? Wyjaśnij krok po kroku." \
  --temp 0.6 \
  --top-p 0.95 \
  -n 512

llama-server

./llama-server \
  -m qapricorn_Q4_K_M.gguf \
  --port 8080 \
  --ctx-size 4096 \
  --temp 0.6

Python (transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("HattoriHanzo1/Qapricorn-4B-merged-bf16")
tokenizer = AutoTokenizer.from_pretrained("HattoriHanzo1/Qapricorn-4B-merged-bf16")

messages = [
    {"role": "system", "content": "Jesteś pomocnym asystentem AI."},
    {"role": "user", "content": "Wyjaśnij algorytm Kaprekara."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Thinking Mode

Qapricorn inherits Qwen3's thinking toggle:

/think    — enable thinking (default)
/no_think — disable thinking for faster responses

Benchmarks (Qualitative)

Task	Result
Kaprekar constant (4B model)	✅ Solved + found 3-digit analogy
Polish mathematical reasoning	✅ Step-by-step with verification
Linked list reversal (Python)	✅ With edge cases
Lateral thinking puzzles	✅ Level 3/4 reasoning
Sentiment analysis (Polish)	✅ With justification
Multilingual switching	✅ Auto-detects prompt language

Limitations

Biology/natural science responses may contain occasional hallucinations (dataset gap)
4B parameter limit — complex multi-step reasoning may be less reliable than larger models
Training covered ~33% of available dataset — further training possible

Related Repositories

🔧 Qapricorn-4B-merged-bf16 — Full merged model
🎯 Qapricorn-4B-adapter-1500 — LoRA adapter (1500 steps)
💾 Qapricorn-4B-adapter-1100 — LoRA adapter (1100 steps)

License

Apache 2.0 — same as base model Qwen3-4B.

Built on a free GPU, on a Sunday, step by step.
"Heghlu'meH QaQ jajvam" — Today is a good day to train. ⚔️

Downloads last month: 10

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

6-bit

8-bit

16-bit

Model tree for HattoriHanzo1/Qapricorn-4B-GGUF

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Adapter

(952)

this model