YOLO-1.5B-Qwen-Coder

A fine-tuned version of Qwen2.5-Coder-1.5B-Instruct specialized in diagnosing CLI errors and generating a single, precise bash fix command.

The lightweight sibling of YOLO-7B-Qwen-Coder. Runs on any machine with 4GB RAM. Responds in under a second on Apple Silicon.

Part of the yolo-coder project — an automated CLI repair tool that wraps any command, catches failures, and fixes them locally using a local LLM.

What it does

Given a CLI error message and surrounding code context, the model outputs exactly one bare bash command to fix the problem. No explanation. No markdown. No backticks. Just the fix.

Input:  ModuleNotFoundError: No module named 'requests'
Output: pip install requests

Input:  PermissionError: [Errno 13] Permission denied: '/usr/local/bin/tool'
Output: sudo chmod +x /usr/local/bin/tool

Model Details

Property	Value
Base model	`Qwen/Qwen2.5-Coder-1.5B-Instruct`
Fine-tune method	LoRA (MLX on Apple Silicon)
LoRA rank	8
LoRA scale	20.0
Layers trained	16
Training iterations	500
Learning rate	1e-5
Batch size	2 (no grad accumulation)
Max sequence length	2048
Training hardware	Apple Silicon M-series
Model size (GGUF)	~941MB
RAM required	~2GB

Training Data

Trained on an earlier, smaller dataset of CLI error/fix pairs — a subset of the data later used for the 7B model. The 7B was trained on 2,250 examples with broader coverage; this model predates that expansion.

Coverage includes: Python runtime errors, pip errors, common file/permission errors, and basic Node.js/npm errors.

Format: ChatML with a system prompt enforcing single-command output.

Files in this repo

File	Description
`YOLO-1.5B-Qwen.gguf`	Q4 quantized GGUF (~941MB) — works with Ollama
`safetensors/`	fp16 safetensors — for further fine-tuning

Usage with Ollama

# Download the Modelfile
curl -O https://raw.githubusercontent.com/erdemozkan/yolo-coder/main/YOLO-MODEL-FILES/Modelfile

# Register
ollama create yolo-coder -f Modelfile

# Test
ollama run yolo-coder "ModuleNotFoundError: No module named 'flask'"
# → pip install flask

Usage with yolo-coder

git clone https://github.com/erdemozkan/yolo-coder
cd yolo-coder
pip install -e .

# Uses yolo-coder (1.5B) by default
yoco python3 myapp.py

# Explicitly
yoco --model yolo-coder python3 myapp.py

Prompt Format (ChatML)

<|im_start|>system
You are a CLI repair tool. Output ONLY a single bare bash command to fix the error. No explanation. No markdown. No backticks.<|im_end|>
<|im_start|>user
{error message}
<|im_end|>
<|im_start|>assistant

When to use 1.5B vs 7B

	1.5B	7B
RAM needed	~2GB	~5GB
Speed	<1s on Apple Silicon	1–3s on Apple Silicon
Common errors	✅ Excellent	✅ Excellent
Complex/rare errors	⚠️ May miss	✅ Better coverage
Best for	Daily driver, fast machines, CI	Hard errors, better accuracy

Recommendation: Use 1.5B as default, switch to 7B with --model yolo-7b when it struggles.

⚠️ Experimental

This model is experimental. It can underperform on complex, rare, or multi-layered errors — especially those outside its training distribution. Output quality is not guaranteed.

That said, it is extremely efficient on resources:

Runs on any machine with 2GB free RAM
Under 1 second response time on Apple Silicon
Only ~941MB on disk
Works on CPU — no GPU required

If it misses, switch to the 7B: yoco --model yolo-7b python3 myapp.py

Limitations

Single-command output only — not suitable for multi-step fixes without a wrapper
Smaller capacity than 7B — complex or novel errors may produce suboptimal fixes
Not a general coding assistant