YOLO-1.5B-Qwen-Coder

A fine-tuned version of Qwen2.5-Coder-1.5B-Instruct specialized in diagnosing CLI errors and generating a single, precise bash fix command.

The lightweight sibling of YOLO-7B-Qwen-Coder. Runs on any machine with 4GB RAM. Responds in under a second on Apple Silicon.

Part of the yolo-coder project โ€” an automated CLI repair tool that wraps any command, catches failures, and fixes them locally using a local LLM.


What it does

Given a CLI error message and surrounding code context, the model outputs exactly one bare bash command to fix the problem. No explanation. No markdown. No backticks. Just the fix.

Input:  ModuleNotFoundError: No module named 'requests'
Output: pip install requests
Input:  PermissionError: [Errno 13] Permission denied: '/usr/local/bin/tool'
Output: sudo chmod +x /usr/local/bin/tool

Model Details

Property Value
Base model Qwen/Qwen2.5-Coder-1.5B-Instruct
Fine-tune method LoRA (MLX on Apple Silicon)
LoRA rank 8
LoRA scale 20.0
Layers trained 16
Training iterations 500
Learning rate 1e-5
Batch size 2 (no grad accumulation)
Max sequence length 2048
Training hardware Apple Silicon M-series
Model size (GGUF) ~941MB
RAM required ~2GB

Training Data

Trained on an earlier, smaller dataset of CLI error/fix pairs โ€” a subset of the data later used for the 7B model. The 7B was trained on 2,250 examples with broader coverage; this model predates that expansion.

Coverage includes: Python runtime errors, pip errors, common file/permission errors, and basic Node.js/npm errors.

Format: ChatML with a system prompt enforcing single-command output.


Files in this repo

File Description
YOLO-1.5B-Qwen.gguf Q4 quantized GGUF (~941MB) โ€” works with Ollama
safetensors/ fp16 safetensors โ€” for further fine-tuning

Usage with Ollama

# Download the Modelfile
curl -O https://raw.githubusercontent.com/erdemozkan/yolo-coder/main/YOLO-MODEL-FILES/Modelfile

# Register
ollama create yolo-coder -f Modelfile

# Test
ollama run yolo-coder "ModuleNotFoundError: No module named 'flask'"
# โ†’ pip install flask

Usage with yolo-coder

git clone https://github.com/erdemozkan/yolo-coder
cd yolo-coder
pip install -e .

# Uses yolo-coder (1.5B) by default
yoco python3 myapp.py

# Explicitly
yoco --model yolo-coder python3 myapp.py

Prompt Format (ChatML)

<|im_start|>system
You are a CLI repair tool. Output ONLY a single bare bash command to fix the error. No explanation. No markdown. No backticks.<|im_end|>
<|im_start|>user
{error message}
<|im_end|>
<|im_start|>assistant

When to use 1.5B vs 7B

1.5B 7B
RAM needed ~2GB ~5GB
Speed <1s on Apple Silicon 1โ€“3s on Apple Silicon
Common errors โœ… Excellent โœ… Excellent
Complex/rare errors โš ๏ธ May miss โœ… Better coverage
Best for Daily driver, fast machines, CI Hard errors, better accuracy

Recommendation: Use 1.5B as default, switch to 7B with --model yolo-7b when it struggles.


โš ๏ธ Experimental

This model is experimental. It can underperform on complex, rare, or multi-layered errors โ€” especially those outside its training distribution. Output quality is not guaranteed.

That said, it is extremely efficient on resources:

  • Runs on any machine with 2GB free RAM
  • Under 1 second response time on Apple Silicon
  • Only ~941MB on disk
  • Works on CPU โ€” no GPU required

If it misses, switch to the 7B: yoco --model yolo-7b python3 myapp.py


Limitations

  • Single-command output only โ€” not suitable for multi-step fixes without a wrapper
  • Smaller capacity than 7B โ€” complex or novel errors may produce suboptimal fixes
  • Not a general coding assistant

License

Apache 2.0 โ€” same as the base model.

Downloads last month
304
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for erdemozkan/YOLO-1.5B-Qwen-Coder

Adapter
(88)
this model