Qwen3-4B KiCad Netlist Generator

Fine-tuned Qwen3-4B to generate valid KiCad s-expression netlists from natural language circuit descriptions.

What It Does

Given a text description like "Design an RP2040-based flight controller with IMU, barometer, USB-C, and SWD debug", this model:

Identifies all required components (MCU, sensors, regulators, connectors, passives)
Selects appropriate ICs with correct footprints and pin definitions
Generates a complete KiCad netlist in s-expression format with components, power nets, decoupling, and pin assignments

Quick Start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

MODEL_ID = "Qwen/Qwen3-4B"
ADAPTER_ID = "AbijahKaj/qwen3-4b-kicad-netlist"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(MODEL_ID, quantization_config=bnb_config, device_map="auto")
model = PeftModel.from_pretrained(base_model, ADAPTER_ID)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

messages = [
    {"role": "system", "content": "You are an expert electronics engineer and KiCad schematic designer. When given a description of an electronic circuit or system, you generate a complete, valid KiCad netlist in s-expression format."},
    {"role": "user", "content": "Design an RP2040-based flight controller with ICM-42688-P IMU on SPI, BMP388 barometer on I2C, 4 PWM motor outputs, USB-C, QSPI flash, and SWD debug header."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.3, top_p=0.9, do_sample=True)

netlist = tokenizer.decode(outputs[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True)
print(netlist)

Supported Circuit Types

Circuit Type	Key Components
Flight Controller	RP2040, ICM-42688-P IMU, BMP388 baro, PWM outputs
IoT Sensor Node	STM32F411, BME280, SX1276 LoRa, OLED
WiFi/CAN Gateway	ESP32-S3, MCP2515, TJA1050, SD card, ADS1115
Arduino Clone	ATmega328P, CH340G USB-UART, 16MHz crystal
Battery Sensor	ATtiny85, nRF24L01+, BME280, MCP73831 charger
USB Power Meter	ATmega328P, INA219, OLED display
Motor Driver	RP2040, DRV8833, reverse polarity protection
GPS Tracker	ESP32-S3, u-blox MAX-M8, SD card, LiPo charger
Thermocouple Reader	ATmega328P, MAX31855, OLED
Data Logger	RP2040, ADS1115 ADC, SD card, OLED
CM4 Carrier Board	Dual Hirose DF40HC, USB 2.0, HDMI, SD, GPIO
CM5 Carrier Board	Dual Hirose DF40HC, PCIe NVMe, USB 3.0, Ethernet, Fan
CAN Bus Node	STM32F103, MCP2515, TJA1050, 120Ω termination
Simple circuits	LED drivers, voltage dividers, current sensors, buck converters

Dataset

The v2 dataset contains:

28,702 direct-generation examples with proper (nets sections — converted from real .kicad_sch schematics from ~6,000 GitHub repos plus synthetic circuits
285 tool-augmented examples with search_component and get_datasheet_info tool calls
100% have nets, 93% have GND, ~86% have power nets, avg 80 nets per example

Dataset generation scripts are in the dataset repo.

ERC Validation

The included erc_validator.py (v2) validates generated netlists:

Check	What It Validates
Syntax	Balanced parentheses, valid s-expression structure
Structure	Has `(export)`, `(design)`, `(components)`, `(nets)` sections
Components	Unique refs, values, footprints, libsource
Nets	Multi-node connectivity, no floating pins, pin-type compatibility
Net Quality	Multi-node ratio, average nodes/net, named net ratio
Power	GND net present, power supply net present
Decoupling	ICs on power nets have bypass capacitors
Connectivity	All components connected to at least one net

python erc_validator.py your_netlist.kicad_net
python erc_validator.py your_dataset.jsonl

Training

V6 — Completion-only loss, 2 epochs (current)

Trained on v2 dataset with completion-only loss masking and sequence length filtering on RTX 5090.

Parameter	Value
Method	QLoRA (4-bit NF4, double quant) + SFT
LoRA rank	r=64, α=32, all-linear targets
Effective batch	8 (BS=1 × grad_accum=8)
Max seq length	8,192 tokens
Learning rate	2e-4 (cosine decay)
Epochs	2
Train loss	0.1442 (avg), 0.112 (final)
Eval loss	0.1251
Token accuracy	96.78%
ERC checks	Every 500 steps on 3 validation prompts
Best ERC	0.613 (step 4000)
Final ERC	0.433

V5 — Retrained on v2 dataset

Trained on v2 dataset with 16,738 examples (filtered to max 8192 tokens) on RTX 5090.

Parameter	Value
Method	QLoRA (4-bit NF4, double quant) + SFT
LoRA rank	r=64, α=32, all-linear targets
Effective batch	8 (BS=1 × grad_accum=8)
Max seq length	8,192 tokens
Epochs	1
Train loss	0.125
Eval loss	0.141
Token accuracy	96.3%

Training Recipe Sources

CADmium (arXiv:2507.09792): LoRA r=64, all-linear, completion-only loss
OSIRIS (arXiv:2601.19439): QLoRA NF4 on Qwen family for EDA
LoRA Without Regret (Schulman, 2025): High-rank LoRA matches full fine-tuning
PCBSchemaGen (arXiv:2602.00510): ERC validation during training

Repository Contents

File	Description
`adapter_model.safetensors`	LoRA adapter weights (504 MB)
`adapter_config.json`	PEFT adapter configuration
`train.py`	Training script with inline ERC v2
`erc_validator.py`	ERC v2 — stricter net quality checks
`evaluate_model.py`	Evaluation suite: 7 prompts, ERC scoring, A/B comparison
`chat_template.jinja`	Qwen3 chat template

Limitations

Component library: Limited to components seen during training. May hallucinate pin numbers for unseen ICs.
Complexity ceiling: Best results on circuits with <100 components.
No PCB layout: Generates netlists (electrical connections) only, not physical PCB layouts.

Tips

Be specific: Include IC part numbers, bus types (SPI/I2C/UART), and pin assignments
Temperature 0.3: Low temperature produces more consistent, valid netlists
Validate output: Use erc_validator.py to check generated netlists
Disable thinking: Use enable_thinking=False for direct netlist output