PCBSchemaGen: Constraint-Guided Schematic Design via LLM for Printed Circuit Boards (PCB)
Paper • 2602.00510 • Published
Fine-tuned Qwen3-4B to generate valid KiCad s-expression netlists from natural language circuit descriptions.
Given a text description like "Design an RP2040-based flight controller with IMU, barometer, USB-C, and SWD debug", this model:
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
MODEL_ID = "Qwen/Qwen3-4B"
ADAPTER_ID = "AbijahKaj/qwen3-4b-kicad-netlist"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16,
)
base_model = AutoModelForCausalLM.from_pretrained(MODEL_ID, quantization_config=bnb_config, device_map="auto")
model = PeftModel.from_pretrained(base_model, ADAPTER_ID)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
messages = [
{"role": "system", "content": "You are an expert electronics engineer and KiCad schematic designer. When given a description of an electronic circuit or system, you generate a complete, valid KiCad netlist in s-expression format."},
{"role": "user", "content": "Design an RP2040-based flight controller with ICM-42688-P IMU on SPI, BMP388 barometer on I2C, 4 PWM motor outputs, USB-C, QSPI flash, and SWD debug header."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.3, top_p=0.9, do_sample=True)
netlist = tokenizer.decode(outputs[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True)
print(netlist)
| Circuit Type | Key Components |
|---|---|
| Flight Controller | RP2040, ICM-42688-P IMU, BMP388 baro, PWM outputs |
| IoT Sensor Node | STM32F411, BME280, SX1276 LoRa, OLED |
| WiFi/CAN Gateway | ESP32-S3, MCP2515, TJA1050, SD card, ADS1115 |
| Arduino Clone | ATmega328P, CH340G USB-UART, 16MHz crystal |
| Battery Sensor | ATtiny85, nRF24L01+, BME280, MCP73831 charger |
| USB Power Meter | ATmega328P, INA219, OLED display |
| Motor Driver | RP2040, DRV8833, reverse polarity protection |
| GPS Tracker | ESP32-S3, u-blox MAX-M8, SD card, LiPo charger |
| Thermocouple Reader | ATmega328P, MAX31855, OLED |
| Data Logger | RP2040, ADS1115 ADC, SD card, OLED |
| CM4 Carrier Board | Dual Hirose DF40HC, USB 2.0, HDMI, SD, GPIO |
| CM5 Carrier Board | Dual Hirose DF40HC, PCIe NVMe, USB 3.0, Ethernet, Fan |
| CAN Bus Node | STM32F103, MCP2515, TJA1050, 120Ω termination |
| Simple circuits | LED drivers, voltage dividers, current sensors, buck converters |
The v2 dataset contains:
(nets sections — converted from real .kicad_sch schematics from ~6,000 GitHub repos plus synthetic circuitssearch_component and get_datasheet_info tool callsDataset generation scripts are in the dataset repo.
The included erc_validator.py (v2) validates generated netlists:
| Check | What It Validates |
|---|---|
| Syntax | Balanced parentheses, valid s-expression structure |
| Structure | Has (export), (design), (components), (nets) sections |
| Components | Unique refs, values, footprints, libsource |
| Nets | Multi-node connectivity, no floating pins, pin-type compatibility |
| Net Quality | Multi-node ratio, average nodes/net, named net ratio |
| Power | GND net present, power supply net present |
| Decoupling | ICs on power nets have bypass capacitors |
| Connectivity | All components connected to at least one net |
python erc_validator.py your_netlist.kicad_net
python erc_validator.py your_dataset.jsonl
Trained on v2 dataset with completion-only loss masking and sequence length filtering on RTX 5090.
| Parameter | Value |
|---|---|
| Method | QLoRA (4-bit NF4, double quant) + SFT |
| LoRA rank | r=64, α=32, all-linear targets |
| Effective batch | 8 (BS=1 × grad_accum=8) |
| Max seq length | 8,192 tokens |
| Learning rate | 2e-4 (cosine decay) |
| Epochs | 2 |
| Train loss | 0.1442 (avg), 0.112 (final) |
| Eval loss | 0.1251 |
| Token accuracy | 96.78% |
| ERC checks | Every 500 steps on 3 validation prompts |
| Best ERC | 0.613 (step 4000) |
| Final ERC | 0.433 |
Trained on v2 dataset with 16,738 examples (filtered to max 8192 tokens) on RTX 5090.
| Parameter | Value |
|---|---|
| Method | QLoRA (4-bit NF4, double quant) + SFT |
| LoRA rank | r=64, α=32, all-linear targets |
| Effective batch | 8 (BS=1 × grad_accum=8) |
| Max seq length | 8,192 tokens |
| Epochs | 1 |
| Train loss | 0.125 |
| Eval loss | 0.141 |
| Token accuracy | 96.3% |
| File | Description |
|---|---|
adapter_model.safetensors |
LoRA adapter weights (504 MB) |
adapter_config.json |
PEFT adapter configuration |
train.py |
Training script with inline ERC v2 |
erc_validator.py |
ERC v2 — stricter net quality checks |
evaluate_model.py |
Evaluation suite: 7 prompts, ERC scoring, A/B comparison |
chat_template.jinja |
Qwen3 chat template |
erc_validator.py to check generated netlistsenable_thinking=False for direct netlist outputApache 2.0 (same as base model Qwen3-4B)