Pi0.5 fine-tuned on SO-101 edu-robot scenarios (sc1-4)

Fine-tuned lerobot/pi05_base on four SO-101 pick-and-place scenarios collected in the IntelligentDecisionLab edu_robot project. Trained as a baseline for a voice-driven educational robot (SO-101 + Jetson Orin NX).

⚠️ Status: offline training complete, real-robot success rate on SO-101 has not yet been measured. Treat this as a research checkpoint, not a validated policy.

Datasets

Trained on a merged dataset composed of four in-domain SO-101 scenarios collected by Jason (jedeka30):

Scenario	Repo	Description
1 — Adjective	`jedeka30/edurobot-scenario1`	Color/adjective-conditioned pick
2 — Size	`jedeka30/edurobot-scenario2`	Size-conditioned pick
3 — Spatial	`jedeka30/edurobot-scenario3`	Spatial-referent pick
4 — Action Clarification	`jedeka30/edurobot-scenario4`	Multi-step pick + place

Merged as: kunhsiang/edurobot-sc1234-merged

Training

Base: lerobot/pi05_base (PaliGemma gemma_2b + action-expert gemma_300m)
Framework: LeRobot
Steps: 50,000 (~125 epochs over ~200 episodes)
Batch size: 2
Optimizer: AdamW (β=[0.9, 0.95], wd=0.01, grad clip 1.0)
LR schedule: warmup 1,000 → 2.5e-5 → cosine decay to 2.5e-6 over 30,000 steps
Precision: bfloat16, gradient_checkpointing=True
Image resolution: 224 × 224
Chunk size / action horizon: 50
Inference steps: 10 (flow matching)
Normalization: quantiles (STATE, ACTION), identity (VISUAL)
Train expert only: True (vision encoder unfrozen)
Hardware: 1× NVIDIA RTX 5090 (sm_120, CUDA 12.8)
Wall-clock: ~4h 20m (≈3.37 step/s)
Final loss: ~0.04–0.06

Checkpoints

Intermediate checkpoints are saved every 2,500 steps under checkpoints/:

checkpoints/
├── 002500/ … 050000/   # 20 intermediate checkpoints
└── last -> 050000       # final checkpoint (recommended)

Use checkpoints/last/ for inference unless you have a reason to pick an earlier step.

Usage

from lerobot.policies.pi0.modeling_pi05 import PI05Policy

policy = PI05Policy.from_pretrained(
    "kunhsiang/pi05_so101_edurobot_sc1234_50k",
    revision="main",
)

or with lerobot-eval / real-robot control scripts, point --policy.path at this repo.

Known caveats

Dataset is small (~200 episodes × 125 epochs). Late checkpoints are likely overfit; compare 25K / 40K / 50K on the real robot before committing.
Not yet validated on hardware. Compare against lerobot/pi05_base zero-shot before claiming improvement.
SO-101 specific. Action / state dims follow SO-101 (6-DOF arm + gripper). Will not transfer to Franka / other embodiments.
Do not use lerobot/pi05_libero_finetuned as a base for SO-101 — it hard-codes Franka 7-DOF shapes.

Citation & attribution

Dataset collection: Jason Dekarnegie (jedeka30)
Training / tooling: Kunhsiang (kunhsiang)
Project supervision: Prof. Lien, Intelligent Decision Lab
Base model: Physical Intelligence pi0.5 via Hugging Face LeRobot

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics

Model tree for kunhsiang/pi05_so101_edurobot_sc1234_50k

Base model

lerobot/pi05_base

Finetuned

(7)

this model