Pi0.5 fine-tuned on SO-101 edu-robot scenarios (sc1-4)
Fine-tuned lerobot/pi05_base on four SO-101 pick-and-place scenarios collected
in the IntelligentDecisionLab edu_robot project. Trained as a baseline for a
voice-driven educational robot (SO-101 + Jetson Orin NX).
⚠️ Status: offline training complete, real-robot success rate on SO-101 has not yet been measured. Treat this as a research checkpoint, not a validated policy.
Datasets
Trained on a merged dataset composed of four in-domain SO-101 scenarios
collected by Jason (jedeka30):
| Scenario | Repo | Description |
|---|---|---|
| 1 — Adjective | jedeka30/edurobot-scenario1 |
Color/adjective-conditioned pick |
| 2 — Size | jedeka30/edurobot-scenario2 |
Size-conditioned pick |
| 3 — Spatial | jedeka30/edurobot-scenario3 |
Spatial-referent pick |
| 4 — Action Clarification | jedeka30/edurobot-scenario4 |
Multi-step pick + place |
Merged as: kunhsiang/edurobot-sc1234-merged
Training
- Base:
lerobot/pi05_base(PaliGemmagemma_2b+ action-expertgemma_300m) - Framework: LeRobot
- Steps: 50,000 (~125 epochs over ~200 episodes)
- Batch size: 2
- Optimizer: AdamW (β=[0.9, 0.95], wd=0.01, grad clip 1.0)
- LR schedule: warmup 1,000 → 2.5e-5 → cosine decay to 2.5e-6 over 30,000 steps
- Precision: bfloat16,
gradient_checkpointing=True - Image resolution: 224 × 224
- Chunk size / action horizon: 50
- Inference steps: 10 (flow matching)
- Normalization: quantiles (STATE, ACTION), identity (VISUAL)
- Train expert only:
True(vision encoder unfrozen) - Hardware: 1× NVIDIA RTX 5090 (sm_120, CUDA 12.8)
- Wall-clock: ~4h 20m (≈3.37 step/s)
- Final loss: ~0.04–0.06
Checkpoints
Intermediate checkpoints are saved every 2,500 steps under checkpoints/:
checkpoints/
├── 002500/ … 050000/ # 20 intermediate checkpoints
└── last -> 050000 # final checkpoint (recommended)
Use checkpoints/last/ for inference unless you have a reason to pick an
earlier step.
Usage
from lerobot.policies.pi0.modeling_pi05 import PI05Policy
policy = PI05Policy.from_pretrained(
"kunhsiang/pi05_so101_edurobot_sc1234_50k",
revision="main",
)
or with lerobot-eval / real-robot control scripts, point --policy.path at
this repo.
Known caveats
- Dataset is small (~200 episodes × 125 epochs). Late checkpoints are likely overfit; compare 25K / 40K / 50K on the real robot before committing.
- Not yet validated on hardware. Compare against
lerobot/pi05_basezero-shot before claiming improvement. - SO-101 specific. Action / state dims follow SO-101 (6-DOF arm + gripper). Will not transfer to Franka / other embodiments.
- Do not use
lerobot/pi05_libero_finetunedas a base for SO-101 — it hard-codes Franka 7-DOF shapes.
Citation & attribution
- Dataset collection: Jason Dekarnegie (
jedeka30) - Training / tooling: Kunhsiang (
kunhsiang) - Project supervision: Prof. Lien, Intelligent Decision Lab
- Base model: Physical Intelligence
pi0.5via Hugging Face LeRobot
Model tree for kunhsiang/pi05_so101_edurobot_sc1234_50k
Base model
lerobot/pi05_base