Pi0.5 fine-tuned on SO-101 edu-robot scenarios (sc1-4)

Fine-tuned lerobot/pi05_base on four SO-101 pick-and-place scenarios collected in the IntelligentDecisionLab edu_robot project. Trained as a baseline for a voice-driven educational robot (SO-101 + Jetson Orin NX).

⚠️ Status: offline training complete, real-robot success rate on SO-101 has not yet been measured. Treat this as a research checkpoint, not a validated policy.

Datasets

Trained on a merged dataset composed of four in-domain SO-101 scenarios collected by Jason (jedeka30):

Scenario Repo Description
1 — Adjective jedeka30/edurobot-scenario1 Color/adjective-conditioned pick
2 — Size jedeka30/edurobot-scenario2 Size-conditioned pick
3 — Spatial jedeka30/edurobot-scenario3 Spatial-referent pick
4 — Action Clarification jedeka30/edurobot-scenario4 Multi-step pick + place

Merged as: kunhsiang/edurobot-sc1234-merged

Training

  • Base: lerobot/pi05_base (PaliGemma gemma_2b + action-expert gemma_300m)
  • Framework: LeRobot
  • Steps: 50,000 (~125 epochs over ~200 episodes)
  • Batch size: 2
  • Optimizer: AdamW (β=[0.9, 0.95], wd=0.01, grad clip 1.0)
  • LR schedule: warmup 1,000 → 2.5e-5 → cosine decay to 2.5e-6 over 30,000 steps
  • Precision: bfloat16, gradient_checkpointing=True
  • Image resolution: 224 × 224
  • Chunk size / action horizon: 50
  • Inference steps: 10 (flow matching)
  • Normalization: quantiles (STATE, ACTION), identity (VISUAL)
  • Train expert only: True (vision encoder unfrozen)
  • Hardware: 1× NVIDIA RTX 5090 (sm_120, CUDA 12.8)
  • Wall-clock: ~4h 20m (≈3.37 step/s)
  • Final loss: ~0.04–0.06

Checkpoints

Intermediate checkpoints are saved every 2,500 steps under checkpoints/:

checkpoints/
├── 002500/ … 050000/   # 20 intermediate checkpoints
└── last -> 050000       # final checkpoint (recommended)

Use checkpoints/last/ for inference unless you have a reason to pick an earlier step.

Usage

from lerobot.policies.pi0.modeling_pi05 import PI05Policy

policy = PI05Policy.from_pretrained(
    "kunhsiang/pi05_so101_edurobot_sc1234_50k",
    revision="main",
)

or with lerobot-eval / real-robot control scripts, point --policy.path at this repo.

Known caveats

  1. Dataset is small (~200 episodes × 125 epochs). Late checkpoints are likely overfit; compare 25K / 40K / 50K on the real robot before committing.
  2. Not yet validated on hardware. Compare against lerobot/pi05_base zero-shot before claiming improvement.
  3. SO-101 specific. Action / state dims follow SO-101 (6-DOF arm + gripper). Will not transfer to Franka / other embodiments.
  4. Do not use lerobot/pi05_libero_finetuned as a base for SO-101 — it hard-codes Franka 7-DOF shapes.

Citation & attribution

  • Dataset collection: Jason Dekarnegie (jedeka30)
  • Training / tooling: Kunhsiang (kunhsiang)
  • Project supervision: Prof. Lien, Intelligent Decision Lab
  • Base model: Physical Intelligence pi0.5 via Hugging Face LeRobot
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Model tree for kunhsiang/pi05_so101_edurobot_sc1234_50k

Finetuned
(7)
this model