arxiv:2602.22190
Qianhui WU
qianhuiwu
AI & ML interests
None yet
Recent Activity
upvoted a paper about 3 hours ago
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts updated a dataset 8 days ago
OpenWebRL/OpenWebRL-RL-Tasks updated a dataset 9 days ago
OpenWebRL/OpenWebRL-SFT-Trajectories