ezetimibe

company

AI & ML interests

None defined yet.

Recent Activity

zhouxiangxin authored a paper 2 days ago

Rethinking the Divergence Regularization in LLM RL

zhouxiangxin authored a paper 2 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

zhouxiangxin authored a paper 2 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

View all activity

authored 3 papers 2 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 6 days ago • 32

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published 5 days ago • 40

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 5 days ago • 41

submitted 2 papers to Daily Papers 3 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 5 days ago • 41

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 6 days ago • 32

authored a paper 4 months ago

Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published Feb 4 • 37

authored a paper 7 months ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 32

authored a paper 8 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 92

authored a paper 9 months ago

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26, 2025 • 69

authored a paper about 1 year ago

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 27

authored a paper almost 2 years ago

ProteinBench: A Holistic Evaluation of Protein Foundation Models

Paper • 2409.06744 • Published Sep 10, 2024 • 8