CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models Paper โข 2407.17467 โข Published Jul 24, 2024
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding Paper โข 2408.14764 โข Published Aug 27, 2024
Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model Paper โข 2404.10306 โข Published Apr 16, 2024 โข 1
Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning Paper โข 2507.20335 โข Published Jul 27, 2025
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper โข 2508.05592 โข Published Aug 7, 2025 โข 6
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning Paper โข 2508.19996 โข Published Aug 27, 2025