Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models Paper • 2601.18734 • Published Jan 26 • 5
Towards Active Synthetic Data Generation for Finetuning Language Models Paper • 2512.00884 • Published Nov 30, 2025 • 1
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities Paper • 2501.12147 • Published Jan 21, 2025 • 1
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation Paper • 2402.18191 • Published Feb 28, 2024 • 1
SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response Ranking Paper • 2406.10882 • Published Jun 16, 2024 • 2
LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning Paper • 2505.07437 • Published May 12, 2025 • 1
The Best Instruction-Tuning Data are Those That Fit Paper • 2502.04194 • Published Feb 6, 2025 • 2
BARE: Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation Paper • 2502.01697 • Published Feb 3, 2025 • 1
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper • 2402.13064 • Published Feb 20, 2024 • 51
Beyond Random Sampling: Efficient Language Model Pretraining via Curriculum Learning Paper • 2506.11300 • Published Jun 12, 2025 • 2
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining Paper • 2305.10429 • Published May 17, 2023 • 5
Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness Paper • 2604.12373 • Published 3 days ago • 7
Accelerating Speculative Decoding with Block Diffusion Draft Trees Paper • 2604.12989 • Published 3 days ago • 5
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation Paper • 2604.09497 • Published 7 days ago • 26
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 7 days ago • 27
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 3 days ago • 92
Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice Paper • 2512.24503 • Published 5 days ago • 1
Predicting LLM Reasoning Performance with Small Proxy Model Paper • 2509.21013 • Published Sep 25, 2025 • 6