DS' Daily paper
updated
Instruction Pre-Training: Language Models are Supervised Multitask
Learners
Paper
• 2406.14491
• Published • 96
Transformers are SSMs: Generalized Models and Efficient Algorithms
Through Structured State Space Duality
Paper
• 2405.21060
• Published • 68
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small
Reference Models
Paper
• 2405.20541
• Published • 24
MMLU-Pro: A More Robust and Challenging Multi-Task Language
Understanding Benchmark
Paper
• 2406.01574
• Published • 54
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
• 2406.00888
• Published • 33
Artificial Generational Intelligence: Cultural Accumulation in
Reinforcement Learning
Paper
• 2406.00392
• Published • 14
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration
Paper
• 2406.01014
• Published • 33
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper
• 2406.02657
• Published • 41
Parrot: Multilingual Visual Instruction Tuning
Paper
• 2406.02539
• Published • 36
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper
• 2406.04692
• Published • 59
Large Language Model Confidence Estimation via Black-Box Access
Paper
• 2406.04370
• Published • 22
CRAG -- Comprehensive RAG Benchmark
Paper
• 2406.04744
• Published • 46
PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Paper
• 2406.06282
• Published • 39
Paper
• 2406.04127
• Published • 39
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context
Language Modeling
Paper
• 2406.07522
• Published • 40
Transformers meet Neural Algorithmic Reasoners
Paper
• 2406.09308
• Published • 44
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
Intelligence
Paper
• 2406.11931
• Published • 69
Bootstrapping Language Models with DPO Implicit Rewards
Paper
• 2406.09760
• Published • 41
TroL: Traversal of Layers for Large Language and Vision Models
Paper
• 2406.12246
• Published • 36
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Paper
• 2406.12275
• Published • 31
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Paper
• 2406.15319
• Published • 64
Judging the Judges: Evaluating Alignment and Vulnerabilities in
LLMs-as-Judges
Paper
• 2406.12624
• Published • 37
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
Scale
Paper
• 2406.17557
• Published • 102
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
• 2305.18290
• Published • 64
Scaling Relationship on Learning Mathematical Reasoning with Large
Language Models
Paper
• 2308.01825
• Published • 23
SLiC-HF: Sequence Likelihood Calibration with Human Feedback
Paper
• 2305.10425
• Published • 7
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning
Paper
• 2410.01044
• Published • 35
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
• 2410.23090
• Published • 55
LLaMo: Large Language Model-based Molecular Graph Assistant
Paper
• 2411.00871
• Published • 22
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Framework
Paper
• 2308.08155
• Published • 11
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper
• 2303.11366
• Published • 7
Scaling RL to Long Videos
Paper
• 2507.07966
• Published • 162
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable
Reinforcement Learning
Paper
• 2507.01006
• Published • 253
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper
• 2507.13546
• Published • 126