DS' Daily paper - a dkimds Collection

dkimds 's Collections

DS' Daily paper

DS' Daily paper

updated Jul 29, 2025

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 96
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31, 2024 • 68
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Paper • 2405.20541 • Published May 30, 2024 • 24
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3, 2024 • 54
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2, 2024 • 33
Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning

Paper • 2406.00392 • Published Jun 1, 2024 • 14
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

Paper • 2406.01014 • Published Jun 3, 2024 • 33
Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4, 2024 • 41
Parrot: Multilingual Visual Instruction Tuning

Paper • 2406.02539 • Published Jun 4, 2024 • 36
Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 59
Large Language Model Confidence Estimation via Black-Box Access

Paper • 2406.04370 • Published Jun 1, 2024 • 22
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7, 2024 • 46
PowerInfer-2: Fast Large Language Model Inference on a Smartphone

Paper • 2406.06282 • Published Jun 10, 2024 • 39
Are We Done with MMLU?

Paper • 2406.04127 • Published Jun 6, 2024 • 39
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11, 2024 • 40
Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13, 2024 • 44
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 69
Bootstrapping Language Models with DPO Implicit Rewards

Paper • 2406.09760 • Published Jun 14, 2024 • 41
TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18, 2024 • 36
VoCo-LLaMA: Towards Vision Compression with Large Language Models

Paper • 2406.12275 • Published Jun 18, 2024 • 31
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 64
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Paper • 2406.12624 • Published Jun 18, 2024 • 37
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 102
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Paper • 2308.01825 • Published Aug 3, 2023 • 23
SLiC-HF: Sequence Likelihood Calibration with Human Feedback

Paper • 2305.10425 • Published May 17, 2023 • 7
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Paper • 2410.01044 • Published Oct 1, 2024 • 35
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published Oct 30, 2024 • 55
LLaMo: Large Language Model-based Molecular Graph Assistant

Paper • 2411.00871 • Published Oct 31, 2024 • 22
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 11
Reflexion: Language Agents with Verbal Reinforcement Learning

Paper • 2303.11366 • Published Mar 20, 2023 • 7
Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10, 2025 • 162
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1, 2025 • 253
nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17, 2025 • 126