-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 17.9k • 1.43k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 77 • 17 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
Collections
Discover the best community collections!
Collections including paper arxiv:2504.16084
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 513 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 49 -
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper • 2509.13683 • Published • 8 -
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
Paper • 2509.00798 • Published • 1
-
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper • 2504.01990 • Published • 305 -
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 308 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 55 -
Seedream 3.0 Technical Report
Paper • 2504.11346 • Published • 70
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 80 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
-
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 33 -
Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval
Paper • 2509.21710 • Published • 19 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 122 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 141 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 122 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 55 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 190
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 17.9k • 1.43k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 77 • 17 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 80 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 45
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 513 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 49 -
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper • 2509.13683 • Published • 8 -
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
Paper • 2509.00798 • Published • 1
-
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Paper • 2504.08600 • Published • 33 -
Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval
Paper • 2509.21710 • Published • 19 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 122 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 140
-
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper • 2504.01990 • Published • 305 -
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Paper • 2504.10479 • Published • 308 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 55 -
Seedream 3.0 Technical Report
Paper • 2504.11346 • Published • 70
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 141 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 122 -
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models
Paper • 2503.24235 • Published • 55 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 190