ajinkyakolhe 's Collections Language Models - Essential Research Papers
updated
Attention Is All You Need
Paper
• 1706.03762
• Published • 120
Language Models are Few-Shot Learners
Paper
• 2005.14165
• Published • 20
LLaMA: Open and Efficient Foundation Language Models
Paper
• 2302.13971
• Published • 23
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
• 2307.09288
• Published • 251
Insights into DeepSeek-V3: Scaling Challenges and Reflections on
Hardware for AI Architectures
Paper
• 2505.09343
• Published • 76
Textbooks Are All You Need
Paper
• 2306.11644
• Published • 154
Textbooks Are All You Need II: phi-1.5 technical report
Paper
• 2309.05463
• Published • 90
Paper
• 2412.15115
• Published • 377
Qwen2.5-1M Technical Report
Paper
• 2501.15383
• Published • 72
Qwen2.5-VL Technical Report
Paper
• 2502.13923
• Published • 217
Qwen2.5-Omni Technical Report
Paper
• 2503.20215
• Published • 172
Qwen2.5-Coder Technical Report
Paper
• 2409.12186
• Published • 154
mHC: Manifold-Constrained Hyper-Connections
Paper
• 2512.24880
• Published • 322
Training language models to follow instructions with human feedback
Paper
• 2203.02155
• Published • 24
LoRA: Low-Rank Adaptation of Large Language Models
Paper
• 2106.09685
• Published • 60
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Paper
• 2306.02707
• Published • 51
Knowledge Distillation of Large Language Models
Paper
• 2306.08543
• Published • 23