fangtongen 's Collections
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
with Web Data, and Web Data Only
Paper
• 2306.01116
• Published • 44
FlashAttention: Fast and Memory-Efficient Exact Attention with
IO-Awareness
Paper
• 2205.14135
• Published • 15
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper
• 2104.09864
• Published • 17
Language Models are Few-Shot Learners
Paper
• 2005.14165
• Published • 20
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Paper
• 2101.00027
• Published • 10
Fast Transformer Decoding: One Write-Head is All You Need
Paper
• 1911.02150
• Published • 9
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
• 2307.09288
• Published • 251
LLaMA: Open and Efficient Foundation Language Models
Paper
• 2302.13971
• Published • 23
Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Paper
• 2306.02707
• Published • 51
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity
Text Embeddings Through Self-Knowledge Distillation
Paper
• 2402.03216
• Published • 7
Paper
• 2310.06825
• Published • 58