Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2605.01428

meta-llama/Llama-3.2-1B

Text Generation • 1B • Updated Oct 24, 2024 • 1.65M • • 2.45k
Hallucinations Undermine Trust; Metacognition is a Way Forward

Paper • 2605.01428 • Published May 2 • 24

Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning

Paper • 2510.03259 • Published Sep 26, 2025 • 57
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published Oct 8, 2025 • 30
First Try Matters: Revisiting the Role of Reflection in Reasoning Models

Paper • 2510.08308 • Published Oct 9, 2025 • 24
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 76

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 31
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 45
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 24

Multi-agent cooperation through in-context co-player inference

Paper • 2602.16301 • Published Feb 18 • 24
TIDE: Every Layer Knows the Token Beneath the Context

Paper • 2605.06216 • Published May 7 • 11
Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 81
Hallucinations Undermine Trust; Metacognition is a Way Forward

Paper • 2605.01428 • Published May 2 • 24

Theory and Representation learning

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 31
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Paper • 2508.05305 • Published Aug 7, 2025 • 48
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17
Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3, 2024 • 33

meta-llama/Llama-3.2-1B

Text Generation • 1B • Updated Oct 24, 2024 • 1.65M • • 2.45k
Hallucinations Undermine Trust; Metacognition is a Way Forward

Paper • 2605.01428 • Published May 2 • 24

Multi-agent cooperation through in-context co-player inference

Paper • 2602.16301 • Published Feb 18 • 24
TIDE: Every Layer Knows the Token Beneath the Context

Paper • 2605.06216 • Published May 7 • 11
Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 81
Hallucinations Undermine Trust; Metacognition is a Way Forward

Paper • 2605.01428 • Published May 2 • 24

Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning

Paper • 2510.03259 • Published Sep 26, 2025 • 57
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published Oct 8, 2025 • 30
First Try Matters: Revisiting the Role of Reflection in Reasoning Models

Paper • 2510.08308 • Published Oct 9, 2025 • 24
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 76

Theory and Representation learning

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 31
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Paper • 2508.05305 • Published Aug 7, 2025 • 48
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17
Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3, 2024 • 33

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 31
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 45
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 24

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs