Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
AmanPriyanshu 's Collections
Pooled Sets
FORMAT: Search - Retrieve RLVR
FORMAT: Reasoning Datasets - DeepSeek Format
FORMAT: Tool-Use Datasets - Hermes-Reasoning-Tool-Use Format
Stratified-Datasets (100K-1M) [Pre-Training, IF, Reasoning]
GPT-OSS Pruned Experts (4.2B-20B) [IF, Science, Math, etc.]
GPT-OSS General (4.2B to 20B)
GPT-OSS Harmful (4.2B to 20B)
GPT-OSS Math (4.2B to 20B)
GPT-OSS Health / Medicine (4.2B to 20B)
GPT-OSS Law (4.2B to 20B)
GPT-OSS Instruction Following (4.2B to 20B)
GPT-OSS Safety (4.2B to 20B)
GPT-OSS Science (4.2B to 20B)

Stratified-Datasets (100K-1M) [Pre-Training, IF, Reasoning]

updated Oct 8, 2025

Diverse datasets on pre-training, instruction-following, and reasoning

Upvote
-

  • AmanPriyanshu/stratified-kmeans-diverse-instruction-following-100K-1M

    Viewer • Updated Oct 4, 2025 • 1.9M • 18

  • AmanPriyanshu/stratified-kmeans-diverse-pretraining-100K-1M

    Viewer • Updated Oct 4, 2025 • 2.22M • 64 • 1

  • AmanPriyanshu/stratified-kmeans-diverse-reasoning-100K-1M

    Viewer • Updated Oct 4, 2025 • 1.9M • 37

  • AmanPriyanshu/rlvr-guru-raw-data-extended

    Viewer • Updated Oct 20, 2025 • 226k • 113
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs