Reward Modeling Datasets
updated
Viewer
• Updated • 37.1k • 1.84k
• 249
Viewer
• Updated • 169k • 33.5k
• 1.79k
Viewer
• Updated • 386k • 2.76k
• 323
PKU-Alignment/PKU-SafeRLHF
Viewer
• Updated • 164k • 11.9k
• 187
openai/webgpt_comparisons
Viewer
• Updated • 19.6k • 536
• 241
openai/summarize_from_feedback
Viewer
• Updated • 194k • 1.64k
• 220
HuggingFaceH4/ultrafeedback_binarized
Viewer
• Updated • 187k • 14k
• 340
Viewer
• Updated • 183k • 1.43k
• 295
HuggingFaceH4/stack-exchange-preferences
Viewer
• Updated • 10.8M • 15.9k
• 134
HuggingFaceH4/hhh_alignment
Viewer
• Updated • 221 • 1.03k
• 24
Birchlabs/openai-prm800k-stepwise-critic
Viewer
• Updated • 1.09M • 253
• 46
prometheus-eval/Feedback-Collection
Viewer
• Updated • 100k • 798
• 120
argilla/OpenHermesPreferences
Viewer
• Updated • 989k • 709
• 214
Viewer
• Updated • 8.11k • 5.26k
• 107
Viewer
• Updated • 21.4k • 4.53k
• 452
Magpie-Align/Magpie-Pro-DPO-200K
Viewer
• Updated • 207k • 7
• 7
argilla/magpie-ultra-v0.1
Viewer
• Updated • 50k • 718
• 222