Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

289

Base only

Active filters: reward-model

mradermacher/ThinkPRM-14B-GGUF

15B • Updated Jul 31, 2025 • 64

mradermacher/ThinkPRM-14B-i1-GGUF

15B • Updated Jul 11, 2025 • 227

mradermacher/ThinkPRM-1.5B-GGUF

2B • Updated Jul 11, 2025 • 116

ilgee/Binary-Think-RM-8B

8B • Updated Nov 2, 2025 • 4

ilgee/Multiclass-Think-RM-8B

8B • Updated Nov 2, 2025 • 3

launch/ThinkPRM-7B

Text Generation • 8B • Updated May 17, 2025 • 48 • 1

mradermacher/ThinkPRM-7B-GGUF

8B • Updated Jul 11, 2025 • 138

mradermacher/ThinkPRM-7B-i1-GGUF

8B • Updated Jul 11, 2025 • 338

Huanghz/align2llava-7b-lora-question

Updated May 21, 2025 • 2

Huanghz/align2llava-7b-lora-answer

Updated May 21, 2025 • 3

nvidia/Qwen-2.5-Nemotron-32B-Reward

Text Classification • 32B • Updated Jun 26, 2025 • 17 • 3

nvidia/Qwen-3-Nemotron-32B-Reward

Text Classification • 32B • Updated Jun 26, 2025 • 86 • 20

zhuohaoyu/RewardAnything-8B-v1

Text Generation • 8B • Updated Jun 5, 2025 • 10 • • 4

mradermacher/RewardAnything-8B-v1-GGUF

8B • Updated Jul 11, 2025 • 42

WisdomShell/RewardAnything-8B-v1

Text Generation • 8B • Updated Jun 5, 2025 • 1.57k • • 23

Skywork/Skywork-Reward-V2-Qwen3-8B

Text Classification • 8B • Updated Jul 6, 2025 • 12.4k • 24

ContextualAI/ctx-bird-reward-250121

Text Generation • 33B • Updated Dec 2, 2025 • 90 • • 7

Bifrost-AI/Qwen-3-Nemotron-32B-Reward-F16

Text Classification • 32B • Updated Jul 11, 2025 • 3

tensorblock/WisdomShell_RewardAnything-8B-v1-GGUF

Text Generation • 8B • Updated Jan 27 • 1

ulab-ai/sotopia-rl-qwen2.5-7B-rm

Feature Extraction • Updated Aug 7, 2025 • 1

ilgee/Binary-Think-RM-3B

3B • Updated Nov 2, 2025 • 16 • 1

gandhiraketla277/demo-lora-reward-model

Text Generation • Updated Aug 10, 2025

Schrieffer/Llama-SARM-4B

Reinforcement Learning • 5B • Updated Dec 11, 2025 • 46 • 2

phuongntc/Multi_EvalSumViet2

Summarization • 0.2B • Updated Apr 26 • 5

ykorkmaz/rfm_no_failure

4B • Updated Aug 30, 2025 • 8

abraranwar/spur_metaworld

4B • Updated Aug 31, 2025 • 5

ykorkmaz/rfm_progress_only

4B • Updated Sep 1, 2025 • 6

kewu93/skywork-medarena-lora-v1

Updated Sep 18, 2025 • 2

kewu93/skywork-medarena-lora-v2

Text Classification • Updated Sep 18, 2025 • 3

nabeelshan/rlhf-gpt2-pipeline

Text Generation • Updated Sep 24, 2025