Multi-Domain Reward Model FsfairX Llama-3-8B-Instruct

This is a multi-domain reward model built from sfairXC/FsfairX-LLaMA3-RM-v0.1. It combines 23 fine-grained regression objectives across coherence, commonsense, empathy, and multicultural response quality with a prompt-conditioned gating network that produces a single preference score.

The checkpoint was packaged with the custom RewardModelWithGating architecture used in the Multi-Domain Reward Model project.

Project repository: Mario-RC/multi-domain-reward-model.

Intended Use

Use this model to score and compare assistant responses when the evaluation should account for multiple quality dimensions rather than a single generic helpfulness score. The primary use case is reward modeling or offline response ranking for chat-style data.

Training Data

The model uses multi-objective scoring and preference data from:

Evaluation

Preference accuracy by domain:

Domain Accuracy (%)
Coherence 85.8261
Commonsense 97.2511
Empathy 94.3606
Multicultural 76.2950

Hugging Face Models

The packaged multi-domain reward models are available on Hugging Face under the mario-rc namespace:

Usage Example

This checkpoint uses the project's custom RewardModelWithGating class. Run the example from an environment where multidomain_model/modeling_custom.py is importable.

import torch
from transformers import AutoTokenizer
from modeling_custom import RewardModelWithGating

model_id = "mario-rc/multi-domain-rm-fsfairx-llama-3-8b-it"
dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
device_map = {"": 0} if torch.cuda.is_available() else None

tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = RewardModelWithGating.from_pretrained(
    model_id,
    device_map=device_map,
    dtype=dtype,
).eval()
device = next(model.parameters()).device

messages = [
    {"role": "user", "content": "I failed an important exam and feel awful."},
    {"role": "assistant", "content": "I'm sorry. That is a hard setback, but it does not define your ability. Take a little time to recover, then we can make a concrete study plan for the next attempt."},
]

encoded = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    padding=True,
    truncation=True,
    max_length=4096,
)
inputs = {"input_ids": encoded.to(device)} if isinstance(encoded, torch.Tensor) else {
    key: value.to(device) for key, value in encoded.items()
}

with torch.no_grad():
    score = model(**inputs).score.float().item()

print(score)

Limitations

This is a reward model, not a standalone chat assistant. Scores are intended for relative comparison and should be calibrated for each downstream use case. The model inherits limitations from its base model and from the annotation coverage of the multi-domain datasets, especially for cultural contexts not represented in the evaluation data.

Credits

This model is based on the ArmoRM/RLHFlow reward-modeling approach and adapts it to custom multi-domain attributes for coherence, commonsense, empathy, and multicultural response quality.

Downloads last month
94
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mario-rc/multi-domain-rm-fsfairx-llama-3-8b-it

Finetuned
(3)
this model

Datasets used to train mario-rc/multi-domain-rm-fsfairx-llama-3-8b-it

Collection including mario-rc/multi-domain-rm-fsfairx-llama-3-8b-it