---
license: apache-2.0
language:
- en
- es
- fr
- de
- zh
- ja
- ko
- ar
- pt
- ru
- hi
- multilingual
library_name: transformers
tags:
- text-classification
- feedback-detection
- user-satisfaction
- mmbert
- modernbert
- multilingual
- vllm-semantic-router
datasets:
- llm-semantic-router/feedback-detector-dataset
metrics:
- accuracy
- f1
base_model: jhu-clsp/mmBERT-base
pipeline_tag: text-classification
model-index:
- name: mmbert-feedback-detector-merged
  results:
  - task:
      type: text-classification
      name: User Feedback Classification
    dataset:
      name: feedback-detector-dataset
      type: llm-semantic-router/feedback-detector-dataset
    metrics:
    - type: accuracy
      value: 0.9689
      name: Accuracy
    - type: f1
      value: 0.9688
      name: F1 Macro
---

# mmBERT Feedback Detector (Merged)

A **multilingual** 4-class user feedback classifier built on [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base). This model classifies user responses into satisfaction categories to help understand user intent in conversational AI systems.

## Model Description

This is the **merged model** (LoRA weights merged into base model) for direct inference without PEFT. For the LoRA adapter version, see [llm-semantic-router/mmbert-feedback-detector-lora](https://huggingface.co/llm-semantic-router/mmbert-feedback-detector-lora).

### Labels

| Label | ID | Description |
|-------|-----|-------------|
| `SAT` | 0 | User is satisfied with the response |
| `NEED_CLARIFICATION` | 1 | User needs more explanation or clarification |
| `WRONG_ANSWER` | 2 | User indicates the response is incorrect |
| `WANT_DIFFERENT` | 3 | User wants alternative options or different response |

## Performance

| Metric | Score |
|--------|-------|
| **Accuracy** | 96.89% |
| **F1 Macro** | 96.88% |
| **F1 Weighted** | 96.88% |

### Per-Class Performance

| Class | F1 Score |
|-------|----------|
| SAT | 100.0% |
| NEED_CLARIFICATION | 99.7% |
| WRONG_ANSWER | 94.0% |
| WANT_DIFFERENT | 93.8% |

## Multilingual Support

Thanks to mmBERT's multilingual pretraining (256k vocabulary, 100+ languages), this model achieves excellent cross-lingual transfer:

| Language | Accuracy |
|----------|----------|
| 🇺🇸 English | 100% |
| 🇪🇸 Spanish | 100% |
| 🇫🇷 French | 100% |
| 🇩🇪 German | 100% |
| 🇨🇳 Chinese | 100% |
| 🇯🇵 Japanese | 100% |
| 🇰🇷 Korean | 100% |
| 🇸🇦 Arabic | 100% |
| 🇵🇹 Portuguese | 100% |
| 🇷🇺 Russian | 100% |

## Usage

### With Transformers

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model_name = "llm-semantic-router/mmbert-feedback-detector-merged"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example: Classify user feedback
text = "Thanks, that's exactly what I needed!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    pred = probs.argmax().item()

labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]
print(f"Prediction: {labels[pred]} ({probs[0][pred]:.1%})")
# Output: Prediction: SAT (100.0%)
```

### With Pipeline

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="llm-semantic-router/mmbert-feedback-detector-merged"
)

# English
result = classifier("Thanks, that's helpful!")
print(result)  # [{'label': 'SAT', 'score': 0.999...}]

# Spanish (cross-lingual transfer)
result = classifier("¡Gracias, eso es muy útil!")
print(result)  # [{'label': 'SAT', 'score': 0.999...}]

# Chinese
result = classifier("谢谢，这很有帮助！")
print(result)  # [{'label': 'SAT', 'score': 0.98...}]
```

## Training Details

- **Base Model**: [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base)
- **Method**: LoRA fine-tuning + merge
- **LoRA Rank**: 16
- **LoRA Alpha**: 32
- **Learning Rate**: 2e-5
- **Batch Size**: 32
- **Epochs**: 5
- **Max Length**: 512
- **Dataset**: [llm-semantic-router/feedback-detector-dataset](https://huggingface.co/datasets/llm-semantic-router/feedback-detector-dataset)

## Use Cases

- **Conversational AI**: Understand if users are satisfied with chatbot responses
- **Customer Support**: Route dissatisfied users to human agents
- **Quality Monitoring**: Track response quality across languages
- **Feedback Analysis**: Categorize user feedback automatically

## Related Models

- [llm-semantic-router/mmbert-feedback-detector-lora](https://huggingface.co/llm-semantic-router/mmbert-feedback-detector-lora) - LoRA adapter version
- [llm-semantic-router/mmbert-intent-classifier-merged](https://huggingface.co/llm-semantic-router/mmbert-intent-classifier-merged) - Intent classification
- [llm-semantic-router/mmbert-fact-check-merged](https://huggingface.co/llm-semantic-router/mmbert-fact-check-merged) - Fact checking
- [llm-semantic-router/mmbert-jailbreak-detector-merged](https://huggingface.co/llm-semantic-router/mmbert-jailbreak-detector-merged) - Security

## Citation

```bibtex
@misc{mmbert-feedback-detector,
  title={mmBERT Feedback Detector},
  author={vLLM Semantic Router Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/llm-semantic-router/mmbert-feedback-detector-merged}
}
```

## License

Apache 2.0