File size: 5,473 Bytes
21d8d9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63510e4
21d8d9d
 
 
 
 
 
 
 
 
 
 
 
63510e4
 
21d8d9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63510e4
21d8d9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
---
license: apache-2.0
language:
- en
- es
- fr
- de
- zh
- ja
- ko
- ar
- pt
- ru
- hi
- multilingual
library_name: transformers
tags:
- text-classification
- feedback-detection
- user-satisfaction
- mmbert
- modernbert
- multilingual
- vllm-semantic-router
datasets:
- llm-semantic-router/feedback-detector-dataset
metrics:
- accuracy
- f1
base_model: jhu-clsp/mmBERT-base
pipeline_tag: text-classification
model-index:
- name: mmbert-feedback-detector-merged
  results:
  - task:
      type: text-classification
      name: User Feedback Classification
    dataset:
      name: feedback-detector-dataset
      type: llm-semantic-router/feedback-detector-dataset
    metrics:
    - type: accuracy
      value: 0.9689
      name: Accuracy
    - type: f1
      value: 0.9688
      name: F1 Macro
---

# mmBERT Feedback Detector (Merged)

A **multilingual** 4-class user feedback classifier built on [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base). This model classifies user responses into satisfaction categories to help understand user intent in conversational AI systems.

## Model Description

This is the **merged model** (LoRA weights merged into base model) for direct inference without PEFT. For the LoRA adapter version, see [llm-semantic-router/mmbert-feedback-detector-lora](https://huggingface.co/llm-semantic-router/mmbert-feedback-detector-lora).

### Labels

| Label | ID | Description |
|-------|-----|-------------|
| `SAT` | 0 | User is satisfied with the response |
| `NEED_CLARIFICATION` | 1 | User needs more explanation or clarification |
| `WRONG_ANSWER` | 2 | User indicates the response is incorrect |
| `WANT_DIFFERENT` | 3 | User wants alternative options or different response |

## Performance

| Metric | Score |
|--------|-------|
| **Accuracy** | 96.89% |
| **F1 Macro** | 96.88% |
| **F1 Weighted** | 96.88% |

### Per-Class Performance

| Class | F1 Score |
|-------|----------|
| SAT | 100.0% |
| NEED_CLARIFICATION | 99.7% |
| WRONG_ANSWER | 94.0% |
| WANT_DIFFERENT | 93.8% |

## Multilingual Support

Thanks to mmBERT's multilingual pretraining (256k vocabulary, 100+ languages), this model achieves excellent cross-lingual transfer:

| Language | Accuracy |
|----------|----------|
| 🇺🇸 English | 100% |
| 🇪🇸 Spanish | 100% |
| 🇫🇷 French | 100% |
| 🇩🇪 German | 100% |
| 🇨🇳 Chinese | 100% |
| 🇯🇵 Japanese | 100% |
| 🇰🇷 Korean | 100% |
| 🇸🇦 Arabic | 100% |
| 🇵🇹 Portuguese | 100% |
| 🇷🇺 Russian | 100% |

## Usage

### With Transformers

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model_name = "llm-semantic-router/mmbert-feedback-detector-merged"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example: Classify user feedback
text = "Thanks, that's exactly what I needed!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    pred = probs.argmax().item()

labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]
print(f"Prediction: {labels[pred]} ({probs[0][pred]:.1%})")
# Output: Prediction: SAT (100.0%)
```

### With Pipeline

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="llm-semantic-router/mmbert-feedback-detector-merged"
)

# English
result = classifier("Thanks, that's helpful!")
print(result)  # [{'label': 'SAT', 'score': 0.999...}]

# Spanish (cross-lingual transfer)
result = classifier("¡Gracias, eso es muy útil!")
print(result)  # [{'label': 'SAT', 'score': 0.999...}]

# Chinese
result = classifier("谢谢,这很有帮助!")
print(result)  # [{'label': 'SAT', 'score': 0.98...}]
```

## Training Details

- **Base Model**: [jhu-clsp/mmBERT-base](https://huggingface.co/jhu-clsp/mmBERT-base)
- **Method**: LoRA fine-tuning + merge
- **LoRA Rank**: 16
- **LoRA Alpha**: 32
- **Learning Rate**: 2e-5
- **Batch Size**: 32
- **Epochs**: 5
- **Max Length**: 512
- **Dataset**: [llm-semantic-router/feedback-detector-dataset](https://huggingface.co/datasets/llm-semantic-router/feedback-detector-dataset)

## Use Cases

- **Conversational AI**: Understand if users are satisfied with chatbot responses
- **Customer Support**: Route dissatisfied users to human agents
- **Quality Monitoring**: Track response quality across languages
- **Feedback Analysis**: Categorize user feedback automatically

## Related Models

- [llm-semantic-router/mmbert-feedback-detector-lora](https://huggingface.co/llm-semantic-router/mmbert-feedback-detector-lora) - LoRA adapter version
- [llm-semantic-router/mmbert-intent-classifier-merged](https://huggingface.co/llm-semantic-router/mmbert-intent-classifier-merged) - Intent classification
- [llm-semantic-router/mmbert-fact-check-merged](https://huggingface.co/llm-semantic-router/mmbert-fact-check-merged) - Fact checking
- [llm-semantic-router/mmbert-jailbreak-detector-merged](https://huggingface.co/llm-semantic-router/mmbert-jailbreak-detector-merged) - Security

## Citation

```bibtex
@misc{mmbert-feedback-detector,
  title={mmBERT Feedback Detector},
  author={vLLM Semantic Router Team},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/llm-semantic-router/mmbert-feedback-detector-merged}
}
```

## License

Apache 2.0