Update README.md

6d3c00e verified about 1 month ago

3.53 kB

license: cc-by-nc-4.0
language:
  - en
base_model:
  - microsoft/deberta-v3-base
model-index:
  - name: AI-Response-Comparer
    results:
      - task:
          type: text-classification
          name: Multi-class Preference Classification
        dataset:
          name: LLM Classification Finetuning (Kaggle)
          type: kaggle-llm-finetune
        metrics:
          - name: Multi-class Log Loss
            type: log_loss
            value: 1.0346
          - name: Accuracy
            type: accuracy
            value: 0.4893642026090022

Model Description

This model is a fine-tuned version of microsoft/deberta-v3-base, optimized for Preference Classification (Reward Modeling). Instead of standard text classification, this model is designed to compare two AI-generated responses to the same prompt and predict which one is higher quality or more "preferred."

License

Training scripts and source code are licensed under Apache-2.0.
Model weights are released under CC BY-NC 4.0 due to dataset licensing restrictions.

Dataset

Source: LLM Classification Finetuning (Kaggle)
Context: The dataset consists of "Chatbot Arena" style prompts and paired completions, labeled by human preference.
License: CC BY-NC 4.0 (Non-commercial use only).

Metrics

The model is evaluated using the following criteria, comparing the predicted probability distribution [P(A), P(B), P(Tie)] against the ground truth:

Multi-class Log Loss (Primary):
- Definition: Measures the distance between the predicted probability distribution and the actual labels. $L = -\frac{1}{N} \sum_{i=1}^{N} \sum_{j=1}^{M} y_{i,j} \log(p_{i,j})$
- Variables: Where $M = 3$ (representing Response A, Response B, and Tie).
- Why: It rewards the model for assigning higher probabilities to the correct outcome and heavily penalizes high-confidence incorrect predictions.
Accuracy (Secondary):
- Definition: The percentage of instances where the class with the highest predicted probability matches the ground truth label.
- Calculation: Correct Predictions / Total Samples.

Evaluation Results

The following results were achieved during final evaluation. Note that Accuracy was calculated using a local train/test split, while Log Loss follows the competition's evaluation framework.

Metric	Value	Source/Split
Multi-class Log Loss	1.0346	Kaggle Competition Metric
Accuracy	48.94%	Local Train/Test Split

Note on Performance:

Log Loss: This score reflects the model's ability to provide well-calibrated probabilities for the three classes (A, B, and Tie) as required by the Kaggle competition.

Accuracy: This was monitored locally to ensure the model was successfully learning the preference patterns beyond a random baseline (33.33%).

Acknowledgments & Attribution

Base Model: This work utilizes DeBERTa-v3-base, developed by Microsoft.
Dataset: Training data was provided by the LMSYS LLM Classification Finetuning competition on Kaggle.
License Notice: This model is subject to the CC BY-NC 4.0 license due to the underlying dataset. It is intended for non-commercial, research, and educational purposes only.