---
model-index:
- name: poltextlab/media2-25-26-v1-1001
  results:
  - task:
      type: text-classification
    metrics:
    - name: Accuracy
      type: accuracy
      value: 71%
    - name: F1-Score
      type: f1
      value: 70%
tags:
- text-classification
- transformers
- roberta
metrics:
- accuracy
- f1_score
language:
- en
base_model:
- xlm-roberta-large
pipeline_tag: text-classification
library_name: transformers
license: cc-by-4.0
extra_gated_prompt: Our models are intended for academic projects and academic research
  only. If you are not affiliated with an academic institution, please reach out to
  us at huggingface [at] poltextlab [dot] com for further inquiry. If we cannot clearly
  determine your academic affiliation and use case based on your form data, your request
  may be rejected. Please allow us a few business days to manually review subscriptions.
extra_gated_fields:
  Country: country
  Institution: text
  Institution Email: text
  Full Name: text
  Please specify your academic project/use case you want to use the models for: text
---

# media2-25-26-v1-1001

This model uses the poltextLAB Media2 codebook built on top of the CAP codebook.


# How to use the model

```python
from transformers import AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-large")
pipe = pipeline(
    model="poltextlab/media2-25-26-v1-1001",
    task="text-classification",
    tokenizer=tokenizer,
    use_fast=False,
    token="<your_hf_read_only_token>"
)

text = "<text_to_classify>"
pipe(text)
```
        

# Classification Report

## Overall Performance:

Evaluated on a test set of 1601 English samples.

* **Accuracy:** 71%
* **Macro Avg:** Precision: 0.67, Recall: 0.62, F1-score: 0.62
* **Weighted Avg:** Precision: 0.74, Recall: 0.71, F1-score: 0.70

## Per-Class Metrics:

|   Label |   Precision |   Recall |   F1-score |   Support |
|--------:|------------:|---------:|-----------:|----------:|
|       1 |        0.77 |     0.8  |       0.78 |        50 |
|       2 |        0.74 |     0.78 |       0.76 |        50 |
|       3 |        0.74 |     0.74 |       0.74 |        50 |
|       4 |        0.7  |     0.86 |       0.77 |        50 |
|       5 |        0.86 |     0.76 |       0.81 |        50 |
|       6 |        0.83 |     0.98 |       0.9  |        50 |
|       7 |        0.85 |     0.88 |       0.86 |        50 |
|       8 |        0.87 |     0.94 |       0.9  |        50 |
|       9 |        0.87 |     0.82 |       0.85 |        50 |
|      10 |        0.77 |     0.94 |       0.85 |        50 |
|      12 |        0.56 |     0.88 |       0.69 |        50 |
|      13 |        0.88 |     0.86 |       0.87 |        50 |
|      14 |        0.73 |     0.76 |       0.75 |        50 |
|      15 |        0.51 |     0.86 |       0.64 |        50 |
|      16 |        0.75 |     0.86 |       0.8  |        50 |
|      17 |        0.63 |     0.76 |       0.69 |        50 |
|      18 |        0.91 |     0.82 |       0.86 |        50 |
|      19 |        0.51 |     0.82 |       0.63 |        50 |
|      20 |        0.62 |     0.92 |       0.74 |        50 |
|      21 |        0.75 |     0.8  |       0.78 |        50 |
|      23 |        0.52 |     0.78 |       0.62 |        50 |
|      24 |        0.71 |     0.57 |       0.63 |        42 |
|      25 |        0.92 |     0.48 |       0.63 |        23 |
|      26 |        0.92 |     0.56 |       0.7  |        43 |
|      27 |        0    |     0    |       0    |        18 |
|      28 |        0    |     0    |       0    |         9 |
|      29 |        0.43 |     0.27 |       0.33 |        33 |
|      30 |        0.72 |     0.28 |       0.41 |        46 |
|      31 |        0.89 |     0.44 |       0.59 |        36 |
|      32 |        0    |     0    |       0    |        20 |
|      33 |        0.12 |     0.08 |       0.1  |        12 |
|      34 |        0.07 |     0.14 |       0.1  |         7 |
|      35 |        0.93 |     0.71 |       0.81 |        35 |
|      36 |        0    |     0    |       0    |         3 |
|      37 |        1    |     0.82 |       0.9  |        44 |
|      38 |        0.81 |     0.81 |       0.81 |        42 |
|      39 |        1    |     0.39 |       0.57 |        33 |
|      40 |        0.88 |     0.21 |       0.34 |        33 |
|      41 |        1    |     0.78 |       0.88 |        32 |
|     998 |        0.92 |     0.55 |       0.69 |        40 |

# Inference platform
This model is used by the [CAP Babel Machine](https://babel.poltextlab.com), an open-source and free natural language processing tool, designed to simplify and speed up projects for comparative research.  

# Cooperation
Model performance can be significantly improved by extending our training sets. We appreciate every submission of CAP-coded corpora (of any domain and language) at poltextlab{at}poltextlab{dot}com or by using the [CAP Babel Machine](https://babel.poltextlab.com).
## Debugging and issues
This architecture uses the `sentencepiece` tokenizer. In order to run the model before `transformers==4.27` you need to install it manually.