indonlp/NusaX-MT
Updated • 135 • 11
How to use nahiar/xlm-roberta-indonesian-languages with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="nahiar/xlm-roberta-indonesian-languages") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("nahiar/xlm-roberta-indonesian-languages")
model = AutoModelForSequenceClassification.from_pretrained("nahiar/xlm-roberta-indonesian-languages")Fine-tuned XLM-RoBERTa model for identifying 11 Indonesian regional languages + English.
from transformers import pipeline
# Load model
classifier = pipeline("text-classification", model="YOUR_USERNAME/xlm-roberta-indonesian-languages")
# Single prediction
result = classifier("Sugeng enjing, piye kabare?")
print(result)
# Output: [{'label': 'javanese', 'score': 0.9876}]
# Batch prediction
texts = [
"Selamat pagi, apa kabar?",
"Wilujeng enjing, kumaha damang?",
"Good morning, how are you?"
]
results = classifier(texts)
for text, result in zip(texts, results):
print(f"{text} -> {result['label']} ({result['score']:.4f})")
If you use this model, please cite:
@misc{indonesian-language-id,
author = {Raihan Hidayatullah Djunaedi},
title = {Indonesian Regional Languages Identifier},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/nahiar/xlm-roberta-indonesian-languages}
}
Base model
FacebookAI/xlm-roberta-base