WhisperX Large-v3 — PRE-ACORDO

This model is a fine-tuned Whisper variant of `openai/whisper-large-v3`, trained for European Portuguese (EP) (around 425h) automatic speech recognition (ASR). From CAMÕES work.

🧠 Model Description

Base model: openai/whisper-large-v3
Architecture: Transformer encoder–decoder
Training: Fine-tuned on around 800 hours of Portuguese speech
Task: Transcription (task="transcribe")
Compute type: float16 (recommended)

🧩 Usage

import whisperx

device = "cuda"  # or "cpu"
compute_type = "float16"

model = whisperx.load_model(
    "inesc-id/WhisperLv3-EP-X",
    device=device,
    compute_type=compute_type,
    language="pt",
    task="transcribe"
)

## Citation


**BibTeX:**

@inproceedings{camoes, title={{CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese}}, author={Carlos Carvalho, Francisco Teixeira, Catarina Botelho, Anna Pompili, Rubén Solera-Ureña, Sérgio Paulo, Mariana Julião, Thomas Rolland, John Mendonça, Diogo Pereira, Isabel Trancoso, Alberto Abad}, booktitle={Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)}, year={2025}, }

Downloads last month: 41

Model tree for inesc-id/WhisperLv3-EP-X

Base model

openai/whisper-large-v3

Finetuned

(826)

this model

Space using inesc-id/WhisperLv3-EP-X 1

Collection including inesc-id/WhisperLv3-EP-X

ASR for European Portuguese

Collection

Collection of datasets and models for Portuguese ASR • 7 items • Updated Nov 3, 2025 • 4

Paper for inesc-id/WhisperLv3-EP-X

CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese

Paper • 2508.19721 • Published Aug 27, 2025 • 5