Whisper Large V3 TR

whisper-large-v3-tr is a Turkish ASR model derived from openai/whisper-large-v3 and fine-tuned on public Turkish speech data.

This repository contains the full CTranslate2 float16 export for fast inference with faster-whisper and WhisperX. It is not a LoRA adapter; consumers can load this repository directly as a standalone inference model.

Intended Use

  • General-purpose Turkish speech recognition.
  • Turkish transcription with faster-whisper or WhisperX.
  • GPU inference with CTranslate2 float16.
  • Batch transcription pipelines that need a ready-to-use Turkish Whisper large-v3 model.

Model Details

Item Value
Base model openai/whisper-large-v3
Language Turkish (tr)
Training method LoRA fine-tuning, merged into full model before CT2 export
Release format CTranslate2 float16
Recommended runtime faster-whisper / WhisperX
Training rows 241,546
Training audio 272.42 hours
Validation rows 8,000
Validation audio 8.84 hours

Training Data

The model was fine-tuned on public Turkish ASR data:

Source Role Notes
Mozilla Common Voice / Common Voice Scripted Speech 25.0 Turkish General Turkish speech Public Common Voice Turkish corpus
issai/Turkish_Speech_Corpus General Turkish speech Public Turkish speech corpus

This general Turkish release was trained from the public sources listed above.

Evaluation

Held-out validation from the same public-source training pipeline:

Metric Value
WER 0.0880
CER 0.0252

These numbers are from the project validation split and should not be treated as a universal benchmark. For production use, evaluate on your own audio distribution.

faster-whisper Usage

from faster_whisper import WhisperModel

model = WhisperModel(
    "oguzhangokboru/whisper-large-v3-tr",
    device="cuda",
    compute_type="float16",
)

segments, info = model.transcribe(
    "audio.wav",
    language="tr",
    beam_size=5,
)

for segment in segments:
    print(segment.start, segment.end, segment.text)

WhisperX Usage

Use this repository as the faster-whisper / CTranslate2 model path in your WhisperX transcription pipeline.

Files

  • model.bin: CTranslate2 float16 model weights.
  • config.json: CTranslate2 model configuration.
  • tokenizer.json, vocabulary.json, tokenizer_config.json: tokenizer assets.
  • preprocessor_config.json: Whisper large-v3 feature extractor configuration.
  • processor_config.json: processor metadata.
  • training_summary.json: public training and export summary.

Notes

  • This is a general Turkish ASR model, not a domain-specialized subtitle model.
  • Very noisy recordings, overlapping speech, music-heavy audio, and far-field speech can still require review.
  • The model inherits the broad behavior of Whisper large-v3 and should be evaluated on target production audio before deployment.
Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oguzhangokboru/whisper-large-v3-tr

Finetuned
(870)
this model