FastConformer Arabic ASR - Quran Minshawi (with Tashkeel)

This model is a fine-tuned version of nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0. It has been meticulously specialized for Automatic Speech Recognition (ASR) of Quranic recitation, featuring the renowned voice of Sheikh Mohamed Siddiq El-Minshawi.

Crucially, this model is designed to output Arabic text with full Tashkeel (diacritics), making it highly valuable for Islamic AI applications, pronunciation assessment, and linguistic research.

πŸŽ™οΈ Model Features

  • Architecture: FastConformer Hybrid Large (approx. 114M parameters)
  • Task: Automatic Speech Recognition (ASR)
  • Language: Arabic (ar-EG / Classical Quranic Arabic)
  • Specialization: High-fidelity transcription of Quranic audio with exact diacritization.

How to Use (NVIDIA NeMo)

You can easily instantiate this model and transcribe audio using the nemo_toolkit.

Installation

pip install nemo_toolkit[asr]

Python Inference Script

import nemo.collections.asr as nemo_asr

# 1. Load the model directly from Hugging Face
model = nemo_asr.models.ASRModel.from_pretrained("NightPrince/stt-ar-fastconformer-quran-minshawi")

# 2. Define the path to your audio file (Must be 16kHz, Mono, .wav)
audio_files = ["path_to_your_minshawi_audio.wav"]

# 3. Transcribe
transcription = model.transcribe(paths2audio_files=audio_files)
print(f"Transcription: {transcription[0]}")

πŸ“Š Training Details

This model was fine-tuned to map acoustic features specifically to fully diacritized Arabic text.

  • Framework: PyTorch Lightning & NVIDIA NeMo
  • Base Model: FastConformer Hybrid Large PCD
  • Target Domain: Quranic Recitation (Minshawi)
Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for NightPrince/stt-ar-fastconformer-quran-minshawi

Finetuned
(3)
this model