FastConformer Arabic ASR - Quran Minshawi (with Tashkeel)
This model is a fine-tuned version of nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0. It has been meticulously specialized for Automatic Speech Recognition (ASR) of Quranic recitation, featuring the renowned voice of Sheikh Mohamed Siddiq El-Minshawi.
Crucially, this model is designed to output Arabic text with full Tashkeel (diacritics), making it highly valuable for Islamic AI applications, pronunciation assessment, and linguistic research.
ποΈ Model Features
- Architecture: FastConformer Hybrid Large (approx. 114M parameters)
- Task: Automatic Speech Recognition (ASR)
- Language: Arabic (ar-EG / Classical Quranic Arabic)
- Specialization: High-fidelity transcription of Quranic audio with exact diacritization.
How to Use (NVIDIA NeMo)
You can easily instantiate this model and transcribe audio using the nemo_toolkit.
Installation
pip install nemo_toolkit[asr]
Python Inference Script
import nemo.collections.asr as nemo_asr
# 1. Load the model directly from Hugging Face
model = nemo_asr.models.ASRModel.from_pretrained("NightPrince/stt-ar-fastconformer-quran-minshawi")
# 2. Define the path to your audio file (Must be 16kHz, Mono, .wav)
audio_files = ["path_to_your_minshawi_audio.wav"]
# 3. Transcribe
transcription = model.transcribe(paths2audio_files=audio_files)
print(f"Transcription: {transcription[0]}")
π Training Details
This model was fine-tuned to map acoustic features specifically to fully diacritized Arabic text.
- Framework: PyTorch Lightning & NVIDIA NeMo
- Base Model: FastConformer Hybrid Large PCD
- Target Domain: Quranic Recitation (Minshawi)
- Downloads last month
- 11