FastConformer Arabic ASR - Quran Minshawi (with Tashkeel)

This model is a fine-tuned version of nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0. It has been meticulously specialized for Automatic Speech Recognition (ASR) of Quranic recitation, featuring the renowned voice of Sheikh Mohamed Siddiq El-Minshawi.

Crucially, this model is designed to output Arabic text with full Tashkeel (diacritics), making it highly valuable for Islamic AI applications, pronunciation assessment, and linguistic research.

🎙️ Model Features

Architecture: FastConformer Hybrid Large (approx. 114M parameters)
Task: Automatic Speech Recognition (ASR)
Language: Arabic (ar-EG / Classical Quranic Arabic)
Specialization: High-fidelity transcription of Quranic audio with exact diacritization.

How to Use (NVIDIA NeMo)

You can easily instantiate this model and transcribe audio using the nemo_toolkit.

Installation

pip install nemo_toolkit[asr]

Python Inference Script

import nemo.collections.asr as nemo_asr

# 1. Load the model directly from Hugging Face
model = nemo_asr.models.ASRModel.from_pretrained("NightPrince/stt-ar-fastconformer-quran-minshawi")

# 2. Define the path to your audio file (Must be 16kHz, Mono, .wav)
audio_files = ["path_to_your_minshawi_audio.wav"]

# 3. Transcribe
transcription = model.transcribe(paths2audio_files=audio_files)
print(f"Transcription: {transcription[0]}")

📊 Training Details

This model was fine-tuned to map acoustic features specifically to fully diacritized Arabic text.

Framework: PyTorch Lightning & NVIDIA NeMo
Base Model: FastConformer Hybrid Large PCD
Target Domain: Quranic Recitation (Minshawi)

Downloads last month: 11

Model tree for NightPrince/stt-ar-fastconformer-quran-minshawi

Base model

nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0

Finetuned

(3)

this model