Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 53
This repository hosts a CTranslate2 conversion of the Hugging Face model
MAdel121/whisper-small-egyptian-arabic for use with faster-whisper.
MAdel121/whisper-small-egyptian-arabicopenai/whisper-smallMAdel121/arabic-egy-cleanedfrom faster_whisper import WhisperModel
model = WhisperModel(
"faster-whisper-small-egyptian-arabic",
device="cuda",
compute_type="float16",
)
CPU usage:
from faster_whisper import WhisperModel
model = WhisperModel(
"faster-whisper-small-egyptian-arabic",
device="cpu",
compute_type="int8",
)
Converted with:
ct2-transformers-converter \
--model whisper-small-egyptian-arabic \
--output_dir faster-whisper-small-egyptian-arabic \
--quantization float16 \
--copy_files tokenizer.json preprocessor_config.json
Please cite the original Whisper paper and dataset:
@article{radford2023robust,
title={Robust Speech Recognition via Large-Scale Weak Supervision},
author={Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
journal={arXiv preprint arXiv:2212.04356},
year={2023}
}
@misc{adel_mohamed_2024_12860997,
author = {Adel Mohamed},
title = {MAdel121/arabic-egy-cleaned},
month = jun,
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.12860997},
url = {https://doi.org/10.5281/zenodo.12860997}
}
@misc{speechbrain,
title={{SpeechBrain}: A General-Purpose Speech Toolkit},
author={Ravanelli, Mirco and Parcollet, Titouan and Plantinga, Peter and Rouhe, Aku and Cornell, Samuele and Lugosch, Loren and Subakan, Cem and Dawalatabad, Nauman and Heba, Abdelwahab and Zhong, Jianyuan and Chou, Ju-Chieh and Yeh, Sung-Lin and Fu, Szu-Wei and Liao, Chien-Feng and Rastorgueva, Elena and Grondin, Francois and Aris, William and Na, Hwidong and Gao, Yan and De Mori, Renato and Bengio, Yoshua},
year={2021},
eprint={2106.04624},
archivePrefix={arXiv},
primaryClass={eess.AS}
}
Base model
MAdel121/whisper-small-egyptian-arabic