Update README.md
Browse files
README.md
CHANGED
|
@@ -47,14 +47,6 @@ This is a fine-tuned version of [openai/whisper-small](https://huggingface.co/op
|
|
| 47 |
- **Training Data:** [ai4bharat/Kathbath](https://huggingface.co/datasets/ai4bharat/Kathbath)
|
| 48 |
- **Fine-tuning Framework:** Transformers + Custom DALI Pipeline
|
| 49 |
|
| 50 |
-
## Training Details
|
| 51 |
-
|
| 52 |
-
The model was fine-tuned on the Kathbath Hindi dataset with the following configuration:
|
| 53 |
-
- **Epochs:** 3
|
| 54 |
-
- **Batch Size:** 4 (effective: 16 with gradient accumulation)
|
| 55 |
-
- **Learning Rate:** 1e-5
|
| 56 |
-
- **Mixed Precision:** FP16
|
| 57 |
-
- **Gradient Checkpointing:** Enabled
|
| 58 |
|
| 59 |
## Evaluation Results
|
| 60 |
|
|
@@ -109,22 +101,6 @@ print(result["text"])
|
|
| 109 |
- Best performance on clear audio with minimal background noise
|
| 110 |
- May struggle with very fast speech or heavy code-mixing
|
| 111 |
|
| 112 |
-
## Citation
|
| 113 |
-
|
| 114 |
-
If you use this model, please cite:
|
| 115 |
|
| 116 |
-
```bibtex
|
| 117 |
-
@misc{vanshnawander_whisper_small_hindi_asr},
|
| 118 |
-
author = {AI4Bharat},
|
| 119 |
-
title = {vanshnawander/whisper-small-hindi-asr},
|
| 120 |
-
year = {2024},
|
| 121 |
-
publisher = {HuggingFace},
|
| 122 |
-
url = {https://huggingface.co/vanshnawander/whisper-small-hindi-asr}
|
| 123 |
-
}
|
| 124 |
```
|
| 125 |
|
| 126 |
-
## Acknowledgments
|
| 127 |
-
|
| 128 |
-
- [OpenAI Whisper](https://github.com/openai/whisper) for the base model
|
| 129 |
-
- [AI4Bharat](https://ai4bharat.iitm.ac.in/) for the Kathbath and LAHAJA datasets
|
| 130 |
-
- [Hugging Face](https://huggingface.co/) for the transformers library
|
|
|
|
| 47 |
- **Training Data:** [ai4bharat/Kathbath](https://huggingface.co/datasets/ai4bharat/Kathbath)
|
| 48 |
- **Fine-tuning Framework:** Transformers + Custom DALI Pipeline
|
| 49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
## Evaluation Results
|
| 52 |
|
|
|
|
| 101 |
- Best performance on clear audio with minimal background noise
|
| 102 |
- May struggle with very fast speech or heavy code-mixing
|
| 103 |
|
|
|
|
|
|
|
|
|
|
| 104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
```
|
| 106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|