Bangla VITS TTS
A Bengali (Bangla) text-to-speech model trained with Coqui TTS using the VITS architecture. This repository is updated automatically every 2 epochs.
Training progress
| Checkpoint | Loss | Step | Epoch |
|---|---|---|---|
| 🏆 Best model | 16.9650 |
26,910 |
630 |
| 📦 Current model | 17.5934 |
28,980 |
922 |
Last updated: 2026-03-30 05:22 UTC
Repository structure
| Path | Contents |
|---|---|
best_model/ |
Checkpoint with the lowest validation loss seen so far |
current_model/ |
Most recent checkpoint (updated every 2 epochs) |
training_state.json |
Latest step, epoch, and both losses — used to resume training |
Quick start
from TTS.api import TTS
# Best (lowest-loss) checkpoint
tts = TTS(
model_path="best_model/<checkpoint>.pth",
config_path="best_model/config.json",
)
tts.tts_to_file(
text="আমি বাংলায় কথা বলতে পারি।",
file_path="output.wav",
)
from TTS.api import TTS
# Current (lowest-loss) checkpoint
tts = TTS(
model_path="current_model/<checkpoint>.pth",
config_path="current_model/config.json",
)
tts.tts_to_file(
text="আমি বাংলায় কথা বলতে পারি।",
file_path="output.wav",
)
Training details
| Field | Value |
|---|---|
| Language | Bengali (bn) |
| Architecture | VITS |
| Framework | Coqui TTS |
| Sample rate | 22 050 Hz |
Checkpoint policy
current_model/is replaced on every push (every 2 epochs).best_model/is only replaced when the current loss is strictly lower than all previous checkpoints — it always contains the safest checkpoint for inference.
License
Apache 2.0 — see LICENSE for details.