Bangla VITS TTS

A Bengali (Bangla) text-to-speech model trained with Coqui TTS using the VITS architecture. This repository is updated automatically every 2 epochs.

Training progress

Checkpoint	Loss	Step	Epoch
🏆 Best model	`16.9650`	`26,910`	`630`
📦 Current model	`17.5934`	`28,980`	`922`

Last updated: 2026-03-30 05:22 UTC

Repository structure

Path	Contents
`best_model/`	Checkpoint with the lowest validation loss seen so far
`current_model/`	Most recent checkpoint (updated every 2 epochs)
`training_state.json`	Latest step, epoch, and both losses — used to resume training

Quick start

from TTS.api import TTS

# Best (lowest-loss) checkpoint
tts = TTS(
    model_path="best_model/<checkpoint>.pth",
    config_path="best_model/config.json",
)
tts.tts_to_file(
    text="আমি বাংলায় কথা বলতে পারি।",
    file_path="output.wav",
)

from TTS.api import TTS

# Current (lowest-loss) checkpoint
tts = TTS(
    model_path="current_model/<checkpoint>.pth",
    config_path="current_model/config.json",
)
tts.tts_to_file(
    text="আমি বাংলায় কথা বলতে পারি।",
    file_path="output.wav",
)

Training details

Field	Value
Language	Bengali (`bn`)
Architecture	VITS
Framework	Coqui TTS
Sample rate	22 050 Hz

Checkpoint policy

current_model/ is replaced on every push (every 2 epochs).
best_model/ is only replaced when the current loss is strictly lower than all previous checkpoints — it always contains the safest checkpoint for inference.

License

Apache 2.0 — see LICENSE for details.

Downloads last month: -; Downloads are not tracked for this model. How to track