patrickcmd/qwen3-tts-salt-lug-0001
Single-speaker lug finetune of Qwen/Qwen3-TTS-12Hz-1.7B-Base on the salt_lug_0001 voice
from Sunbird/tts.
Training summary
- Base model:
Qwen/Qwen3-TTS-12Hz-1.7B-Base - Dataset: Sunbird/tts (config:
lug), filtered tospeaker_id == salt_lug_0001 - Splits used: train (n=2395), dev (n=50)
- Best dev loss: 6.1009 @ step 575
- Hardware: 1× RTX A6000 48GB
- MLflow run: https://mlflow.sunbird.ai/#/experiments/0/runs/8725586dacb741a0af33d80f24d90b92
Usage
import torch
import soundfile as sf
from qwen_tts import Qwen3TTSModel
tts = Qwen3TTSModel.from_pretrained(
"patrickcmd/qwen3-tts-salt-lug-0001",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)
wavs, sr = tts.generate_custom_voice(
text="Oli otya?",
speaker="salt_lug_0001",
)
sf.write("out.wav", wavs[0], sr)
Limitations
- Single speaker only (
salt_lug_0001); voice cloning to other speakers is not the goal of this finetune. - Trained for 1 epoch on a small subset; expect quality to vary on out-of-distribution lug text.
- Downloads last month
- 52
Model tree for patrickcmd/qwen3-tts-salt-lug-0001
Base model
Qwen/Qwen3-TTS-12Hz-1.7B-Base