Post
Updated the demo for the new version of the W2V-BERT model for Ukrainian audio recognition.
This is a classic Automatic Speech Recognition or Speech to Text task.
What's new in version three:
• more data: 1200 hours
• new SentencePiece tokenizer with 512 tokens
• feature extraction is done via a Rust extension
Facts:
• Training was started from the previous model to speed up the learning process.
• Training takes place on two 3090 video cards with 24 GB each.
• It is well suited for fine-tuning because the training data is very diverse and mostly noisy.
You can try it here:
Yehor/w2v-bert-uk-v3
Download weights here:
speech-uk/w2v-bert-v3
If you wish to support the speech-uk initiative with a donation, here is the link to Monobank:
https://send.monobank.ua/jar/3Saxixsdua
This is a classic Automatic Speech Recognition or Speech to Text task.
What's new in version three:
• more data: 1200 hours
• new SentencePiece tokenizer with 512 tokens
• feature extraction is done via a Rust extension
Facts:
• Training was started from the previous model to speed up the learning process.
• Training takes place on two 3090 video cards with 24 GB each.
• It is well suited for fine-tuning because the training data is very diverse and mostly noisy.
You can try it here:
Yehor/w2v-bert-uk-v3
Download weights here:
speech-uk/w2v-bert-v3
If you wish to support the speech-uk initiative with a donation, here is the link to Monobank:
https://send.monobank.ua/jar/3Saxixsdua