Poor word-level audio analysis performance

#2
by empeza - opened

The model doesn't understand what "word-level" means and seems fixated on returning sentence-level timestamps. I guess it wasn't trained on finer-level audio analysis.

Sign up or log in to comment