Smaller model planned?
#8
by downtown1629 - opened
Thank you for releasing such a great model! Are there any plans for a smaller size model based on the new cache-aware architecture? I'd appreciate it if you could make a smaller model while maintaining punctuation and capitalization functionality.
Thank you for the feedback, @downtown1629 ! We don’t have plans for a smaller Nemotron-Speech-Streaming model at the moment, but this is very helpful input and definitely something we’re interested in exploring. One possible direction could be a ~120M variant with a ~115M FastConformer cache-aware encoder and a single RNN-T layer, along the lines of parakeet_realtime_eou_120m-v1, while preserving punctuation and capitalization. We’ll keep this in mind as we consider future updates.
kunaldhawan changed discussion status to closed