Can we expect an ONNX quant?
#6
by SuperPauly - opened
4 or 8 bit would be awesome and I'm not sure I'm smart enough to do a good job!
Any plans to release ONNX, GGUF or other quant's for this model?
Thanks. 🫶🏼
Also supporting this, it would be super helpful.
Thank you for the feedback, @SuperPauly and @tmssmt . We don’t have quantized versions to share at the moment, but this is great input and definitely something we’re interested in exploring and considering for future updates.
If anyone is interested in:
I exported it as onnx 2 weeks ago https://huggingface.co/altunenes/parakeet-rs/tree/main/nemotron-speech-streaming-en-0.6b
usage:
https://github.com/altunenes/parakeet-rs/blob/master/examples/streaming.rs