goodsmileduck's picture
Upload README.md with huggingface_hub
cd3de0d verified
metadata
tags:
  - onnx
  - openvino
  - speech-recognition
  - npu
  - parakeet
  - nvidia
  - nemo
language: en
license: apache-2.0
base_model: nvidia/parakeet-tdt-0.6b-v3

Parakeet TDT 0.6B v3 — ONNX (NPU-ready)

ONNX export of nvidia/parakeet-tdt-0.6b-v3 for use with OpenVINO on Intel NPU.

Includes the bundled NeMo mel spectrogram preprocessor () for a self-contained pipeline.

Files

File Size Description
\ + \ ~2.5 GB Conformer encoder (runs on NPU)
\ 73 MB TDT joint decoder (runs on CPU)
\ 141 KB Mel spectrogram preprocessor (onnxruntime CPU)
\ 94 KB 8193-token vocabulary
\ 97 B Model metadata

Pipeline

Performance (Intel Core Ultra / Meteor Lake NPU)

Metric Value
Load time (cached) 3.6s
Transcribe 3s audio 0.29s (RTF 0.095)
WER (LibriSpeech test-clean) 3.7%
Max audio length ~16s (MEL_FRAMES=1600)

Usage

Used by npu-whisper dictation engine:

Credits