NB-Whisper Medium — OpenVINO INT8
An OpenVINO IR + INT8 weight-quantized build of
NbAiLab/nb-whisper-medium,
optimized for low-latency local inference with
openvino-genai on Intel
CPUs, integrated GPUs, and NPUs.
Intended for local Norwegian speech-to-text (e.g. push-to-talk dictation) on consumer hardware where the fp16/fp32 checkpoint would be too heavy.
Relation to the upstream model
This is a derivative work. All training data, architecture, tokenizer, and
language coverage are inherited from NbAiLab/nb-whisper-medium. Only the
numeric representation has been altered.
Modifications:
- Exported to OpenVINO Intermediate Representation (IR) using Optimum.
- Post-training INT8 weight quantization via NNCF.
- Removed PyTorch / Flax / safetensors / ONNX / ggml files — only the OpenVINO runtime artifacts are retained.
No fine-tuning or retraining was performed. Accuracy is expected to closely track the upstream model at substantially lower memory and compute cost.
Usage
Input must be 16 kHz mono. Device may be CPU, GPU, or NPU depending
on your OpenVINO runtime and hardware.
import librosa
import openvino_genai as ov_genai
audio, _ = librosa.load("clip.wav", sr=16000, mono=True)
pipe = ov_genai.WhisperPipeline("nb-whisper-medium-ov-int8", device="GPU")
cfg = pipe.get_generation_config()
cfg.language = "<|no|>" # Bokmål. Also <|nn|> (Nynorsk), <|en|>, etc.
cfg.task = "transcribe" # or "translate"
cfg.return_timestamps = False
print(str(pipe.generate(audio, cfg)).strip())
Download the model directory (for example with
huggingface-cli download <user>/nb-whisper-medium-ov-int8 --local-dir ./nb-whisper-medium-ov-int8)
and point WhisperPipeline at the resulting path.
License
Apache License 2.0 — same as the upstream model. See
LICENSE and NOTICE in this repository.
Citation & Contributors
The NB-Whisper Medium model is a product of the NoSTram project led by Per Egil Kummervold (@pere) at the National Library of Norway. Key contributors include Javier de la Rosa (@versae), Freddy Wetjen (@freddyw), and Rolv-Arild Braaten (@Rolv-Arild). NB AI-Lab, under the direction of Svein Arne Brygfjeld (@Brygfjeld), supported the project's successful completion. A detailed paper on their process and findings is forthcoming — contact ailab@nb.no for the latest citation information.
This OpenVINO / INT8 derivative was prepared and uploaded by the repository owner, who did not contribute to training the underlying model and claims no authorship over it. All credit for the science and engineering that produced NB-Whisper goes to NB AI-Lab and their collaborators.
Bias, Risks, and Limitations
Carried over from the upstream model card:
The models, especially the smaller ones, may exhibit occasional hallucinations and may drop parts of the transcript. They are designed to convert spoken language into grammatically correct written sentences, which might not always be word-for-word translations.
Using these models without adequate risk assessment and mitigation could be considered irresponsible. They may contain biases or other undesirable distortions. Users who deploy these models or integrate them into systems or services are responsible for mitigating risks and complying with applicable AI regulations. The National Library of Norway, as the model owner, disclaims liability for any outcomes resulting from third-party use of these models.
The INT8 conversion may introduce additional small accuracy regressions
relative to the fp16/fp32 upstream release. Users who need maximum accuracy
should prefer the original NbAiLab/nb-whisper-medium.
Acknowledgements
Original acknowledgements from the upstream model card:
Our gratitude extends to Google TPU Research Cloud for training resources, Google Cloud for translation credits, and HuggingFace's Sanchit Ghandi for technical support. A special thank you to Per Erik Solberg at Språkbanken for the collaboration on the Stortinget corpus.
- Downloads last month
- 14