NB-Whisper Medium — OpenVINO INT8

An OpenVINO IR + INT8 weight-quantized build of NbAiLab/nb-whisper-medium, optimized for low-latency local inference with openvino-genai on Intel CPUs, integrated GPUs, and NPUs.

Intended for local Norwegian speech-to-text (e.g. push-to-talk dictation) on consumer hardware where the fp16/fp32 checkpoint would be too heavy.

Relation to the upstream model

This is a derivative work. All training data, architecture, tokenizer, and language coverage are inherited from NbAiLab/nb-whisper-medium. Only the numeric representation has been altered.

Modifications:

Exported to OpenVINO Intermediate Representation (IR) using Optimum.
Post-training INT8 weight quantization via NNCF.
Removed PyTorch / Flax / safetensors / ONNX / ggml files — only the OpenVINO runtime artifacts are retained.

No fine-tuning or retraining was performed. Accuracy is expected to closely track the upstream model at substantially lower memory and compute cost.

Usage

Input must be 16 kHz mono. Device may be CPU, GPU, or NPU depending on your OpenVINO runtime and hardware.

import librosa
import openvino_genai as ov_genai

audio, _ = librosa.load("clip.wav", sr=16000, mono=True)
pipe = ov_genai.WhisperPipeline("nb-whisper-medium-ov-int8", device="GPU")

cfg = pipe.get_generation_config()
cfg.language = "<|no|>"   # Bokmål. Also <|nn|> (Nynorsk), <|en|>, etc.
cfg.task = "transcribe"   # or "translate"
cfg.return_timestamps = False

print(str(pipe.generate(audio, cfg)).strip())

Download the model directory (for example with huggingface-cli download <user>/nb-whisper-medium-ov-int8 --local-dir ./nb-whisper-medium-ov-int8) and point WhisperPipeline at the resulting path.

License

Apache License 2.0 — same as the upstream model. See LICENSE and NOTICE in this repository.

Citation & Contributors

The NB-Whisper Medium model is a product of the NoSTram project led by Per Egil Kummervold (@pere) at the National Library of Norway. Key contributors include Javier de la Rosa (@versae), Freddy Wetjen (@freddyw), and Rolv-Arild Braaten (@Rolv-Arild). NB AI-Lab, under the direction of Svein Arne Brygfjeld (@Brygfjeld), supported the project's successful completion. A detailed paper on their process and findings is forthcoming — contact ailab@nb.no for the latest citation information.

This OpenVINO / INT8 derivative was prepared and uploaded by the repository owner, who did not contribute to training the underlying model and claims no authorship over it. All credit for the science and engineering that produced NB-Whisper goes to NB AI-Lab and their collaborators.

Bias, Risks, and Limitations

Carried over from the upstream model card:

The models, especially the smaller ones, may exhibit occasional hallucinations and may drop parts of the transcript. They are designed to convert spoken language into grammatically correct written sentences, which might not always be word-for-word translations.

Using these models without adequate risk assessment and mitigation could be considered irresponsible. They may contain biases or other undesirable distortions. Users who deploy these models or integrate them into systems or services are responsible for mitigating risks and complying with applicable AI regulations. The National Library of Norway, as the model owner, disclaims liability for any outcomes resulting from third-party use of these models.

The INT8 conversion may introduce additional small accuracy regressions relative to the fp16/fp32 upstream release. Users who need maximum accuracy should prefer the original NbAiLab/nb-whisper-medium.

Acknowledgements

Original acknowledgements from the upstream model card:

Our gratitude extends to Google TPU Research Cloud for training resources, Google Cloud for translation credits, and HuggingFace's Sanchit Ghandi for technical support. A special thank you to Per Erik Solberg at Språkbanken for the collaboration on the Stortinget corpus.

Downloads last month: 14

Model tree for josteinbe/nb-whisper-medium-ov-int8

Base model

openai/whisper-medium

Quantized

NbAiLab/nb-whisper-medium

Quantized

(1)

this model