NbAiLab / nb-asr-beta-qwen06b-lunde03-verbatim
Norwegian ASR checkpoint for the NB-ASR beta program
This repository contains an NB-ASR beta checkpoint based on Qwen3-ASR-0.6B, adapted by NbAiLab for Norwegian speech recognition evaluation and deployment testing.
Internal reference: lunde03-verbatim
Uploaded: 31.03.2026
The immediate purpose of this release is to support:
- reproducible beta evaluation,
- loading and inference validation in realistic environments,
- and packaging of a reviewed checkpoint for Hugging Face distribution.
Confidential beta release: this model card and the associated weights are intended for approved evaluators and collaborators. Treat the checkpoint as beta material rather than a public production release.
Provenance
This HF repo was prepared from the local training artifact:
Qwen3-ASR-0.6B-lunde03_verbatim/checkpoint-50000
The packaging step selected the last checkpoint from that training run and copied the files required for inference and model loading into this staged Hugging Face repository.
Overview
This model is part of the NB-ASR beta group and is intended for technical evaluation, integration testing, and model-card maintenance in the Hugging Face workflow. It is suitable for:
- local transcription experiments,
- batch inference,
- serving tests,
- and end-to-end evaluation through the project's standard scripts.
Because this is a beta checkpoint, recognition behavior, formatting, and runtime characteristics may still change. Current results should be treated as provisional.
Recommended Usage
The preferred interface is the official qwen-asr package, which exposes both a standard transformers backend and a vLLM-backed serving path.
Install the base package
pip install -U qwen-asr
Install the vLLM extras
pip install -U "qwen-asr[vllm]"
Optional FlashAttention 2
pip install -U flash-attn --no-build-isolation
For lower-memory build environments:
MAX_JOBS=4 pip install -U flash-attn --no-build-isolation
Quick Start: Transformers Backend
import torch
from qwen_asr import Qwen3ASRModel
model = Qwen3ASRModel.from_pretrained(
"NbAiLab/nb-asr-beta-qwen06b-lunde03-verbatim",
dtype=torch.bfloat16,
device_map="cuda:0",
# attn_implementation="flash_attention_2",
max_inference_batch_size=32,
max_new_tokens=256,
)
results = model.transcribe(
audio="audio.wav",
language=None,
)
print(results[0].language)
print(results[0].text)
Notes:
audiocan usually be provided as a local path, URL, base64 payload, or waveform tuple depending on backend support.- This repo includes a bundled example file,
audio.wav, whose spoken text isHun er oversatt til en rekke språk, men ikke norsk. language=Noneenables automatic language detection.- If you want forced decoding for a known language, set
language="Norwegian"if that matches your environment and prompt conventions.
Quick Start: vLLM Backend
from qwen_asr import Qwen3ASRModel
if __name__ == "__main__":
model = Qwen3ASRModel.LLM(
model="NbAiLab/nb-asr-beta-qwen06b-lunde03-verbatim",
gpu_memory_utilization=0.7,
max_inference_batch_size=128,
max_new_tokens=1024,
)
results = model.transcribe(
audio="audio.wav",
language=None,
)
print(results[0].language)
print(results[0].text)
Serving
You can expose an OpenAI-compatible endpoint with:
qwen-asr-serve NbAiLab/nb-asr-beta-qwen06b-lunde03-verbatim \
--gpu-memory-utilization 0.8 \
--host 0.0.0.0 \
--port 8000
Depending on the installed stack version, a standard vllm serve flow may also be appropriate.
Web Demo
To test the model in a local browser-based demo:
qwen-asr-demo \
--asr-checkpoint NbAiLab/nb-asr-beta-qwen06b-lunde03-verbatim \
--backend transformers \
--cuda-visible-devices 0 \
--ip 0.0.0.0 \
--port 8000
Then open:
http://<your-ip>:8000
Feedback Requested
During the beta period, the most useful feedback is:
- whether the model loads successfully,
- environment and installation problems,
- CUDA or OOM issues,
- inference crashes,
- batching or serving regressions,
- and compatibility with downstream evaluation or synchronization workflows.
If possible, include:
- GPU type,
- Python version,
- relevant package versions,
- backend used,
- approximate audio duration,
- and any error trace or logs.
Included Files
This staged HF repository includes the inference-facing model assets copied from the source checkpoint:
model.safetensorsconfig.jsongeneration_config.jsonpreprocessor_config.jsontokenizer.jsontokenizer_config.jsonspecial_tokens_map.jsonvocab.jsonmerges.txtadded_tokens.jsonchat_template.jinjaaudio.wav
Training-state files such as optimizer state, scheduler state, RNG snapshots, and trainer metadata were intentionally left out of this HF package.
Intended Scope
This checkpoint is meant for technical evaluation and repo maintenance during the beta phase. It should not be treated as a stable public benchmark or final production model without further validation.
Acknowledgements
This model is based on the open Qwen3-ASR framework and adapted by NB-ASR project at the National Library.
The following persons have contributed to the dataset creation and training:
- Freddy Wetjen
- Thea Tollersrud
- Phoebe Parsons
- Per Egil Kummervold
- Downloads last month
- 53