charleslwang
/

parakeet-tdt-0.6b-HD-articulation

Automatic Speech Recognition

pathological-speech

huntingtons-disease

multitask-learning

Eval Results (legacy)

Model card Files Files and versions

charleslwang commited on Mar 3

Commit

c471e55

·

verified ·

1 Parent(s): 984ab6e

Create README.md

Files changed (1) hide show

README.md +73 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+---
+language:
+- en
+tags:
+- automatic-speech-recognition
+- speech
+- pathological-speech
+- dysarthria
+- huntingtons-disease
+- nemo
+- parakeet
+- multitask-learning
+- articulation
+license: apache-2.0
+pipeline_tag: automatic-speech-recognition
+library_name: nemo
+base_model: charleslwang/parakeet-tdt-0.6b-HD
+model-index:
+- name: parakeet-tdt-0.6b-HD-articulation
+  results:
+  - task:
+      type: automatic-speech-recognition
+      name: Automatic Speech Recognition
+    dataset:
+      name: Huntington Disease clinical speech test set
+      type: private
+    metrics:
+    - type: wer
+      value: 6.44
+      name: WER (%)
+---
+# Parakeet-TDT 0.6B HD Articulation
+Official checkpoint for the paper **"Towards Robust Automatic Speech Recognition for Huntington Disease."**
+## Model description
+This model is an articulation-aware variant of **Parakeet-TDT 0.6B HD**, tuned for automatic speech recognition on speech affected by **Huntington disease (HD)**. It extends the HD-adapted base model with auxiliary supervision from articulatory biomarker labels.
+## What this model does
+The model transcribes English read / controlled clinical speech from speakers with Huntington disease and healthy controls. It is intended as a research model for studying robust ASR under hyperkinetic motor-speech disruption and for analyzing the effect of articulatory supervision on transcription behavior.
+## Training
+The model was initialized from `charleslwang/parakeet-tdt-0.6b-HD` and further adapted using **parameter-efficient encoder-side adapters** with an auxiliary objective based on articulatory biomarker labels, while keeping the pretrained backbone frozen.
+## Evaluation
+On the reported HD test set, this model achieved:
+- **WER:** 6.44
+- **Substitutions:** 1.94
+- **Deletions:** 3.21
+- **Insertions:** 1.29
+## Intended use
+This model is intended for:
+- research on pathological / atypical speech recognition,
+- benchmarking ASR on Huntington disease speech,
+- studying how articulatory auxiliary supervision reshapes error behavior.
+It is **not** intended for clinical diagnosis, treatment decisions, or standalone medical use.
+## Limitations
+- Trained and evaluated on a relatively small, high-fidelity clinical corpus.
+- Primarily reflects controlled / read speech rather than spontaneous conversational speech.
+- Did not outperform the plain HD-adapted model on overall WER.
+- May not generalize to severe out-of-distribution impairment, other languages, or other recording conditions.