mHuBERT-147 IPA Linear CTC FT

Fine-tuned English IPA phone-recognition model initialized from utter-project/mHuBERT-147 and trained with a compact linear CTC head.

This repository contains the full fine-tuned model:

  • mHuBERT-147 backbone
  • linear CTC head
  • audio preprocessor config
  • model size: about 94.4M backbone parameters + 35k linear-head parameters

Training setup:

  • initialized from utter-project/mHuBERT-147
  • top 4 encoder layers fine-tuned
  • trained on TIMIT train + Buckeye train
  • linear CTC head on top of frame embeddings

Validation results from the fine-tuning run:

  • TIMIT TEST: PER = 0.1012
  • Buckeye val: PER = 0.2082

Notes:

  • The output vocabulary is the same IPA set as in istomin9192/mHuBERT-147-ipa-head, with one extra CTC blank symbol at the last output index.

Minimal loading example:

import json
import librosa
import torch
from transformers import AutoFeatureExtractor, AutoModel

repo_id = "istomin9192/mHuBERT-147-ipa-linear-ctc-ft"

feature_extractor = AutoFeatureExtractor.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
model.eval()

with open("ipa_map.json", "r", encoding="utf-8") as f:
    id2phone = {int(k): v for k, v in json.load(f)["id2phone"].items()}

wav, sr = librosa.load(wav_file, sr=16000, mono=True)
inputs = feature_extractor(wav, sampling_rate=16000, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits[0]

pred_ids = logits.argmax(dim=-1).tolist()
blank_id = model.config.architecture["blank_id"]
phones = []
prev = blank_id
for pid in pred_ids:
    if pid != blank_id and pid != prev:
        phones.append(id2phone[pid])
    prev = pid

print(phones)
Downloads last month
65
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for istomin9192/mHuBERT-147-ipa-linear-ctc-ft

Finetuned
(15)
this model

Dataset used to train istomin9192/mHuBERT-147-ipa-linear-ctc-ft

Space using istomin9192/mHuBERT-147-ipa-linear-ctc-ft 1

Evaluation results