Magpie TTS Multilingual 357M β€” Core ML

Core ML port of NVIDIA's Magpie TTS Multilingual 357M, packaged for on-device inference on Apple Silicon (iPhone, iPad, Mac).

This repository contains only the converted Core ML model artifacts. Model weights, architecture, and training are entirely NVIDIA's work β€” all credit for the underlying model goes to the Magpie TTS team. This port adds only the Core ML conversion, iOS/macOS runtime integration, and packaging.

What's included

File Role
TextEncoder.mlmodelc Text β†’ encoder hidden states
DecoderPrefill.mlmodelc Batched speaker-context prefill (populates KV cache in one pass)
DecoderStep.mlmodelc Single autoregressive step with explicit KV cache I/O
NanocodecDecoder.mlmodelc Codec tokens β†’ 22 kHz waveform

All four are compiled .mlmodelc bundles, ready to load via MLModel(contentsOf:). FP16 weights, minimum_deployment_target = iOS 17.

Languages

English, Spanish, German, Mandarin, French, Italian, Vietnamese, Hindi, Japanese (9 languages, matching the NVIDIA original).

Model details

  • Base architecture: 12-layer causal decoder, d_model = 768, 12 self-attention heads, d_head = 64
  • 8 audio codebooks, 2016 codes + 8 special tokens each
  • Local transformer: 1-layer causal, d = 256, samples codebooks autoregressively per frame
  • Max text length: 256 tokens, max decoder sequence: 512
  • Output sample rate: 22 kHz

Compute-unit guidance

Tested on iPhone 15 Pro and M-series Macs:

Model Recommended compute unit Notes
TextEncoder .cpuAndNeuralEngine ANE-friendly
DecoderPrefill .cpuAndNeuralEngine Batched, benefits from ANE
DecoderStep .cpuOnly Weight-bandwidth bound; CPU matches GPU on Apple Silicon unified memory and avoids per-step GPU dispatch overhead. Also background-safe (Metal is suspended in background).
NanocodecDecoder .cpuOnly Contains ops/dimensions that exceed ANE limits; CPU beats GPU here too.

License & Attribution

This port inherits the license of the base model from NVIDIA. See the original NVIDIA model card for terms.

The model weights, architecture, and training are NVIDIA's work. This repository provides only a Core ML packaging. Please cite and credit the NVIDIA Magpie TTS team for any use of the underlying model.

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support