Instructions to use elbruno/personaplex-7b-v1-onnx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Moshi
How to use elbruno/personaplex-7b-v1-onnx with Moshi:
# pip install moshi # Run the interactive web server python -m moshi.server --hf-repo "elbruno/personaplex-7b-v1-onnx" # Then open https://localhost:8998 in your browser
# pip install moshi import torch from moshi.models import loaders # Load checkpoint info from HuggingFace checkpoint = loaders.CheckpointInfo.from_hf_repo("elbruno/personaplex-7b-v1-onnx") # Load the Mimi audio codec mimi = checkpoint.get_mimi(device="cuda") mimi.set_num_codebooks(8) # Encode audio (24kHz, mono) wav = torch.randn(1, 1, 24000 * 10) # [batch, channels, samples] with torch.no_grad(): codes = mimi.encode(wav.cuda()) decoded = mimi.decode(codes) - Notebooks
- Google Colab
- Kaggle
PersonaPlex-7B-v1 ONNX
ONNX-exported components of NVIDIA PersonaPlex-7B-v1 for use with ElBruno.PersonaPlex C# library.
Files
| File | Size | Description |
|---|---|---|
| mimi_encoder.onnx | 178 MB | Mimi audio encoder (24kHz audio -> discrete tokens) |
| mimi_decoder.onnx | 170 MB | Mimi audio decoder (discrete tokens -> 24kHz audio) |
Architecture
These are the Mimi audio codec components of PersonaPlex, based on the Moshi architecture:
Encoder: SEANet convolutional encoder + ProjectedTransformer + SplitResidualVectorQuantizer
- Input: [batch, 1, samples] float32 (24kHz mono audio)
- Output: [batch, 8, frames] int64 (8 codebooks, ~12.5 frames/sec)
Decoder: SplitResidualVectorQuantizer + ProjectedTransformer + SEANet convolutional decoder
- Input: [batch, 8, frames] int64
- Output: [batch, 1, samples] float32
Usage with C#
`csharp using ElBruno.PersonaPlex.Pipeline;
// Models download automatically on first run using var pipeline = await PersonaPlexPipeline.CreateAsync(); `
Export Details
- Exported from PyTorch using orch.onnx.export (opset 17)
- Source: nvidia/personaplex-7b-v1
- Mimi model: 79.3M parameters
- Dynamic axes enabled for batch size and sequence length
License
MIT (same as the export tooling). The base model weights are under NVIDIA Open Model License.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support