File size: 1,982 Bytes
fcae46e 0ac2ec8 d3e73f6 0ac2ec8 ac41e39 4450000 ac41e39 3a34af2 0ac2ec8 fcae46e 3a34af2 5f8adb5 3a34af2 5f8adb5 3a34af2 5f8adb5 3a34af2 5f8adb5 3a34af2 5f8adb5 3a34af2 5f8adb5 a9fa15c 5f8adb5 3a34af2 5f8adb5 74443d8 3a34af2 5f8adb5 3a34af2 a9fa15c 3a34af2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | ---
license: mit
language:
- en
base_model: juanquivilla/sotto-cleanup-lfm25-350m
tags:
- speech-to-text
- transcript-cleanup
- text-correction
- asr-post-processing
- LFM
- LiquidAI
- mlx
- mlx-5bit
pipeline_tag: text-generation
---
# SottoASR Transcript Cleanup — LFM2.5-350M MLX 5-bit (v51)
[sottoasr.app](https://sottoasr.app) · [Full precision (bf16)](https://huggingface.co/juanquivilla/sotto-cleanup-lfm25-350m) · [MLX 4-bit (smaller)](https://huggingface.co/juanquivilla/sotto-cleanup-lfm25-350m-mlx-4bit)
## Overview
MLX 5-bit affine quantization of [juanquivilla/sotto-cleanup-lfm25-350m](https://huggingface.co/juanquivilla/sotto-cleanup-lfm25-350m). Recommended for Apple Silicon — best size/quality trade-off.
## What's new in v51
v51 extends v45 with targeted training data for five failure modes (multi-number sentences,
year-context drift, disconnected number lists, within-input duplicates, long-form preservation),
each generated programmatically and audited with a Qwen3.6-27B judge.
| Metric | v45 | **v51** |
|---|---:|---:|
| Number accuracy | 95.9% | **95.3%** |
| Adversarial benchmark (greedy) | 76% | **86%** |
See the [bf16 model card](https://huggingface.co/juanquivilla/sotto-cleanup-lfm25-350m) for the full pipeline and benchmark numbers.
## Quantization Recipe
```bash
mlx_lm.convert \
--hf-path juanquivilla/sotto-cleanup-lfm25-350m \
--mlx-path sotto-cleanup-lfm25-350m-mlx-5bit \
-q --q-bits 5 --q-group-size 64 \
--trust-remote-code
```
## Usage
```python
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler
model, tokenizer = load("juanquivilla/sotto-cleanup-lfm25-350m-mlx-5bit")
sampler = make_sampler(temp=0.0)
text = "talk about server three sixty"
prompt = f"### Input:\n{text}\n\n### Output:\n"
output = generate(model, tokenizer, prompt=prompt, max_tokens=512, sampler=sampler)
if "###" in output:
output = output[:output.index("###")].strip()
print(output)
```
## License
MIT
|