sneakyfree commited on
Commit
8d1d3fe
·
verified ·
1 Parent(s): e7132b1

Refresh README — uniform WindyWord template with WER tier + dialect notes

Browse files
Files changed (1) hide show
  1. README.md +18 -14
README.md CHANGED
@@ -4,34 +4,38 @@ tags:
4
  - automatic-speech-recognition
5
  - whisper
6
  - windyword
7
- - ctranslate2
8
  - chinese
9
  - chinese
10
- library_name: ctranslate2
11
  pipeline_tag: automatic-speech-recognition
12
  language:
13
  - zh
14
  ---
15
 
16
- # WindyWord.ai STT — Chinese (Mandarin) Lingua (CPU INT8 / CTranslate2)
17
 
18
  **Transcribes Chinese (Mandarin) speech (Sino-Tibetan > Sinitic).**
19
 
20
- CPU-optimized INT8 quantization of [WindyWord/listen-windy-lingua-chinese](https://huggingface.co/WindyWord/listen-windy-lingua-chinese). Built via `ct2-transformers-converter` on 2026-04-28. Approximately 25% the size of the GPU safetensors variant, 2-4× faster on CPU, with negligible quality loss in our Grand Rounds testing.
21
 
22
- ## Usage
 
 
 
23
 
24
- ```python
25
- import ctranslate2
26
- from huggingface_hub import snapshot_download
27
 
28
- local = snapshot_download("WindyWord/listen-windy-lingua-chinese-ct2", allow_patterns="ct2-int8/*")
29
- model = ctranslate2.models.Whisper(f"{local}/ct2-int8/")
30
- ```
31
 
32
- ## GPU sibling
33
 
34
- For GPU inference, see [WindyWord/listen-windy-lingua-chinese](https://huggingface.co/WindyWord/listen-windy-lingua-chinese).
 
 
 
 
 
 
35
 
36
  ## Commercial Use
37
 
@@ -41,6 +45,6 @@ Visit [windyword.ai](https://windyword.ai) for apps and API access.
41
 
42
  ## Provenance & License
43
 
44
- INT8 quantization of WindyWord's GPU Chinese (Mandarin) Lingua model. Apache-2.0 (inherited).
45
 
46
  *Certified by Opus 4.6 Opus-Claw (Dr. C) on Veron-1 (RTX 5090, Mt Pleasant SC).*
 
4
  - automatic-speech-recognition
5
  - whisper
6
  - windyword
 
7
  - chinese
8
  - chinese
9
+ library_name: transformers
10
  pipeline_tag: automatic-speech-recognition
11
  language:
12
  - zh
13
  ---
14
 
15
+ # WindyWord.ai STT — Chinese (Mandarin) Lingua (CPU INT8 (CTranslate2))
16
 
17
  **Transcribes Chinese (Mandarin) speech (Sino-Tibetan > Sinitic).**
18
 
19
+ ## Quality
20
 
21
+ - **FLEURS WER:** 0.0% (50-sample audit)
22
+ - **CER:** 0.0
23
+ - **Tier:** EXCELLENT ⭐⭐⭐⭐⭐
24
+ - **Source:** WindyWord Grand Rounds v2 audit (50-sample FLEURS)
25
 
26
+ ## About this variant
 
 
27
 
28
+ This is the **ct2-int8** deployment format of our Chinese (Mandarin) Lingua STT model. Load it via the `ct2-int8/` subfolder.
 
 
29
 
30
+ Part of the [WindyWord.ai](https://windyword.ai) STT fleet — covering 35+ languages that commercial speech-to-text APIs underserve, with proper dialect / script disclosures where they matter.
31
 
32
+ ## Usage
33
+
34
+ ```python
35
+ from transformers import WhisperForConditionalGeneration, WhisperProcessor
36
+ processor = WhisperProcessor.from_pretrained("WindyWord/listen-windy-lingua-chinese-ct2", subfolder="ct2-int8")
37
+ model = WhisperForConditionalGeneration.from_pretrained("WindyWord/listen-windy-lingua-chinese-ct2", subfolder="ct2-int8")
38
+ ```
39
 
40
  ## Commercial Use
41
 
 
45
 
46
  ## Provenance & License
47
 
48
+ Weights derived from upstream community Whisper fine-tunes (see individual model card for exact lineage). Redistributed under Apache-2.0 (inherited).
49
 
50
  *Certified by Opus 4.6 Opus-Claw (Dr. C) on Veron-1 (RTX 5090, Mt Pleasant SC).*