Upload 9 files

Browse files

Files changed (9) hide show

README.md +94 -3
adapter_config.json +44 -0
adapter_model.safetensors +3 -0
optimizer.pt +3 -0
rng_state.pth +3 -0
scaler.pt +3 -0
scheduler.pt +3 -0
trainer_state.json +0 -0
training_args.bin +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,94 @@
----
-license: apache-2.0
----

+---
+base_model: openai/whisper-small
+library_name: peft
+tags:
+- whisper
+- asr
+- uyghur
+- lo-ra
+- peft
+language:
+- ug
+datasets:
+- mozilla-foundation/common_voice_11_0
+license: apache-2.0
+metrics:
+- wer
+---
+# Whisper Small Uyghur LoRA (Fine-tuned)
+## ئۇچۇر (Description in Uyghur)
+بۇ مودېل `openai/whisper-small` ئاساسىدا ئۇيغۇرچە نۇتۇقنى تونۇش ئۈچۈن مەخسۇس تەربىيەلەنگەن. بىز LoRA تېخنىكىسىنى ئىشلىتىپ، ئۇيغۇرچە ئاۋازلارنى يۇقىرى ئېنىقلىقتا تېكىستكە ئايلاندۇرۇش مەقسىتىگە يەتتۇق.
+- **تەربىيەلەش سانلىق مەلۇماتى:** Mozilla Common Voice (Uyghur)
+- **قاتتىق دېتال:** NVIDIA GeForce RTX 3060 (9 سائەت تەربىيەلەنگەن)
+- **مەقسەت:** ئۇيغۇر تىلىنىڭ رەقەملىك ساھەدىكى تەرەققىياتى ۋە تىلنى قوغداشقا تۆھپە قوشۇش.
+---
+## Model Description (English)
+This model is a fine-tuned version of **OpenAI Whisper Small** for Uyghur Speech Recognition (ASR). It was trained using **LoRA (Low-Rank Adaptation)**, resulting in a lightweight but highly accurate adapter (approx. 13MB).
+- **Data Source:** [Mozilla Common Voice / Data Collective](https://community.mozilladatacollective.com/)
+- **Hardware:** Trained on a single **NVIDIA RTX 3060** GPU for approximately **9 hours**.
+- **Accuracy:** Fine-tuned to achieve high precision in recognizing Uyghur spoken language.
+---
+## ⚙️ Training Details
+- **Base Model:** `openai/whisper-small`
+- **Method:** PEFT (LoRA)
+- **Training Time:** ~9 hours
+- **Optimizer:** AdamW
+- **Adapter Size:** ~13.5 MB
+---
+## ⚠️ Disclaimer (ئاگاھلاندۇرۇش)
+**English:** This model is released for research, educational, and language preservation purposes only. The developer strongly opposes the use of this technology for mass surveillance, human rights violations, or any form of discrimination.
+**ئۇيغۇرچە:** بۇ مودېل پەقەت تەتقىقات، مائارىپ ۋە تىلنى قوغداش مەقسىتىدە ئېلان قىلىندى. بۇ تېخنىكىنى كۆزىتىش، كىشىلىك ھوقۇققا دەخلى-تەرۇز قىلىش ياكى كەمسىتىش خاراكتېرلىك ئىشلارغا ئىشلىتىشكە قەتئىي قارشى تۇرىمىز.
+---
+## How to use
+You can load this model using `PEFT` and `Transformers`. Since the processor is not included in this adapter-only repo, please load the processor from the base model.
+```python
+import torch
+import librosa
+from transformers import WhisperForConditionalGeneration, WhisperProcessor
+from peft import PeftModel
+# 1. Setup Model IDs
+base_model_id = "openai/whisper-small"
+peft_model_id = "xiwol/whisper-small-uyghur"
+# 2. Load Processor from the base model
+# Note: We specify language and task for Uyghur ASR
+processor = WhisperProcessor.from_pretrained(base_model_id, language="uyghur", task="transcribe")
+# 3. Load Base Model
+base_model = WhisperForConditionalGeneration.from_pretrained(
+    base_model_id,
+    device_map="auto",
+    torch_dtype=torch.float16
+)
+# 4. Load the LoRA Adapter from Hugging Face
+model = PeftModel.from_pretrained(base_model, peft_model_id)
+model.eval()
+# 5. Inference Example
+# Load your audio file (ensure 16kHz sampling rate)
+# audio, _ = librosa.load("your_audio_file.mp3", sr=16000)
+# input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features.to("cuda").half()
+# Generate Transcription
+# with torch.no_grad():
+#     predicted_ids = model.generate(input_features)
+#     transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
+#     print(f"Transcription: {transcription}")

adapter_config.json ADDED Viewed

	@@ -0,0 +1,44 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "WhisperForConditionalGeneration",
+    "parent_library": "transformers.models.whisper.modeling_whisper"
+  },
+  "base_model_name_or_path": "openai/whisper-small",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 64,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.0",
+  "qalora_group_size": 16,
+  "r": 32,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "v_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": null,
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e0cce8474c24115f1329c6ce93022f40c8aa84f87002d176dd9e8f003adf3037
+size 14176064

optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2e30db67afdf7c34b4feeb62ed33e5f592594b6fa899d7b9a84285eeb632b65c
+size 4906682

rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:457bf6f19646bf1743939cefce9a5b0d5b49e96a6338d575bf53183fdea502f2
+size 14244

scaler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:84658d7b97b8473c1a84b0a3f3653f13be312a20606632621e9ddb7ba3dc9db7
+size 988

scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5d123dc2866296bea0f4743636e5cee7f4b45387f7f2e9dcf90078ff1c863039
+size 1064

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6dbed90c84a6880a0fa13c58f65439bca66cc9920043d4e71a623fcb50155034
+size 5496