Ameena-9B (Full Merged Model)
Qwen3.5-9B-Base with Tajik language continual pre-training, LoRA merged into base weights.
For the LoRA adapter only, see Tohirju/ameena-9B-lora.
Model Details
| Parameter | Value |
|---|---|
| Base model | unsloth/Qwen3.5-9B-Base (9.4B params) |
| Method | Continual Pre-Training (CPT) with LoRA, merged |
| Training data | ~370M tokens of Tajik text |
| Dataset | Tohirju/Tajik_Pretrain_370M_Qwen35_9B |
| Training steps | 25 (of 954) |
| Precision | bf16 (16-bit merged weights) |
| GPU | NVIDIA H200 (140GB) |
| Final loss | 1.597 |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Tohirju/ameena-9B-full", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Tohirju/ameena-9B-full")
inputs = tokenizer("\u0422\u043e\u04b7\u0438\u043a\u0438\u0441\u0442\u043e\u043d \u043a\u0438\u0448\u0432\u0430\u0440\u0438 \u0437\u0435\u0431\u043e \u0430\u0441\u0442", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
License
Apache 2.0 (same as base model)
- Downloads last month
- 11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support