---
license: apache-2.0
base_model: unsloth/DeepSeek-R1-0528-Qwen3-8B-unsloth-bnb-4bit
library_name: peft
pipeline_tag: translation
language:
- ru
- en
tags:
- base_model:adapter:unsloth/DeepSeek-R1-0528-Qwen3-8B-unsloth-bnb-4bit
- Lyrics
- Songs
- Poetry
- Dynamic
- Creative
- Translation
- Russian
- English
- Soviet
- Verse
- Finetune
- lora
- orpo
- trl
- unsloth
- LLM
- edge
---
## LYRICAL Russian to English Machine Translation Model
**Variant 0.2a: ORPO-tuned DeepSeek-R1-0528 Rank64 Adapter**
***EPOCH 4 (2400 Steps)***
- **Developed by:** [SilverAgePoets.com](www.silveragepoets.com)
- **Model type:** [Lyrical Machine Translation]
- **Languages (NLP):** [Совпроясный, Worldish] aka [Russian, English]
- **Finetuned from:** [unsloth/DeepSeek-R1-0528-Qwen3-8B-unsloth-bnb-4bit]

Experimental WIP prototype adapter for **DeepSeek-R1-0528-Qwen3-8B**. <br>
This adapter belongs to our Lyrical MT (Machine Translation) series of fine-tuned LLMs and adapters. <br>
Our ultimate aim with the Lyrical MT project is to iteratively foster a translation model capable of adaptively localizing idiomatic, formal/poetic/rhythmic, and performance-catered features of lyrical input texts, whilst retaining adequate accuracy at the level of direct semantic translation. <br>

## USES:
**Intended scope of effective applicability limited to:** <br>
Russian to English translation of song lyrics, poems, scriptures, slogans, etc... <br>
Translation from a Russian-language input text structured in accordance with literary, aesthetic, or/and vocalization-catering compositional devices to an English output text exhibiting cross-lingually rebalanced approximations of source-matched formal features. <br>

**Depending on the relative performance, foundations, and the idiosyncracies of a given checkpoint/adapter variant in the Lyrical MT series, the above-suggested applicability scope may plausibly extend to:** <br>
Russian to English text-to-text translation in general. <br>
English to Russian translation. <br>

The Lyrical MT models were fine-tuned primarily on single-line (fragment), double-line (couplet), quadruple-line (quatrain), and full-length bilingual textual inputs. <br>

#### Training Info
The training was conducted on one L4 GPU (w/ 22.5 GB VRAM) via the TRL framework and the ORPO Trainer, leveraged via Unsloth over their 4-bit optimized dynamic quantized variant of the DeepseekR1 Qwen3-8B Distilled model.

*Training loss for the adapter variant herein (Step 2400, roughly 4+ epochs along in the training run):*
  Training Loss    rewards/chosen    rewards/rejected    rewards/accuracies    rewards/margins   
  0.042500	         -0.000802	        -0.731392	        1.000000	           0.730590

### Training Data
Fine-tuned for Odds Ratio Preference Optimization (ORPO) on our ORPO-catered [Russian-to-English song lyrics translation/localization dataset](https://huggingface.co/datasets/AlekseyCalvin/song_lyrics_Ru2En_PostSoviet_alt_anthems). <br>

### Hyperparameters
Adapter Rank = 64
Adapter Alpha = 64
Learning Rate = 1e-4
Max Sequence Length = 2048
Optimizer = AdamW_8bit
Learning Rate Scheduler Type = Linear
Beta/Decay = 0.1
Warmup Steps = 5

### Framework versions
PEFT 0.17.1
transformers 4.55.4

### Note:
We would appreciate feedback/reports from anyone else who happens to try out this model, or its other variants (to be released in the near future). <br>