OmniTranslate 1.0 LoRA

This is the LoRA version of OmniTranslate.

If you don't want to download the full merged model, you can download the LoRA and run with the base Qwen 3 0.6B.

Otherwise, stick to the full merged version, model card starts below.

OmniTranslate 1.0

OmniTranslate is a massively multilingual machine translation model supporting over 500 languages. Fine-tuned from Qwen 3 0.6B (with Unsloth), this model is designed for translation tasks on any device!

Features

500+ Languages Supported: The broadest coverage of languages supported for a translation model that's under 1 billion parameters!
Tiny Size: Beats any other large model on speed and memory usage. No other model is able to compete with this!

Issues

Accuracy on Common Languages: Accuracy on common languages in the dataset (like Spanish, Chinese, Romanian) is generally very good! Sometimes there's a chance that OmniTranslate can make hiccups. Examples are roșă and ami when translating to Romanian.
Accuracy on Rare Languages: Accuracy on rare languages in the dataset (like Toki Pona) isn't as good as on common languages!

As it follows, OmniTranslate 1.0 is a experimental model and shouldn't be used for tasks where accurate translations matter.

Notes

Providing the ISO code instead of the language name can improve the results a lot.

Usage

Code is by Gemini 3 Flash:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.0"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate (replace ron_Latn with your language here)
prompt = "<|im_start|>user\nTranslate to ron_Latn: OmniTranslate is a massively multilingual machine translation model supporting over 500 languages!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Data Used

I used my own OmniSurgical 1.0, which the dataset itself is a extract of HF's FineTranslations.

120 sentences per language (60 per language pair).

Uploaded model

Developed by: MihaiPopa-1
License: apache-2.0
Finetuned from model : unsloth/qwen3-0.6b-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth

Downloads last month: -; Downloads are not tracked for this model. How to track

MihaiPopa-1
/

OmniTranslate-1.0-LoRA