nebula-8lang-7b
Fine-tuned Qwen/Qwen2.5-7B for translating Nebula — a universal code intermediate language — into 8 target programming languages: Python, JavaScript, TypeScript, Go, Swift, Kotlin, Rust, and C.
Part of the Nebula 1.0 release. Nebula is a token-efficient canonical form that compresses 16% smaller than source code on average across 8 languages, while round-tripping cleanly back to any of them.
Training
| Base model | Qwen/Qwen2.5-7B |
| Method | LoRA (SFT) |
| LoRA rank / alpha | 16 / 16 |
| LoRA dropout | 0.05 |
| LoRA modules | all-linear |
| Epochs | 3 |
| Learning rate | 1e-5 |
| Batch size | 8 |
| Training data | electrocampbell/nebula-8lang-68k (68K pairs) |
| Trained on | Together AI |
Evaluation
HumanEval (164 problems, Nebula→Python, Pass@1):
| Model | Raw | With Error Correction |
|---|---|---|
| nebula-8lang-1.5b | 45.1% | 79.3% |
| nebula-8lang-7b (this model) | 67.7% | 88.4% |
| nebula-8lang-14b | 57.9% (89.0% on H100) | 88.4% |
MBPP (500 problems, Nebula→Python, Pass@1): 55.4%
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("electrocampbell/nebula-8lang-7b")
model = AutoModelForCausalLM.from_pretrained("electrocampbell/nebula-8lang-7b")
system = "You are a code translator. Given code in Nebula (a universal intermediate language), produce the equivalent idiomatic Python code. Output only the Python code, no explanations."
nebula_code = '''fn add(a, b): rt a + b'''
messages = [
{"role": "system", "content": system},
{"role": "user", "content": nebula_code},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
out = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
Replace the system prompt's Python with any of: JavaScript, TypeScript, Go, Swift, Kotlin, Rust, C.
Citation
If you use this model, please cite the Nebula project: https://github.com/colinc86/nebula
License
Apache 2.0, inherited from the Qwen 2.5 base model.
- Downloads last month
- 115