You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology.
Log in or Sign Up to review the conditions and access this model content.
Lorenzob/synch-2
Base model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active)
Variant: adapter-only (LoRA) · Final checkpoint: sync2_apogeo_si_iter1
Ready for HuggingFace Dedicated Inference Endpoints, HF Inference API, Together AI, Modal, vLLM/TGI compatible runtimes.
Quickstart · Dedicated Endpoint
from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2", token="<HF_TOKEN>")
out = client.text_generation(
"<|user|>Compute the Ollivier-Ricci curvature of K_5.<|assistant|>",
max_new_tokens=512, temperature=0.7,
)
print(out)
Quickstart · Local (PEFT)
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("Lorenzob/synch-2",
trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
"nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16", torch_dtype=torch.bfloat16,
device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, "Lorenzob/synch-2")
Suggested Hardware
AWS · 8x H100 80GB (o equivalente A100 80GB x4) — bfloat16 adapter + base = ~245 GB VRAM.
Training Pipeline
- SFT — 9-dataset supervised fine-tuning (KIMI-K2.5, Opus-4.6-Reasoning, coding-200k, tulu-3-math, multilingual, Nettoov, MLEM, tool-reasoning).
- EML-RLHF — symbolic regression reward (EML trees).
- Fractal RL — GRPO con 8-component reward (coherence + fractal + Ollivier-Ricci curvature + EML + math + reasoning + length + DPP).
- v11 APOGEO — alignment SFT (300 step) + self-improving reasoner
(2 iter × 80 prompt × top-30%). Best reward
+0.200.
Attribution
- Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
- Self-improving reasoner: karpathy/nanochat
- Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.
License
Apache-2.0 (this adapter) · NVIDIA Nemotron-3 license (base weights).
- Downloads last month
- 129