metadata
license: apache-2.0
language:
- en
- it
base_model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
library_name: transformers
pipeline_tag: text-generation
tags:
- nemotron_h
- sync2
- fractal-rl
- cognitive-behaviors
- self-improving
- thesia
- alignment
- lora
- peft
- custom_code
- endpoints-ready
inference:
parameters:
temperature: 0.7
top_p: 0.9
max_new_tokens: 512
do_sample: true
widget:
- text: |
<|system|>You are sync2, a reasoner trained with Fractal RL.
<|user|>Spiega in 3 punti il principio di Ollivier-Ricci.
<|assistant|>
example_title: Reasoning · IT
- text: |
<|system|>You are sync2.
<|user|>Write a Python function for Ricci curvature on a graph.
<|assistant|>
example_title: Code · EN
extra_gated_prompt: >-
Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The
model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite
arxiv:2503.01307 for cognitive-behaviors methodology.
extra_gated_fields:
Affiliation: text
Country: country
Intended use: text
Lorenzob/synch-2
Base model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active)
Variant: adapter-only (LoRA) · Final checkpoint: sync2_apogeo_si_iter1
Ready for HuggingFace Dedicated Inference Endpoints, HF Inference API, Together AI, Modal, vLLM/TGI compatible runtimes.
Quickstart · Dedicated Endpoint
from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2", token="<HF_TOKEN>")
out = client.text_generation(
"<|user|>Compute the Ollivier-Ricci curvature of K_5.<|assistant|>",
max_new_tokens=512, temperature=0.7,
)
print(out)
Quickstart · Local (PEFT)
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("Lorenzob/synch-2",
trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
"nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16", torch_dtype=torch.bfloat16,
device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, "Lorenzob/synch-2")
Suggested Hardware
AWS · 8x H100 80GB (o equivalente A100 80GB x4) — bfloat16 adapter + base = ~245 GB VRAM.
Training Pipeline
- SFT — 9-dataset supervised fine-tuning (KIMI-K2.5, Opus-4.6-Reasoning, coding-200k, tulu-3-math, multilingual, Nettoov, MLEM, tool-reasoning).
- EML-RLHF — symbolic regression reward (EML trees).
- Fractal RL — GRPO con 8-component reward (coherence + fractal + Ollivier-Ricci curvature + EML + math + reasoning + length + DPP).
- v11 APOGEO — alignment SFT (300 step) + self-improving reasoner
(2 iter × 80 prompt × top-30%). Best reward
+0.200.
Attribution
- Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
- Self-improving reasoner: karpathy/nanochat
- Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.
License
Apache-2.0 (this adapter) · NVIDIA Nemotron-3 license (base weights).