You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology.

Lorenzob/synch-2

Base model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active) Variant: adapter-only (LoRA) · Final checkpoint: sync2_apogeo_si_iter1

Ready for HuggingFace Dedicated Inference Endpoints, HF Inference API, Together AI, Modal, vLLM/TGI compatible runtimes.

Quickstart · Dedicated Endpoint

from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2", token="<HF_TOKEN>")
out = client.text_generation(
    "<|user|>Compute the Ollivier-Ricci curvature of K_5.<|assistant|>",
    max_new_tokens=512, temperature=0.7,
)
print(out)

Quickstart · Local (PEFT)

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("Lorenzob/synch-2",
                                    trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
    "nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16", torch_dtype=torch.bfloat16,
    device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, "Lorenzob/synch-2")

Suggested Hardware

AWS · 8x H100 80GB (o equivalente A100 80GB x4) — bfloat16 adapter + base = ~245 GB VRAM.

Training Pipeline

SFT — 9-dataset supervised fine-tuning (KIMI-K2.5, Opus-4.6-Reasoning, coding-200k, tulu-3-math, multilingual, Nettoov, MLEM, tool-reasoning).
EML-RLHF — symbolic regression reward (EML trees).
Fractal RL — GRPO con 8-component reward (coherence + fractal + Ollivier-Ricci curvature + EML + math + reasoning + length + DPP).
v11 APOGEO — alignment SFT (300 step) + self-improving reasoner (2 iter × 80 prompt × top-30%). Best reward +0.200.

Attribution

Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
Self-improving reasoner: karpathy/nanochat
Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.

License

Apache-2.0 (this adapter) · NVIDIA Nemotron-3 license (base weights).

Downloads last month: 129

Model tree for Lorenzob/synch-2

Base model

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

Adapter

(4)

this model

Paper for Lorenzob/synch-2

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3, 2025 • 38