You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology.

Log in or Sign Up to review the conditions and access this model content.

Lorenzob/synch-2

Base model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active) Variant: adapter-only (LoRA) · Final checkpoint: sync2_apogeo_si_iter1

Ready for HuggingFace Dedicated Inference Endpoints, HF Inference API, Together AI, Modal, vLLM/TGI compatible runtimes.

Quickstart · Dedicated Endpoint

from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2", token="<HF_TOKEN>")
out = client.text_generation(
    "<|user|>Compute the Ollivier-Ricci curvature of K_5.<|assistant|>",
    max_new_tokens=512, temperature=0.7,
)
print(out)

Quickstart · Local (PEFT)

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("Lorenzob/synch-2",
                                    trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
    "nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16", torch_dtype=torch.bfloat16,
    device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, "Lorenzob/synch-2")

Suggested Hardware

AWS · 8x H100 80GB (o equivalente A100 80GB x4) — bfloat16 adapter + base = ~245 GB VRAM.

Training Pipeline

  1. SFT — 9-dataset supervised fine-tuning (KIMI-K2.5, Opus-4.6-Reasoning, coding-200k, tulu-3-math, multilingual, Nettoov, MLEM, tool-reasoning).
  2. EML-RLHF — symbolic regression reward (EML trees).
  3. Fractal RL — GRPO con 8-component reward (coherence + fractal + Ollivier-Ricci curvature + EML + math + reasoning + length + DPP).
  4. v11 APOGEO — alignment SFT (300 step) + self-improving reasoner (2 iter × 80 prompt × top-30%). Best reward +0.200.

Attribution

  • Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
  • Self-improving reasoner: karpathy/nanochat
  • Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.

License

Apache-2.0 (this adapter) · NVIDIA Nemotron-3 license (base weights).

Downloads last month
129
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lorenzob/synch-2

Adapter
(4)
this model

Paper for Lorenzob/synch-2