--- license: apache-2.0 language: - en - it base_model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 library_name: transformers pipeline_tag: text-generation tags: - nemotron_h - sync2 - fractal-rl - cognitive-behaviors - self-improving - thesia - alignment - lora - peft - custom_code - endpoints-ready inference: parameters: temperature: 0.7 top_p: 0.9 max_new_tokens: 512 do_sample: true widget: - text: | <|system|>You are sync2, a reasoner trained with Fractal RL. <|user|>Spiega in 3 punti il principio di Ollivier-Ricci. <|assistant|> example_title: Reasoning · IT - text: | <|system|>You are sync2. <|user|>Write a Python function for Ricci curvature on a graph. <|assistant|> example_title: Code · EN extra_gated_prompt: >- Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology. extra_gated_fields: Affiliation: text Country: country Intended use: text --- # Lorenzob/synch-2 **Base model**: `nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16` (120B MoE · 12B active) **Variant**: adapter-only (LoRA) · **Final checkpoint**: `sync2_apogeo_si_iter1` Ready for **HuggingFace Dedicated Inference Endpoints**, HF Inference API, Together AI, Modal, vLLM/TGI compatible runtimes. ## Quickstart · Dedicated Endpoint ```python from huggingface_hub import InferenceClient client = InferenceClient("Lorenzob/synch-2", token="") out = client.text_generation( "<|user|>Compute the Ollivier-Ricci curvature of K_5.<|assistant|>", max_new_tokens=512, temperature=0.7, ) print(out) ``` ## Quickstart · Local (PEFT) ```python import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("Lorenzob/synch-2", trust_remote_code=True) base = AutoModelForCausalLM.from_pretrained( "nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) model = PeftModel.from_pretrained(base, "Lorenzob/synch-2") ``` ## Suggested Hardware AWS · 8x H100 80GB (o equivalente A100 80GB x4) — `bfloat16` adapter + base = ~245 GB VRAM. ## Training Pipeline 1. **SFT** — 9-dataset supervised fine-tuning (KIMI-K2.5, Opus-4.6-Reasoning, coding-200k, tulu-3-math, multilingual, Nettoov, MLEM, tool-reasoning). 2. **EML-RLHF** — symbolic regression reward (EML trees). 3. **Fractal RL** — GRPO con 8-component reward (coherence + fractal + Ollivier-Ricci curvature + EML + math + reasoning + length + DPP). 4. **v11 APOGEO** — alignment SFT (300 step) + self-improving reasoner (2 iter × 80 prompt × top-30%). Best reward `+0.200`. ## Attribution - Cognitive behaviors: Gandhi et al. 2025 ([arXiv:2503.01307](https://arxiv.org/abs/2503.01307)) - Self-improving reasoner: [karpathy/nanochat](https://github.com/karpathy/nanochat) - Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications. ## License Apache-2.0 (this adapter) · NVIDIA Nemotron-3 license (base weights).