metadata
license: apache-2.0
language:
- en
- it
base_model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
library_name: transformers
pipeline_tag: text-generation
tags:
- nemotron_h
- sync2
- fractal-rl
- cognitive-behaviors
- self-improving
- thesia
- alignment
- merged
- nemotron-h
- conversational
- custom_code
- endpoints-ready
- tgi-compatible
- vllm-compatible
inference:
parameters:
temperature: 0.7
top_p: 0.9
max_new_tokens: 512
do_sample: true
widget:
- messages:
- role: system
content: You are sync2, a reasoner trained with Fractal RL.
- role: user
content: Spiega il principio di Ollivier-Ricci in 3 punti.
example_title: Reasoning · IT
- messages:
- role: user
content: >-
Write a Python function that computes Ollivier-Ricci curvature on a
graph.
example_title: Code · EN
model-index:
- name: Lorenzob/synch-2-merged
results:
- task:
type: text-generation
name: Cognitive Reasoning · Fractal-RL Composite
dataset:
type: internal
name: Synch2 Internal Eval (private)
metrics:
- type: fractal_rl_reward
value: 0.2
name: Best Fractal-RL Composite Reward
extra_gated_prompt: >-
Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The
model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite
arxiv:2503.01307 for cognitive-behaviors methodology.
extra_gated_fields:
Affiliation: text
Country: country
Intended use: text
Lorenzob/synch-2-merged
Full merged model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active) +
Lorenzob/synch-2 (v11 APOGEO LoRA · best reward +0.200).
Drop-in compatibile con: HF Dedicated Endpoints, TGI, vLLM, HF Inference API, Together AI, Modal, RunPod, Replicate.
Quickstart · Dedicated Endpoint
from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2-merged", token="<HF_TOKEN>")
out = client.chat_completion(
messages=[
{"role": "user",
"content": "Compute the Ollivier-Ricci curvature of K_5."},
],
max_tokens=512, temperature=0.7,
)
print(out.choices[0].message.content)
Quickstart · TGI (Text Generation Inference)
docker run --gpus all --shm-size 1g -p 8080:80 \
-v $PWD/data:/data \
ghcr.io/huggingface/text-generation-inference:latest \
--model-id Lorenzob/synch-2-merged --trust-remote-code \
--num-shard 4 --max-input-length 4096 --max-total-tokens 8192
Quickstart · vLLM
python -m vllm.entrypoints.openai.api_server \
--model Lorenzob/synch-2-merged --trust-remote-code \
--dtype bfloat16 --tensor-parallel-size 4 \
--max-model-len 8192
Quickstart · Locale (transformers full-weights)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained(
"Lorenzob/synch-2-merged", trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
"Lorenzob/synch-2-merged",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
Suggested Hardware
AWS · 4x H100 80GB (consigliato) — accetta 8x A100 80GB — pesi BF16 totali ~245 GB,
50 shard model-XXXXX-of-00050.safetensors.
Merge Details
- Base: NVIDIA Nemotron-3-Super-120B-A12B-BF16 (120B MoE, 12B active)
- Adapter:
Lorenzob/synch-2(v11 APOGEO, best reward +0.200) - LoRA rank: 64 · LoRA alpha: 32
- Merge type: weight addition (peft
merge_and_unload) - Attention impl: SDPA (default), FlashAttention-2 supported
Governance
Vedi i documenti dedicati nel repo:
bias.md— bias analysissafety.md— safety considerationsprivacy.md— privacy implicationsexplainability.md— interpretability notesaccuracy_chart.png— eval results visual
Attribution
- Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
- Self-improving reasoner: karpathy/nanochat
- Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.
License
Apache-2.0 (this merged model) · NVIDIA Nemotron-3 license applies to the underlying base weights distribution.