You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology.

Lorenzob/synch-2-merged

Full merged model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active) + Lorenzob/synch-2 (v11 APOGEO LoRA · best reward +0.200).

Drop-in compatibile con: HF Dedicated Endpoints, TGI, vLLM, HF Inference API, Together AI, Modal, RunPod, Replicate.

Quickstart · Dedicated Endpoint

from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2-merged", token="<HF_TOKEN>")
out = client.chat_completion(
    messages=[
        {"role": "user",
         "content": "Compute the Ollivier-Ricci curvature of K_5."},
    ],
    max_tokens=512, temperature=0.7,
)
print(out.choices[0].message.content)

Quickstart · TGI (Text Generation Inference)

docker run --gpus all --shm-size 1g -p 8080:80 \
    -v $PWD/data:/data \
    ghcr.io/huggingface/text-generation-inference:latest \
    --model-id Lorenzob/synch-2-merged --trust-remote-code \
    --num-shard 4 --max-input-length 4096 --max-total-tokens 8192

Quickstart · vLLM

python -m vllm.entrypoints.openai.api_server \
    --model Lorenzob/synch-2-merged --trust-remote-code \
    --dtype bfloat16 --tensor-parallel-size 4 \
    --max-model-len 8192

Quickstart · Locale (transformers full-weights)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained(
    "Lorenzob/synch-2-merged", trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
    "Lorenzob/synch-2-merged",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

Suggested Hardware

AWS · 4x H100 80GB (consigliato) — accetta 8x A100 80GB — pesi BF16 totali ~245 GB, 50 shard model-XXXXX-of-00050.safetensors.

Merge Details

Base: NVIDIA Nemotron-3-Super-120B-A12B-BF16 (120B MoE, 12B active)
Adapter: Lorenzob/synch-2 (v11 APOGEO, best reward +0.200)
LoRA rank: 64 · LoRA alpha: 32
Merge type: weight addition (peft merge_and_unload)
Attention impl: SDPA (default), FlashAttention-2 supported

Governance

Vedi i documenti dedicati nel repo:

bias.md — bias analysis
safety.md — safety considerations
privacy.md — privacy implications
explainability.md — interpretability notes
accuracy_chart.png — eval results visual

Attribution

Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
Self-improving reasoner: karpathy/nanochat
Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.

License

Apache-2.0 (this merged model) · NVIDIA Nemotron-3 license applies to the underlying base weights distribution.

Downloads last month: 324

Safetensors

Model size

124B params

Tensor type

F32

BF16

Model tree for Lorenzob/synch-2-merged

Base model

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

Finetuned

(15)

this model

Paper for Lorenzob/synch-2-merged

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3, 2025 • 38

Evaluation results

Best Fractal-RL Composite Reward on Synch2 Internal Eval (private)
self-reported

0.200