You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Accept the Apache-2.0 license and the NVIDIA Nemotron-3 base license. The model embeds Lorenzo Bernardini's Fractal-RL / THESIA research; cite arxiv:2503.01307 for cognitive-behaviors methodology.

Log in or Sign Up to review the conditions and access this model content.

Lorenzob/synch-2-merged

Full merged model: nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 (120B MoE · 12B active) + Lorenzob/synch-2 (v11 APOGEO LoRA · best reward +0.200).

Drop-in compatibile con: HF Dedicated Endpoints, TGI, vLLM, HF Inference API, Together AI, Modal, RunPod, Replicate.

Quickstart · Dedicated Endpoint

from huggingface_hub import InferenceClient
client = InferenceClient("Lorenzob/synch-2-merged", token="<HF_TOKEN>")
out = client.chat_completion(
    messages=[
        {"role": "user",
         "content": "Compute the Ollivier-Ricci curvature of K_5."},
    ],
    max_tokens=512, temperature=0.7,
)
print(out.choices[0].message.content)

Quickstart · TGI (Text Generation Inference)

docker run --gpus all --shm-size 1g -p 8080:80 \
    -v $PWD/data:/data \
    ghcr.io/huggingface/text-generation-inference:latest \
    --model-id Lorenzob/synch-2-merged --trust-remote-code \
    --num-shard 4 --max-input-length 4096 --max-total-tokens 8192

Quickstart · vLLM

python -m vllm.entrypoints.openai.api_server \
    --model Lorenzob/synch-2-merged --trust-remote-code \
    --dtype bfloat16 --tensor-parallel-size 4 \
    --max-model-len 8192

Quickstart · Locale (transformers full-weights)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained(
    "Lorenzob/synch-2-merged", trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
    "Lorenzob/synch-2-merged",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

Suggested Hardware

AWS · 4x H100 80GB (consigliato) — accetta 8x A100 80GB — pesi BF16 totali ~245 GB, 50 shard model-XXXXX-of-00050.safetensors.

Merge Details

  • Base: NVIDIA Nemotron-3-Super-120B-A12B-BF16 (120B MoE, 12B active)
  • Adapter: Lorenzob/synch-2 (v11 APOGEO, best reward +0.200)
  • LoRA rank: 64 · LoRA alpha: 32
  • Merge type: weight addition (peft merge_and_unload)
  • Attention impl: SDPA (default), FlashAttention-2 supported

Governance

Vedi i documenti dedicati nel repo:

Attribution

  • Cognitive behaviors: Gandhi et al. 2025 (arXiv:2503.01307)
  • Self-improving reasoner: karpathy/nanochat
  • Fractal RL · LCTR · THESIA: Lorenzo Bernardini publications.

License

Apache-2.0 (this merged model) · NVIDIA Nemotron-3 license applies to the underlying base weights distribution.

Downloads last month
324
Safetensors
Model size
124B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lorenzob/synch-2-merged

Finetuned
(15)
this model

Paper for Lorenzob/synch-2-merged

Evaluation results

  • Best Fractal-RL Composite Reward on Synch2 Internal Eval (private)
    self-reported
    0.200