nanochat-d32-sae-layer16-topk32

Sparse Autoencoder trained on karpathy/nanochat-d32 (1.88B params).

Training Details

Setting Value
Base model nanochat-d32 (1.88B params, bfloat16)
Layer 16 (blocks.16.hook_resid_post)
SAE architecture TopK (k=32)
Dimensions 2048 โ†’ 8192 โ†’ 2048
Activations 50,000 from WikiText-103
Epochs 3
Best train loss 0.445701
Explained variance 57.3%
Alive features 2116/8192 (26%)

Usage

import torch
from sae.config import SAEConfig
from sae.models import TopKSAE

checkpoint = torch.load("sae_final.pt", map_location="cpu")
config = SAEConfig.from_dict(checkpoint["config"])
sae = TopKSAE(config)
sae.load_state_dict(checkpoint["sae_state_dict"])

# Normalize input activations before passing to SAE
act_mean = checkpoint["act_mean"]
act_std = checkpoint["act_std"]
normalized = (activations - act_mean) / act_std
reconstruction, features, metrics = sae(normalized)

Repository

Trained with nanochat-SAE.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support