ChessQwen3Base-6p5e19

Small Qwen3 base (pretrained) models for the chess compute-allocation study, all trained at a fixed total compute of C = 6.5e19 FLOPs.

The collection is a matrix of model size ร— alpha (pretrain-compute fraction). To keep one tidy repo, the two axes are mapped onto Hub primitives:

  • alpha โ†’ git branch (revision) โ€” e.g. alpha0.200
  • size โ†’ folder (subfolder) โ€” e.g. 20m

Loading

These models ship a custom tokenizer (tokenizer.py), so trust_remote_code=True is required. Because the custom-code loader in transformers does not honor subfolder=, snapshot the size folder locally first and load from that path:

from huggingface_hub import snapshot_download
from transformers import AutoModelForCausalLM, AutoTokenizer

repo  = "pavelslab-nyu/ChessQwen3Base-6p5e19"
size  = "20m"          # 5m, 10m, 20m, 32m, 50m, 100m, 410m, 680m, 1000m, ...
alpha = "alpha0.200"   # pick a branch from "Available branches" below

path = snapshot_download(repo, revision=alpha, allow_patterns=f"{size}/*") + f"/{size}"
model = AutoModelForCausalLM.from_pretrained(path, trust_remote_code=True)
tok   = AutoTokenizer.from_pretrained(path, trust_remote_code=True)

Do not pass subfolder="20m" directly to from_pretrained. The model weights would load, but AutoTokenizer would fail with ... does not appear to have a file named tokenizer.py โ€” the remote-code resolver looks for tokenizer.py at the repo root and ignores subfolder. The snapshot_download recipe above sidesteps this.

Available branches (alphas)

Pick the branch from the repo's branch dropdown or pass it as revision. Not every (size, alpha) pair exists; browse a branch to see which sizes it holds.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including pavelslab-nyu/ChessQwen3Base-6p5e19