nanochat Turkish d20 / 32k Raw Checkpoint

This repository stores a raw nanochat checkpoint bundle. It is intended for restoring or evaluating this repository's nanochat implementation. It is not yet converted to the Hugging Face Transformers from_pretrained format.

Checkpoint

  • Model tag: tr_d20_bpe_32768_chinchilla20
  • Step: 17100
  • Tokenizer: bpe_32768
  • Base dir on UHeM: /ari/users/nunal/nanochat-turk-d20-bpe32k
  • Checkpoint dir on UHeM: /ari/users/nunal/nanochat-turk-d20-bpe32k/base_checkpoints/tr_d20_bpe_32768_chinchilla20
  • Tokenizer dir on UHeM: /ari/users/nunal/nanochat-turk-d20-bpe32k/tokenizers/bpe_32768
  • Training job id: 492421
  • CETVEL job id: unknown

Model Config

{
  "n_embd": 1280,
  "n_head": 10,
  "n_kv_head": 10,
  "n_layer": 20,
  "sequence_len": 2048,
  "vocab_size": 32768,
  "window_pattern": "L"
}

Training Config Highlights

  • Depth: 20
  • Vocab size: 32768
  • Sequence length: 2048
  • Device batch size: 4
  • Total batch size: 1048576
  • Window pattern: L

Contents

The important files are:

  • checkpoint/model_017100.pt
  • checkpoint/meta_017100.json
  • checkpoint/optim_017100_rank*.pt if optimizer shards were uploaded
  • tokenizer/tokenizer.pkl
  • tokenizer/tokenizer_config.json
  • tokenizer/token_bytes.pt
  • report/ and logs/ when available
  • cetvel_out/ when CETVEL has completed
  • provenance/upload_manifest.json

CETVEL

No CETVEL output directory was found at upload time.

Provenance

  • Git branch: unknown
  • Git commit: unknown
  • Git dirty at upload time: False
  • Uploaded at: 2026-06-08T11:46:49.296475+00:00

Caveat

This is a raw research checkpoint. Use the source code in this repository to load it, or convert it separately before expecting Transformers-compatible loading.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support