Transformers
English
Italian
gpt2
causal-lm
bilingual
english
italian
llm-nanochat
qualitative-candidate
Instructions to use nazdef/gpt2small-en-it-nanochat-lr2e4-bs6-wsds700-final2e5-webwiki-step7525 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nazdef/gpt2small-en-it-nanochat-lr2e4-bs6-wsds700-final2e5-webwiki-step7525 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nazdef/gpt2small-en-it-nanochat-lr2e4-bs6-wsds700-final2e5-webwiki-step7525", dtype="auto") - Notebooks
- Google Colab
- Kaggle
GPT-2-small EN/IT NanoChat - WSD-S final2e5 behavior candidate (step_7525)
This repository publishes the behavior / generation best candidate checkpoint from the paper-like WSD-S continuation:
- run:
20260608_resume-gpt2small-lr2e4-bs6-wsds700-final2e5-webwiki-step7000 - checkpoint:
step_7525.pt - role: qualitative / generation candidate
Why this repo exists
This checkpoint is not the official benchmark champion. The same run's benchmark winner remains step_7700 with val_loss_mixed = 5.1189.
This checkpoint is published because it looked cleaner for generation behavior:
loop_rate = 0.475distinct_2 = 0.4510language_consistency_en = 1.000val_loss_mixed = 5.1725
Probe rank / probability snapshot
The capital of Italy is-> expectedRomecorrect_token_rank = 43correct_token_probability = 0.0028533935546875
A small language model should-> expectedbecorrect_token_rank = 1correct_token_probability = 0.59375
La capitale d'Italia è-> expectedRomacorrect_token_rank = 275correct_token_probability = 0.00037384033203125
Un piccolo modello linguistico dovrebbe-> expectedesserecorrect_token_rank = 1correct_token_probability = 0.4453125
Aggregate probe read
correct_token_rank_mean = 80.0correct_token_rank_p50 = 22.0correct_token_probability_mean = 0.2605724334716797top10_contains_correct_rate = 0.5
Files included
- original
.ptcheckpoint - exported
.safetensorsweights plus metadata sidecar - tokenizer files
- training config
- run telemetry (
best_validation.json,metrics.jsonl,eval_metrics.jsonl,probe_generations.jsonl) - repo-native benchmark bundle (
eval_summary.json,comparison.json,benchmark_report.md,benchmark_metrics.json,benchmark_scores.json,benchmark_source_losses.json)
Caveats
- generations are still repetitive and brittle
- factual capital probes remain weak even when procedural probes are strong
- use
step_7700for benchmark-first comparison,step_7525for behavior-side comparison
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nazdef/gpt2small-en-it-nanochat-lr2e4-bs6-wsds700-final2e5-webwiki-step7525", dtype="auto")