Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -88,13 +88,12 @@ This is the **small** variant (~9.5M parameters). PAWN is designed as a frozen b
|
|
| 88 |
|
| 89 |
### Accuracy Ratios
|
| 90 |
|
| 91 |
-
PAWN is trained on uniformly random chess games, so top-1 accuracy has a hard theoretical ceiling. Ratios above 100% on the unconditioned ceiling indicate the model
|
| 92 |
|
| 93 |
| Ceiling | Ratio |
|
| 94 |
|---------|-------|
|
| 95 |
-
| Unconditioned (E\[1/N_legal\] = 6.
|
| 96 |
-
|
|
| 97 |
-
| Bayes-optimal conditioned (MCTS, 32 rollouts = 7.92%) | 85% |
|
| 98 |
|
| 99 |
|
| 100 |
## Probe Results
|
|
|
|
| 88 |
|
| 89 |
### Accuracy Ratios
|
| 90 |
|
| 91 |
+
PAWN is trained on uniformly random chess games, so top-1 accuracy has a hard theoretical ceiling. Ratios above 100% on the unconditioned ceiling indicate the model exploits the outcome token to make non-uniform predictions. The MC conditioned ceiling is an estimate reported as a bracket \[corrected, naive\]; see [Accuracy Ceiling Analysis](https://github.com/thomas-schweich/PAWN/blob/main/docs/ACCURACY_CEILING.md) for methodology.
|
| 92 |
|
| 93 |
| Ceiling | Ratio |
|
| 94 |
|---------|-------|
|
| 95 |
+
| Unconditioned (E\[1/N_legal\] = 6.52%) | 103% |
|
| 96 |
+
| Bayes-optimal conditioned (MC, 128 rollouts = \[6.67, 7.34\]%) | 92–101% |
|
|
|
|
| 97 |
|
| 98 |
|
| 99 |
## Probe Results
|