Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,11 @@
|
|
| 1 |
### xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-Overlap
|
| 2 |
|
| 3 |
Experimental checkpoint from "Data Overlap as a Post-Training Hyperparameter for Autoformalization." This is the **SFT+GRPO with 100% overlap** variant (Qwen3-8B, thinking disabled) -- the control condition where GRPO reuses SFT data entirely. See the [paper repo](https://github.com/suxls/data-overlap-autoformalization) for details, results, and all artifacts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
### xiaolesu/OsmosisProofling-SFT-NT-GRPO-NT-Overlap
|
| 2 |
|
| 3 |
Experimental checkpoint from "Data Overlap as a Post-Training Hyperparameter for Autoformalization." This is the **SFT+GRPO with 100% overlap** variant (Qwen3-8B, thinking disabled) -- the control condition where GRPO reuses SFT data entirely. See the [paper repo](https://github.com/suxls/data-overlap-autoformalization) for details, results, and all artifacts.
|
| 4 |
+
|
| 5 |
+
## 📄 Paper
|
| 6 |
+
|
| 7 |
+
This model is part of the experiments in:
|
| 8 |
+
|
| 9 |
+
**SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization**
|
| 10 |
+
Xiaole Su, Kasey Zhang, Andy Lyu
|
| 11 |
+
https://arxiv.org/abs/2604.13515
|