A collection with text-classification and token-classification models for PII Protection
Alvaro Bartolome
AI & ML interests
machine learning @huggingface (inference + cloud)
Recent Activity
liked a model about 1 hour ago
unsloth/Qwen3.6-35B-A3B-GGUF upvoted a changelog about 4 hours ago
Introducing Kernels reacted to sergiopaniego's post with π€ about 4 hours ago
Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy
Andβ¦ it's already supported in TRL, built by Kashif Rasul. you can really feel the pace of development in the team π
Paper by Ruixiang ZHANG, He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang at Apple π
How it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed
You can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder):
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd.py
or benchmark a checkpoint with the eval script:
https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd_eval.py
One neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train Γ T_eval, so a broad band of configs works well. even very noisy samples still help
Want to dig deeper?
Paper: https://huggingface.co/papers/2604.01193
Trainer docs: https://huggingface.co/docs/trl/main/en/ssd_trainer