Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
willx7890
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
Commit History
Training in progress, step 10
89a0f27
verified
willx7890
commited on
Jul 1, 2025
Training in progress, step 50
d2e0193
verified
willx7890
commited on
Jul 1, 2025
Training in progress, step 40
3fc2192
verified
willx7890
commited on
Jul 1, 2025
Training in progress, step 30
bde3eda
verified
willx7890
commited on
Jul 1, 2025
Training in progress, step 20
3775032
verified
willx7890
commited on
Jul 1, 2025
Training in progress, step 10
8c66084
verified
willx7890
commited on
Jul 1, 2025
initial commit
92b4dab
verified
willx7890
commited on
Jul 1, 2025