wheattoast11
/

qwen3-30b-atgrpo-production-k8

Reinforcement Learning

Model card Files Files and versions

qwen3-30b-atgrpo-production-k8 / vocab.json

wheattoast11's picture

Upload AT-GRPO adapter (400 steps, K=8 production)

6cf4dd4 verified 5 months ago

history contribute delete

2.78 MB

File too large to display, you can check the raw version instead.