Model Description
AgenticQwen-8B is a small agentic language model trained on Qwen3-8B, designed for multi-step reasoning and tool use. It is trained with a multi-round reinforcement learning (GRPO-style) pipeline and a dual "data flywheel" mechanism that continually increases task difficulty for both reasoning and agentic workflows.