Azure Advisor Qwen3.5-0.8B (SFT)
Supervised fine-tune of Qwen/Qwen3.5-0.8B on
thegovind/azure-advisor-sft
to produce structured Azure Advisor-style recommendations.
See the GRPO model for the post-GRPO checkpoint and hill-climbing verification.
Supervised fine-tune of Qwen/Qwen3.5-0.8B on
thegovind/azure-advisor-sft
to produce structured Azure Advisor-style recommendations.
See the GRPO model for the post-GRPO checkpoint and hill-climbing verification.