Gemma-4-31B_Opus-Reasoning_BF16
This is a fine-tuned and merged version of the Gemma-4 31B model, trained on the high-quality reasoning dataset Crownelius/Opus-4.6-Reasoning-3300x.
The primary goal of this project was to leverage Gemma-4's native <|channel> architecture to enforce strict, logical step-by-step reasoning before outputting a final answer. By fusing the Gemma-4 foundation with the Opus reasoning dataset, the model acts as a deeply analytical agent capable of planning complex cloud deployments and logical deductions.
Reasoning Format: Gemma 4 Architecture
This model adheres strictly to the Gemma 4 multimodal and reasoning formats. It outputs internal reasoning within <|channel>thought bounds before delivering the final response.
Below are raw, unedited samples from this merge:
User: I am building an automated AI training platform on AWS. I need to generate scalable Infrastructure-as-Code (IaC) templates to dynamically spin up GPU instances. Walk me through how I should decide between using standard Terraform versus the AWS Cloud Development Kit (CDK). Give me the pros and cons, and make a final recommendation assuming my platform logic is written in Python.
Model:
<|channel>thoughtLet me work through this step by step.Step 1: Understand the Requirements
- The goal is to create Infrastructure-as-Code for an automated AI training platform... (Context truncation for brevity)
- The platform logic is written in Python, so the IaC solution should integrate smoothly with that codebase. (Evaluates Terraform vs CDK)
Final Recommendation:
Since the platform is built on Python, and there is no immediate need for multi-cloud support, AWS CDK is the best choice...
<channel|>Final Answer: For your automated AI training platform on AWS, I recommend using AWS CDK instead of Terraform. Here's why...
Benchmarks: ARC Challenge
While standard knowledge benchmarks occasionally show minor regression during strict reasoning SFT, the structural output improvements are massive.
- Base (Gemma-4-31B):
acc_norm: 69.88% - Merged (Opus Reasoning):
acc_norm: 69.54%
Training Details
- Base Model:
google/gemma-4-31B - Dataset:
Crownelius/Opus-4.6-Reasoning-3300x - Training Framework: Eschaton Engine (Cloudbjorn)
- Format: Merged (Base + LoRA)
Training Precision:
- Compute Dtype: bfloat16
LoRA Parameters (Auto-Scaled for 31B via Eschaton):
- r: 16
- lora_alpha: 32
- target_modules:
all-linear
Hyperparameters:
- Optimizer: 8-bit Paged AdamW
- Effective Batch Size: 32 (Gradient Accumulation)
- Learning Rate: 2e-5
- LR Scheduler: Linear
- Epochs: 1
- Training Sequence Length: 2048
- Warmup Steps: 50
- Weight Decay: 0.01
- Downloads last month
- 170
Model tree for cloudbjorn/gemma-4-31B-Opus-4.6-Reasoning
Base model
google/gemma-4-31B