---
language: en
license: apache-2.0
library_name: mlx
tags:
- hrm
- fine-tuning
- qlora
- sft
- hierachical-reasoning
model-index:
- name: hrm-1b-sft-v6
  results:
  - task:
      type: text-generation
    metrics:
    - type: weighted-score
      value: 61.7
      name: v3 Benchmark Weighted Score
datasets:
- glaiveai/glaive-function-calling-v2
- iamtarun/code_instructions_120k_alpaca
- openai/gsm8k
- yahma/alpaca-cleaned
- HuggingFaceTB/cosmopedia-100k
---

# HRM-Text-1B SFT QLoRA Adapters (v6)

QLoRA fine-tuned adapters for `Aryagm/HRM-Text-1B-MLX-4bit`, a 1B-parameter hierarchical reasoning model with a recurrent architecture (H=2, L=3 = 8 passes per token).

Trained entirely on an 8GB M2 Mac Mini. **Part of the [Sid Local LLM Benchmark v3](https://github.com/blackdeerbits/sid-local-llm-bench).**

## Results

| Metric | Base Model | Fine-Tuned (v6) | Delta |
|--------|-----------|-----------------|-------|
| Overall Weighted Score | 58.3% | **61.7%** | **+3.4%** |
| AGENT (tool calling) | 10% | **60%** | **+50pp** |
| CODE | 70% | 60% | -10pp |
| HALL (hallucination resistance) | 62% | **75%** | **+13pp** |
| INST (instruction following) | 40% | **60%** | **+20pp** |
| CTX (context reasoning) | 75% | 75% | 0 |

## Files

- `adapters.npz` — final v6 QLoRA weights (~22MB)
- `best_adapters.npz` — best-validation checkpoint (identical to final)

## Training Details

| Parameter | Value |
|-----------|-------|
| Base model | Aryagm/HRM-Text-1B-MLX-4bit (4-bit MXFP4) |
| Method | QLoRA (rank=16, alpha=32) |
| Target layers | Attention projections only (gqkv_proj, o_proj) |
| Training samples | 2,000 |
| Iterations | 2,000 |
| Batch size | 1 (gradient accumulation) |
| Learning rate | 2e-5 |
| Optimizer | AdamW |
| Loss | Masked response loss (answer tokens only) |
| Hardware | Apple M2 Mac Mini, 8GB unified memory |

### Dataset Composition

| Source | % | Count |
|--------|---|-------|
| glaiveai/glaive-function-calling-v2 (AGENT) | 20% | 400 |
| iamtarun/code_instructions_120k_alpaca (CODE) | 30% | 600 |
| yahma/alpaca-cleaned (INST) | 25% | 500 |
| openai/gsm8k (MATH) | 15% | 300 |
| HuggingFaceTB/cosmopedia-100k (REPLAY) | 10% | 200 |

## Usage

```python
import mlx.core as mx
from mlx_hrm_text.runner import HRMTextGenerator
from mlx_hrm_text.model import HrmTextForCausalLM, set_metal_swiglu
from pathlib import Path

set_metal_swiglu(True)

# Load base model
gen = HRMTextGenerator(
    model_dir="Aryagm/HRM-Text-1B-MLX-4bit",
    temperature=0.3,
)

# Freeze and apply LoRA
gen.model.freeze()

# Patch attention projections
from mlx.nn import Module
class LoRALinear(Module):
    def __init__(self, linear, r=16, alpha=32):
        super().__init__()
        self.linear = linear
        self.linear.freeze()
        self.r = r
        self.scale = alpha / r
        out_f, in_f = linear.weight.shape
        self.lora_a = mx.random.normal((in_f, r)) / r
        self.lora_b = mx.zeros((r, out_f))
    def __call__(self, x):
        dtype = x.dtype
        return self.linear(x) + (x @ self.lora_a.astype(dtype) @ self.lora_b.astype(dtype)) * self.scale

def apply_lora(module):
    for block in module.layers:
        block.attn.gqkv_proj = LoRALinear(block.attn.gqkv_proj)
        block.attn.o_proj = LoRALinear(block.attn.o_proj)

apply_lora(gen.model.model.H_module)
apply_lora(gen.model.model.L_module)

# Load adapters
flat = mx.load("adapters.npz")
# (Full recursive population in run_hrm_lora_bench.py on GitHub)

result = gen.generate("Write a Python function to reverse a string.")
print(result.text)
```

## Links

- **Full code + benchmark data:** [github.com/blackdeerbits/sid-local-llm-bench](https://github.com/blackdeerbits/sid-local-llm-bench)
- **HRM fine-tuning article:** [reddeerinv.com/ai/hrm-fine-tuning-journey/](https://reddeerinv.com/ai/hrm-fine-tuning-journey/)
- **Base model:** [huggingface.co/Aryagm/HRM-Text-1B-MLX-4bit](https://huggingface.co/Aryagm/HRM-Text-1B-MLX-4bit)

## Citation

```bibtex
@misc{reddeer2026hrm,
  author = {the_red_deer},
  title = {The HRM Fine-Tuning Journey},
  year = {2026},
  url = {https://reddeerinv.com/ai/hrm-fine-tuning-journey/}
}
```