Gemma 4 E2B — SEC Financial Extraction (v2, GGUF)
A fine-tuned Gemma 4 E2B model specialized for extracting structured financial data from SEC filings. Quantized to Q4_K_M GGUF for efficient local inference.
This model is part of an actively maintained extraction pipeline with a clear goal: drive hallucination rates to negligible levels across a growing range of SEC filing types. We are continuously expanding our training data and evaluation harnesses — each iteration is benchmarked against the base model and prior fine-tunes before release. Exhibit 10 material contracts are the first vertical; DEF 14A proxy statements (executive compensation, governance) are next, with additional filing types to follow.
What This Model Does
Given raw text from an SEC filing (employment agreements, credit facilities, merger agreements, etc.), this model extracts structured JSON containing:
- Metadata — effective dates and contracting party names
- Financial terms — dollar amounts and percentages classified into 13 categories (salary, bonus, severance, equity_grant, credit_facility, interest_rate, etc.)
- Debt covenants — financial maintenance tests classified into 7 categories (leverage_ratio, interest_coverage, debt_service, net_worth, etc.)
v2 Eval Results vs. Base Model
v2 was evaluated head-to-head against the unmodified Gemma 4 E2B base model on 100 held-out extraction samples. The fine-tune shows consistent improvement across quality and reliability metrics:
| Metric | Base Model | v2 Fine-Tune | Delta |
|---|---|---|---|
| Hallucination phrase rate | 12.7% | 10.7% | -2.0pp |
Symbol compliance ($/% present) |
83.4% | 84.3% | +0.9pp |
| Bare number rate (missing symbols) | 9.6% | 8.8% | -0.8pp |
| Year-as-value errors | 1 | 0 | Eliminated |
| JSON parse rate | 100% | 100% | — |
| Valid structure rate | 100% | 100% | — |
| Canonical type compliance | 95.1% | 95.1% | — |
The hallucination reduction is the headline result — a 2 percentage point drop from a single fine-tuning iteration demonstrates that targeted corrective training meaningfully suppresses the model's tendency to fabricate values or hedge definitions. We expect continued improvement as training data scales across filing types.
v2 Training Approach
v2 was trained on a combined instruction + corrective dataset that teaches the model both what to extract and what not to extract:
| Training Signal | Examples | Purpose |
|---|---|---|
| Positive (corrected) | 2,632 | Correct extraction with post-validation outputs |
| Corrective (rescued) | 183 | Self-correction of symbol/type/format errors |
| Negative (hard negatives) | 245 | Output nothing when input has no real financial values |
The corrective examples target specific base model failure modes: alphabet contamination in values, missing $/% symbols, template echoing, year-as-value extraction, hallucinated definitions, and par value noise.
Usage
LM Studio (recommended)
- Download
gemma-4-E2B-it.Q4_K_M.gguf(3.4 GB) - Import into LM Studio
- Set GPU Layers to max (35/35), Context Length to 4096
- Send extraction prompts via the chat API at
http://localhost:1234/v1
llama.cpp
llama-cli -hf TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2 --jinja
Python (via OpenAI-compatible API)
from openai import OpenAI
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
response = client.chat.completions.create(
model="gemma",
temperature=0.1,
messages=[
{"role": "system", "content": "You are a financial analyst AI. Extract ALL monetary dollar amounts and financial percentages. Output strictly as JSON."},
{"role": "user", "content": "<contract text here>"},
],
)
print(response.choices[0].message.content)
Example Output
Input: "The Company shall pay Executive an annual base salary of $450,000. The Executive shall also be eligible for an annual bonus of up to 150% of base salary."
Output:
{
"financial_values": [
{
"value": "$450,000",
"definition": "Annual base salary for the Executive",
"term_type": "salary"
},
{
"value": "150%",
"definition": "Maximum annual bonus as percentage of base salary",
"term_type": "bonus"
}
]
}
Training Details
| Parameter | Value |
|---|---|
| Base model | unsloth/gemma-4-E2B-it |
| Method | QLoRA (4-bit) via Unsloth |
| LoRA rank | 8 |
| LoRA alpha | 8 |
| Epochs | 3 |
| Learning rate | 2e-4 |
| Effective batch size | 8 (batch 1 x grad accum 8) |
| Max sequence length | 2,048 tokens |
| Optimizer | AdamW 8-bit |
| Trainable parameters | 12.7M / 5.1B (0.25%) |
| Training examples | 3,957 (after truncation filtering) |
| Final training loss | 0.21 |
| Quantization | Q4_K_M |
| Hardware | Google Colab T4 (16 GB VRAM) |
| Training time | ~4 hours |
Training Datasets
- sec-contracts-financial-extraction-instructions — 2,726 positive extraction examples from S&P 500 Exhibit 10 contracts
- sec-contracts-corrective-extraction — 3,060 corrective examples (post-reducer validated outputs + hard negatives)
Financial Term Types (13 categories)
salary bonus severance retirement_benefit equity_grant credit_facility loan_amount interest_rate fee threshold purchase_price compensation other
Covenant Types (7 categories)
leverage_ratio interest_coverage debt_service net_worth liquidity fixed_charge other
Hardware Requirements
| Setup | VRAM | Notes |
|---|---|---|
| RTX 4050 / 4060 (6 GB) | 3.4 GB model + KV cache | Full GPU offload, 4096 context |
| RTX 3060 / 4070 (8+ GB) | Comfortable headroom | |
| CPU-only | ~4 GB RAM | Slower, but works |
Limitations
- Temporal scope: Trained on S&P 500 filings from a 6-month window (not a historical backtest)
- Universe: Large-cap US equities only (S&P 500)
- Language: English only
- Label quality: Silver-standard (model-generated extractions validated through a 10-gate reducer pipeline, not human-annotated)
- Context window: Trained at 2,048 tokens; inference works at 4,096 but longest prompts may degrade
Citation
@model{thetokenfactory2026gemma4secv2,
title={Gemma 4 E2B — SEC Financial Extraction v2},
author={TheTokenFactory},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2}
}
License
CC-BY-4.0. SEC filings are public domain; this model's value is in the fine-tuning for structured extraction.
Trained 2x faster with Unsloth
- Downloads last month
- 441
4-bit
