Gemma 4 E2B — SEC Financial Extraction (v2, GGUF)

A fine-tuned Gemma 4 E2B model specialized for extracting structured financial data from SEC filings. Quantized to Q4_K_M GGUF for efficient local inference.

This model is part of an actively maintained extraction pipeline with a clear goal: drive hallucination rates to negligible levels across a growing range of SEC filing types. We are continuously expanding our training data and evaluation harnesses — each iteration is benchmarked against the base model and prior fine-tunes before release. Exhibit 10 material contracts are the first vertical; DEF 14A proxy statements (executive compensation, governance) are next, with additional filing types to follow.

What This Model Does

Given raw text from an SEC filing (employment agreements, credit facilities, merger agreements, etc.), this model extracts structured JSON containing:

Metadata — effective dates and contracting party names
Financial terms — dollar amounts and percentages classified into 13 categories (salary, bonus, severance, equity_grant, credit_facility, interest_rate, etc.)
Debt covenants — financial maintenance tests classified into 7 categories (leverage_ratio, interest_coverage, debt_service, net_worth, etc.)

v2 Eval Results vs. Base Model

v2 was evaluated head-to-head against the unmodified Gemma 4 E2B base model on 100 held-out extraction samples. The fine-tune shows consistent improvement across quality and reliability metrics:

Metric	Base Model	v2 Fine-Tune	Delta
Hallucination phrase rate	12.7%	10.7%	-2.0pp
Symbol compliance (`$`/`%` present)	83.4%	84.3%	+0.9pp
Bare number rate (missing symbols)	9.6%	8.8%	-0.8pp
Year-as-value errors	1	0	Eliminated
JSON parse rate	100%	100%	—
Valid structure rate	100%	100%	—
Canonical type compliance	95.1%	95.1%	—

The hallucination reduction is the headline result — a 2 percentage point drop from a single fine-tuning iteration demonstrates that targeted corrective training meaningfully suppresses the model's tendency to fabricate values or hedge definitions. We expect continued improvement as training data scales across filing types.

v2 Training Approach

v2 was trained on a combined instruction + corrective dataset that teaches the model both what to extract and what not to extract:

Training Signal	Examples	Purpose
Positive (corrected)	2,632	Correct extraction with post-validation outputs
Corrective (rescued)	183	Self-correction of symbol/type/format errors
Negative (hard negatives)	245	Output nothing when input has no real financial values

The corrective examples target specific base model failure modes: alphabet contamination in values, missing $/% symbols, template echoing, year-as-value extraction, hallucinated definitions, and par value noise.

Usage

LM Studio (recommended)

Download gemma-4-E2B-it.Q4_K_M.gguf (3.4 GB)
Import into LM Studio
Set GPU Layers to max (35/35), Context Length to 4096
Send extraction prompts via the chat API at http://localhost:1234/v1

llama.cpp

llama-cli -hf TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2 --jinja

Python (via OpenAI-compatible API)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

response = client.chat.completions.create(
    model="gemma",
    temperature=0.1,
    messages=[
        {"role": "system", "content": "You are a financial analyst AI. Extract ALL monetary dollar amounts and financial percentages. Output strictly as JSON."},
        {"role": "user", "content": "<contract text here>"},
    ],
)
print(response.choices[0].message.content)

Example Output

Input: "The Company shall pay Executive an annual base salary of $450,000. The Executive shall also be eligible for an annual bonus of up to 150% of base salary."

Output:

{
  "financial_values": [
    {
      "value": "$450,000",
      "definition": "Annual base salary for the Executive",
      "term_type": "salary"
    },
    {
      "value": "150%",
      "definition": "Maximum annual bonus as percentage of base salary",
      "term_type": "bonus"
    }
  ]
}

Training Details

Parameter	Value
Base model	`unsloth/gemma-4-E2B-it`
Method	QLoRA (4-bit) via Unsloth
LoRA rank	8
LoRA alpha	8
Epochs	3
Learning rate	2e-4
Effective batch size	8 (batch 1 x grad accum 8)
Max sequence length	2,048 tokens
Optimizer	AdamW 8-bit
Trainable parameters	12.7M / 5.1B (0.25%)
Training examples	3,957 (after truncation filtering)
Final training loss	0.21
Quantization	Q4_K_M
Hardware	Google Colab T4 (16 GB VRAM)
Training time	~4 hours

Training Datasets

sec-contracts-financial-extraction-instructions — 2,726 positive extraction examples from S&P 500 Exhibit 10 contracts
sec-contracts-corrective-extraction — 3,060 corrective examples (post-reducer validated outputs + hard negatives)

Financial Term Types (13 categories)

salary bonus severance retirement_benefit equity_grant credit_facility loan_amount interest_rate fee threshold purchase_price compensation other

Covenant Types (7 categories)

leverage_ratio interest_coverage debt_service net_worth liquidity fixed_charge other

Hardware Requirements

Setup	VRAM	Notes
RTX 4050 / 4060 (6 GB)	3.4 GB model + KV cache	Full GPU offload, 4096 context
RTX 3060 / 4070 (8+ GB)	Comfortable headroom
CPU-only	~4 GB RAM	Slower, but works

Limitations

Temporal scope: Trained on S&P 500 filings from a 6-month window (not a historical backtest)
Universe: Large-cap US equities only (S&P 500)
Language: English only
Label quality: Silver-standard (model-generated extractions validated through a 10-gate reducer pipeline, not human-annotated)
Context window: Trained at 2,048 tokens; inference works at 4,096 but longest prompts may degrade

Citation

@model{thetokenfactory2026gemma4secv2,
  title={Gemma 4 E2B — SEC Financial Extraction v2},
  author={TheTokenFactory},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2}
}

License

CC-BY-4.0. SEC filings are public domain; this model's value is in the fine-tuning for structured extraction.

Trained 2x faster with Unsloth

Downloads last month: 441

GGUF

Model size

5B params

Architecture

gemma4

Hardware compatibility

4-bit

TheTokenFactory
/

gemma-4-E2B-sec-extraction-GGUF-v2