Gemma 4 E2B — SEC Financial Extraction (v2, GGUF)

A fine-tuned Gemma 4 E2B model specialized for extracting structured financial data from SEC filings. Quantized to Q4_K_M GGUF for efficient local inference.

This model is part of an actively maintained extraction pipeline with a clear goal: drive hallucination rates to negligible levels across a growing range of SEC filing types. We are continuously expanding our training data and evaluation harnesses — each iteration is benchmarked against the base model and prior fine-tunes before release. Exhibit 10 material contracts are the first vertical; DEF 14A proxy statements (executive compensation, governance) are next, with additional filing types to follow.

What This Model Does

Given raw text from an SEC filing (employment agreements, credit facilities, merger agreements, etc.), this model extracts structured JSON containing:

  • Metadata — effective dates and contracting party names
  • Financial terms — dollar amounts and percentages classified into 13 categories (salary, bonus, severance, equity_grant, credit_facility, interest_rate, etc.)
  • Debt covenants — financial maintenance tests classified into 7 categories (leverage_ratio, interest_coverage, debt_service, net_worth, etc.)

v2 Eval Results vs. Base Model

v2 was evaluated head-to-head against the unmodified Gemma 4 E2B base model on 100 held-out extraction samples. The fine-tune shows consistent improvement across quality and reliability metrics:

Metric Base Model v2 Fine-Tune Delta
Hallucination phrase rate 12.7% 10.7% -2.0pp
Symbol compliance ($/% present) 83.4% 84.3% +0.9pp
Bare number rate (missing symbols) 9.6% 8.8% -0.8pp
Year-as-value errors 1 0 Eliminated
JSON parse rate 100% 100% —
Valid structure rate 100% 100% —
Canonical type compliance 95.1% 95.1% —

The hallucination reduction is the headline result — a 2 percentage point drop from a single fine-tuning iteration demonstrates that targeted corrective training meaningfully suppresses the model's tendency to fabricate values or hedge definitions. We expect continued improvement as training data scales across filing types.

v2 Training Approach

v2 was trained on a combined instruction + corrective dataset that teaches the model both what to extract and what not to extract:

Training Signal Examples Purpose
Positive (corrected) 2,632 Correct extraction with post-validation outputs
Corrective (rescued) 183 Self-correction of symbol/type/format errors
Negative (hard negatives) 245 Output nothing when input has no real financial values

The corrective examples target specific base model failure modes: alphabet contamination in values, missing $/% symbols, template echoing, year-as-value extraction, hallucinated definitions, and par value noise.

Usage

LM Studio (recommended)

  1. Download gemma-4-E2B-it.Q4_K_M.gguf (3.4 GB)
  2. Import into LM Studio
  3. Set GPU Layers to max (35/35), Context Length to 4096
  4. Send extraction prompts via the chat API at http://localhost:1234/v1

llama.cpp

llama-cli -hf TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2 --jinja

Python (via OpenAI-compatible API)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

response = client.chat.completions.create(
    model="gemma",
    temperature=0.1,
    messages=[
        {"role": "system", "content": "You are a financial analyst AI. Extract ALL monetary dollar amounts and financial percentages. Output strictly as JSON."},
        {"role": "user", "content": "<contract text here>"},
    ],
)
print(response.choices[0].message.content)

Example Output

Input: "The Company shall pay Executive an annual base salary of $450,000. The Executive shall also be eligible for an annual bonus of up to 150% of base salary."

Output:

{
  "financial_values": [
    {
      "value": "$450,000",
      "definition": "Annual base salary for the Executive",
      "term_type": "salary"
    },
    {
      "value": "150%",
      "definition": "Maximum annual bonus as percentage of base salary",
      "term_type": "bonus"
    }
  ]
}

Training Details

Parameter Value
Base model unsloth/gemma-4-E2B-it
Method QLoRA (4-bit) via Unsloth
LoRA rank 8
LoRA alpha 8
Epochs 3
Learning rate 2e-4
Effective batch size 8 (batch 1 x grad accum 8)
Max sequence length 2,048 tokens
Optimizer AdamW 8-bit
Trainable parameters 12.7M / 5.1B (0.25%)
Training examples 3,957 (after truncation filtering)
Final training loss 0.21
Quantization Q4_K_M
Hardware Google Colab T4 (16 GB VRAM)
Training time ~4 hours

Training Datasets

  1. sec-contracts-financial-extraction-instructions — 2,726 positive extraction examples from S&P 500 Exhibit 10 contracts
  2. sec-contracts-corrective-extraction — 3,060 corrective examples (post-reducer validated outputs + hard negatives)

Financial Term Types (13 categories)

salary bonus severance retirement_benefit equity_grant credit_facility loan_amount interest_rate fee threshold purchase_price compensation other

Covenant Types (7 categories)

leverage_ratio interest_coverage debt_service net_worth liquidity fixed_charge other

Hardware Requirements

Setup VRAM Notes
RTX 4050 / 4060 (6 GB) 3.4 GB model + KV cache Full GPU offload, 4096 context
RTX 3060 / 4070 (8+ GB) Comfortable headroom
CPU-only ~4 GB RAM Slower, but works

Limitations

  • Temporal scope: Trained on S&P 500 filings from a 6-month window (not a historical backtest)
  • Universe: Large-cap US equities only (S&P 500)
  • Language: English only
  • Label quality: Silver-standard (model-generated extractions validated through a 10-gate reducer pipeline, not human-annotated)
  • Context window: Trained at 2,048 tokens; inference works at 4,096 but longest prompts may degrade

Citation

@model{thetokenfactory2026gemma4secv2,
  title={Gemma 4 E2B — SEC Financial Extraction v2},
  author={TheTokenFactory},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2}
}

License

CC-BY-4.0. SEC filings are public domain; this model's value is in the fine-tuning for structured extraction.


Trained 2x faster with Unsloth

Downloads last month
441
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train TheTokenFactory/gemma-4-E2B-sec-extraction-GGUF-v2