Earnings Intelligence Copilot — Fine-tuned Phi-3.5-mini (Merged)

A QLoRA fine-tuned and merged version of Phi-3.5-mini-instruct for structured KPI extraction from SEC filings. Part of the Earnings Intelligence Copilot — a multi-agent system that ingests SEC filings, extracts KPIs, and generates citation-grounded investment memos.

What it does

Extracts financial KPIs (Revenue, Gross Margin, Operating Income, EPS, Free Cash Flow) from SEC filing chunks as structured JSON
Returns {"confidence": "UNVERIFIABLE"} instead of hallucinating when data is not present
Always includes a source_quote field grounding every answer in the original filing text

Model Details

Property	Value
Base model	microsoft/Phi-3.5-mini-instruct
Fine-tuning method	QLoRA (4-bit NF4 quantization)
LoRA rank	r=16, alpha=32
Target modules	q_proj, v_proj, k_proj, o_proj
Training examples	619 balanced examples (50% HIGH / 50% UNVERIFIABLE)
Data source	10-K and 10-Q filings, 20 S&P 500 companies (2020-2024)
Training hardware	Kaggle T4 GPU (~75 minutes)
Model type	Full merged model (adapter + base combined)

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained(
    "ratnasekhar/earnings-copilot-phi3-merged",
    trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
    "ratnasekhar/earnings-copilot-phi3-merged",
    dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
    attn_implementation="eager"
)
model.eval()

chunk = "Net sales for Q1 FY2024 were $119.6 billion, an increase of 2% compared to Q1 FY2023."
prompt = (
    "<|user|>\n"
    "You are a financial KPI extraction model. Extract metrics as JSON. "
    "Output {\"confidence\": \"UNVERIFIABLE\"} if not found. Never invent numbers.\n\n"
    "Filing chunk:\n" + chunk + "\n\n"
    "Extract: What was total revenue and its YoY change?<|end|>\n"
    "<|assistant|>\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=150,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id,
        use_cache=False
    )
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Example Outputs

When data IS present:

{
  "metric": "Total Revenue",
  "value": "$119.6 billion",
  "unit": "Billion",
  "period": "Q1 FY2024",
  "yoy_change": "+2%",
  "source_quote": "Net sales for Q1 FY2024 were $119.6 billion",
  "confidence": "HIGH"
}

When data is NOT present:

{
  "confidence": "UNVERIFIABLE",
  "reason": "The chunk does not contain any specific revenue figures."
}

Key Design Decision — Class Balance

Raw LLM-generated training data had 93% UNVERIFIABLE examples. Training on this imbalanced data would cause the model to always refuse. We deliberately resampled to 50/50 HIGH/UNVERIFIABLE to teach both behaviors equally — this is the core fine-tuning contribution of the project.

Related Models

Model	Description
ratnasekhar/earnings-copilot-mistral-7b	Mistral-7B LoRA adapter (larger, higher quality)
ratnasekhar/earnings-copilot-phi3-merged	Phi-3.5-mini merged model (this model, deployable)

System Architecture

SEC Filings (220 filings, 20 S&P 500 tickers)
        ↓
Qdrant Cloud (2,464 financial table chunks)
        ↓
Phi-3.5-mini — this model (KPI extraction)
        ↓
Verification Agent (cross-checks every number)
        ↓
Citation-grounded Investment Memo

Downloads last month: 92

Safetensors

Model size

4B params

Tensor type

F16

Model tree for ratnasekhar/earnings-copilot-phi3-merged

Base model

microsoft/Phi-3.5-mini-instruct

Adapter

(690)

this model