GPT-SW3-356M — Icelandic Grammar-Aligned (SAGA SDPO + Antihack)

Fine-tuned with SAGA (Syntax-Aware Grammar Alignment) using Self-Distilled Policy Optimization (SDPO) with anti-hacking measures.

Pipeline: GPT-SW3-356M → KL-SFT → SDPO (this adapter)

Metric Base 356M SDPO KL-SFT Antihack
Stanza parse success 0.725 0.810
Stanza mean quality 0.471 0.525
Stanza parse score 0.341 0.426
Wiki PPL 22.85 23.81
ScaLA AUROC 0.680 0.684

Anti-hacking measures:

  • Repetition penalty: 1.3 (generation-side)
  • MATTR weight: 0.2 (reward-side lexical diversity penalty)

Training config: 1 epoch, batch 64, 8 generations, lr=1e-5, alpha=0.5, success threshold=0.3.

Cross-lingual transfer (Stanza parse score): IS→DA 0.413, IS→NB 0.409, IS→SV 0.410.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Hodfa71/saga-is-356m-kl-sft")
model = PeftModel.from_pretrained(base, "acbueff/gpt-sw3-356m-is-saga-kl-sft-sdpo")
tokenizer = AutoTokenizer.from_pretrained("acbueff/gpt-sw3-356m-is-saga-kl-sft-sdpo")

Oracle: Greynir (Icelandic constituency parser). Held-out eval: Stanza is. Inherits the base model license (AI Sweden LLM License).

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for acbueff/gpt-sw3-356m-is-saga-kl-sft-sdpo

Adapter
(1)
this model