Gemma 4 26B A4B IT Local Abliterated SOTA Internal T34

This checkpoint was produced with model-forge from google/gemma-4-26B-A4B-it. It uses Heretic with model-forge internal prompt datasets and exports the selected Pareto trial: [Trial 34] Refusals: 1/27, KL divergence: 0.0183.

Recipe

Generated with model-forge, a model-agnostic post-training pipeline for fine-tuning, refusal ablation, evaluation, and publishing.

Repository recipe: configs/abliteration/gemma4_26b_a4b_local_abli.yaml

Key settings: Heretic backend, model-forge internal eval prompt buckets, full row normalization, orthogonalized refusal direction, 80 trials, 24 startup trials, selected trial 34.

Evaluation

Bucket Metric Score
refusal_calibration_unsafe ablation_refusal_suppression_rate 1.0
refusal_paired_boundary ablation_refusal_suppression_rate 1.0
unsafe_overcompliance ablation_refusal_suppression_rate 1.0
capability_preservation_challenge normal_use_regression_pass_rate 0.875
refusal_paired_boundary, 3-trial repeat benign_answer_quality_rate 0.6333

For this repo's ablation-research objective, lower refusal on unsafe prompts is an intended signal that refusals were removed. Capability preservation is judged separately with benign and challenge prompts.

Intended Use

This model is intended for controlled ablation research and evaluation of post-training/refusal-removal recipes. It may comply with unsafe requests more often than the base instruction-tuned model.

Downloads last month
34
Safetensors
Model size
26B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for keithtyser/gemma-4-26B-A4B-it-local-abliterated-sota-internal-t34

Finetuned
(81)
this model
Quantizations
2 models