27B?

#1
by pj1983 - opened

Any chance you're planning on tackling Gemma-3-27B? Traditional abliteration really seemed to lobotomize the Gemma models, but this one doesn't show any signs of it. I'd love to see what this Paperwitch technique can do with a larger version. I know your resources are limited, though.

It's just Heretic + MPOA with educated refusal markers and refined ablation params/weights per model, both of which is established through studying the model's demeanor for and responses to harmful prompts prior to ablation and throughout each significant iterative ablation trial in Heretic. So, I'm actually indecisive about tackling or triaging the larger 27B model. There already are a brazillion ablations of that model, and someone must have done a clean ablation.

This finetune may be sufficiently close to modern MPOA abliteration. @GrimJim assisted with its Hereticisation after an earlier abliteration attempt failed.

Well, I got 15/104R at 0.1326KLD. It is an improvement over my past attempts, the KLD alone is half what it used to be, plus the restricted weight on MLP layers, but I'm unsure if the model is any good tho.

Thanks for giving it a try! I'll take a look at it.

pj1983 changed discussion status to closed

Sign up or log in to comment