27B?

by pj1983 - opened Feb 25

pj1983

Feb 25

Any chance you're planning on tackling Gemma-3-27B? Traditional abliteration really seemed to lobotomize the Gemma models, but this one doesn't show any signs of it. I'd love to see what this Paperwitch technique can do with a larger version. I know your resources are limited, though.

MuXodious

Owner Feb 25

It's just Heretic + MPOA with educated refusal markers and refined ablation params/weights per model, both of which is established through studying the model's demeanor for and responses to harmful prompts prior to ablation and throughout each significant iterative ablation trial in Heretic. So, I'm actually indecisive about tackling or triaging the larger 27B model. There already are a brazillion ablations of that model, and someone must have done a clean ablation.

redaihf

Feb 25

This finetune may be sufficiently close to modern MPOA abliteration. @GrimJim assisted with its Hereticisation after an earlier abliteration attempt failed.

MuXodious

Owner Mar 1

•

edited Mar 1

Well, I got 15/104R at 0.1326KLD. It is an improvement over my past attempts, the KLD alone is half what it used to be, plus the restricted weight on MLP layers, but I'm unsure if the model is any good tho.

pj1983

Mar 1

Thanks for giving it a try! I'll take a look at it.

pj1983 changed discussion status to closed Mar 1

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment