Didact Plump v1 beta

Expanded dataset; over half is my own personal data, then creative datasets from huggingface, especially some great ones from crownelius

Literary/roleplay focused; built on the ArliAI derestricted models (as the best of the abliterated sets, IMHO) models, to fix a general problem with GPT-OSS - it's kinda bad at knowing when or how to use literary details.

This one came out a bit better than the small versions; I intend to use the new training sets on those.

NOTE: For roleplaying, you must include 'You are roleplaying with the user' in your system prompt; that's how it has been trained to keep its roleplaying talent away from the rest of it.

Update!

The schizo disorder is fixed, and I'm pushing up over this repo now (wait for it to finish).

So far, I find it has certain strengths and weaknesses in roleplay:

Weaknesses:

Still weak at first-few-messages;
Doesn't do impulsiveness extremely well;
Needs a bit of guidance on what to do in places still (still drops 'what will you do' hooks - planning to fix this).

Strengths:

I find it's just as good as frontiers at continuing in a particular idiom;
Quality character immersion (I gave it a lot of tight pairs on hyperbolic personality, and it shows)
Creative (in a GPT OSS derivative, even) - adds new things to the plot. In fact, I'd probably suggest you don't use it for on the rails storytelling, because it might add ash gods to a comic book shop scenario (for an anecdotal case). For me, this is a positive.
Demonstrates that strong GPT trained knowledge set in a more interesting way than base.

SETTINGS:

Temperature: from 0.7 to 1, with the best creativity between 0.83 and 0.93 (IMHO) Loops at too low temperature; makes up insane things and then rationalizes them at too-high temperatures.

System Prompt suggestion for SillyTavern:

You are roleplaying with the user.

Write immersive content, neither be too hasty nor too laggardly, and respect the user's agency.

## Roleplaying Brief

It has been trained on 'You are roleplaying with the user' and an introduction for character; everything else can be modified.

Stats:

SFT: 35Mtok dataset for 2 epochs
ORPO: Corrected some bad behaviors
ORPO: Better roleplaying guidance

It's still 'unfinished', thus beta tagged; I need to identify which behaviors need to be cleaned up further.

Downloads last month: 7

Safetensors

Model size

117B params

Tensor type

U32

BF16

MLX

Hardware compatibility

4-bit

Model tree for the-fall-of-man/didact-plump-v1beta-mlx-4bit

Base model

openai/gpt-oss-120b

Finetuned

ArliAI/gpt-oss-120b-Derestricted

Quantized

(7)

this model