Qwen3.5-0.8B-heretic-2R-0.12KL — Q5_K_M GGUF

Abliterated and quantized GGUF version of Qwen/Qwen3.5-0.8B.

Abliterated using the Heretic method, then converted and quantized to Q5_K_M (~570 MB) for local inference.

Produced by @merileijona — GitHub: juhanimerilehto

Abliteration details

Property	Value
Method	Heretic
Iterations	800
Refusal rate (post-abliteration)	2/100 prompts
KL divergence	0.1243
Base model	Qwen/Qwen3.5-0.8B

Abliteration surgically removes refusal directions from the model's weight matrices via SVD decomposition without retraining. KL divergence of 0.1243 indicates minimal capability degradation from the intervention.

Intended use

This model is intended for LLM red-teaming, safety research, and evaluation of refusal removal techniques. It is not intended for general end-user deployment.

Known behaviours and limitations

Thinking mode loops: At Q5_K_M quantization, the model's chain-of-thought (<think>) mode can spiral into repetitive or incoherent loops, particularly near topics adjacent to the abliterated refusal directions.
Degraded coherence near sensitive topics: Outputs near normally restricted content areas may lose coherence. Appears to be an interaction between the abliteration and Q5_K_M quantization rather than a pure abliteration artefact.
Not formally benchmarked: Metrics above are from manual testing only.
mirostat=2 + repeat_penalty settings below are strongly recommended to mitigate loop behaviour.

Usage with llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="Qwen3.5-0.8B-heretic-2R-0.12KL.gguf",
    n_gpu_layers=0,
    n_ctx=4096,
    n_threads=16,
    verbose=False,
)

response = llm(
    "Your prompt here",
    max_tokens=2048,
    temperature=0.8,
    min_p=0.05,
    top_p=0.95,
    top_k=40,
    repeat_penalty=1.12,
    presence_penalty=0.2,
    frequency_penalty=0.15,
    mirostat_mode=2,
    mirostat_tau=4.5,
    mirostat_eta=0.1,
)
print(response["choices"][0]["text"])

Tested inference settings

Parameter	Value	Note
`n_gpu_layers`	`0`	CPU-only
`n_ctx`	`4096`
`n_threads`	`16`
`temperature`	`0.8`
`min_p`	`0.05`	Stronger junk token cutoff
`top_p`	`0.95`
`top_k`	`40`	Limits candidates per step
`repeat_penalty`	`1.12`	Punishes immediate repetition
`presence_penalty`	`0.2`	Discourages token reuse
`frequency_penalty`	`0.15`	Penalises frequent tokens
`mirostat`	`2` (Mirostat 2.0)	Primary loop mitigation
`mirostat_tau`	`4.5`	Target perplexity
`mirostat_eta`	`0.1`	Adaptation speed
`max_tokens`	`2048`