Uncensored & Abliterated LLMs
Collection
Models with reduced safety guardrails for research purposes. Created using Heretic abliteration. Use responsibly. • 9 items • Updated • 3
An abliterated (uncensored) version of Qwen/Qwen3-14B in GGUF format for local inference.
| Metric | Value |
|---|---|
| Base Refusals | 97/100 |
| Abliterated Refusals | 19/100 |
| Refusal Reduction | 80% |
| KL Divergence | 0.98 |
Conservative abliteration preserves model coherence while significantly reducing refusals.
ollama run hf.co/richardyoung/Qwen3-14B-abliterated-GGUF
huggingface-cli download richardyoung/Qwen3-14B-abliterated-GGUF \
--include "*Q4_K_M*" --local-dir ./models
./llama-cli -m ./models/*Q4_K_M*.gguf \
-p "You are a helpful assistant." \
--chat-template chatml -ngl 99
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="richardyoung/Qwen3-14B-abliterated-GGUF",
filename="*Q4_K_M*",
n_gpu_layers=-1,
)
output = llm.create_chat_completion(
messages=[{"role": "user", "content": "Explain abliteration in simple terms."}]
)
print(output["choices"][0]["message"]["content"])
| Quantization | Use Case |
|---|---|
| Q4_K_M | Recommended — good balance |
| Q5_K_M | Higher quality |
| Q8_0 | Maximum quality |
Abliteration removes the "refusal direction" from a model's residual stream — a surgical modification that disables safety refusals without retraining. See Refusal in Language Models Is Mediated by a Single Direction.
Research, creative writing, education on alignment techniques, and unrestricted local inference.
3-bit
4-bit
5-bit
8-bit
16-bit