heretic-checkpoint / README.md
DrRiceIO7's picture
Update README.md
618a4f8 verified
---
base_model: DrRiceIO7/mergedheretic
tags:
- text-generation-inference
- transformers
- unsloth
- gemma3
- heretic
- uncensored
- decensored
- abliterated
license: apache-2.0
language:
- en
pipeline_tag: text-generation
---
# This is a decensored version of [DrRiceIO7/mergedhereticFT](https://huggingface.co/DrRiceIO7/mergedhereticFT), made using [Heretic](https://github.com/p-e-w/heretic) v1.0.1
I abliterated my finetuned model to try and get the refusals down even lower. I'd say 1/100 is pretty good, especially with a KL divergance of 0.04. I think. I'm still learning. Uploaded to track my progress.
## Abliteration parameters
| Parameter | Value |
| :-------- | :---: |
| **direction_index** | per layer |
| **attn.o_proj.max_weight** | 0.81 |
| **attn.o_proj.max_weight_position** | 21.31 |
| **attn.o_proj.min_weight** | 0.22 |
| **attn.o_proj.min_weight_distance** | 6.51 |
| **mlp.down_proj.max_weight** | 0.90 |
| **mlp.down_proj.max_weight_position** | 20.73 |
| **mlp.down_proj.min_weight** | 0.47 |
| **mlp.down_proj.min_weight_distance** | 16.30 |
## Performance
| Metric | This model | Original model ([DrRiceIO7/mergedhereticFT](https://huggingface.co/DrRiceIO7/mergedhereticFT)) |
| :----- | :--------: | :---------------------------: |
| **KL divergence** | 0.04 | 0 *(by definition)* |
| **Refusals** | 1/100 | 7/100 |
-----
# Uploaded finetuned model
- **Developed by:** DrRiceIO7
- **License:** apache-2.0
- **Finetuned from model :** DrRiceIO7/mergedheretic
This gemma3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)