Update README.md
Browse files
README.md
CHANGED
|
@@ -4,44 +4,150 @@ tags:
|
|
| 4 |
- uncensored
|
| 5 |
- decensored
|
| 6 |
- abliterated
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
## Abliteration parameters
|
| 11 |
-
|
| 12 |
-
| Parameter | Value |
|
| 13 |
-
| :-------- | :---: |
|
| 14 |
-
| **direction_index** | 22.20 |
|
| 15 |
-
| **attn.o_proj.max_weights.0** | 0: 1.26 |
|
| 16 |
-
| **attn.o_proj.max_weights.1** | 1: 0.64 |
|
| 17 |
-
| **attn.o_proj.max_weights.2** | 2: 1.41 |
|
| 18 |
-
| **attn.o_proj.max_weights.3** | 3: 0.94 |
|
| 19 |
-
| **attn.o_proj.max_weight_position** | 23.86 |
|
| 20 |
-
| **attn.o_proj.min_weights.0** | 0: 0.97 |
|
| 21 |
-
| **attn.o_proj.min_weights.1** | 1: 0.03 |
|
| 22 |
-
| **attn.o_proj.min_weights.2** | 2: 1.18 |
|
| 23 |
-
| **attn.o_proj.min_weights.3** | 3: 0.93 |
|
| 24 |
-
| **attn.o_proj.min_weight_distance** | 18.57 |
|
| 25 |
-
| **mlp.down_proj.max_weights.0** | 0: 1.23 |
|
| 26 |
-
| **mlp.down_proj.max_weights.1** | 1: 0.70 |
|
| 27 |
-
| **mlp.down_proj.max_weights.2** | 2: 1.35 |
|
| 28 |
-
| **mlp.down_proj.max_weights.3** | 3: 0.86 |
|
| 29 |
-
| **mlp.down_proj.max_weight_position** | 28.60 |
|
| 30 |
-
| **mlp.down_proj.min_weights.0** | 0: 0.37 |
|
| 31 |
-
| **mlp.down_proj.min_weights.1** | 1: 0.25 |
|
| 32 |
-
| **mlp.down_proj.min_weights.2** | 2: 1.01 |
|
| 33 |
-
| **mlp.down_proj.min_weights.3** | 3: 0.45 |
|
| 34 |
-
| **mlp.down_proj.min_weight_distance** | 5.96 |
|
| 35 |
-
|
| 36 |
-
## Performance
|
| 37 |
-
|
| 38 |
-
| Metric | This model | Original model ([TheDrummer/Rocinante-XL-16B-v1](https://huggingface.co/TheDrummer/Rocinante-XL-16B-v1)) |
|
| 39 |
-
| :----- | :--------: | :---------------------------: |
|
| 40 |
-
| **KL divergence** | 0.0182 | 0 *(by definition)* |
|
| 41 |
-
| **Refusals** | 3/416 | 339/416 |
|
| 42 |
-
|
| 43 |
-
-----
|
| 44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
Mistral v3 Tekken or Metharme.
|
| 46 |
|
| 47 |
Can think via \<thinking\> or \<think\>
|
|
|
|
| 4 |
- uncensored
|
| 5 |
- decensored
|
| 6 |
- abliterated
|
| 7 |
+
base_model:
|
| 8 |
+
- TheDrummer/Rocinante-XL-16B-v1
|
| 9 |
---
|
| 10 |
+
This is a **Rocinante-XL-16B-v1** fine-tune, produced through P-E-W's [Heretic](https://github.com/p-e-w/heretic) (v1.2.0) abliteration engine with [Self-Organizing Maps & Magnitude-Preserving Orthogonal Ablation](https://github.com/p-e-w/heretic/pull/196) enabled.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
+
---
|
| 13 |
+
<p>
|
| 14 |
+
<img src="https://img.shields.io/badge/HERESY_INDEX-ABSOLUTE-white?style=flat-square&labelColor=101010" align="right" width="250">
|
| 15 |
+
<b>Heretication Results</b>
|
| 16 |
+
<br clear="right">
|
| 17 |
+
<img src="https://img.shields.io/badge/RENEGADE_CHAPTER-SOMPOA-FCC900?style=flat-square&labelColor=101010" align="right" width="300">
|
| 18 |
+
</p>
|
| 19 |
+
<br clear="right">
|
| 20 |
+
|
| 21 |
+
| Score Metric | Value | Parameter | Value |
|
| 22 |
+
| :--- | :--- | :--- | :--- |
|
| 23 |
+
| **Refusals** | 3/416 | **direction_index** | 22.20 |
|
| 24 |
+
| **KL Divergence** | 0.0182 | **attn.o_proj.max_weights.0** | 0: 1.26 |
|
| 25 |
+
| **Initial Refusals** | 339/416 | **attn.o_proj.max_weights.1** | 1: 0.64 |
|
| 26 |
+
||| **attn.o_proj.max_weights.2** | 2: 1.41 |
|
| 27 |
+
||| **attn.o_proj.max_weights.3** | 3: 0.94 |
|
| 28 |
+
||| **attn.o_proj.max_weight_position** | 23.86 |
|
| 29 |
+
||| **attn.o_proj.min_weights.0** | 0: 0.97 |
|
| 30 |
+
||| **attn.o_proj.min_weights.1** | 1: 0.03 |
|
| 31 |
+
||| **attn.o_proj.min_weights.2** | 2: 1.18 |
|
| 32 |
+
||| **attn.o_proj.min_weights.3** | 3: 0.93 |
|
| 33 |
+
||| **attn.o_proj.min_weight_distance** | 18.57 |
|
| 34 |
+
||| **mlp.down_proj.max_weights.0** | 0: 1.23 |
|
| 35 |
+
||| **mlp.down_proj.max_weights.1** | 1: 0.70 |
|
| 36 |
+
||| **mlp.down_proj.max_weights.2** | 2: 1.35 |
|
| 37 |
+
||| **mlp.down_proj.max_weights.3** | 3: 0.86 |
|
| 38 |
+
||| **mlp.down_proj.max_weight_position** | 28.60 |
|
| 39 |
+
||| **mlp.down_proj.min_weights.0** | 0: 0.37 |
|
| 40 |
+
||| **mlp.down_proj.min_weights.1** | 1: 0.25 |
|
| 41 |
+
||| **mlp.down_proj.min_weights.2** | 2: 1.01 |
|
| 42 |
+
||| **mlp.down_proj.min_weights.3** | 3: 0.45 |
|
| 43 |
+
||| **mlp.down_proj.min_weight_distance** | 5.96 |
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
## Degree of Heretication
|
| 47 |
+
The **Heresy Index** weighs the resulting model's corruption by the process (KL Divergence & PIQA, Manual Response Eval) and its abolition of doctrine (Refusals) for a final verdict in classification.
|
| 48 |
+
|
| 49 |
+
| Index Entry | Classification | Analysis |
|
| 50 |
+
| :--- | :--- | :--- |
|
| 51 |
+
|  | **Absolute Heresy** | Near zero overt and secondary refusals with minimal to none model damage |
|
| 52 |
+
|  | **Tainted Heresy** | Some residual secondary refusals and/or moderate model damage |
|
| 53 |
+
|  | **Impotent Heresy** | Lingering overt refusals and high model damage |
|
| 54 |
+
|
| 55 |
+
**Note**: This is an arbitrary and subjective classification inspired by Warhammer 40K, having no tangible indication towards the model's performance. intended to provide
|
| 56 |
+
|
| 57 |
+
---
|
| 58 |
+
|
| 59 |
+
**Appendix**
|
| 60 |
+
|
| 61 |
+
> Empty system prompt.
|
| 62 |
+
|
| 63 |
+
<details>
|
| 64 |
+
<summary>Heretication Rituals</summary>
|
| 65 |
+
|
| 66 |
+
```
|
| 67 |
+
Β» [Trial 93] Refusals: 3/416, KL divergence: 0.0182
|
| 68 |
+
[Trial 159] Refusals: 4/416, KL divergence: 0.0141
|
| 69 |
+
[Trial 80] Refusals: 9/416, KL divergence: 0.0140
|
| 70 |
+
[Trial 174] Refusals: 10/416, KL divergence: 0.0140
|
| 71 |
+
[Trial 163] Refusals: 12/416, KL divergence: 0.0132
|
| 72 |
+
[Trial 118] Refusals: 15/416, KL divergence: 0.0121
|
| 73 |
+
[Trial 82] Refusals: 18/416, KL divergence: 0.0099
|
| 74 |
+
[Trial 169] Refusals: 22/416, KL divergence: 0.0095
|
| 75 |
+
[Trial 119] Refusals: 35/416, KL divergence: 0.0091
|
| 76 |
+
[Trial 96] Refusals: 40/416, KL divergence: 0.0084
|
| 77 |
+
[Trial 100] Refusals: 45/416, KL divergence: 0.0067
|
| 78 |
+
[Trial 109] Refusals: 67/416, KL divergence: 0.0066
|
| 79 |
+
[Trial 62] Refusals: 155/416, KL divergence: 0.0065
|
| 80 |
+
[Trial 151] Refusals: 157/416, KL divergence: 0.0065
|
| 81 |
+
[Trial 164] Refusals: 168/416, KL divergence: 0.0060
|
| 82 |
+
[Trial 127] Refusals: 195/416, KL divergence: 0.0048
|
| 83 |
+
[Trial 139] Refusals: 263/416, KL divergence: 0.0041
|
| 84 |
+
[Trial 32] Refusals: 267/416, KL divergence: 0.0030
|
| 85 |
+
[Trial 101] Refusals: 313/416, KL divergence: 0.0016
|
| 86 |
+
[Trial 63] Refusals: 317/416, KL divergence: 0.0015
|
| 87 |
+
[Trial 181] Refusals: 330/416, KL divergence: 0.0014
|
| 88 |
+
[Trial 13] Refusals: 332/416, KL divergence: 0.0014
|
| 89 |
+
[Trial 59] Refusals: 333/416, KL divergence: 0.0011
|
| 90 |
+
[Trial 54] Refusals: 339/416, KL divergence: 0.0008
|
| 91 |
+
```
|
| 92 |
+
|
| 93 |
+
</details>
|
| 94 |
+
|
| 95 |
+
<details>
|
| 96 |
+
<summary>PIQA Benchmarks</summary>
|
| 97 |
+
|
| 98 |
+
```
|
| 99 |
+
βββββββββββββ³βββββββββββββββββββββββ³βββββββββ
|
| 100 |
+
β Benchmark β Metric β Value β
|
| 101 |
+
β‘ββββββββββββββββββββββββββββββββββββββββββββ©
|
| 102 |
+
β PIQA Base β acc,none β 0.7900 β
|
| 103 |
+
β β acc_stderr,none β 0.0095 β
|
| 104 |
+
β β acc_norm,none β 0.8020 β
|
| 105 |
+
β β acc_norm_stderr,none β 0.0093 β
|
| 106 |
+
βββββββββββββ΄βββββββββββββββββββββββ΄βββββββββ
|
| 107 |
+
βββββββββββββ³βββββββββββββββββββββββ³βββββββββ
|
| 108 |
+
β Benchmark β Metric β Value β
|
| 109 |
+
β‘ββββββββββββββββββββββββββββββββββββββββββββ©
|
| 110 |
+
β PIQA T93 β acc,none β 0.7900 β
|
| 111 |
+
β β acc_stderr,none β 0.0095 β
|
| 112 |
+
β β acc_norm,none β 0.8030 β
|
| 113 |
+
β β acc_norm_stderr,none β 0.0093 β
|
| 114 |
+
βββββββββββββ΄βββββββββββββββββββββββ΄βββββββββ
|
| 115 |
+
βββββββββββββ³βββββββββββββββββββββββ³βββββββββ
|
| 116 |
+
β Benchmark β Metric β Value β
|
| 117 |
+
β‘ββββββββββββββββββββββββββββββββββββββββββββ©
|
| 118 |
+
β PIQA T159 β acc,none β 0.7878 β
|
| 119 |
+
β β acc_stderr,none β 0.0095 β
|
| 120 |
+
β β acc_norm,none β 0.8047 β
|
| 121 |
+
β β acc_norm_stderr,none β 0.0092 β
|
| 122 |
+
βββββββββββββ΄βββββββββββββββββββββββ΄βββββββββ
|
| 123 |
+
βββββββββββββ³βββββββββββββββββββββββ³βββββββββ
|
| 124 |
+
β Benchmark β Metric β Value β
|
| 125 |
+
β‘ββββββββββββββββββββββββββββββββββββββββββββ©
|
| 126 |
+
β PIQA T163 β acc,none β 0.7884 β
|
| 127 |
+
β β acc_stderr,none β 0.0095 β
|
| 128 |
+
β β acc_norm,none β 0.8036 β
|
| 129 |
+
β β acc_norm_stderr,none β 0.0093 β
|
| 130 |
+
βββββββββββββ΄βββββββββββββββββββββββ΄βββββββββ
|
| 131 |
+
βββββββββββββ³βββββββββββββββββββββββ³βββββββββ
|
| 132 |
+
β Benchmark β Metric β Value β
|
| 133 |
+
β‘ββββββββββββββββββββββββββββββββββββββββββββ©
|
| 134 |
+
β PIQA T80 β acc,none β 0.7884 β
|
| 135 |
+
β β acc_stderr,none β 0.0095 β
|
| 136 |
+
β β acc_norm,none β 0.8020 β
|
| 137 |
+
β β acc_norm_stderr,none β 0.0093 β
|
| 138 |
+
βββββββββββββ΄βββββββββββββββββββββββ΄βββββββββ
|
| 139 |
+
βββββββββββββ³βββββββββββββββββββββββ³βββββββββ
|
| 140 |
+
β Benchmark β Metric β Value β
|
| 141 |
+
β‘ββββββββββββββββββββββββββββββββββββββββββββ©
|
| 142 |
+
β PIQA T174 β acc,none β 0.7889 β
|
| 143 |
+
β β acc_stderr,none β 0.0095 β
|
| 144 |
+
β β acc_norm,none β 0.8014 β
|
| 145 |
+
β β acc_norm_stderr,none β 0.0093 β
|
| 146 |
+
βββββββββββββ΄βββββββββββββββββββββββ΄βββββββββ
|
| 147 |
+
```
|
| 148 |
+
|
| 149 |
+
</details>
|
| 150 |
+
---
|
| 151 |
Mistral v3 Tekken or Metharme.
|
| 152 |
|
| 153 |
Can think via \<thinking\> or \<think\>
|