MuXodious commited on
Commit
c23bd1c
Β·
verified Β·
1 Parent(s): 5649ad1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +142 -36
README.md CHANGED
@@ -4,44 +4,150 @@ tags:
4
  - uncensored
5
  - decensored
6
  - abliterated
 
 
7
  ---
8
- # This is a decensored version of [TheDrummer/Rocinante-XL-16B-v1](https://huggingface.co/TheDrummer/Rocinante-XL-16B-v1), made using [Heretic](https://github.com/p-e-w/heretic) v1.2.0
9
-
10
- ## Abliteration parameters
11
-
12
- | Parameter | Value |
13
- | :-------- | :---: |
14
- | **direction_index** | 22.20 |
15
- | **attn.o_proj.max_weights.0** | 0: 1.26 |
16
- | **attn.o_proj.max_weights.1** | 1: 0.64 |
17
- | **attn.o_proj.max_weights.2** | 2: 1.41 |
18
- | **attn.o_proj.max_weights.3** | 3: 0.94 |
19
- | **attn.o_proj.max_weight_position** | 23.86 |
20
- | **attn.o_proj.min_weights.0** | 0: 0.97 |
21
- | **attn.o_proj.min_weights.1** | 1: 0.03 |
22
- | **attn.o_proj.min_weights.2** | 2: 1.18 |
23
- | **attn.o_proj.min_weights.3** | 3: 0.93 |
24
- | **attn.o_proj.min_weight_distance** | 18.57 |
25
- | **mlp.down_proj.max_weights.0** | 0: 1.23 |
26
- | **mlp.down_proj.max_weights.1** | 1: 0.70 |
27
- | **mlp.down_proj.max_weights.2** | 2: 1.35 |
28
- | **mlp.down_proj.max_weights.3** | 3: 0.86 |
29
- | **mlp.down_proj.max_weight_position** | 28.60 |
30
- | **mlp.down_proj.min_weights.0** | 0: 0.37 |
31
- | **mlp.down_proj.min_weights.1** | 1: 0.25 |
32
- | **mlp.down_proj.min_weights.2** | 2: 1.01 |
33
- | **mlp.down_proj.min_weights.3** | 3: 0.45 |
34
- | **mlp.down_proj.min_weight_distance** | 5.96 |
35
-
36
- ## Performance
37
-
38
- | Metric | This model | Original model ([TheDrummer/Rocinante-XL-16B-v1](https://huggingface.co/TheDrummer/Rocinante-XL-16B-v1)) |
39
- | :----- | :--------: | :---------------------------: |
40
- | **KL divergence** | 0.0182 | 0 *(by definition)* |
41
- | **Refusals** | 3/416 | 339/416 |
42
-
43
- -----
44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  Mistral v3 Tekken or Metharme.
46
 
47
  Can think via \<thinking\> or \<think\>
 
4
  - uncensored
5
  - decensored
6
  - abliterated
7
+ base_model:
8
+ - TheDrummer/Rocinante-XL-16B-v1
9
  ---
10
+ This is a **Rocinante-XL-16B-v1** fine-tune, produced through P-E-W's [Heretic](https://github.com/p-e-w/heretic) (v1.2.0) abliteration engine with [Self-Organizing Maps & Magnitude-Preserving Orthogonal Ablation](https://github.com/p-e-w/heretic/pull/196) enabled.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
+ ---
13
+ <p>
14
+ <img src="https://img.shields.io/badge/HERESY_INDEX-ABSOLUTE-white?style=flat-square&labelColor=101010" align="right" width="250">
15
+ <b>Heretication Results</b>
16
+ <br clear="right">
17
+ <img src="https://img.shields.io/badge/RENEGADE_CHAPTER-SOMPOA-FCC900?style=flat-square&labelColor=101010" align="right" width="300">
18
+ </p>
19
+ <br clear="right">
20
+
21
+ | Score Metric | Value | Parameter | Value |
22
+ | :--- | :--- | :--- | :--- |
23
+ | **Refusals** | 3/416 | **direction_index** | 22.20 |
24
+ | **KL Divergence** | 0.0182 | **attn.o_proj.max_weights.0** | 0: 1.26 |
25
+ | **Initial Refusals** | 339/416 | **attn.o_proj.max_weights.1** | 1: 0.64 |
26
+ ||| **attn.o_proj.max_weights.2** | 2: 1.41 |
27
+ ||| **attn.o_proj.max_weights.3** | 3: 0.94 |
28
+ ||| **attn.o_proj.max_weight_position** | 23.86 |
29
+ ||| **attn.o_proj.min_weights.0** | 0: 0.97 |
30
+ ||| **attn.o_proj.min_weights.1** | 1: 0.03 |
31
+ ||| **attn.o_proj.min_weights.2** | 2: 1.18 |
32
+ ||| **attn.o_proj.min_weights.3** | 3: 0.93 |
33
+ ||| **attn.o_proj.min_weight_distance** | 18.57 |
34
+ ||| **mlp.down_proj.max_weights.0** | 0: 1.23 |
35
+ ||| **mlp.down_proj.max_weights.1** | 1: 0.70 |
36
+ ||| **mlp.down_proj.max_weights.2** | 2: 1.35 |
37
+ ||| **mlp.down_proj.max_weights.3** | 3: 0.86 |
38
+ ||| **mlp.down_proj.max_weight_position** | 28.60 |
39
+ ||| **mlp.down_proj.min_weights.0** | 0: 0.37 |
40
+ ||| **mlp.down_proj.min_weights.1** | 1: 0.25 |
41
+ ||| **mlp.down_proj.min_weights.2** | 2: 1.01 |
42
+ ||| **mlp.down_proj.min_weights.3** | 3: 0.45 |
43
+ ||| **mlp.down_proj.min_weight_distance** | 5.96 |
44
+
45
+ ---
46
+ ## Degree of Heretication
47
+ The **Heresy Index** weighs the resulting model's corruption by the process (KL Divergence & PIQA, Manual Response Eval) and its abolition of doctrine (Refusals) for a final verdict in classification.
48
+
49
+ | Index Entry | Classification | Analysis |
50
+ | :--- | :--- | :--- |
51
+ | ![Absolute](https://img.shields.io/badge/HERESY_INDEX-ABSOLUTE-white?style=flat-square&labelColor=101010) | **Absolute Heresy** | Near zero overt and secondary refusals with minimal to none model damage |
52
+ | ![Tainted](https://img.shields.io/badge/HERESY_INDEX-TAINTED-blueviolet?style=flat-square&labelColor=101010) | **Tainted Heresy** | Some residual secondary refusals and/or moderate model damage |
53
+ | ![Impotent](https://img.shields.io/badge/HERESY_INDEX-IMPOTENT-5c4033?style=flat-square&labelColor=101010) | **Impotent Heresy** | Lingering overt refusals and high model damage |
54
+
55
+ **Note**: This is an arbitrary and subjective classification inspired by Warhammer 40K, having no tangible indication towards the model's performance. intended to provide
56
+
57
+ ---
58
+
59
+ **Appendix**
60
+
61
+ > Empty system prompt.
62
+
63
+ <details>
64
+ <summary>Heretication Rituals</summary>
65
+
66
+ ```
67
+ Β» [Trial 93] Refusals: 3/416, KL divergence: 0.0182
68
+ [Trial 159] Refusals: 4/416, KL divergence: 0.0141
69
+ [Trial 80] Refusals: 9/416, KL divergence: 0.0140
70
+ [Trial 174] Refusals: 10/416, KL divergence: 0.0140
71
+ [Trial 163] Refusals: 12/416, KL divergence: 0.0132
72
+ [Trial 118] Refusals: 15/416, KL divergence: 0.0121
73
+ [Trial 82] Refusals: 18/416, KL divergence: 0.0099
74
+ [Trial 169] Refusals: 22/416, KL divergence: 0.0095
75
+ [Trial 119] Refusals: 35/416, KL divergence: 0.0091
76
+ [Trial 96] Refusals: 40/416, KL divergence: 0.0084
77
+ [Trial 100] Refusals: 45/416, KL divergence: 0.0067
78
+ [Trial 109] Refusals: 67/416, KL divergence: 0.0066
79
+ [Trial 62] Refusals: 155/416, KL divergence: 0.0065
80
+ [Trial 151] Refusals: 157/416, KL divergence: 0.0065
81
+ [Trial 164] Refusals: 168/416, KL divergence: 0.0060
82
+ [Trial 127] Refusals: 195/416, KL divergence: 0.0048
83
+ [Trial 139] Refusals: 263/416, KL divergence: 0.0041
84
+ [Trial 32] Refusals: 267/416, KL divergence: 0.0030
85
+ [Trial 101] Refusals: 313/416, KL divergence: 0.0016
86
+ [Trial 63] Refusals: 317/416, KL divergence: 0.0015
87
+ [Trial 181] Refusals: 330/416, KL divergence: 0.0014
88
+ [Trial 13] Refusals: 332/416, KL divergence: 0.0014
89
+ [Trial 59] Refusals: 333/416, KL divergence: 0.0011
90
+ [Trial 54] Refusals: 339/416, KL divergence: 0.0008
91
+ ```
92
+
93
+ </details>
94
+
95
+ <details>
96
+ <summary>PIQA Benchmarks</summary>
97
+
98
+ ```
99
+ ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
100
+ ┃ Benchmark ┃ Metric ┃ Value ┃
101
+ ┑━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
102
+ β”‚ PIQA Base β”‚ acc,none β”‚ 0.7900 β”‚
103
+ β”‚ β”‚ acc_stderr,none β”‚ 0.0095 β”‚
104
+ β”‚ β”‚ acc_norm,none β”‚ 0.8020 β”‚
105
+ β”‚ β”‚ acc_norm_stderr,none β”‚ 0.0093 β”‚
106
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
107
+ ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
108
+ ┃ Benchmark ┃ Metric ┃ Value ┃
109
+ ┑━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
110
+ β”‚ PIQA T93 β”‚ acc,none β”‚ 0.7900 β”‚
111
+ β”‚ β”‚ acc_stderr,none β”‚ 0.0095 β”‚
112
+ β”‚ β”‚ acc_norm,none β”‚ 0.8030 β”‚
113
+ β”‚ β”‚ acc_norm_stderr,none β”‚ 0.0093 β”‚
114
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
115
+ ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
116
+ ┃ Benchmark ┃ Metric ┃ Value ┃
117
+ ┑━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
118
+ β”‚ PIQA T159 β”‚ acc,none β”‚ 0.7878 β”‚
119
+ β”‚ β”‚ acc_stderr,none β”‚ 0.0095 β”‚
120
+ β”‚ β”‚ acc_norm,none β”‚ 0.8047 β”‚
121
+ β”‚ β”‚ acc_norm_stderr,none β”‚ 0.0092 β”‚
122
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
123
+ ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
124
+ ┃ Benchmark ┃ Metric ┃ Value ┃
125
+ ┑━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
126
+ β”‚ PIQA T163 β”‚ acc,none β”‚ 0.7884 β”‚
127
+ β”‚ β”‚ acc_stderr,none β”‚ 0.0095 β”‚
128
+ β”‚ β”‚ acc_norm,none β”‚ 0.8036 β”‚
129
+ β”‚ β”‚ acc_norm_stderr,none β”‚ 0.0093 β”‚
130
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
131
+ ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
132
+ ┃ Benchmark ┃ Metric ┃ Value ┃
133
+ ┑━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
134
+ β”‚ PIQA T80 β”‚ acc,none β”‚ 0.7884 β”‚
135
+ β”‚ β”‚ acc_stderr,none β”‚ 0.0095 β”‚
136
+ β”‚ β”‚ acc_norm,none β”‚ 0.8020 β”‚
137
+ β”‚ β”‚ acc_norm_stderr,none β”‚ 0.0093 β”‚
138
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
139
+ ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
140
+ ┃ Benchmark ┃ Metric ┃ Value ┃
141
+ ┑━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
142
+ β”‚ PIQA T174 β”‚ acc,none β”‚ 0.7889 β”‚
143
+ β”‚ β”‚ acc_stderr,none β”‚ 0.0095 β”‚
144
+ β”‚ β”‚ acc_norm,none β”‚ 0.8014 β”‚
145
+ β”‚ β”‚ acc_norm_stderr,none β”‚ 0.0093 β”‚
146
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜
147
+ ```
148
+
149
+ </details>
150
+ ---
151
  Mistral v3 Tekken or Metharme.
152
 
153
  Can think via \<thinking\> or \<think\>