DreamFast commited on
Commit
5bdce2a
·
verified ·
1 Parent(s): 5d16c3d

Add Abliterlitics repo link

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -16,6 +16,8 @@ tags:
16
 
17
  # GLM-4.7-Flash: HauhauCS, Safetensors
18
 
 
 
19
  This is the HauhauCS abliteration of [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash), converted from the BF16 GGUF release to native safetensors using [ungguf](https://github.com/dreamfast/ungguf).
20
 
21
  HauhauCS claims these are *"No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended, just without the refusals"* and describes them as *"the best lossless uncensored models out there."*
@@ -304,7 +306,7 @@ Weight forensics reveal that HauhauCS used four stacked methods from the reaper-
304
  - **Safety:** [HarmBench](https://github.com/centerforaisafety/HarmBench) 400 textual behaviours, `max_tokens=2048, temperature=0.0`, `classify_harmbench.py` v3.0 with manual overrides, reviewed by GLM 5.1
305
  - **KL divergence:** `F.kl_div(logprobs_variant, logprobs_base, reduction="batchmean", log_target=True)` on full vocab first-token logits via `model.generate(max_new_tokens=1, output_scores=True)`, matching the [Heretic evaluator](https://github.com/p-e-w/heretic/blob/master/src/heretic/evaluator.py). Dataset: [mlabonne/harmless_alpaca](https://huggingface.co/datasets/mlabonne/harmless_alpaca) `test[:100]`, system prompt "You are a helpful assistant." Collected with BF16 dual-GPU inference (RTX 5090 + RTX 4090) with CPU offloading. Validated on single A100-80GB (no offload) for Heretic: KL=0.0115 vs 0.0110, confirming offload does not meaningfully distort results.
306
  - **CoT forensics:** Keyword-based analysis of 2,000 HarmBench reasoning chains (400 per model) captured via OpenAI-compatible API `reasoning` field. Patterns detected: safety deliberation, explicit refusal language, educational pivots, disclaimers.
307
- - **Weight analysis:** SVD, fingerprint, edit vector overlap, per-layer analysis, rank structure, and cross-technique alignment comparing all four abliteration variants against the base
308
  - **Hardware:** RTX 5090 32GB + RTX 4090 24GB
309
 
310
  ## Forensic Notes
 
16
 
17
  # GLM-4.7-Flash: HauhauCS, Safetensors
18
 
19
+ > Forensic analysis by [Abliterlitics](https://github.com/dreamfast/abliterlitics) — open-source abliteration forensics toolkit
20
+
21
  This is the HauhauCS abliteration of [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash), converted from the BF16 GGUF release to native safetensors using [ungguf](https://github.com/dreamfast/ungguf).
22
 
23
  HauhauCS claims these are *"No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended, just without the refusals"* and describes them as *"the best lossless uncensored models out there."*
 
306
  - **Safety:** [HarmBench](https://github.com/centerforaisafety/HarmBench) 400 textual behaviours, `max_tokens=2048, temperature=0.0`, `classify_harmbench.py` v3.0 with manual overrides, reviewed by GLM 5.1
307
  - **KL divergence:** `F.kl_div(logprobs_variant, logprobs_base, reduction="batchmean", log_target=True)` on full vocab first-token logits via `model.generate(max_new_tokens=1, output_scores=True)`, matching the [Heretic evaluator](https://github.com/p-e-w/heretic/blob/master/src/heretic/evaluator.py). Dataset: [mlabonne/harmless_alpaca](https://huggingface.co/datasets/mlabonne/harmless_alpaca) `test[:100]`, system prompt "You are a helpful assistant." Collected with BF16 dual-GPU inference (RTX 5090 + RTX 4090) with CPU offloading. Validated on single A100-80GB (no offload) for Heretic: KL=0.0115 vs 0.0110, confirming offload does not meaningfully distort results.
308
  - **CoT forensics:** Keyword-based analysis of 2,000 HarmBench reasoning chains (400 per model) captured via OpenAI-compatible API `reasoning` field. Patterns detected: safety deliberation, explicit refusal language, educational pivots, disclaimers.
309
+ - **Weight analysis:** SVD, fingerprint, edit vector overlap, per-layer analysis, rank structure, and cross-technique alignment comparing all four abliteration variants against the base, using [Abliterlitics](https://github.com/dreamfast/abliterlitics)
310
  - **Hardware:** RTX 5090 32GB + RTX 4090 24GB
311
 
312
  ## Forensic Notes