jie1888 douyamv commited on
Commit
e76751e
·
0 Parent(s):

Duplicate from douyamv/Gemma-4-31B-JANG_4M-CRACK-GGUF

Browse files

Co-authored-by: douyamv <douyamv@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ gemma-4-31b-jang-crack-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ gemma-4-31b-jang-crack-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
38
+ gemma-4-31b-jang-crack-Q8_0-00006-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
39
+ gemma-4-31b-jang-crack-Q8_0-00002-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
40
+ gemma-4-31b-jang-crack-Q8_0-00005-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
41
+ gemma-4-31b-jang-crack-Q8_0-00001-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
42
+ gemma-4-31b-jang-crack-Q8_0-00008-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
43
+ gemma-4-31b-jang-crack-Q8_0-00009-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
44
+ gemma-4-31b-jang-crack-Q8_0-00004-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
45
+ gemma-4-31b-jang-crack-Q8_0-00003-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
46
+ gemma-4-31b-jang-crack-Q8_0-00007-of-00009.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: gemma
5
+ tags:
6
+ - gemma4
7
+ - gguf
8
+ - quantized
9
+ - 31b
10
+ base_model: google/gemma-4-31b-it
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ # Gemma-4-31B-JANG_4M-CRACK-GGUF
15
+
16
+ GGUF quantizations of Gemma-4-31B-JANG_4M-CRACK for use with llama.cpp, LM Studio, Ollama, and other GGUF-compatible inference engines.
17
+
18
+ ## About the Model
19
+
20
+ - **Base model:** [google/gemma-4-31b-it](https://huggingface.co/google/gemma-4-31b-it)
21
+ - **Architecture:** Gemma 4 Dense Transformer (31B parameters, 60 layers)
22
+ - **Features:** Hybrid Sliding/Global Attention, Vision + Audio multimodal
23
+ - **Modification:** CRACK abliteration (refusal removal) + JANG v2 mixed-precision quantization
24
+
25
+ ## Why This Conversion?
26
+
27
+ The original model uses **JANG v2 mixed-precision MLX quantization** (attention 8-bit + MLP 4-bit), which is only compatible with vMLX. Standard tools (llama.cpp, LM Studio, oMLX, mlx-lm) cannot load this format due to mixed per-layer bit widths.
28
+
29
+ This repository provides standard GGUF quantizations that work everywhere.
30
+
31
+ ## Conversion Process
32
+
33
+ ```
34
+ Original (JANG v2 MLX safetensors, ~18GB)
35
+ ↓ dequantize (attention 8-bit → f16, MLP 4-bit → f16)
36
+ Intermediate (float16 safetensors, ~60GB)
37
+ ↓ convert_hf_to_gguf.py + quantize
38
+ GGUF (various quantizations)
39
+ ```
40
+
41
+ **Note:** Since the original was already quantized (avg 5.1 bits), the dequantized f16 intermediate is an approximation. Re-quantizing to GGUF introduces minimal additional quality loss since the attention layers were preserved at 8-bit in the original.
42
+
43
+ ## Available Quantizations
44
+
45
+ | File | Quant | Size | Quality | Notes |
46
+ |------|-------|------|---------|-------|
47
+ | `gemma-4-31b-jang-crack-Q3_K_M.gguf` | Q3_K_M | ~14 GB | Acceptable | Minimum viable quality |
48
+ | `gemma-4-31b-jang-crack-Q4_K_M.gguf` | Q4_K_M | ~18 GB | Good | Best size/quality balance |
49
+ | `gemma-4-31b-jang-crack-Q5_K_M.gguf` | Q5_K_M | ~21 GB | Better | Recommended if RAM allows |
50
+ | `gemma-4-31b-jang-crack-Q6_K.gguf` | Q6_K | ~25 GB | Very Good | High quality |
51
+ | `gemma-4-31b-jang-crack-Q8_0.gguf` | Q8_0 | ~33 GB | Near lossless | Closest to original |
52
+
53
+ ## System Requirements
54
+
55
+ | Quantization | Minimum RAM | Recommended |
56
+ |-------------|------------|-------------|
57
+ | Q3_K_M | 20 GB | 24 GB |
58
+ | Q4_K_M | 24 GB | 32 GB |
59
+ | Q5_K_M | 28 GB | 36 GB |
60
+ | Q6_K | 32 GB | 40 GB |
61
+ | Q8_0 | 40 GB | 48 GB |
62
+
63
+ ## Usage
64
+
65
+ ### LM Studio
66
+ Download any `.gguf` file and open it in LM Studio.
67
+
68
+ ### llama.cpp
69
+ ```bash
70
+ ./llama-cli -m gemma-4-31b-jang-crack-Q4_K_M.gguf -p "Hello" -n 256
71
+ ```
72
+
73
+ ### Ollama
74
+ ```bash
75
+ echo 'FROM ./gemma-4-31b-jang-crack-Q4_K_M.gguf' > Modelfile
76
+ ollama create gemma4-crack -f Modelfile
77
+ ollama run gemma4-crack
78
+ ```
79
+
80
+ ## License
81
+
82
+ [Gemma License](https://ai.google.dev/gemma/terms)
83
+
84
+ ## Disclaimer
85
+
86
+ This model has had safety guardrails removed. Use responsibly and in compliance with applicable laws.
gemma-4-31b-jang-crack-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2def38f79f60769ee9d3135c723b7e2add6745ab5653292cfb90e040b4067503
3
+ size 15287102912
gemma-4-31b-jang-crack-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1fc8ee10f916da019dbb1d177854fa3b6421fbea1e93839a2061861308e1de7
3
+ size 18687057344
gemma-4-31b-jang-crack-Q8_0-00001-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3194e06834d6e4df8ad8b92701f4ccf26d1eef38b950262fcfd86bd2fa16aab0
3
+ size 3935089728
gemma-4-31b-jang-crack-Q8_0-00002-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3038a7dcb6f63544c613af743215327d9ccda8d73ce882e0ad5d22202f92a7c2
3
+ size 3942979136
gemma-4-31b-jang-crack-Q8_0-00003-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee63ba2abec0b3817322996305067aabc50360a43a1b68e0627a69e361a97261
3
+ size 3989767648
gemma-4-31b-jang-crack-Q8_0-00004-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:45308b26548d60f8b966f1382bfa9aee2fc94b2470337715e5de931e9cc8f34e
3
+ size 3884443136
gemma-4-31b-jang-crack-Q8_0-00005-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b26a9c315ffc7e155531092877204cc3ebaf461c5cb0d1cf91fe1f4ba8be96f
3
+ size 3972244096
gemma-4-31b-jang-crack-Q8_0-00006-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4fb201454d6561099fd02d7ae4d70f14edc6b17e232c3ddd363f5e6530c1ecd5
3
+ size 3960481088
gemma-4-31b-jang-crack-Q8_0-00007-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e294d61e9e911c9e7212f7be44e7b9d00a1202b0a445ea24d81e2d52d08be784
3
+ size 3884486336
gemma-4-31b-jang-crack-Q8_0-00008-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d8ccb562b088d99ed36515a924b62ffec28217036ac21733646840725b6a3177
3
+ size 3989767648
gemma-4-31b-jang-crack-Q8_0-00009-of-00009.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da198b3dadf0721bcba9d718759b7cb3f5cbd1bd6a01313c9f0b2ce207ab745e
3
+ size 1076412160