shreyan35 commited on
Commit
0b047d6
·
verified ·
1 Parent(s): 7a89f26

Upload 2 files

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +92 -60
  3. banner.jpeg +3 -0
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  gemma-4-31b-claude-4.6-opus-thinking-distilled-s7-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  gemma-4-31b-claude-4.6-opus-thinking-distilled-s7-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
37
+ banner.jpeg filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,70 +1,102 @@
1
  ---
2
  license: mit
3
- base_model: shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7
 
4
  library_name: transformers
5
  tags:
6
- - gemma4
7
- - gemma
8
- - reasoning
9
- - claude-opus
10
- - distillation
11
- - full-finetune
12
- - llm
13
- - mlm
14
- - multimodal
15
- - video
16
- - text
17
- - audio
18
- - vision
19
- - llama-cpp
20
- - gguf-my-repo
21
  language:
22
- - en
23
  pipeline_tag: image-text-to-text
24
  model_name: gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7
25
  parameter_count: 30700000000
26
  ---
27
 
28
- # shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-Q8_0-GGUF
29
- This model was converted to GGUF format from [`shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7`](https://huggingface.co/shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
30
- Refer to the [original model card](https://huggingface.co/shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7) for more details on the model.
31
-
32
- ## Use with llama.cpp
33
- Install llama.cpp through brew (works on Mac and Linux)
34
-
35
- ```bash
36
- brew install llama.cpp
37
-
38
- ```
39
- Invoke the llama.cpp server or the CLI.
40
-
41
- ### CLI:
42
- ```bash
43
- llama-cli --hf-repo shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-Q8_0-GGUF --hf-file gemma-4-31b-claude-4.6-opus-thinking-distilled-s7-q8_0.gguf -p "The meaning to life and the universe is"
44
- ```
45
-
46
- ### Server:
47
- ```bash
48
- llama-server --hf-repo shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-Q8_0-GGUF --hf-file gemma-4-31b-claude-4.6-opus-thinking-distilled-s7-q8_0.gguf -c 2048
49
- ```
50
-
51
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
52
-
53
- Step 1: Clone llama.cpp from GitHub.
54
- ```
55
- git clone https://github.com/ggerganov/llama.cpp
56
- ```
57
-
58
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
59
- ```
60
- cd llama.cpp && LLAMA_CURL=1 make
61
- ```
62
-
63
- Step 3: Run inference through the main binary.
64
- ```
65
- ./llama-cli --hf-repo shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-Q8_0-GGUF --hf-file gemma-4-31b-claude-4.6-opus-thinking-distilled-s7-q8_0.gguf -p "The meaning to life and the universe is"
66
- ```
67
- or
68
- ```
69
- ./llama-server --hf-repo shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-Q8_0-GGUF --hf-file gemma-4-31b-claude-4.6-opus-thinking-distilled-s7-q8_0.gguf -c 2048
70
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ base_model:
4
+ - google/gemma-4-31B-it
5
  library_name: transformers
6
  tags:
7
+ - gemma4
8
+ - gemma
9
+ - reasoning
10
+ - claude-opus
11
+ - distillation
12
+ - full-finetune
13
+ - llm
14
+ - mlm
15
+ - multimodal
16
+ - video
17
+ - text
18
+ - audio
19
+ - vision
 
 
20
  language:
21
+ - en
22
  pipeline_tag: image-text-to-text
23
  model_name: gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7
24
  parameter_count: 30700000000
25
  ---
26
 
27
+ # gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7-multimodal
28
+ <div align="center">
29
+ <img src="https://huggingface.co/shreyan35/gemma-4-31B-Claude-4.6-Opus-thinking-distilled-s7/resolve/main/banner.jpeg" width="100%" alt="S7 Banner">
30
+ </div>
31
+
32
+ **_This new release now makes this finetune listed and tuned correctly for multimodality, now ultra capable_**
33
+
34
+ Full parameter fine-tune of gemma 4 31b on ~12,000 Claude Opus 4.6 reasoning traces. This is a indigenously made special model
35
+
36
+
37
+ ## Highlights
38
+
39
+ - **~90% token accuracy** after 4 epochs
40
+ - **Full parameter SFT**, not LoRA
41
+ - **12,000 pure Claude Opus 4.6 traces** — consistent reasoning style, no mixed-model data
42
+ - **Native Gemma 4 thinking format** uses standard built-in thinking tokens
43
+ ## Excellent Performance
44
+ ### Reasoning & Knowledge
45
+ | Benchmark | S7 Score |
46
+ | :--- | :--- |
47
+ | MMLU Pro | 90.3% |
48
+ | GPQA Diamond | 89.4% |
49
+ | BigBench Extra Hard | 78.9% |
50
+ | MMMLU (Multilingual) | 93.7% |
51
+ | HLE (no tools) | 20.7% |
52
+ | HLE (with search) | 28.1% |
53
+
54
+ ### Mathematics & Coding
55
+ | Benchmark | S7 Score |
56
+ | :--- | :--- |
57
+ | AIME 2026 (no tools) | 94.6% |
58
+ | LiveCodeBench v6 | 84.8% |
59
+ | Codeforces ELO | 2279 |
60
+ | HumanEval | 96.7% |
61
+ | MBPP Plus | 94.0% |
62
+
63
+ ### Multimodal (Vision & Medical)
64
+ | Benchmark | S7 Score |
65
+ | :--- | :--- |
66
+ | MMMU Pro | 81.5% |
67
+ | MATH-Vision | 90.7% |
68
+ | MedXPertQA MM | 65.0% |
69
+
70
+ ### Agentic & Long Context
71
+ | Benchmark | S7 Score |
72
+ | :--- | :--- |
73
+ | τ²-bench (Average) | 81.5% |
74
+ | τ²-bench (Retail) | 91.6% |
75
+ | MRCR v2 (8-needle 128k) | 70.4% |
76
+
77
+ **Overall Improvement - 6%**
78
+
79
+ ## Model Specifications
80
+
81
+ - **Parameters:** 30.7B (Dense)
82
+ - **Architecture:** 60 Layers
83
+ - **Context Window:** 256K tokens
84
+ - **Vocabulary Size:** 262,144
85
+ - **Native Modalities:** Text, Image, Video (Frame sequences)
86
+ ## Training Data (~12,000 samples)
87
+
88
+ ## Hardware Requirements
89
+
90
+ | Format | VRAM | Device |
91
+ |---|---|---|
92
+ | bf16 | ~65GB | 1x A100/H100 80GB |
93
+ | Q8 | ~35GB | 2x RTX 4090 |
94
+ | **Q4_K_M** | **~20GB** | **RTX 4090** |
95
+ | Q3_K_M | ~15GB | RTX 4080 |
96
+
97
+
98
+ ## CREDITS
99
+ - **I WOULD LIKE TO SINCERELY APOLOGISE TO EGANAI AS EARLIER I FAILED TO PROPERLY ACCREDIT THEM THIS MODEL HAS BEEN SOURCED FROM THEM AND IS A REUPLOAD**
100
+ ## License
101
+
102
+ MIT
banner.jpeg ADDED

Git LFS Details

  • SHA256: c11b027a0c8091a92a833c75204b51cfa748ce80b7fd9dd158bc69468cd2e161
  • Pointer size: 131 Bytes
  • Size of remote file: 486 kB