Brooooooklyn commited on
Commit
20c54f3
·
verified ·
1 Parent(s): 19ce260

Add files using upload-large-folder tool

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model: Qwen/Qwen3.6-27B
7
+ tags:
8
+ - mlx
9
+ - mlx-node
10
+ - quantized
11
+ - awq
12
+ - 6-bit
13
+ - qwen3.6
14
+ - hybrid-attention
15
+ - gated-delta-net
16
+ - apple-silicon
17
+ - unsloth-dynamic
18
+ library_name: mlx-node
19
+ quantized_by: mlx-node
20
+ pipeline_tag: text-generation
21
+ model_type: qwen3_5
22
+ ---
23
+
24
+ # Qwen3.6-27B — UD-Q6_K_XL (mlx-node)
25
+
26
+ 6-bit base mixed-precision quantization of [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B) for Apple Silicon, using the [**Unsloth Dynamic** quantization strategy](https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks) via [mlx-node](https://github.com/mlx-node/mlx-node).
27
+
28
+ | | Original (BF16) | This Model |
29
+ |---|---|---|
30
+ | **Size** | ~51 GB | **27 GB** |
31
+ | **Format** | SafeTensors (sharded) | SafeTensors (sharded) |
32
+ | **Precision** | BF16 uniform | Mixed 6-bit + BF16 |
33
+
34
+ ## All Variants
35
+
36
+ | Repo | GGUF Equivalent | Size | Decode (tok/s) | Speedup vs BF16 |
37
+ |---|---|---|---|---|
38
+ | [Brooooooklyn/Qwen3.6-27B-UD-Q2_K_XL-mlx](https://huggingface.co/Brooooooklyn/Qwen3.6-27B-UD-Q2_K_XL-mlx) | UD-Q2_K_XL | 15 GB | 18.6 | 3.32x |
39
+ | [Brooooooklyn/Qwen3.6-27B-UD-Q3_K_XL-mlx](https://huggingface.co/Brooooooklyn/Qwen3.6-27B-UD-Q3_K_XL-mlx) | UD-Q3_K_XL | 18 GB | 15.5 | 2.77x |
40
+ | [Brooooooklyn/Qwen3.6-27B-UD-Q4_K_XL-mlx](https://huggingface.co/Brooooooklyn/Qwen3.6-27B-UD-Q4_K_XL-mlx) | UD-Q4_K_XL | 21 GB | 13.9 | 2.48x |
41
+ | [Brooooooklyn/Qwen3.6-27B-UD-Q5_K_XL-mlx](https://huggingface.co/Brooooooklyn/Qwen3.6-27B-UD-Q5_K_XL-mlx) | UD-Q5_K_XL | 25 GB | 12.0 | 2.14x |
42
+ | [Brooooooklyn/Qwen3.6-27B-UD-Q6_K_XL-mlx](https://huggingface.co/Brooooooklyn/Qwen3.6-27B-UD-Q6_K_XL-mlx) | UD-Q6_K_XL | 27 GB | 10.8 | 1.93x |
43
+ | [Brooooooklyn/Qwen3.6-27B-UD-Q8_K_XL-mlx](https://huggingface.co/Brooooooklyn/Qwen3.6-27B-UD-Q8_K_XL-mlx) | UD-Q8_K_XL | 30 GB | 9.9 | 1.77x |
44
+
45
+ Benchmarked on Apple M3 Max 128GB via `examples/lm.ts` (Turn 4 steady-state decode).
46
+
47
+ ## Performance
48
+
49
+ | Model | Size | Decode (tok/s) | Speedup |
50
+ |---|---|---|---|
51
+ | BF16 (unquantized) | 51 GB | 5.6 | baseline |
52
+ | **This model (UD-Q6_K_XL)** | **27 GB** | **10.8** | **1.93x faster** |
53
+
54
+ Decode is memory-bandwidth bound on Apple Silicon — fewer bytes per token directly translates to higher throughput. The hybrid architecture interleaves linear attention (gated delta net, 48/64 layers) with full attention (16/64 layers).
55
+
56
+ ## Per-Tensor Bit Assignments (N=6)
57
+
58
+ | Weight | Bits | Rationale |
59
+ |---|---|---|
60
+ | `embed_tokens` | 8-bit | KLD ~0.15 — very low sensitivity |
61
+ | `lm_head` | 8-bit | KLD ~0.05 — safest tensor |
62
+ | `self_attn.q/k/v_proj` | 8-bit + AWQ | KLD ~1.5–2.9, AWQ via layernorm |
63
+ | `linear_attn.in_proj_qkv/z` | 8-bit + AWQ | KLD ~2.9, AWQ via layernorm |
64
+ | `self_attn.o_proj` | **bf16** | NOT AWQ-correctable |
65
+ | `linear_attn.out_proj` | **bf16** | KLD ~6.0 — worst tensor |
66
+ | `down_proj` | 8-bit | "Slightly more sensitive" (snap N+1=7 → 8) |
67
+ | `gate_proj`, `up_proj` | 6-bit | base bits |
68
+ | GDN params (A_log, etc) | **bf16** | State-space dynamics |
69
+
70
+ ## Quantization Strategy
71
+
72
+ Based on [Unsloth Dynamic 2.0](https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks) per-tensor KLD analysis. Sensitive layers get higher bits with AWQ correction, while the bulk of FFN weights are aggressively quantized. imatrix AWQ pre-scaling amplifies important weight channels and fuses inverse scales into preceding layer norms (zero inference overhead).
73
+
74
+ **AWQ-correctable** projections (q/k/v, in_proj_qkv/z) are quantized at 8-bit via `input_layernorm`. **Non-AWQ-correctable** projections (o_proj, out_proj) are kept at bf16 — their inputs come from attention/GDN computation, not from a norm layer.
75
+
76
+ ## Architecture
77
+
78
+ | Parameter | Value |
79
+ |---|---|
80
+ | Total parameters | 27.4B (dense — all active) |
81
+ | Hidden size | 5,120 |
82
+ | Layers | 64 (48 linear + 16 full attention) |
83
+ | Attention heads | 24 (4 KV heads, GQA 6:1) |
84
+ | Head dimension | 256 |
85
+ | Intermediate size | 17,408 |
86
+ | Vocab size | 248,320 |
87
+ | Max context | 262,144 tokens |
88
+
89
+ ## Usage
90
+
91
+ ```typescript
92
+ import { loadSession } from '@mlx-node/lm';
93
+
94
+ const session = await loadSession('./Qwen3.6-27B-UD-Q6_K_XL-mlx');
95
+
96
+ for await (const event of session.sendStream('Explain the hybrid attention mechanism in Qwen3.6.', {
97
+ config: { maxNewTokens: 2048, temperature: 0.6, reasoningEffort: 'low' },
98
+ })) {
99
+ if (!event.done) process.stdout.write(event.text);
100
+ }
101
+ ```
102
+
103
+ ## How It Was Made
104
+
105
+ ```bash
106
+ mlx convert \
107
+ -i Qwen3.6-27B \
108
+ -o Qwen3.6-27B-UD-Q6_K_XL-mlx \
109
+ -q --q-bits 6 --q-recipe unsloth \
110
+ --imatrix-path imatrix_unsloth.gguf
111
+ ```
112
+
113
+ ## Acknowledgments
114
+
115
+ - **[Unsloth](https://unsloth.ai)** — Quantization strategy based on their [per-layer KLD benchmarks](https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks) and Dynamic 2.0 methodology
116
+ - **[Qwen Team](https://huggingface.co/Qwen)** — For the Qwen3.6 model family
117
+ - **[Apple MLX](https://github.com/ml-explore/mlx)** — For the Metal-accelerated ML framework
118
+
119
+ ## License
120
+
121
+ [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) (inherited from base model).
chat_template.jinja ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- set image_count = namespace(value=0) %}
2
+ {%- set video_count = namespace(value=0) %}
3
+ {%- macro render_content(content, do_vision_count, is_system_content=false) %}
4
+ {%- if content is string %}
5
+ {{- content }}
6
+ {%- elif content is iterable and content is not mapping %}
7
+ {%- for item in content %}
8
+ {%- if 'image' in item or 'image_url' in item or item.type == 'image' %}
9
+ {%- if is_system_content %}
10
+ {{- raise_exception('System message cannot contain images.') }}
11
+ {%- endif %}
12
+ {%- if do_vision_count %}
13
+ {%- set image_count.value = image_count.value + 1 %}
14
+ {%- endif %}
15
+ {%- if add_vision_id %}
16
+ {{- 'Picture ' ~ image_count.value ~ ': ' }}
17
+ {%- endif %}
18
+ {{- '<|vision_start|><|image_pad|><|vision_end|>' }}
19
+ {%- elif 'video' in item or item.type == 'video' %}
20
+ {%- if is_system_content %}
21
+ {{- raise_exception('System message cannot contain videos.') }}
22
+ {%- endif %}
23
+ {%- if do_vision_count %}
24
+ {%- set video_count.value = video_count.value + 1 %}
25
+ {%- endif %}
26
+ {%- if add_vision_id %}
27
+ {{- 'Video ' ~ video_count.value ~ ': ' }}
28
+ {%- endif %}
29
+ {{- '<|vision_start|><|video_pad|><|vision_end|>' }}
30
+ {%- elif 'text' in item %}
31
+ {{- item.text }}
32
+ {%- else %}
33
+ {{- raise_exception('Unexpected item type in content.') }}
34
+ {%- endif %}
35
+ {%- endfor %}
36
+ {%- elif content is none or content is undefined %}
37
+ {{- '' }}
38
+ {%- else %}
39
+ {{- raise_exception('Unexpected content type.') }}
40
+ {%- endif %}
41
+ {%- endmacro %}
42
+ {%- if not messages %}
43
+ {{- raise_exception('No messages provided.') }}
44
+ {%- endif %}
45
+ {%- if tools and tools is iterable and tools is not mapping %}
46
+ {{- '<|im_start|>system\n' }}
47
+ {{- "# Tools\n\nYou have access to the following functions:\n\n<tools>" }}
48
+ {%- for tool in tools %}
49
+ {{- "\n" }}
50
+ {{- tool | tojson }}
51
+ {%- endfor %}
52
+ {{- "\n</tools>" }}
53
+ {{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
54
+ {%- if messages[0].role == 'system' %}
55
+ {%- set content = render_content(messages[0].content, false, true)|trim %}
56
+ {%- if content %}
57
+ {{- '\n\n' + content }}
58
+ {%- endif %}
59
+ {%- endif %}
60
+ {{- '<|im_end|>\n' }}
61
+ {%- else %}
62
+ {%- if messages[0].role == 'system' %}
63
+ {%- set content = render_content(messages[0].content, false, true)|trim %}
64
+ {{- '<|im_start|>system\n' + content + '<|im_end|>\n' }}
65
+ {%- endif %}
66
+ {%- endif %}
67
+ {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
68
+ {%- for message in messages[::-1] %}
69
+ {%- set index = (messages|length - 1) - loop.index0 %}
70
+ {%- if ns.multi_step_tool and message.role == "user" %}
71
+ {%- set content = render_content(message.content, false)|trim %}
72
+ {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}
73
+ {%- set ns.multi_step_tool = false %}
74
+ {%- set ns.last_query_index = index %}
75
+ {%- endif %}
76
+ {%- endif %}
77
+ {%- endfor %}
78
+ {%- if ns.multi_step_tool %}
79
+ {{- raise_exception('No user query found in messages.') }}
80
+ {%- endif %}
81
+ {%- for message in messages %}
82
+ {%- set content = render_content(message.content, true)|trim %}
83
+ {%- if message.role == "system" %}
84
+ {%- if not loop.first %}
85
+ {{- raise_exception('System message must be at the beginning.') }}
86
+ {%- endif %}
87
+ {%- elif message.role == "user" %}
88
+ {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
89
+ {%- elif message.role == "assistant" %}
90
+ {%- set reasoning_content = '' %}
91
+ {%- if message.reasoning_content is string %}
92
+ {%- set reasoning_content = message.reasoning_content %}
93
+ {%- else %}
94
+ {%- if '</think>' in content %}
95
+ {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
96
+ {%- set content = content.split('</think>')[-1].lstrip('\n') %}
97
+ {%- endif %}
98
+ {%- endif %}
99
+ {%- set reasoning_content = reasoning_content|trim %}
100
+ {%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}
101
+ {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content + '\n</think>\n\n' + content }}
102
+ {%- else %}
103
+ {{- '<|im_start|>' + message.role + '\n' + content }}
104
+ {%- endif %}
105
+ {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}
106
+ {%- for tool_call in message.tool_calls %}
107
+ {%- if tool_call.function is defined %}
108
+ {%- set tool_call = tool_call.function %}
109
+ {%- endif %}
110
+ {%- if loop.first %}
111
+ {%- if content|trim %}
112
+ {{- '\n\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
113
+ {%- else %}
114
+ {{- '<tool_call>\n<function=' + tool_call.name + '>\n' }}
115
+ {%- endif %}
116
+ {%- else %}
117
+ {{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
118
+ {%- endif %}
119
+ {%- if tool_call.arguments is defined %}
120
+ {%- for args_name, args_value in tool_call.arguments|items %}
121
+ {{- '<parameter=' + args_name + '>\n' }}
122
+ {%- set args_value = args_value | string if args_value is string else args_value | tojson | safe %}
123
+ {{- args_value }}
124
+ {{- '\n</parameter>\n' }}
125
+ {%- endfor %}
126
+ {%- endif %}
127
+ {{- '</function>\n</tool_call>' }}
128
+ {%- endfor %}
129
+ {%- endif %}
130
+ {{- '<|im_end|>\n' }}
131
+ {%- elif message.role == "tool" %}
132
+ {%- if loop.previtem and loop.previtem.role != "tool" %}
133
+ {{- '<|im_start|>user' }}
134
+ {%- endif %}
135
+ {{- '\n<tool_response>\n' }}
136
+ {{- content }}
137
+ {{- '\n</tool_response>' }}
138
+ {%- if not loop.last and loop.nextitem.role != "tool" %}
139
+ {{- '<|im_end|>\n' }}
140
+ {%- elif loop.last %}
141
+ {{- '<|im_end|>\n' }}
142
+ {%- endif %}
143
+ {%- else %}
144
+ {{- raise_exception('Unexpected message role.') }}
145
+ {%- endif %}
146
+ {%- endfor %}
147
+ {%- if add_generation_prompt %}
148
+ {{- '<|im_start|>assistant\n' }}
149
+ {%- if enable_thinking is defined and enable_thinking is false %}
150
+ {{- '<think>\n\n</think>\n\n' }}
151
+ {%- else %}
152
+ {{- '<think>\n' }}
153
+ {%- endif %}
154
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,2250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen3_5ForConditionalGeneration"
4
+ ],
5
+ "image_token_id": 248056,
6
+ "language_model_only": false,
7
+ "model_type": "qwen3_5",
8
+ "quantization": {
9
+ "group_size": 64,
10
+ "bits": 6,
11
+ "mode": "affine",
12
+ "language_model.model.layers.22.linear_attn.in_proj_z": {
13
+ "bits": 8,
14
+ "group_size": 64,
15
+ "mode": "affine"
16
+ },
17
+ "language_model.model.layers.49.linear_attn.in_proj_z": {
18
+ "bits": 8,
19
+ "group_size": 64,
20
+ "mode": "affine"
21
+ },
22
+ "language_model.model.layers.15.self_attn.k_proj": {
23
+ "bits": 8,
24
+ "group_size": 64,
25
+ "mode": "affine"
26
+ },
27
+ "language_model.model.layers.46.mlp.down_proj": {
28
+ "bits": 8,
29
+ "group_size": 64,
30
+ "mode": "affine"
31
+ },
32
+ "language_model.model.layers.33.linear_attn.in_proj_z": {
33
+ "bits": 8,
34
+ "group_size": 64,
35
+ "mode": "affine"
36
+ },
37
+ "language_model.model.layers.12.linear_attn.in_proj_qkv": {
38
+ "bits": 8,
39
+ "group_size": 64,
40
+ "mode": "affine"
41
+ },
42
+ "language_model.model.layers.14.linear_attn.in_proj_z": {
43
+ "bits": 8,
44
+ "group_size": 64,
45
+ "mode": "affine"
46
+ },
47
+ "language_model.model.layers.8.linear_attn.in_proj_qkv": {
48
+ "bits": 8,
49
+ "group_size": 64,
50
+ "mode": "affine"
51
+ },
52
+ "language_model.model.layers.9.linear_attn.in_proj_qkv": {
53
+ "bits": 8,
54
+ "group_size": 64,
55
+ "mode": "affine"
56
+ },
57
+ "language_model.model.layers.9.mlp.down_proj": {
58
+ "bits": 8,
59
+ "group_size": 64,
60
+ "mode": "affine"
61
+ },
62
+ "language_model.model.layers.30.linear_attn.in_proj_qkv": {
63
+ "bits": 8,
64
+ "group_size": 64,
65
+ "mode": "affine"
66
+ },
67
+ "language_model.model.layers.44.linear_attn.in_proj_z": {
68
+ "bits": 8,
69
+ "group_size": 64,
70
+ "mode": "affine"
71
+ },
72
+ "language_model.model.layers.62.linear_attn.in_proj_z": {
73
+ "bits": 8,
74
+ "group_size": 64,
75
+ "mode": "affine"
76
+ },
77
+ "language_model.model.layers.63.self_attn.k_proj": {
78
+ "bits": 8,
79
+ "group_size": 64,
80
+ "mode": "affine"
81
+ },
82
+ "language_model.model.layers.24.linear_attn.in_proj_z": {
83
+ "bits": 8,
84
+ "group_size": 64,
85
+ "mode": "affine"
86
+ },
87
+ "language_model.model.layers.62.mlp.down_proj": {
88
+ "bits": 8,
89
+ "group_size": 64,
90
+ "mode": "affine"
91
+ },
92
+ "language_model.model.layers.59.self_attn.v_proj": {
93
+ "bits": 8,
94
+ "group_size": 64,
95
+ "mode": "affine"
96
+ },
97
+ "language_model.model.layers.18.linear_attn.in_proj_qkv": {
98
+ "bits": 8,
99
+ "group_size": 64,
100
+ "mode": "affine"
101
+ },
102
+ "language_model.model.layers.39.self_attn.k_proj": {
103
+ "bits": 8,
104
+ "group_size": 64,
105
+ "mode": "affine"
106
+ },
107
+ "language_model.model.layers.48.linear_attn.in_proj_qkv": {
108
+ "bits": 8,
109
+ "group_size": 64,
110
+ "mode": "affine"
111
+ },
112
+ "language_model.model.layers.5.linear_attn.in_proj_z": {
113
+ "bits": 8,
114
+ "group_size": 64,
115
+ "mode": "affine"
116
+ },
117
+ "language_model.model.layers.30.linear_attn.in_proj_z": {
118
+ "bits": 8,
119
+ "group_size": 64,
120
+ "mode": "affine"
121
+ },
122
+ "language_model.model.layers.56.linear_attn.in_proj_z": {
123
+ "bits": 8,
124
+ "group_size": 64,
125
+ "mode": "affine"
126
+ },
127
+ "language_model.model.layers.47.self_attn.v_proj": {
128
+ "bits": 8,
129
+ "group_size": 64,
130
+ "mode": "affine"
131
+ },
132
+ "language_model.model.layers.52.mlp.down_proj": {
133
+ "bits": 8,
134
+ "group_size": 64,
135
+ "mode": "affine"
136
+ },
137
+ "language_model.model.layers.60.mlp.down_proj": {
138
+ "bits": 8,
139
+ "group_size": 64,
140
+ "mode": "affine"
141
+ },
142
+ "language_model.model.layers.56.linear_attn.in_proj_qkv": {
143
+ "bits": 8,
144
+ "group_size": 64,
145
+ "mode": "affine"
146
+ },
147
+ "language_model.model.layers.56.mlp.down_proj": {
148
+ "bits": 8,
149
+ "group_size": 64,
150
+ "mode": "affine"
151
+ },
152
+ "language_model.model.layers.2.linear_attn.in_proj_qkv": {
153
+ "bits": 8,
154
+ "group_size": 64,
155
+ "mode": "affine"
156
+ },
157
+ "language_model.model.layers.35.mlp.down_proj": {
158
+ "bits": 8,
159
+ "group_size": 64,
160
+ "mode": "affine"
161
+ },
162
+ "language_model.model.layers.13.mlp.down_proj": {
163
+ "bits": 8,
164
+ "group_size": 64,
165
+ "mode": "affine"
166
+ },
167
+ "language_model.model.layers.41.linear_attn.in_proj_z": {
168
+ "bits": 8,
169
+ "group_size": 64,
170
+ "mode": "affine"
171
+ },
172
+ "language_model.model.layers.1.linear_attn.in_proj_z": {
173
+ "bits": 8,
174
+ "group_size": 64,
175
+ "mode": "affine"
176
+ },
177
+ "language_model.model.layers.25.linear_attn.in_proj_z": {
178
+ "bits": 8,
179
+ "group_size": 64,
180
+ "mode": "affine"
181
+ },
182
+ "language_model.model.layers.47.mlp.down_proj": {
183
+ "bits": 8,
184
+ "group_size": 64,
185
+ "mode": "affine"
186
+ },
187
+ "language_model.model.layers.50.mlp.down_proj": {
188
+ "bits": 8,
189
+ "group_size": 64,
190
+ "mode": "affine"
191
+ },
192
+ "language_model.model.layers.3.self_attn.v_proj": {
193
+ "bits": 8,
194
+ "group_size": 64,
195
+ "mode": "affine"
196
+ },
197
+ "language_model.model.layers.6.linear_attn.in_proj_qkv": {
198
+ "bits": 8,
199
+ "group_size": 64,
200
+ "mode": "affine"
201
+ },
202
+ "language_model.model.layers.38.mlp.down_proj": {
203
+ "bits": 8,
204
+ "group_size": 64,
205
+ "mode": "affine"
206
+ },
207
+ "language_model.model.layers.61.linear_attn.in_proj_z": {
208
+ "bits": 8,
209
+ "group_size": 64,
210
+ "mode": "affine"
211
+ },
212
+ "language_model.model.layers.8.linear_attn.in_proj_z": {
213
+ "bits": 8,
214
+ "group_size": 64,
215
+ "mode": "affine"
216
+ },
217
+ "language_model.model.layers.39.self_attn.q_proj": {
218
+ "bits": 8,
219
+ "group_size": 64,
220
+ "mode": "affine"
221
+ },
222
+ "language_model.model.layers.41.linear_attn.in_proj_qkv": {
223
+ "bits": 8,
224
+ "group_size": 64,
225
+ "mode": "affine"
226
+ },
227
+ "language_model.model.layers.36.linear_attn.in_proj_qkv": {
228
+ "bits": 8,
229
+ "group_size": 64,
230
+ "mode": "affine"
231
+ },
232
+ "language_model.model.layers.51.mlp.down_proj": {
233
+ "bits": 8,
234
+ "group_size": 64,
235
+ "mode": "affine"
236
+ },
237
+ "language_model.model.layers.21.mlp.down_proj": {
238
+ "bits": 8,
239
+ "group_size": 64,
240
+ "mode": "affine"
241
+ },
242
+ "language_model.model.layers.30.mlp.down_proj": {
243
+ "bits": 8,
244
+ "group_size": 64,
245
+ "mode": "affine"
246
+ },
247
+ "language_model.model.layers.15.self_attn.v_proj": {
248
+ "bits": 8,
249
+ "group_size": 64,
250
+ "mode": "affine"
251
+ },
252
+ "language_model.model.layers.59.self_attn.q_proj": {
253
+ "bits": 8,
254
+ "group_size": 64,
255
+ "mode": "affine"
256
+ },
257
+ "language_model.model.layers.43.mlp.down_proj": {
258
+ "bits": 8,
259
+ "group_size": 64,
260
+ "mode": "affine"
261
+ },
262
+ "language_model.model.layers.3.mlp.down_proj": {
263
+ "bits": 8,
264
+ "group_size": 64,
265
+ "mode": "affine"
266
+ },
267
+ "language_model.model.layers.28.linear_attn.in_proj_qkv": {
268
+ "bits": 8,
269
+ "group_size": 64,
270
+ "mode": "affine"
271
+ },
272
+ "language_model.model.layers.11.self_attn.v_proj": {
273
+ "bits": 8,
274
+ "group_size": 64,
275
+ "mode": "affine"
276
+ },
277
+ "language_model.model.layers.22.mlp.down_proj": {
278
+ "bits": 8,
279
+ "group_size": 64,
280
+ "mode": "affine"
281
+ },
282
+ "language_model.model.layers.37.linear_attn.in_proj_z": {
283
+ "bits": 8,
284
+ "group_size": 64,
285
+ "mode": "affine"
286
+ },
287
+ "language_model.model.layers.5.linear_attn.in_proj_qkv": {
288
+ "bits": 8,
289
+ "group_size": 64,
290
+ "mode": "affine"
291
+ },
292
+ "language_model.model.layers.50.linear_attn.in_proj_z": {
293
+ "bits": 8,
294
+ "group_size": 64,
295
+ "mode": "affine"
296
+ },
297
+ "language_model.model.layers.29.linear_attn.in_proj_z": {
298
+ "bits": 8,
299
+ "group_size": 64,
300
+ "mode": "affine"
301
+ },
302
+ "language_model.model.layers.35.self_attn.k_proj": {
303
+ "bits": 8,
304
+ "group_size": 64,
305
+ "mode": "affine"
306
+ },
307
+ "language_model.model.layers.33.mlp.down_proj": {
308
+ "bits": 8,
309
+ "group_size": 64,
310
+ "mode": "affine"
311
+ },
312
+ "language_model.model.layers.50.linear_attn.in_proj_qkv": {
313
+ "bits": 8,
314
+ "group_size": 64,
315
+ "mode": "affine"
316
+ },
317
+ "language_model.model.layers.25.linear_attn.in_proj_qkv": {
318
+ "bits": 8,
319
+ "group_size": 64,
320
+ "mode": "affine"
321
+ },
322
+ "language_model.model.layers.20.linear_attn.in_proj_qkv": {
323
+ "bits": 8,
324
+ "group_size": 64,
325
+ "mode": "affine"
326
+ },
327
+ "language_model.model.layers.59.mlp.down_proj": {
328
+ "bits": 8,
329
+ "group_size": 64,
330
+ "mode": "affine"
331
+ },
332
+ "language_model.model.layers.0.mlp.down_proj": {
333
+ "bits": 8,
334
+ "group_size": 64,
335
+ "mode": "affine"
336
+ },
337
+ "language_model.model.layers.7.self_attn.k_proj": {
338
+ "bits": 8,
339
+ "group_size": 64,
340
+ "mode": "affine"
341
+ },
342
+ "language_model.model.layers.10.linear_attn.in_proj_z": {
343
+ "bits": 8,
344
+ "group_size": 64,
345
+ "mode": "affine"
346
+ },
347
+ "language_model.model.layers.49.mlp.down_proj": {
348
+ "bits": 8,
349
+ "group_size": 64,
350
+ "mode": "affine"
351
+ },
352
+ "language_model.model.layers.24.mlp.down_proj": {
353
+ "bits": 8,
354
+ "group_size": 64,
355
+ "mode": "affine"
356
+ },
357
+ "language_model.model.layers.26.mlp.down_proj": {
358
+ "bits": 8,
359
+ "group_size": 64,
360
+ "mode": "affine"
361
+ },
362
+ "language_model.model.layers.34.linear_attn.in_proj_z": {
363
+ "bits": 8,
364
+ "group_size": 64,
365
+ "mode": "affine"
366
+ },
367
+ "language_model.model.layers.39.mlp.down_proj": {
368
+ "bits": 8,
369
+ "group_size": 64,
370
+ "mode": "affine"
371
+ },
372
+ "language_model.model.layers.0.linear_attn.in_proj_z": {
373
+ "bits": 8,
374
+ "group_size": 64,
375
+ "mode": "affine"
376
+ },
377
+ "language_model.model.layers.49.linear_attn.in_proj_qkv": {
378
+ "bits": 8,
379
+ "group_size": 64,
380
+ "mode": "affine"
381
+ },
382
+ "language_model.model.layers.37.mlp.down_proj": {
383
+ "bits": 8,
384
+ "group_size": 64,
385
+ "mode": "affine"
386
+ },
387
+ "language_model.model.layers.13.linear_attn.in_proj_qkv": {
388
+ "bits": 8,
389
+ "group_size": 64,
390
+ "mode": "affine"
391
+ },
392
+ "language_model.model.layers.20.linear_attn.in_proj_z": {
393
+ "bits": 8,
394
+ "group_size": 64,
395
+ "mode": "affine"
396
+ },
397
+ "language_model.model.layers.40.linear_attn.in_proj_qkv": {
398
+ "bits": 8,
399
+ "group_size": 64,
400
+ "mode": "affine"
401
+ },
402
+ "language_model.model.layers.60.linear_attn.in_proj_z": {
403
+ "bits": 8,
404
+ "group_size": 64,
405
+ "mode": "affine"
406
+ },
407
+ "language_model.model.layers.54.linear_attn.in_proj_z": {
408
+ "bits": 8,
409
+ "group_size": 64,
410
+ "mode": "affine"
411
+ },
412
+ "language_model.model.layers.51.self_attn.q_proj": {
413
+ "bits": 8,
414
+ "group_size": 64,
415
+ "mode": "affine"
416
+ },
417
+ "language_model.model.layers.61.mlp.down_proj": {
418
+ "bits": 8,
419
+ "group_size": 64,
420
+ "mode": "affine"
421
+ },
422
+ "language_model.model.layers.16.linear_attn.in_proj_qkv": {
423
+ "bits": 8,
424
+ "group_size": 64,
425
+ "mode": "affine"
426
+ },
427
+ "language_model.model.layers.39.self_attn.v_proj": {
428
+ "bits": 8,
429
+ "group_size": 64,
430
+ "mode": "affine"
431
+ },
432
+ "language_model.model.layers.48.mlp.down_proj": {
433
+ "bits": 8,
434
+ "group_size": 64,
435
+ "mode": "affine"
436
+ },
437
+ "language_model.model.layers.58.mlp.down_proj": {
438
+ "bits": 8,
439
+ "group_size": 64,
440
+ "mode": "affine"
441
+ },
442
+ "language_model.model.layers.19.mlp.down_proj": {
443
+ "bits": 8,
444
+ "group_size": 64,
445
+ "mode": "affine"
446
+ },
447
+ "language_model.model.layers.21.linear_attn.in_proj_qkv": {
448
+ "bits": 8,
449
+ "group_size": 64,
450
+ "mode": "affine"
451
+ },
452
+ "language_model.model.layers.25.mlp.down_proj": {
453
+ "bits": 8,
454
+ "group_size": 64,
455
+ "mode": "affine"
456
+ },
457
+ "language_model.model.layers.32.mlp.down_proj": {
458
+ "bits": 8,
459
+ "group_size": 64,
460
+ "mode": "affine"
461
+ },
462
+ "language_model.model.layers.17.mlp.down_proj": {
463
+ "bits": 8,
464
+ "group_size": 64,
465
+ "mode": "affine"
466
+ },
467
+ "language_model.model.layers.57.linear_attn.in_proj_z": {
468
+ "bits": 8,
469
+ "group_size": 64,
470
+ "mode": "affine"
471
+ },
472
+ "language_model.model.layers.41.mlp.down_proj": {
473
+ "bits": 8,
474
+ "group_size": 64,
475
+ "mode": "affine"
476
+ },
477
+ "language_model.model.layers.1.linear_attn.in_proj_qkv": {
478
+ "bits": 8,
479
+ "group_size": 64,
480
+ "mode": "affine"
481
+ },
482
+ "language_model.model.layers.51.self_attn.v_proj": {
483
+ "bits": 8,
484
+ "group_size": 64,
485
+ "mode": "affine"
486
+ },
487
+ "language_model.model.layers.34.mlp.down_proj": {
488
+ "bits": 8,
489
+ "group_size": 64,
490
+ "mode": "affine"
491
+ },
492
+ "language_model.model.layers.17.linear_attn.in_proj_z": {
493
+ "bits": 8,
494
+ "group_size": 64,
495
+ "mode": "affine"
496
+ },
497
+ "language_model.model.layers.35.self_attn.q_proj": {
498
+ "bits": 8,
499
+ "group_size": 64,
500
+ "mode": "affine"
501
+ },
502
+ "language_model.model.layers.14.mlp.down_proj": {
503
+ "bits": 8,
504
+ "group_size": 64,
505
+ "mode": "affine"
506
+ },
507
+ "language_model.model.layers.3.self_attn.k_proj": {
508
+ "bits": 8,
509
+ "group_size": 64,
510
+ "mode": "affine"
511
+ },
512
+ "language_model.model.layers.7.self_attn.v_proj": {
513
+ "bits": 8,
514
+ "group_size": 64,
515
+ "mode": "affine"
516
+ },
517
+ "language_model.model.layers.12.mlp.down_proj": {
518
+ "bits": 8,
519
+ "group_size": 64,
520
+ "mode": "affine"
521
+ },
522
+ "language_model.model.layers.11.self_attn.k_proj": {
523
+ "bits": 8,
524
+ "group_size": 64,
525
+ "mode": "affine"
526
+ },
527
+ "language_model.model.layers.63.self_attn.q_proj": {
528
+ "bits": 8,
529
+ "group_size": 64,
530
+ "mode": "affine"
531
+ },
532
+ "language_model.model.layers.14.linear_attn.in_proj_qkv": {
533
+ "bits": 8,
534
+ "group_size": 64,
535
+ "mode": "affine"
536
+ },
537
+ "language_model.model.layers.31.self_attn.v_proj": {
538
+ "bits": 8,
539
+ "group_size": 64,
540
+ "mode": "affine"
541
+ },
542
+ "language_model.model.layers.31.mlp.down_proj": {
543
+ "bits": 8,
544
+ "group_size": 64,
545
+ "mode": "affine"
546
+ },
547
+ "language_model.model.layers.27.self_attn.k_proj": {
548
+ "bits": 8,
549
+ "group_size": 64,
550
+ "mode": "affine"
551
+ },
552
+ "language_model.model.layers.29.linear_attn.in_proj_qkv": {
553
+ "bits": 8,
554
+ "group_size": 64,
555
+ "mode": "affine"
556
+ },
557
+ "language_model.model.layers.63.mlp.down_proj": {
558
+ "bits": 8,
559
+ "group_size": 64,
560
+ "mode": "affine"
561
+ },
562
+ "language_model.model.layers.2.linear_attn.in_proj_z": {
563
+ "bits": 8,
564
+ "group_size": 64,
565
+ "mode": "affine"
566
+ },
567
+ "language_model.model.layers.6.linear_attn.in_proj_z": {
568
+ "bits": 8,
569
+ "group_size": 64,
570
+ "mode": "affine"
571
+ },
572
+ "language_model.model.layers.29.mlp.down_proj": {
573
+ "bits": 8,
574
+ "group_size": 64,
575
+ "mode": "affine"
576
+ },
577
+ "language_model.model.layers.53.linear_attn.in_proj_z": {
578
+ "bits": 8,
579
+ "group_size": 64,
580
+ "mode": "affine"
581
+ },
582
+ "language_model.model.layers.35.self_attn.v_proj": {
583
+ "bits": 8,
584
+ "group_size": 64,
585
+ "mode": "affine"
586
+ },
587
+ "language_model.model.layers.44.mlp.down_proj": {
588
+ "bits": 8,
589
+ "group_size": 64,
590
+ "mode": "affine"
591
+ },
592
+ "language_model.model.layers.28.linear_attn.in_proj_z": {
593
+ "bits": 8,
594
+ "group_size": 64,
595
+ "mode": "affine"
596
+ },
597
+ "language_model.model.layers.3.self_attn.q_proj": {
598
+ "bits": 8,
599
+ "group_size": 64,
600
+ "mode": "affine"
601
+ },
602
+ "language_model.model.layers.11.self_attn.q_proj": {
603
+ "bits": 8,
604
+ "group_size": 64,
605
+ "mode": "affine"
606
+ },
607
+ "language_model.model.layers.32.linear_attn.in_proj_qkv": {
608
+ "bits": 8,
609
+ "group_size": 64,
610
+ "mode": "affine"
611
+ },
612
+ "language_model.model.layers.6.mlp.down_proj": {
613
+ "bits": 8,
614
+ "group_size": 64,
615
+ "mode": "affine"
616
+ },
617
+ "language_model.model.layers.55.self_attn.q_proj": {
618
+ "bits": 8,
619
+ "group_size": 64,
620
+ "mode": "affine"
621
+ },
622
+ "language_model.model.layers.36.mlp.down_proj": {
623
+ "bits": 8,
624
+ "group_size": 64,
625
+ "mode": "affine"
626
+ },
627
+ "language_model.model.layers.53.linear_attn.in_proj_qkv": {
628
+ "bits": 8,
629
+ "group_size": 64,
630
+ "mode": "affine"
631
+ },
632
+ "language_model.model.layers.33.linear_attn.in_proj_qkv": {
633
+ "bits": 8,
634
+ "group_size": 64,
635
+ "mode": "affine"
636
+ },
637
+ "language_model.model.layers.55.self_attn.v_proj": {
638
+ "bits": 8,
639
+ "group_size": 64,
640
+ "mode": "affine"
641
+ },
642
+ "language_model.model.layers.32.linear_attn.in_proj_z": {
643
+ "bits": 8,
644
+ "group_size": 64,
645
+ "mode": "affine"
646
+ },
647
+ "language_model.model.layers.16.linear_attn.in_proj_z": {
648
+ "bits": 8,
649
+ "group_size": 64,
650
+ "mode": "affine"
651
+ },
652
+ "language_model.model.layers.23.self_attn.q_proj": {
653
+ "bits": 8,
654
+ "group_size": 64,
655
+ "mode": "affine"
656
+ },
657
+ "language_model.model.embed_tokens": {
658
+ "bits": 8,
659
+ "group_size": 64,
660
+ "mode": "affine"
661
+ },
662
+ "language_model.model.layers.23.self_attn.k_proj": {
663
+ "bits": 8,
664
+ "group_size": 64,
665
+ "mode": "affine"
666
+ },
667
+ "language_model.model.layers.42.linear_attn.in_proj_z": {
668
+ "bits": 8,
669
+ "group_size": 64,
670
+ "mode": "affine"
671
+ },
672
+ "language_model.model.layers.40.mlp.down_proj": {
673
+ "bits": 8,
674
+ "group_size": 64,
675
+ "mode": "affine"
676
+ },
677
+ "language_model.model.layers.45.linear_attn.in_proj_z": {
678
+ "bits": 8,
679
+ "group_size": 64,
680
+ "mode": "affine"
681
+ },
682
+ "language_model.model.layers.23.self_attn.v_proj": {
683
+ "bits": 8,
684
+ "group_size": 64,
685
+ "mode": "affine"
686
+ },
687
+ "language_model.model.layers.23.mlp.down_proj": {
688
+ "bits": 8,
689
+ "group_size": 64,
690
+ "mode": "affine"
691
+ },
692
+ "language_model.model.layers.45.mlp.down_proj": {
693
+ "bits": 8,
694
+ "group_size": 64,
695
+ "mode": "affine"
696
+ },
697
+ "language_model.model.layers.57.linear_attn.in_proj_qkv": {
698
+ "bits": 8,
699
+ "group_size": 64,
700
+ "mode": "affine"
701
+ },
702
+ "language_model.model.layers.27.self_attn.q_proj": {
703
+ "bits": 8,
704
+ "group_size": 64,
705
+ "mode": "affine"
706
+ },
707
+ "language_model.model.layers.55.self_attn.k_proj": {
708
+ "bits": 8,
709
+ "group_size": 64,
710
+ "mode": "affine"
711
+ },
712
+ "language_model.model.layers.31.self_attn.k_proj": {
713
+ "bits": 8,
714
+ "group_size": 64,
715
+ "mode": "affine"
716
+ },
717
+ "language_model.model.layers.52.linear_attn.in_proj_z": {
718
+ "bits": 8,
719
+ "group_size": 64,
720
+ "mode": "affine"
721
+ },
722
+ "language_model.model.layers.47.self_attn.q_proj": {
723
+ "bits": 8,
724
+ "group_size": 64,
725
+ "mode": "affine"
726
+ },
727
+ "language_model.model.layers.2.mlp.down_proj": {
728
+ "bits": 8,
729
+ "group_size": 64,
730
+ "mode": "affine"
731
+ },
732
+ "language_model.model.layers.4.linear_attn.in_proj_z": {
733
+ "bits": 8,
734
+ "group_size": 64,
735
+ "mode": "affine"
736
+ },
737
+ "language_model.model.layers.9.linear_attn.in_proj_z": {
738
+ "bits": 8,
739
+ "group_size": 64,
740
+ "mode": "affine"
741
+ },
742
+ "language_model.model.layers.11.mlp.down_proj": {
743
+ "bits": 8,
744
+ "group_size": 64,
745
+ "mode": "affine"
746
+ },
747
+ "language_model.model.layers.19.self_attn.q_proj": {
748
+ "bits": 8,
749
+ "group_size": 64,
750
+ "mode": "affine"
751
+ },
752
+ "language_model.model.layers.61.linear_attn.in_proj_qkv": {
753
+ "bits": 8,
754
+ "group_size": 64,
755
+ "mode": "affine"
756
+ },
757
+ "language_model.model.layers.16.mlp.down_proj": {
758
+ "bits": 8,
759
+ "group_size": 64,
760
+ "mode": "affine"
761
+ },
762
+ "language_model.model.layers.24.linear_attn.in_proj_qkv": {
763
+ "bits": 8,
764
+ "group_size": 64,
765
+ "mode": "affine"
766
+ },
767
+ "language_model.model.layers.58.linear_attn.in_proj_qkv": {
768
+ "bits": 8,
769
+ "group_size": 64,
770
+ "mode": "affine"
771
+ },
772
+ "language_model.model.layers.10.linear_attn.in_proj_qkv": {
773
+ "bits": 8,
774
+ "group_size": 64,
775
+ "mode": "affine"
776
+ },
777
+ "language_model.model.layers.46.linear_attn.in_proj_z": {
778
+ "bits": 8,
779
+ "group_size": 64,
780
+ "mode": "affine"
781
+ },
782
+ "language_model.model.layers.57.mlp.down_proj": {
783
+ "bits": 8,
784
+ "group_size": 64,
785
+ "mode": "affine"
786
+ },
787
+ "language_model.model.layers.5.mlp.down_proj": {
788
+ "bits": 8,
789
+ "group_size": 64,
790
+ "mode": "affine"
791
+ },
792
+ "language_model.model.layers.48.linear_attn.in_proj_z": {
793
+ "bits": 8,
794
+ "group_size": 64,
795
+ "mode": "affine"
796
+ },
797
+ "language_model.model.layers.58.linear_attn.in_proj_z": {
798
+ "bits": 8,
799
+ "group_size": 64,
800
+ "mode": "affine"
801
+ },
802
+ "language_model.model.layers.26.linear_attn.in_proj_qkv": {
803
+ "bits": 8,
804
+ "group_size": 64,
805
+ "mode": "affine"
806
+ },
807
+ "language_model.model.layers.62.linear_attn.in_proj_qkv": {
808
+ "bits": 8,
809
+ "group_size": 64,
810
+ "mode": "affine"
811
+ },
812
+ "language_model.model.layers.60.linear_attn.in_proj_qkv": {
813
+ "bits": 8,
814
+ "group_size": 64,
815
+ "mode": "affine"
816
+ },
817
+ "language_model.model.layers.44.linear_attn.in_proj_qkv": {
818
+ "bits": 8,
819
+ "group_size": 64,
820
+ "mode": "affine"
821
+ },
822
+ "language_model.model.layers.15.mlp.down_proj": {
823
+ "bits": 8,
824
+ "group_size": 64,
825
+ "mode": "affine"
826
+ },
827
+ "language_model.model.layers.55.mlp.down_proj": {
828
+ "bits": 8,
829
+ "group_size": 64,
830
+ "mode": "affine"
831
+ },
832
+ "language_model.model.layers.43.self_attn.v_proj": {
833
+ "bits": 8,
834
+ "group_size": 64,
835
+ "mode": "affine"
836
+ },
837
+ "language_model.model.layers.37.linear_attn.in_proj_qkv": {
838
+ "bits": 8,
839
+ "group_size": 64,
840
+ "mode": "affine"
841
+ },
842
+ "language_model.model.layers.52.linear_attn.in_proj_qkv": {
843
+ "bits": 8,
844
+ "group_size": 64,
845
+ "mode": "affine"
846
+ },
847
+ "language_model.model.layers.10.mlp.down_proj": {
848
+ "bits": 8,
849
+ "group_size": 64,
850
+ "mode": "affine"
851
+ },
852
+ "language_model.model.layers.46.linear_attn.in_proj_qkv": {
853
+ "bits": 8,
854
+ "group_size": 64,
855
+ "mode": "affine"
856
+ },
857
+ "language_model.model.layers.19.self_attn.v_proj": {
858
+ "bits": 8,
859
+ "group_size": 64,
860
+ "mode": "affine"
861
+ },
862
+ "language_model.model.layers.54.linear_attn.in_proj_qkv": {
863
+ "bits": 8,
864
+ "group_size": 64,
865
+ "mode": "affine"
866
+ },
867
+ "language_model.model.layers.8.mlp.down_proj": {
868
+ "bits": 8,
869
+ "group_size": 64,
870
+ "mode": "affine"
871
+ },
872
+ "language_model.model.layers.4.mlp.down_proj": {
873
+ "bits": 8,
874
+ "group_size": 64,
875
+ "mode": "affine"
876
+ },
877
+ "language_model.model.layers.18.mlp.down_proj": {
878
+ "bits": 8,
879
+ "group_size": 64,
880
+ "mode": "affine"
881
+ },
882
+ "language_model.model.layers.0.linear_attn.in_proj_qkv": {
883
+ "bits": 8,
884
+ "group_size": 64,
885
+ "mode": "affine"
886
+ },
887
+ "language_model.model.layers.21.linear_attn.in_proj_z": {
888
+ "bits": 8,
889
+ "group_size": 64,
890
+ "mode": "affine"
891
+ },
892
+ "language_model.model.layers.34.linear_attn.in_proj_qkv": {
893
+ "bits": 8,
894
+ "group_size": 64,
895
+ "mode": "affine"
896
+ },
897
+ "language_model.model.layers.27.mlp.down_proj": {
898
+ "bits": 8,
899
+ "group_size": 64,
900
+ "mode": "affine"
901
+ },
902
+ "language_model.model.layers.27.self_attn.v_proj": {
903
+ "bits": 8,
904
+ "group_size": 64,
905
+ "mode": "affine"
906
+ },
907
+ "language_model.model.layers.7.mlp.down_proj": {
908
+ "bits": 8,
909
+ "group_size": 64,
910
+ "mode": "affine"
911
+ },
912
+ "language_model.lm_head": {
913
+ "bits": 8,
914
+ "group_size": 64,
915
+ "mode": "affine"
916
+ },
917
+ "language_model.model.layers.40.linear_attn.in_proj_z": {
918
+ "bits": 8,
919
+ "group_size": 64,
920
+ "mode": "affine"
921
+ },
922
+ "language_model.model.layers.1.mlp.down_proj": {
923
+ "bits": 8,
924
+ "group_size": 64,
925
+ "mode": "affine"
926
+ },
927
+ "language_model.model.layers.13.linear_attn.in_proj_z": {
928
+ "bits": 8,
929
+ "group_size": 64,
930
+ "mode": "affine"
931
+ },
932
+ "language_model.model.layers.15.self_attn.q_proj": {
933
+ "bits": 8,
934
+ "group_size": 64,
935
+ "mode": "affine"
936
+ },
937
+ "language_model.model.layers.12.linear_attn.in_proj_z": {
938
+ "bits": 8,
939
+ "group_size": 64,
940
+ "mode": "affine"
941
+ },
942
+ "language_model.model.layers.4.linear_attn.in_proj_qkv": {
943
+ "bits": 8,
944
+ "group_size": 64,
945
+ "mode": "affine"
946
+ },
947
+ "language_model.model.layers.42.linear_attn.in_proj_qkv": {
948
+ "bits": 8,
949
+ "group_size": 64,
950
+ "mode": "affine"
951
+ },
952
+ "language_model.model.layers.43.self_attn.k_proj": {
953
+ "bits": 8,
954
+ "group_size": 64,
955
+ "mode": "affine"
956
+ },
957
+ "language_model.model.layers.51.self_attn.k_proj": {
958
+ "bits": 8,
959
+ "group_size": 64,
960
+ "mode": "affine"
961
+ },
962
+ "language_model.model.layers.36.linear_attn.in_proj_z": {
963
+ "bits": 8,
964
+ "group_size": 64,
965
+ "mode": "affine"
966
+ },
967
+ "language_model.model.layers.59.self_attn.k_proj": {
968
+ "bits": 8,
969
+ "group_size": 64,
970
+ "mode": "affine"
971
+ },
972
+ "language_model.model.layers.31.self_attn.q_proj": {
973
+ "bits": 8,
974
+ "group_size": 64,
975
+ "mode": "affine"
976
+ },
977
+ "language_model.model.layers.7.self_attn.q_proj": {
978
+ "bits": 8,
979
+ "group_size": 64,
980
+ "mode": "affine"
981
+ },
982
+ "language_model.model.layers.45.linear_attn.in_proj_qkv": {
983
+ "bits": 8,
984
+ "group_size": 64,
985
+ "mode": "affine"
986
+ },
987
+ "language_model.model.layers.19.self_attn.k_proj": {
988
+ "bits": 8,
989
+ "group_size": 64,
990
+ "mode": "affine"
991
+ },
992
+ "language_model.model.layers.38.linear_attn.in_proj_qkv": {
993
+ "bits": 8,
994
+ "group_size": 64,
995
+ "mode": "affine"
996
+ },
997
+ "language_model.model.layers.38.linear_attn.in_proj_z": {
998
+ "bits": 8,
999
+ "group_size": 64,
1000
+ "mode": "affine"
1001
+ },
1002
+ "language_model.model.layers.63.self_attn.v_proj": {
1003
+ "bits": 8,
1004
+ "group_size": 64,
1005
+ "mode": "affine"
1006
+ },
1007
+ "language_model.model.layers.53.mlp.down_proj": {
1008
+ "bits": 8,
1009
+ "group_size": 64,
1010
+ "mode": "affine"
1011
+ },
1012
+ "language_model.model.layers.42.mlp.down_proj": {
1013
+ "bits": 8,
1014
+ "group_size": 64,
1015
+ "mode": "affine"
1016
+ },
1017
+ "language_model.model.layers.43.self_attn.q_proj": {
1018
+ "bits": 8,
1019
+ "group_size": 64,
1020
+ "mode": "affine"
1021
+ },
1022
+ "language_model.model.layers.47.self_attn.k_proj": {
1023
+ "bits": 8,
1024
+ "group_size": 64,
1025
+ "mode": "affine"
1026
+ },
1027
+ "language_model.model.layers.22.linear_attn.in_proj_qkv": {
1028
+ "bits": 8,
1029
+ "group_size": 64,
1030
+ "mode": "affine"
1031
+ },
1032
+ "language_model.model.layers.20.mlp.down_proj": {
1033
+ "bits": 8,
1034
+ "group_size": 64,
1035
+ "mode": "affine"
1036
+ },
1037
+ "language_model.model.layers.28.mlp.down_proj": {
1038
+ "bits": 8,
1039
+ "group_size": 64,
1040
+ "mode": "affine"
1041
+ },
1042
+ "language_model.model.layers.17.linear_attn.in_proj_qkv": {
1043
+ "bits": 8,
1044
+ "group_size": 64,
1045
+ "mode": "affine"
1046
+ },
1047
+ "language_model.model.layers.26.linear_attn.in_proj_z": {
1048
+ "bits": 8,
1049
+ "group_size": 64,
1050
+ "mode": "affine"
1051
+ },
1052
+ "language_model.model.layers.54.mlp.down_proj": {
1053
+ "bits": 8,
1054
+ "group_size": 64,
1055
+ "mode": "affine"
1056
+ },
1057
+ "language_model.model.layers.18.linear_attn.in_proj_z": {
1058
+ "bits": 8,
1059
+ "group_size": 64,
1060
+ "mode": "affine"
1061
+ }
1062
+ },
1063
+ "quantization_config": {
1064
+ "group_size": 64,
1065
+ "bits": 6,
1066
+ "mode": "affine",
1067
+ "language_model.model.layers.22.linear_attn.in_proj_z": {
1068
+ "bits": 8,
1069
+ "group_size": 64,
1070
+ "mode": "affine"
1071
+ },
1072
+ "language_model.model.layers.49.linear_attn.in_proj_z": {
1073
+ "bits": 8,
1074
+ "group_size": 64,
1075
+ "mode": "affine"
1076
+ },
1077
+ "language_model.model.layers.15.self_attn.k_proj": {
1078
+ "bits": 8,
1079
+ "group_size": 64,
1080
+ "mode": "affine"
1081
+ },
1082
+ "language_model.model.layers.46.mlp.down_proj": {
1083
+ "bits": 8,
1084
+ "group_size": 64,
1085
+ "mode": "affine"
1086
+ },
1087
+ "language_model.model.layers.33.linear_attn.in_proj_z": {
1088
+ "bits": 8,
1089
+ "group_size": 64,
1090
+ "mode": "affine"
1091
+ },
1092
+ "language_model.model.layers.12.linear_attn.in_proj_qkv": {
1093
+ "bits": 8,
1094
+ "group_size": 64,
1095
+ "mode": "affine"
1096
+ },
1097
+ "language_model.model.layers.14.linear_attn.in_proj_z": {
1098
+ "bits": 8,
1099
+ "group_size": 64,
1100
+ "mode": "affine"
1101
+ },
1102
+ "language_model.model.layers.8.linear_attn.in_proj_qkv": {
1103
+ "bits": 8,
1104
+ "group_size": 64,
1105
+ "mode": "affine"
1106
+ },
1107
+ "language_model.model.layers.9.linear_attn.in_proj_qkv": {
1108
+ "bits": 8,
1109
+ "group_size": 64,
1110
+ "mode": "affine"
1111
+ },
1112
+ "language_model.model.layers.9.mlp.down_proj": {
1113
+ "bits": 8,
1114
+ "group_size": 64,
1115
+ "mode": "affine"
1116
+ },
1117
+ "language_model.model.layers.30.linear_attn.in_proj_qkv": {
1118
+ "bits": 8,
1119
+ "group_size": 64,
1120
+ "mode": "affine"
1121
+ },
1122
+ "language_model.model.layers.44.linear_attn.in_proj_z": {
1123
+ "bits": 8,
1124
+ "group_size": 64,
1125
+ "mode": "affine"
1126
+ },
1127
+ "language_model.model.layers.62.linear_attn.in_proj_z": {
1128
+ "bits": 8,
1129
+ "group_size": 64,
1130
+ "mode": "affine"
1131
+ },
1132
+ "language_model.model.layers.63.self_attn.k_proj": {
1133
+ "bits": 8,
1134
+ "group_size": 64,
1135
+ "mode": "affine"
1136
+ },
1137
+ "language_model.model.layers.24.linear_attn.in_proj_z": {
1138
+ "bits": 8,
1139
+ "group_size": 64,
1140
+ "mode": "affine"
1141
+ },
1142
+ "language_model.model.layers.62.mlp.down_proj": {
1143
+ "bits": 8,
1144
+ "group_size": 64,
1145
+ "mode": "affine"
1146
+ },
1147
+ "language_model.model.layers.59.self_attn.v_proj": {
1148
+ "bits": 8,
1149
+ "group_size": 64,
1150
+ "mode": "affine"
1151
+ },
1152
+ "language_model.model.layers.18.linear_attn.in_proj_qkv": {
1153
+ "bits": 8,
1154
+ "group_size": 64,
1155
+ "mode": "affine"
1156
+ },
1157
+ "language_model.model.layers.39.self_attn.k_proj": {
1158
+ "bits": 8,
1159
+ "group_size": 64,
1160
+ "mode": "affine"
1161
+ },
1162
+ "language_model.model.layers.48.linear_attn.in_proj_qkv": {
1163
+ "bits": 8,
1164
+ "group_size": 64,
1165
+ "mode": "affine"
1166
+ },
1167
+ "language_model.model.layers.5.linear_attn.in_proj_z": {
1168
+ "bits": 8,
1169
+ "group_size": 64,
1170
+ "mode": "affine"
1171
+ },
1172
+ "language_model.model.layers.30.linear_attn.in_proj_z": {
1173
+ "bits": 8,
1174
+ "group_size": 64,
1175
+ "mode": "affine"
1176
+ },
1177
+ "language_model.model.layers.56.linear_attn.in_proj_z": {
1178
+ "bits": 8,
1179
+ "group_size": 64,
1180
+ "mode": "affine"
1181
+ },
1182
+ "language_model.model.layers.47.self_attn.v_proj": {
1183
+ "bits": 8,
1184
+ "group_size": 64,
1185
+ "mode": "affine"
1186
+ },
1187
+ "language_model.model.layers.52.mlp.down_proj": {
1188
+ "bits": 8,
1189
+ "group_size": 64,
1190
+ "mode": "affine"
1191
+ },
1192
+ "language_model.model.layers.60.mlp.down_proj": {
1193
+ "bits": 8,
1194
+ "group_size": 64,
1195
+ "mode": "affine"
1196
+ },
1197
+ "language_model.model.layers.56.linear_attn.in_proj_qkv": {
1198
+ "bits": 8,
1199
+ "group_size": 64,
1200
+ "mode": "affine"
1201
+ },
1202
+ "language_model.model.layers.56.mlp.down_proj": {
1203
+ "bits": 8,
1204
+ "group_size": 64,
1205
+ "mode": "affine"
1206
+ },
1207
+ "language_model.model.layers.2.linear_attn.in_proj_qkv": {
1208
+ "bits": 8,
1209
+ "group_size": 64,
1210
+ "mode": "affine"
1211
+ },
1212
+ "language_model.model.layers.35.mlp.down_proj": {
1213
+ "bits": 8,
1214
+ "group_size": 64,
1215
+ "mode": "affine"
1216
+ },
1217
+ "language_model.model.layers.13.mlp.down_proj": {
1218
+ "bits": 8,
1219
+ "group_size": 64,
1220
+ "mode": "affine"
1221
+ },
1222
+ "language_model.model.layers.41.linear_attn.in_proj_z": {
1223
+ "bits": 8,
1224
+ "group_size": 64,
1225
+ "mode": "affine"
1226
+ },
1227
+ "language_model.model.layers.1.linear_attn.in_proj_z": {
1228
+ "bits": 8,
1229
+ "group_size": 64,
1230
+ "mode": "affine"
1231
+ },
1232
+ "language_model.model.layers.25.linear_attn.in_proj_z": {
1233
+ "bits": 8,
1234
+ "group_size": 64,
1235
+ "mode": "affine"
1236
+ },
1237
+ "language_model.model.layers.47.mlp.down_proj": {
1238
+ "bits": 8,
1239
+ "group_size": 64,
1240
+ "mode": "affine"
1241
+ },
1242
+ "language_model.model.layers.50.mlp.down_proj": {
1243
+ "bits": 8,
1244
+ "group_size": 64,
1245
+ "mode": "affine"
1246
+ },
1247
+ "language_model.model.layers.3.self_attn.v_proj": {
1248
+ "bits": 8,
1249
+ "group_size": 64,
1250
+ "mode": "affine"
1251
+ },
1252
+ "language_model.model.layers.6.linear_attn.in_proj_qkv": {
1253
+ "bits": 8,
1254
+ "group_size": 64,
1255
+ "mode": "affine"
1256
+ },
1257
+ "language_model.model.layers.38.mlp.down_proj": {
1258
+ "bits": 8,
1259
+ "group_size": 64,
1260
+ "mode": "affine"
1261
+ },
1262
+ "language_model.model.layers.61.linear_attn.in_proj_z": {
1263
+ "bits": 8,
1264
+ "group_size": 64,
1265
+ "mode": "affine"
1266
+ },
1267
+ "language_model.model.layers.8.linear_attn.in_proj_z": {
1268
+ "bits": 8,
1269
+ "group_size": 64,
1270
+ "mode": "affine"
1271
+ },
1272
+ "language_model.model.layers.39.self_attn.q_proj": {
1273
+ "bits": 8,
1274
+ "group_size": 64,
1275
+ "mode": "affine"
1276
+ },
1277
+ "language_model.model.layers.41.linear_attn.in_proj_qkv": {
1278
+ "bits": 8,
1279
+ "group_size": 64,
1280
+ "mode": "affine"
1281
+ },
1282
+ "language_model.model.layers.36.linear_attn.in_proj_qkv": {
1283
+ "bits": 8,
1284
+ "group_size": 64,
1285
+ "mode": "affine"
1286
+ },
1287
+ "language_model.model.layers.51.mlp.down_proj": {
1288
+ "bits": 8,
1289
+ "group_size": 64,
1290
+ "mode": "affine"
1291
+ },
1292
+ "language_model.model.layers.21.mlp.down_proj": {
1293
+ "bits": 8,
1294
+ "group_size": 64,
1295
+ "mode": "affine"
1296
+ },
1297
+ "language_model.model.layers.30.mlp.down_proj": {
1298
+ "bits": 8,
1299
+ "group_size": 64,
1300
+ "mode": "affine"
1301
+ },
1302
+ "language_model.model.layers.15.self_attn.v_proj": {
1303
+ "bits": 8,
1304
+ "group_size": 64,
1305
+ "mode": "affine"
1306
+ },
1307
+ "language_model.model.layers.59.self_attn.q_proj": {
1308
+ "bits": 8,
1309
+ "group_size": 64,
1310
+ "mode": "affine"
1311
+ },
1312
+ "language_model.model.layers.43.mlp.down_proj": {
1313
+ "bits": 8,
1314
+ "group_size": 64,
1315
+ "mode": "affine"
1316
+ },
1317
+ "language_model.model.layers.3.mlp.down_proj": {
1318
+ "bits": 8,
1319
+ "group_size": 64,
1320
+ "mode": "affine"
1321
+ },
1322
+ "language_model.model.layers.28.linear_attn.in_proj_qkv": {
1323
+ "bits": 8,
1324
+ "group_size": 64,
1325
+ "mode": "affine"
1326
+ },
1327
+ "language_model.model.layers.11.self_attn.v_proj": {
1328
+ "bits": 8,
1329
+ "group_size": 64,
1330
+ "mode": "affine"
1331
+ },
1332
+ "language_model.model.layers.22.mlp.down_proj": {
1333
+ "bits": 8,
1334
+ "group_size": 64,
1335
+ "mode": "affine"
1336
+ },
1337
+ "language_model.model.layers.37.linear_attn.in_proj_z": {
1338
+ "bits": 8,
1339
+ "group_size": 64,
1340
+ "mode": "affine"
1341
+ },
1342
+ "language_model.model.layers.5.linear_attn.in_proj_qkv": {
1343
+ "bits": 8,
1344
+ "group_size": 64,
1345
+ "mode": "affine"
1346
+ },
1347
+ "language_model.model.layers.50.linear_attn.in_proj_z": {
1348
+ "bits": 8,
1349
+ "group_size": 64,
1350
+ "mode": "affine"
1351
+ },
1352
+ "language_model.model.layers.29.linear_attn.in_proj_z": {
1353
+ "bits": 8,
1354
+ "group_size": 64,
1355
+ "mode": "affine"
1356
+ },
1357
+ "language_model.model.layers.35.self_attn.k_proj": {
1358
+ "bits": 8,
1359
+ "group_size": 64,
1360
+ "mode": "affine"
1361
+ },
1362
+ "language_model.model.layers.33.mlp.down_proj": {
1363
+ "bits": 8,
1364
+ "group_size": 64,
1365
+ "mode": "affine"
1366
+ },
1367
+ "language_model.model.layers.50.linear_attn.in_proj_qkv": {
1368
+ "bits": 8,
1369
+ "group_size": 64,
1370
+ "mode": "affine"
1371
+ },
1372
+ "language_model.model.layers.25.linear_attn.in_proj_qkv": {
1373
+ "bits": 8,
1374
+ "group_size": 64,
1375
+ "mode": "affine"
1376
+ },
1377
+ "language_model.model.layers.20.linear_attn.in_proj_qkv": {
1378
+ "bits": 8,
1379
+ "group_size": 64,
1380
+ "mode": "affine"
1381
+ },
1382
+ "language_model.model.layers.59.mlp.down_proj": {
1383
+ "bits": 8,
1384
+ "group_size": 64,
1385
+ "mode": "affine"
1386
+ },
1387
+ "language_model.model.layers.0.mlp.down_proj": {
1388
+ "bits": 8,
1389
+ "group_size": 64,
1390
+ "mode": "affine"
1391
+ },
1392
+ "language_model.model.layers.7.self_attn.k_proj": {
1393
+ "bits": 8,
1394
+ "group_size": 64,
1395
+ "mode": "affine"
1396
+ },
1397
+ "language_model.model.layers.10.linear_attn.in_proj_z": {
1398
+ "bits": 8,
1399
+ "group_size": 64,
1400
+ "mode": "affine"
1401
+ },
1402
+ "language_model.model.layers.49.mlp.down_proj": {
1403
+ "bits": 8,
1404
+ "group_size": 64,
1405
+ "mode": "affine"
1406
+ },
1407
+ "language_model.model.layers.24.mlp.down_proj": {
1408
+ "bits": 8,
1409
+ "group_size": 64,
1410
+ "mode": "affine"
1411
+ },
1412
+ "language_model.model.layers.26.mlp.down_proj": {
1413
+ "bits": 8,
1414
+ "group_size": 64,
1415
+ "mode": "affine"
1416
+ },
1417
+ "language_model.model.layers.34.linear_attn.in_proj_z": {
1418
+ "bits": 8,
1419
+ "group_size": 64,
1420
+ "mode": "affine"
1421
+ },
1422
+ "language_model.model.layers.39.mlp.down_proj": {
1423
+ "bits": 8,
1424
+ "group_size": 64,
1425
+ "mode": "affine"
1426
+ },
1427
+ "language_model.model.layers.0.linear_attn.in_proj_z": {
1428
+ "bits": 8,
1429
+ "group_size": 64,
1430
+ "mode": "affine"
1431
+ },
1432
+ "language_model.model.layers.49.linear_attn.in_proj_qkv": {
1433
+ "bits": 8,
1434
+ "group_size": 64,
1435
+ "mode": "affine"
1436
+ },
1437
+ "language_model.model.layers.37.mlp.down_proj": {
1438
+ "bits": 8,
1439
+ "group_size": 64,
1440
+ "mode": "affine"
1441
+ },
1442
+ "language_model.model.layers.13.linear_attn.in_proj_qkv": {
1443
+ "bits": 8,
1444
+ "group_size": 64,
1445
+ "mode": "affine"
1446
+ },
1447
+ "language_model.model.layers.20.linear_attn.in_proj_z": {
1448
+ "bits": 8,
1449
+ "group_size": 64,
1450
+ "mode": "affine"
1451
+ },
1452
+ "language_model.model.layers.40.linear_attn.in_proj_qkv": {
1453
+ "bits": 8,
1454
+ "group_size": 64,
1455
+ "mode": "affine"
1456
+ },
1457
+ "language_model.model.layers.60.linear_attn.in_proj_z": {
1458
+ "bits": 8,
1459
+ "group_size": 64,
1460
+ "mode": "affine"
1461
+ },
1462
+ "language_model.model.layers.54.linear_attn.in_proj_z": {
1463
+ "bits": 8,
1464
+ "group_size": 64,
1465
+ "mode": "affine"
1466
+ },
1467
+ "language_model.model.layers.51.self_attn.q_proj": {
1468
+ "bits": 8,
1469
+ "group_size": 64,
1470
+ "mode": "affine"
1471
+ },
1472
+ "language_model.model.layers.61.mlp.down_proj": {
1473
+ "bits": 8,
1474
+ "group_size": 64,
1475
+ "mode": "affine"
1476
+ },
1477
+ "language_model.model.layers.16.linear_attn.in_proj_qkv": {
1478
+ "bits": 8,
1479
+ "group_size": 64,
1480
+ "mode": "affine"
1481
+ },
1482
+ "language_model.model.layers.39.self_attn.v_proj": {
1483
+ "bits": 8,
1484
+ "group_size": 64,
1485
+ "mode": "affine"
1486
+ },
1487
+ "language_model.model.layers.48.mlp.down_proj": {
1488
+ "bits": 8,
1489
+ "group_size": 64,
1490
+ "mode": "affine"
1491
+ },
1492
+ "language_model.model.layers.58.mlp.down_proj": {
1493
+ "bits": 8,
1494
+ "group_size": 64,
1495
+ "mode": "affine"
1496
+ },
1497
+ "language_model.model.layers.19.mlp.down_proj": {
1498
+ "bits": 8,
1499
+ "group_size": 64,
1500
+ "mode": "affine"
1501
+ },
1502
+ "language_model.model.layers.21.linear_attn.in_proj_qkv": {
1503
+ "bits": 8,
1504
+ "group_size": 64,
1505
+ "mode": "affine"
1506
+ },
1507
+ "language_model.model.layers.25.mlp.down_proj": {
1508
+ "bits": 8,
1509
+ "group_size": 64,
1510
+ "mode": "affine"
1511
+ },
1512
+ "language_model.model.layers.32.mlp.down_proj": {
1513
+ "bits": 8,
1514
+ "group_size": 64,
1515
+ "mode": "affine"
1516
+ },
1517
+ "language_model.model.layers.17.mlp.down_proj": {
1518
+ "bits": 8,
1519
+ "group_size": 64,
1520
+ "mode": "affine"
1521
+ },
1522
+ "language_model.model.layers.57.linear_attn.in_proj_z": {
1523
+ "bits": 8,
1524
+ "group_size": 64,
1525
+ "mode": "affine"
1526
+ },
1527
+ "language_model.model.layers.41.mlp.down_proj": {
1528
+ "bits": 8,
1529
+ "group_size": 64,
1530
+ "mode": "affine"
1531
+ },
1532
+ "language_model.model.layers.1.linear_attn.in_proj_qkv": {
1533
+ "bits": 8,
1534
+ "group_size": 64,
1535
+ "mode": "affine"
1536
+ },
1537
+ "language_model.model.layers.51.self_attn.v_proj": {
1538
+ "bits": 8,
1539
+ "group_size": 64,
1540
+ "mode": "affine"
1541
+ },
1542
+ "language_model.model.layers.34.mlp.down_proj": {
1543
+ "bits": 8,
1544
+ "group_size": 64,
1545
+ "mode": "affine"
1546
+ },
1547
+ "language_model.model.layers.17.linear_attn.in_proj_z": {
1548
+ "bits": 8,
1549
+ "group_size": 64,
1550
+ "mode": "affine"
1551
+ },
1552
+ "language_model.model.layers.35.self_attn.q_proj": {
1553
+ "bits": 8,
1554
+ "group_size": 64,
1555
+ "mode": "affine"
1556
+ },
1557
+ "language_model.model.layers.14.mlp.down_proj": {
1558
+ "bits": 8,
1559
+ "group_size": 64,
1560
+ "mode": "affine"
1561
+ },
1562
+ "language_model.model.layers.3.self_attn.k_proj": {
1563
+ "bits": 8,
1564
+ "group_size": 64,
1565
+ "mode": "affine"
1566
+ },
1567
+ "language_model.model.layers.7.self_attn.v_proj": {
1568
+ "bits": 8,
1569
+ "group_size": 64,
1570
+ "mode": "affine"
1571
+ },
1572
+ "language_model.model.layers.12.mlp.down_proj": {
1573
+ "bits": 8,
1574
+ "group_size": 64,
1575
+ "mode": "affine"
1576
+ },
1577
+ "language_model.model.layers.11.self_attn.k_proj": {
1578
+ "bits": 8,
1579
+ "group_size": 64,
1580
+ "mode": "affine"
1581
+ },
1582
+ "language_model.model.layers.63.self_attn.q_proj": {
1583
+ "bits": 8,
1584
+ "group_size": 64,
1585
+ "mode": "affine"
1586
+ },
1587
+ "language_model.model.layers.14.linear_attn.in_proj_qkv": {
1588
+ "bits": 8,
1589
+ "group_size": 64,
1590
+ "mode": "affine"
1591
+ },
1592
+ "language_model.model.layers.31.self_attn.v_proj": {
1593
+ "bits": 8,
1594
+ "group_size": 64,
1595
+ "mode": "affine"
1596
+ },
1597
+ "language_model.model.layers.31.mlp.down_proj": {
1598
+ "bits": 8,
1599
+ "group_size": 64,
1600
+ "mode": "affine"
1601
+ },
1602
+ "language_model.model.layers.27.self_attn.k_proj": {
1603
+ "bits": 8,
1604
+ "group_size": 64,
1605
+ "mode": "affine"
1606
+ },
1607
+ "language_model.model.layers.29.linear_attn.in_proj_qkv": {
1608
+ "bits": 8,
1609
+ "group_size": 64,
1610
+ "mode": "affine"
1611
+ },
1612
+ "language_model.model.layers.63.mlp.down_proj": {
1613
+ "bits": 8,
1614
+ "group_size": 64,
1615
+ "mode": "affine"
1616
+ },
1617
+ "language_model.model.layers.2.linear_attn.in_proj_z": {
1618
+ "bits": 8,
1619
+ "group_size": 64,
1620
+ "mode": "affine"
1621
+ },
1622
+ "language_model.model.layers.6.linear_attn.in_proj_z": {
1623
+ "bits": 8,
1624
+ "group_size": 64,
1625
+ "mode": "affine"
1626
+ },
1627
+ "language_model.model.layers.29.mlp.down_proj": {
1628
+ "bits": 8,
1629
+ "group_size": 64,
1630
+ "mode": "affine"
1631
+ },
1632
+ "language_model.model.layers.53.linear_attn.in_proj_z": {
1633
+ "bits": 8,
1634
+ "group_size": 64,
1635
+ "mode": "affine"
1636
+ },
1637
+ "language_model.model.layers.35.self_attn.v_proj": {
1638
+ "bits": 8,
1639
+ "group_size": 64,
1640
+ "mode": "affine"
1641
+ },
1642
+ "language_model.model.layers.44.mlp.down_proj": {
1643
+ "bits": 8,
1644
+ "group_size": 64,
1645
+ "mode": "affine"
1646
+ },
1647
+ "language_model.model.layers.28.linear_attn.in_proj_z": {
1648
+ "bits": 8,
1649
+ "group_size": 64,
1650
+ "mode": "affine"
1651
+ },
1652
+ "language_model.model.layers.3.self_attn.q_proj": {
1653
+ "bits": 8,
1654
+ "group_size": 64,
1655
+ "mode": "affine"
1656
+ },
1657
+ "language_model.model.layers.11.self_attn.q_proj": {
1658
+ "bits": 8,
1659
+ "group_size": 64,
1660
+ "mode": "affine"
1661
+ },
1662
+ "language_model.model.layers.32.linear_attn.in_proj_qkv": {
1663
+ "bits": 8,
1664
+ "group_size": 64,
1665
+ "mode": "affine"
1666
+ },
1667
+ "language_model.model.layers.6.mlp.down_proj": {
1668
+ "bits": 8,
1669
+ "group_size": 64,
1670
+ "mode": "affine"
1671
+ },
1672
+ "language_model.model.layers.55.self_attn.q_proj": {
1673
+ "bits": 8,
1674
+ "group_size": 64,
1675
+ "mode": "affine"
1676
+ },
1677
+ "language_model.model.layers.36.mlp.down_proj": {
1678
+ "bits": 8,
1679
+ "group_size": 64,
1680
+ "mode": "affine"
1681
+ },
1682
+ "language_model.model.layers.53.linear_attn.in_proj_qkv": {
1683
+ "bits": 8,
1684
+ "group_size": 64,
1685
+ "mode": "affine"
1686
+ },
1687
+ "language_model.model.layers.33.linear_attn.in_proj_qkv": {
1688
+ "bits": 8,
1689
+ "group_size": 64,
1690
+ "mode": "affine"
1691
+ },
1692
+ "language_model.model.layers.55.self_attn.v_proj": {
1693
+ "bits": 8,
1694
+ "group_size": 64,
1695
+ "mode": "affine"
1696
+ },
1697
+ "language_model.model.layers.32.linear_attn.in_proj_z": {
1698
+ "bits": 8,
1699
+ "group_size": 64,
1700
+ "mode": "affine"
1701
+ },
1702
+ "language_model.model.layers.16.linear_attn.in_proj_z": {
1703
+ "bits": 8,
1704
+ "group_size": 64,
1705
+ "mode": "affine"
1706
+ },
1707
+ "language_model.model.layers.23.self_attn.q_proj": {
1708
+ "bits": 8,
1709
+ "group_size": 64,
1710
+ "mode": "affine"
1711
+ },
1712
+ "language_model.model.embed_tokens": {
1713
+ "bits": 8,
1714
+ "group_size": 64,
1715
+ "mode": "affine"
1716
+ },
1717
+ "language_model.model.layers.23.self_attn.k_proj": {
1718
+ "bits": 8,
1719
+ "group_size": 64,
1720
+ "mode": "affine"
1721
+ },
1722
+ "language_model.model.layers.42.linear_attn.in_proj_z": {
1723
+ "bits": 8,
1724
+ "group_size": 64,
1725
+ "mode": "affine"
1726
+ },
1727
+ "language_model.model.layers.40.mlp.down_proj": {
1728
+ "bits": 8,
1729
+ "group_size": 64,
1730
+ "mode": "affine"
1731
+ },
1732
+ "language_model.model.layers.45.linear_attn.in_proj_z": {
1733
+ "bits": 8,
1734
+ "group_size": 64,
1735
+ "mode": "affine"
1736
+ },
1737
+ "language_model.model.layers.23.self_attn.v_proj": {
1738
+ "bits": 8,
1739
+ "group_size": 64,
1740
+ "mode": "affine"
1741
+ },
1742
+ "language_model.model.layers.23.mlp.down_proj": {
1743
+ "bits": 8,
1744
+ "group_size": 64,
1745
+ "mode": "affine"
1746
+ },
1747
+ "language_model.model.layers.45.mlp.down_proj": {
1748
+ "bits": 8,
1749
+ "group_size": 64,
1750
+ "mode": "affine"
1751
+ },
1752
+ "language_model.model.layers.57.linear_attn.in_proj_qkv": {
1753
+ "bits": 8,
1754
+ "group_size": 64,
1755
+ "mode": "affine"
1756
+ },
1757
+ "language_model.model.layers.27.self_attn.q_proj": {
1758
+ "bits": 8,
1759
+ "group_size": 64,
1760
+ "mode": "affine"
1761
+ },
1762
+ "language_model.model.layers.55.self_attn.k_proj": {
1763
+ "bits": 8,
1764
+ "group_size": 64,
1765
+ "mode": "affine"
1766
+ },
1767
+ "language_model.model.layers.31.self_attn.k_proj": {
1768
+ "bits": 8,
1769
+ "group_size": 64,
1770
+ "mode": "affine"
1771
+ },
1772
+ "language_model.model.layers.52.linear_attn.in_proj_z": {
1773
+ "bits": 8,
1774
+ "group_size": 64,
1775
+ "mode": "affine"
1776
+ },
1777
+ "language_model.model.layers.47.self_attn.q_proj": {
1778
+ "bits": 8,
1779
+ "group_size": 64,
1780
+ "mode": "affine"
1781
+ },
1782
+ "language_model.model.layers.2.mlp.down_proj": {
1783
+ "bits": 8,
1784
+ "group_size": 64,
1785
+ "mode": "affine"
1786
+ },
1787
+ "language_model.model.layers.4.linear_attn.in_proj_z": {
1788
+ "bits": 8,
1789
+ "group_size": 64,
1790
+ "mode": "affine"
1791
+ },
1792
+ "language_model.model.layers.9.linear_attn.in_proj_z": {
1793
+ "bits": 8,
1794
+ "group_size": 64,
1795
+ "mode": "affine"
1796
+ },
1797
+ "language_model.model.layers.11.mlp.down_proj": {
1798
+ "bits": 8,
1799
+ "group_size": 64,
1800
+ "mode": "affine"
1801
+ },
1802
+ "language_model.model.layers.19.self_attn.q_proj": {
1803
+ "bits": 8,
1804
+ "group_size": 64,
1805
+ "mode": "affine"
1806
+ },
1807
+ "language_model.model.layers.61.linear_attn.in_proj_qkv": {
1808
+ "bits": 8,
1809
+ "group_size": 64,
1810
+ "mode": "affine"
1811
+ },
1812
+ "language_model.model.layers.16.mlp.down_proj": {
1813
+ "bits": 8,
1814
+ "group_size": 64,
1815
+ "mode": "affine"
1816
+ },
1817
+ "language_model.model.layers.24.linear_attn.in_proj_qkv": {
1818
+ "bits": 8,
1819
+ "group_size": 64,
1820
+ "mode": "affine"
1821
+ },
1822
+ "language_model.model.layers.58.linear_attn.in_proj_qkv": {
1823
+ "bits": 8,
1824
+ "group_size": 64,
1825
+ "mode": "affine"
1826
+ },
1827
+ "language_model.model.layers.10.linear_attn.in_proj_qkv": {
1828
+ "bits": 8,
1829
+ "group_size": 64,
1830
+ "mode": "affine"
1831
+ },
1832
+ "language_model.model.layers.46.linear_attn.in_proj_z": {
1833
+ "bits": 8,
1834
+ "group_size": 64,
1835
+ "mode": "affine"
1836
+ },
1837
+ "language_model.model.layers.57.mlp.down_proj": {
1838
+ "bits": 8,
1839
+ "group_size": 64,
1840
+ "mode": "affine"
1841
+ },
1842
+ "language_model.model.layers.5.mlp.down_proj": {
1843
+ "bits": 8,
1844
+ "group_size": 64,
1845
+ "mode": "affine"
1846
+ },
1847
+ "language_model.model.layers.48.linear_attn.in_proj_z": {
1848
+ "bits": 8,
1849
+ "group_size": 64,
1850
+ "mode": "affine"
1851
+ },
1852
+ "language_model.model.layers.58.linear_attn.in_proj_z": {
1853
+ "bits": 8,
1854
+ "group_size": 64,
1855
+ "mode": "affine"
1856
+ },
1857
+ "language_model.model.layers.26.linear_attn.in_proj_qkv": {
1858
+ "bits": 8,
1859
+ "group_size": 64,
1860
+ "mode": "affine"
1861
+ },
1862
+ "language_model.model.layers.62.linear_attn.in_proj_qkv": {
1863
+ "bits": 8,
1864
+ "group_size": 64,
1865
+ "mode": "affine"
1866
+ },
1867
+ "language_model.model.layers.60.linear_attn.in_proj_qkv": {
1868
+ "bits": 8,
1869
+ "group_size": 64,
1870
+ "mode": "affine"
1871
+ },
1872
+ "language_model.model.layers.44.linear_attn.in_proj_qkv": {
1873
+ "bits": 8,
1874
+ "group_size": 64,
1875
+ "mode": "affine"
1876
+ },
1877
+ "language_model.model.layers.15.mlp.down_proj": {
1878
+ "bits": 8,
1879
+ "group_size": 64,
1880
+ "mode": "affine"
1881
+ },
1882
+ "language_model.model.layers.55.mlp.down_proj": {
1883
+ "bits": 8,
1884
+ "group_size": 64,
1885
+ "mode": "affine"
1886
+ },
1887
+ "language_model.model.layers.43.self_attn.v_proj": {
1888
+ "bits": 8,
1889
+ "group_size": 64,
1890
+ "mode": "affine"
1891
+ },
1892
+ "language_model.model.layers.37.linear_attn.in_proj_qkv": {
1893
+ "bits": 8,
1894
+ "group_size": 64,
1895
+ "mode": "affine"
1896
+ },
1897
+ "language_model.model.layers.52.linear_attn.in_proj_qkv": {
1898
+ "bits": 8,
1899
+ "group_size": 64,
1900
+ "mode": "affine"
1901
+ },
1902
+ "language_model.model.layers.10.mlp.down_proj": {
1903
+ "bits": 8,
1904
+ "group_size": 64,
1905
+ "mode": "affine"
1906
+ },
1907
+ "language_model.model.layers.46.linear_attn.in_proj_qkv": {
1908
+ "bits": 8,
1909
+ "group_size": 64,
1910
+ "mode": "affine"
1911
+ },
1912
+ "language_model.model.layers.19.self_attn.v_proj": {
1913
+ "bits": 8,
1914
+ "group_size": 64,
1915
+ "mode": "affine"
1916
+ },
1917
+ "language_model.model.layers.54.linear_attn.in_proj_qkv": {
1918
+ "bits": 8,
1919
+ "group_size": 64,
1920
+ "mode": "affine"
1921
+ },
1922
+ "language_model.model.layers.8.mlp.down_proj": {
1923
+ "bits": 8,
1924
+ "group_size": 64,
1925
+ "mode": "affine"
1926
+ },
1927
+ "language_model.model.layers.4.mlp.down_proj": {
1928
+ "bits": 8,
1929
+ "group_size": 64,
1930
+ "mode": "affine"
1931
+ },
1932
+ "language_model.model.layers.18.mlp.down_proj": {
1933
+ "bits": 8,
1934
+ "group_size": 64,
1935
+ "mode": "affine"
1936
+ },
1937
+ "language_model.model.layers.0.linear_attn.in_proj_qkv": {
1938
+ "bits": 8,
1939
+ "group_size": 64,
1940
+ "mode": "affine"
1941
+ },
1942
+ "language_model.model.layers.21.linear_attn.in_proj_z": {
1943
+ "bits": 8,
1944
+ "group_size": 64,
1945
+ "mode": "affine"
1946
+ },
1947
+ "language_model.model.layers.34.linear_attn.in_proj_qkv": {
1948
+ "bits": 8,
1949
+ "group_size": 64,
1950
+ "mode": "affine"
1951
+ },
1952
+ "language_model.model.layers.27.mlp.down_proj": {
1953
+ "bits": 8,
1954
+ "group_size": 64,
1955
+ "mode": "affine"
1956
+ },
1957
+ "language_model.model.layers.27.self_attn.v_proj": {
1958
+ "bits": 8,
1959
+ "group_size": 64,
1960
+ "mode": "affine"
1961
+ },
1962
+ "language_model.model.layers.7.mlp.down_proj": {
1963
+ "bits": 8,
1964
+ "group_size": 64,
1965
+ "mode": "affine"
1966
+ },
1967
+ "language_model.lm_head": {
1968
+ "bits": 8,
1969
+ "group_size": 64,
1970
+ "mode": "affine"
1971
+ },
1972
+ "language_model.model.layers.40.linear_attn.in_proj_z": {
1973
+ "bits": 8,
1974
+ "group_size": 64,
1975
+ "mode": "affine"
1976
+ },
1977
+ "language_model.model.layers.1.mlp.down_proj": {
1978
+ "bits": 8,
1979
+ "group_size": 64,
1980
+ "mode": "affine"
1981
+ },
1982
+ "language_model.model.layers.13.linear_attn.in_proj_z": {
1983
+ "bits": 8,
1984
+ "group_size": 64,
1985
+ "mode": "affine"
1986
+ },
1987
+ "language_model.model.layers.15.self_attn.q_proj": {
1988
+ "bits": 8,
1989
+ "group_size": 64,
1990
+ "mode": "affine"
1991
+ },
1992
+ "language_model.model.layers.12.linear_attn.in_proj_z": {
1993
+ "bits": 8,
1994
+ "group_size": 64,
1995
+ "mode": "affine"
1996
+ },
1997
+ "language_model.model.layers.4.linear_attn.in_proj_qkv": {
1998
+ "bits": 8,
1999
+ "group_size": 64,
2000
+ "mode": "affine"
2001
+ },
2002
+ "language_model.model.layers.42.linear_attn.in_proj_qkv": {
2003
+ "bits": 8,
2004
+ "group_size": 64,
2005
+ "mode": "affine"
2006
+ },
2007
+ "language_model.model.layers.43.self_attn.k_proj": {
2008
+ "bits": 8,
2009
+ "group_size": 64,
2010
+ "mode": "affine"
2011
+ },
2012
+ "language_model.model.layers.51.self_attn.k_proj": {
2013
+ "bits": 8,
2014
+ "group_size": 64,
2015
+ "mode": "affine"
2016
+ },
2017
+ "language_model.model.layers.36.linear_attn.in_proj_z": {
2018
+ "bits": 8,
2019
+ "group_size": 64,
2020
+ "mode": "affine"
2021
+ },
2022
+ "language_model.model.layers.59.self_attn.k_proj": {
2023
+ "bits": 8,
2024
+ "group_size": 64,
2025
+ "mode": "affine"
2026
+ },
2027
+ "language_model.model.layers.31.self_attn.q_proj": {
2028
+ "bits": 8,
2029
+ "group_size": 64,
2030
+ "mode": "affine"
2031
+ },
2032
+ "language_model.model.layers.7.self_attn.q_proj": {
2033
+ "bits": 8,
2034
+ "group_size": 64,
2035
+ "mode": "affine"
2036
+ },
2037
+ "language_model.model.layers.45.linear_attn.in_proj_qkv": {
2038
+ "bits": 8,
2039
+ "group_size": 64,
2040
+ "mode": "affine"
2041
+ },
2042
+ "language_model.model.layers.19.self_attn.k_proj": {
2043
+ "bits": 8,
2044
+ "group_size": 64,
2045
+ "mode": "affine"
2046
+ },
2047
+ "language_model.model.layers.38.linear_attn.in_proj_qkv": {
2048
+ "bits": 8,
2049
+ "group_size": 64,
2050
+ "mode": "affine"
2051
+ },
2052
+ "language_model.model.layers.38.linear_attn.in_proj_z": {
2053
+ "bits": 8,
2054
+ "group_size": 64,
2055
+ "mode": "affine"
2056
+ },
2057
+ "language_model.model.layers.63.self_attn.v_proj": {
2058
+ "bits": 8,
2059
+ "group_size": 64,
2060
+ "mode": "affine"
2061
+ },
2062
+ "language_model.model.layers.53.mlp.down_proj": {
2063
+ "bits": 8,
2064
+ "group_size": 64,
2065
+ "mode": "affine"
2066
+ },
2067
+ "language_model.model.layers.42.mlp.down_proj": {
2068
+ "bits": 8,
2069
+ "group_size": 64,
2070
+ "mode": "affine"
2071
+ },
2072
+ "language_model.model.layers.43.self_attn.q_proj": {
2073
+ "bits": 8,
2074
+ "group_size": 64,
2075
+ "mode": "affine"
2076
+ },
2077
+ "language_model.model.layers.47.self_attn.k_proj": {
2078
+ "bits": 8,
2079
+ "group_size": 64,
2080
+ "mode": "affine"
2081
+ },
2082
+ "language_model.model.layers.22.linear_attn.in_proj_qkv": {
2083
+ "bits": 8,
2084
+ "group_size": 64,
2085
+ "mode": "affine"
2086
+ },
2087
+ "language_model.model.layers.20.mlp.down_proj": {
2088
+ "bits": 8,
2089
+ "group_size": 64,
2090
+ "mode": "affine"
2091
+ },
2092
+ "language_model.model.layers.28.mlp.down_proj": {
2093
+ "bits": 8,
2094
+ "group_size": 64,
2095
+ "mode": "affine"
2096
+ },
2097
+ "language_model.model.layers.17.linear_attn.in_proj_qkv": {
2098
+ "bits": 8,
2099
+ "group_size": 64,
2100
+ "mode": "affine"
2101
+ },
2102
+ "language_model.model.layers.26.linear_attn.in_proj_z": {
2103
+ "bits": 8,
2104
+ "group_size": 64,
2105
+ "mode": "affine"
2106
+ },
2107
+ "language_model.model.layers.54.mlp.down_proj": {
2108
+ "bits": 8,
2109
+ "group_size": 64,
2110
+ "mode": "affine"
2111
+ },
2112
+ "language_model.model.layers.18.linear_attn.in_proj_z": {
2113
+ "bits": 8,
2114
+ "group_size": 64,
2115
+ "mode": "affine"
2116
+ }
2117
+ },
2118
+ "text_config": {
2119
+ "attention_bias": false,
2120
+ "attention_dropout": 0.0,
2121
+ "attn_output_gate": true,
2122
+ "bos_token_id": 248044,
2123
+ "dtype": "bfloat16",
2124
+ "eos_token_id": 248044,
2125
+ "full_attention_interval": 4,
2126
+ "head_dim": 256,
2127
+ "hidden_act": "silu",
2128
+ "hidden_size": 5120,
2129
+ "initializer_range": 0.02,
2130
+ "intermediate_size": 17408,
2131
+ "layer_types": [
2132
+ "linear_attention",
2133
+ "linear_attention",
2134
+ "linear_attention",
2135
+ "full_attention",
2136
+ "linear_attention",
2137
+ "linear_attention",
2138
+ "linear_attention",
2139
+ "full_attention",
2140
+ "linear_attention",
2141
+ "linear_attention",
2142
+ "linear_attention",
2143
+ "full_attention",
2144
+ "linear_attention",
2145
+ "linear_attention",
2146
+ "linear_attention",
2147
+ "full_attention",
2148
+ "linear_attention",
2149
+ "linear_attention",
2150
+ "linear_attention",
2151
+ "full_attention",
2152
+ "linear_attention",
2153
+ "linear_attention",
2154
+ "linear_attention",
2155
+ "full_attention",
2156
+ "linear_attention",
2157
+ "linear_attention",
2158
+ "linear_attention",
2159
+ "full_attention",
2160
+ "linear_attention",
2161
+ "linear_attention",
2162
+ "linear_attention",
2163
+ "full_attention",
2164
+ "linear_attention",
2165
+ "linear_attention",
2166
+ "linear_attention",
2167
+ "full_attention",
2168
+ "linear_attention",
2169
+ "linear_attention",
2170
+ "linear_attention",
2171
+ "full_attention",
2172
+ "linear_attention",
2173
+ "linear_attention",
2174
+ "linear_attention",
2175
+ "full_attention",
2176
+ "linear_attention",
2177
+ "linear_attention",
2178
+ "linear_attention",
2179
+ "full_attention",
2180
+ "linear_attention",
2181
+ "linear_attention",
2182
+ "linear_attention",
2183
+ "full_attention",
2184
+ "linear_attention",
2185
+ "linear_attention",
2186
+ "linear_attention",
2187
+ "full_attention",
2188
+ "linear_attention",
2189
+ "linear_attention",
2190
+ "linear_attention",
2191
+ "full_attention",
2192
+ "linear_attention",
2193
+ "linear_attention",
2194
+ "linear_attention",
2195
+ "full_attention"
2196
+ ],
2197
+ "linear_conv_kernel_dim": 4,
2198
+ "linear_key_head_dim": 128,
2199
+ "linear_num_key_heads": 16,
2200
+ "linear_num_value_heads": 48,
2201
+ "linear_value_head_dim": 128,
2202
+ "mamba_ssm_dtype": "float32",
2203
+ "max_position_embeddings": 262144,
2204
+ "model_type": "qwen3_5_text",
2205
+ "mtp_num_hidden_layers": 1,
2206
+ "mtp_use_dedicated_embeddings": false,
2207
+ "num_attention_heads": 24,
2208
+ "num_hidden_layers": 64,
2209
+ "num_key_value_heads": 4,
2210
+ "output_gate_type": "swish",
2211
+ "pad_token_id": null,
2212
+ "partial_rotary_factor": 0.25,
2213
+ "rms_norm_eps": 1e-6,
2214
+ "rope_parameters": {
2215
+ "mrope_interleaved": true,
2216
+ "mrope_section": [
2217
+ 11,
2218
+ 11,
2219
+ 10
2220
+ ],
2221
+ "partial_rotary_factor": 0.25,
2222
+ "rope_theta": 10000000,
2223
+ "rope_type": "default"
2224
+ },
2225
+ "tie_word_embeddings": false,
2226
+ "use_cache": true,
2227
+ "vocab_size": 248320
2228
+ },
2229
+ "tie_word_embeddings": false,
2230
+ "transformers_version": "4.57.1",
2231
+ "video_token_id": 248057,
2232
+ "vision_config": {
2233
+ "deepstack_visual_indexes": [],
2234
+ "depth": 27,
2235
+ "hidden_act": "gelu_pytorch_tanh",
2236
+ "hidden_size": 1152,
2237
+ "in_channels": 3,
2238
+ "initializer_range": 0.02,
2239
+ "intermediate_size": 4304,
2240
+ "model_type": "qwen3_5",
2241
+ "num_heads": 16,
2242
+ "num_position_embeddings": 2304,
2243
+ "out_hidden_size": 5120,
2244
+ "patch_size": 16,
2245
+ "spatial_merge_size": 2,
2246
+ "temporal_patch_size": 2
2247
+ },
2248
+ "vision_end_token_id": 248054,
2249
+ "vision_start_token_id": 248053
2250
+ }
generation_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 248044,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 248046,
6
+ 248044
7
+ ],
8
+ "pad_token_id": 248044,
9
+ "temperature": 1.0,
10
+ "top_k": 20,
11
+ "top_p": 0.95
12
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be6dc8074ae05d104c03b1ae3c116b803ab3c9d5350b0cd25b0b03c8cf290661
3
+ size 5368540479
model-00002-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97c55f720a88351bda0e1cf99176dcbd4f9bd1f49557bc233d22d658598fd244
3
+ size 5310390631
model-00003-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4e5bdb49d13be9f82cf385d29eb7fe46374d7727735433214802ee6623d5678a
3
+ size 5353812121
model-00004-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:859b900c5948acc616f4e4d329d47b3db7b15470f1689f196ad25ec084890b2c
3
+ size 5365266561
model-00005-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e250a8c2fd05221c2af02ea24e9b2c81e508f8cd8a9fe38c8bef35c1aff531cc
3
+ size 5318419782
model-00006-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4b6524c5f082843ee9d31156a312802b2007783131b3ca87b004ca0f0ec8971
3
+ size 1842174785
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
preprocessor_config.json ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "size": {
3
+ "longest_edge": 16777216,
4
+ "shortest_edge": 65536
5
+ },
6
+ "patch_size": 16,
7
+ "temporal_patch_size": 2,
8
+ "merge_size": 2,
9
+ "image_mean": [
10
+ 0.5,
11
+ 0.5,
12
+ 0.5
13
+ ],
14
+ "image_std": [
15
+ 0.5,
16
+ 0.5,
17
+ 0.5
18
+ ],
19
+ "processor_class": "Qwen3VLProcessor",
20
+ "image_processor_type": "Qwen2VLImageProcessorFast"
21
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f9e4d4901a92b997e463c1f46055088b6cca5ca61a6522d1b9f64c4bb81cb42
3
+ size 12807982
tokenizer_config.json ADDED
@@ -0,0 +1,305 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "248044": {
5
+ "content": "<|endoftext|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "248045": {
13
+ "content": "<|im_start|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "248046": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "248047": {
29
+ "content": "<|object_ref_start|>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "248048": {
37
+ "content": "<|object_ref_end|>",
38
+ "lstrip": false,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ },
44
+ "248049": {
45
+ "content": "<|box_start|>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false,
50
+ "special": true
51
+ },
52
+ "248050": {
53
+ "content": "<|box_end|>",
54
+ "lstrip": false,
55
+ "normalized": false,
56
+ "rstrip": false,
57
+ "single_word": false,
58
+ "special": true
59
+ },
60
+ "248051": {
61
+ "content": "<|quad_start|>",
62
+ "lstrip": false,
63
+ "normalized": false,
64
+ "rstrip": false,
65
+ "single_word": false,
66
+ "special": true
67
+ },
68
+ "248052": {
69
+ "content": "<|quad_end|>",
70
+ "lstrip": false,
71
+ "normalized": false,
72
+ "rstrip": false,
73
+ "single_word": false,
74
+ "special": true
75
+ },
76
+ "248053": {
77
+ "content": "<|vision_start|>",
78
+ "lstrip": false,
79
+ "normalized": false,
80
+ "rstrip": false,
81
+ "single_word": false,
82
+ "special": true
83
+ },
84
+ "248054": {
85
+ "content": "<|vision_end|>",
86
+ "lstrip": false,
87
+ "normalized": false,
88
+ "rstrip": false,
89
+ "single_word": false,
90
+ "special": true
91
+ },
92
+ "248055": {
93
+ "content": "<|vision_pad|>",
94
+ "lstrip": false,
95
+ "normalized": false,
96
+ "rstrip": false,
97
+ "single_word": false,
98
+ "special": true
99
+ },
100
+ "248056": {
101
+ "content": "<|image_pad|>",
102
+ "lstrip": false,
103
+ "normalized": false,
104
+ "rstrip": false,
105
+ "single_word": false,
106
+ "special": true
107
+ },
108
+ "248057": {
109
+ "content": "<|video_pad|>",
110
+ "lstrip": false,
111
+ "normalized": false,
112
+ "rstrip": false,
113
+ "single_word": false,
114
+ "special": true
115
+ },
116
+ "248058": {
117
+ "content": "<tool_call>",
118
+ "lstrip": false,
119
+ "normalized": false,
120
+ "rstrip": false,
121
+ "single_word": false,
122
+ "special": false
123
+ },
124
+ "248059": {
125
+ "content": "</tool_call>",
126
+ "lstrip": false,
127
+ "normalized": false,
128
+ "rstrip": false,
129
+ "single_word": false,
130
+ "special": false
131
+ },
132
+ "248060": {
133
+ "content": "<|fim_prefix|>",
134
+ "lstrip": false,
135
+ "normalized": false,
136
+ "rstrip": false,
137
+ "single_word": false,
138
+ "special": false
139
+ },
140
+ "248061": {
141
+ "content": "<|fim_middle|>",
142
+ "lstrip": false,
143
+ "normalized": false,
144
+ "rstrip": false,
145
+ "single_word": false,
146
+ "special": false
147
+ },
148
+ "248062": {
149
+ "content": "<|fim_suffix|>",
150
+ "lstrip": false,
151
+ "normalized": false,
152
+ "rstrip": false,
153
+ "single_word": false,
154
+ "special": false
155
+ },
156
+ "248063": {
157
+ "content": "<|fim_pad|>",
158
+ "lstrip": false,
159
+ "normalized": false,
160
+ "rstrip": false,
161
+ "single_word": false,
162
+ "special": false
163
+ },
164
+ "248064": {
165
+ "content": "<|repo_name|>",
166
+ "lstrip": false,
167
+ "normalized": false,
168
+ "rstrip": false,
169
+ "single_word": false,
170
+ "special": false
171
+ },
172
+ "248065": {
173
+ "content": "<|file_sep|>",
174
+ "lstrip": false,
175
+ "normalized": false,
176
+ "rstrip": false,
177
+ "single_word": false,
178
+ "special": false
179
+ },
180
+ "248066": {
181
+ "content": "<tool_response>",
182
+ "lstrip": false,
183
+ "normalized": false,
184
+ "rstrip": false,
185
+ "single_word": false,
186
+ "special": false
187
+ },
188
+ "248067": {
189
+ "content": "</tool_response>",
190
+ "lstrip": false,
191
+ "normalized": false,
192
+ "rstrip": false,
193
+ "single_word": false,
194
+ "special": false
195
+ },
196
+ "248068": {
197
+ "content": "<think>",
198
+ "lstrip": false,
199
+ "normalized": false,
200
+ "rstrip": false,
201
+ "single_word": false,
202
+ "special": false
203
+ },
204
+ "248069": {
205
+ "content": "</think>",
206
+ "lstrip": false,
207
+ "normalized": false,
208
+ "rstrip": false,
209
+ "single_word": false,
210
+ "special": false
211
+ },
212
+ "248070": {
213
+ "content": "<|audio_start|>",
214
+ "lstrip": false,
215
+ "normalized": false,
216
+ "rstrip": false,
217
+ "single_word": false,
218
+ "special": true
219
+ },
220
+ "248071": {
221
+ "content": "<|audio_end|>",
222
+ "lstrip": false,
223
+ "normalized": false,
224
+ "rstrip": false,
225
+ "single_word": false,
226
+ "special": true
227
+ },
228
+ "248072": {
229
+ "content": "<tts_pad>",
230
+ "lstrip": false,
231
+ "normalized": false,
232
+ "rstrip": false,
233
+ "single_word": false,
234
+ "special": true
235
+ },
236
+ "248073": {
237
+ "content": "<tts_text_bos>",
238
+ "lstrip": false,
239
+ "normalized": false,
240
+ "rstrip": false,
241
+ "single_word": false,
242
+ "special": true
243
+ },
244
+ "248074": {
245
+ "content": "<tts_text_eod>",
246
+ "lstrip": false,
247
+ "normalized": false,
248
+ "rstrip": false,
249
+ "single_word": false,
250
+ "special": true
251
+ },
252
+ "248075": {
253
+ "content": "<tts_text_bos_single>",
254
+ "lstrip": false,
255
+ "normalized": false,
256
+ "rstrip": false,
257
+ "single_word": false,
258
+ "special": true
259
+ },
260
+ "248076": {
261
+ "content": "<|audio_pad|>",
262
+ "lstrip": false,
263
+ "normalized": false,
264
+ "rstrip": false,
265
+ "single_word": false,
266
+ "special": true
267
+ }
268
+ },
269
+ "additional_special_tokens": [
270
+ "<|im_start|>",
271
+ "<|im_end|>",
272
+ "<|object_ref_start|>",
273
+ "<|object_ref_end|>",
274
+ "<|box_start|>",
275
+ "<|box_end|>",
276
+ "<|quad_start|>",
277
+ "<|quad_end|>",
278
+ "<|vision_start|>",
279
+ "<|vision_end|>",
280
+ "<|vision_pad|>",
281
+ "<|image_pad|>",
282
+ "<|video_pad|>"
283
+ ],
284
+ "bos_token": null,
285
+ "chat_template": "{%- set image_count = namespace(value=0) %}\n{%- set video_count = namespace(value=0) %}\n{%- macro render_content(content, do_vision_count, is_system_content=false) %}\n {%- if content is string %}\n {{- content }}\n {%- elif content is iterable and content is not mapping %}\n {%- for item in content %}\n {%- if 'image' in item or 'image_url' in item or item.type == 'image' %}\n {%- if is_system_content %}\n {{- raise_exception('System message cannot contain images.') }}\n {%- endif %}\n {%- if do_vision_count %}\n {%- set image_count.value = image_count.value + 1 %}\n {%- endif %}\n {%- if add_vision_id %}\n {{- 'Picture ' ~ image_count.value ~ ': ' }}\n {%- endif %}\n {{- '<|vision_start|><|image_pad|><|vision_end|>' }}\n {%- elif 'video' in item or item.type == 'video' %}\n {%- if is_system_content %}\n {{- raise_exception('System message cannot contain videos.') }}\n {%- endif %}\n {%- if do_vision_count %}\n {%- set video_count.value = video_count.value + 1 %}\n {%- endif %}\n {%- if add_vision_id %}\n {{- 'Video ' ~ video_count.value ~ ': ' }}\n {%- endif %}\n {{- '<|vision_start|><|video_pad|><|vision_end|>' }}\n {%- elif 'text' in item %}\n {{- item.text }}\n {%- else %}\n {{- raise_exception('Unexpected item type in content.') }}\n {%- endif %}\n {%- endfor %}\n {%- elif content is none or content is undefined %}\n {{- '' }}\n {%- else %}\n {{- raise_exception('Unexpected content type.') }}\n {%- endif %}\n{%- endmacro %}\n{%- if not messages %}\n {{- raise_exception('No messages provided.') }}\n{%- endif %}\n{%- if tools and tools is iterable and tools is not mapping %}\n {{- '<|im_start|>system\\n' }}\n {{- \"# Tools\\n\\nYou have access to the following functions:\\n\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\" }}\n {{- '\\n\\nIf you choose to call a function ONLY reply in the following format with NO suffix:\\n\\n<tool_call>\\n<function=example_function_name>\\n<parameter=example_parameter_1>\\nvalue_1\\n</parameter>\\n<parameter=example_parameter_2>\\nThis is the value for the second parameter\\nthat can span\\nmultiple lines\\n</parameter>\\n</function>\\n</tool_call>\\n\\n<IMPORTANT>\\nReminder:\\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\\n- Required parameters MUST be specified\\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\\n</IMPORTANT>' }}\n {%- if messages[0].role == 'system' %}\n {%- set content = render_content(messages[0].content, false, true)|trim %}\n {%- if content %}\n {{- '\\n\\n' + content }}\n {%- endif %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {%- set content = render_content(messages[0].content, false, true)|trim %}\n {{- '<|im_start|>system\\n' + content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" %}\n {%- set content = render_content(message.content, false)|trim %}\n {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if ns.multi_step_tool %}\n {{- raise_exception('No user query found in messages.') }}\n{%- endif %}\n{%- for message in messages %}\n {%- set content = render_content(message.content, true)|trim %}\n {%- if message.role == \"system\" %}\n {%- if not loop.first %}\n {{- raise_exception('System message must be at the beginning.') }}\n {%- endif %}\n {%- elif message.role == \"user\" %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is string %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in content %}\n {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- set reasoning_content = reasoning_content|trim %}\n {%- if (preserve_thinking is defined and preserve_thinking is true) or (loop.index0 > ns.last_query_index) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content + '\\n</think>\\n\\n' + content }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls and message.tool_calls is iterable and message.tool_calls is not mapping %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {%- if loop.first %}\n {%- if content|trim %}\n {{- '\\n\\n<tool_call>\\n<function=' + tool_call.name + '>\\n' }}\n {%- else %}\n {{- '<tool_call>\\n<function=' + tool_call.name + '>\\n' }}\n {%- endif %}\n {%- else %}\n {{- '\\n<tool_call>\\n<function=' + tool_call.name + '>\\n' }}\n {%- endif %}\n {%- if tool_call.arguments is defined %}\n {%- for args_name, args_value in tool_call.arguments|items %}\n {{- '<parameter=' + args_name + '>\\n' }}\n {%- set args_value = args_value | string if args_value is string else args_value | tojson | safe %}\n {{- args_value }}\n {{- '\\n</parameter>\\n' }}\n {%- endfor %}\n {%- endif %}\n {{- '</function>\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.previtem and loop.previtem.role != \"tool\" %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if not loop.last and loop.nextitem.role != \"tool\" %}\n {{- '<|im_end|>\\n' }}\n {%- elif loop.last %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- else %}\n {{- raise_exception('Unexpected message role.') }}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n {%- if enable_thinking is defined and enable_thinking is false %}\n {{- '<think>\\n\\n</think>\\n\\n' }}\n {%- else %}\n {{- '<think>\\n' }}\n {%- endif %}\n{%- endif %}",
286
+ "clean_up_tokenization_spaces": false,
287
+ "eos_token": "<|im_end|>",
288
+ "errors": "replace",
289
+ "model_max_length": 262144,
290
+ "pad_token": "<|endoftext|>",
291
+ "split_special_tokens": false,
292
+ "tokenizer_class": "Qwen2Tokenizer",
293
+ "unk_token": null,
294
+ "add_bos_token": false,
295
+ "pretokenize_regex": "(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?[\\p{L}\\p{M}]+|\\p{N}| ?[^\\s\\p{L}\\p{M}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+",
296
+ "extra_special_tokens": {
297
+ "audio_bos_token": "<|audio_start|>",
298
+ "audio_eos_token": "<|audio_end|>",
299
+ "audio_token": "<|audio_pad|>",
300
+ "image_token": "<|image_pad|>",
301
+ "video_token": "<|video_pad|>",
302
+ "vision_bos_token": "<|vision_start|>",
303
+ "vision_eos_token": "<|vision_end|>"
304
+ }
305
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff