Deviad commited on
Commit
6eb37c9
·
verified ·
1 Parent(s): 5a21f1a

Upload README_UPSTREAM.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README_UPSTREAM.md +627 -0
README_UPSTREAM.md ADDED
@@ -0,0 +1,627 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - compressed-tensors
4
+ license: other
5
+ license_name: modified-mit
6
+ library_name: transformers
7
+ pipeline_tag: image-text-to-text
8
+ ---
9
+ <div align="center">
10
+ <picture>
11
+ <img src="figures/kimi-logo.png" width="30%" alt="Kimi K2.6">
12
+ </picture>
13
+ </div>
14
+ <hr>
15
+ <div align="center" style="line-height:1">
16
+ <a href="https://www.kimi.com" target="_blank"><img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-Kimi%20K2.6-ff6b6b?color=1783ff&logoColor=white"/></a>
17
+ <a href="https://www.moonshot.ai" target="_blank"><img alt="Homepage" src="https://img.shields.io/badge/Homepage-Moonshot%20AI-white?logo=Kimi&logoColor=white"/></a>
18
+ </div>
19
+
20
+ <div align="center" style="line-height: 1;">
21
+ <a href="https://huggingface.co/moonshotai" target="_blank"><img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Moonshot%20AI-ffc107?color=ffc107&logoColor=white"/></a>
22
+ <a href="https://twitter.com/kimi_moonshot" target="_blank"><img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-Kimi.ai-white?logo=x&logoColor=white"/></a>
23
+ <a href="https://discord.gg/TYU2fdJykW" target="_blank"><img alt="Discord" src="https://img.shields.io/badge/Discord-Kimi.ai-white?logo=discord&logoColor=white"/></a>
24
+ <a href="https://modelscope.cn/organization/moonshotai" target="_blank"><img alt="ModelScope" src="https://img.shields.io/badge/ModelScope-Moonshot%20AI-white?labelColor=rgb(99%2C%2074%2C%20255)"/></a>
25
+ </div>
26
+ <div align="center" style="line-height: 1;">
27
+ <a href="https://huggingface.co/moonshotai/Kimi-K2.6/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a>
28
+ </div>
29
+
30
+
31
+ <p align="center">
32
+ 🤗&nbsp;&nbsp;<a href="https://huggingface.co/spaces/akhaliq/Kimi-K2.6" target="_blank">huggingchat</a>
33
+ &nbsp;|&nbsp;
34
+ 📰&nbsp;&nbsp;<a href="https://www.kimi.com/blog/kimi-k2-6.html">Tech Blog</a>
35
+ </p>
36
+
37
+
38
+ ## 1. Model Introduction
39
+
40
+ Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration.
41
+
42
+ ### Key Features
43
+ - **Long-Horizon Coding**: K2.6 achieves significant improvements on complex, end-to-end coding tasks, generalizing robustly across programming languages (Rust, Go, Python) and domains spanning front-end, DevOps, and performance optimization.
44
+ - **Coding-Driven Design**: K2.6 is capable of transforming simple prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows, generating structured layouts, interactive elements, and rich animations with deliberate aesthetic precision.
45
+ - **Elevated Agent Swarm**: Scaling horizontally to 300 sub-agents executing 4,000 coordinated steps, K2.6 can dynamically decompose tasks into parallel, domain-specialized subtasks, delivering end-to-end outputs from documents to websites to spreadsheets in a single autonomous run.
46
+ - **Proactive & Open Orchestration**: For autonomous tasks, K2.6 demonstrates strong performance in powering persistent, 24/7 background agents that proactively manage schedules, execute code, and orchestrate cross-platform operations without human oversight.
47
+
48
+ ## 2. Model Summary
49
+
50
+ <div align="center">
51
+
52
+
53
+ | | |
54
+ |:---:|:---:|
55
+ | **Architecture** | Mixture-of-Experts (MoE) |
56
+ | **Total Parameters** | 1T |
57
+ | **Activated Parameters** | 32B |
58
+ | **Number of Layers** (Dense layer included) | 61 |
59
+ | **Number of Dense Layers** | 1 |
60
+ | **Attention Hidden Dimension** | 7168 |
61
+ | **MoE Hidden Dimension** (per Expert) | 2048 |
62
+ | **Number of Attention Heads** | 64 |
63
+ | **Number of Experts** | 384 |
64
+ | **Selected Experts per Token** | 8 |
65
+ | **Number of Shared Experts** | 1 |
66
+ | **Vocabulary Size** | 160K |
67
+ | **Context Length** | 256K |
68
+ | **Attention Mechanism** | MLA |
69
+ | **Activation Function** | SwiGLU |
70
+ | **Vision Encoder** | MoonViT |
71
+ | **Parameters of Vision Encoder** | 400M |
72
+ </div>
73
+
74
+ ## 3. Evaluation Results
75
+
76
+ <div align="center">
77
+ <table>
78
+ <thead>
79
+ <tr>
80
+ <th align="center">Benchmark</th>
81
+ <th align="center"><sup>Kimi K2.6</sup></th>
82
+ <th align="center"><sup>GPT-5.4 <br><sup>(xhigh)</sup></sup></th>
83
+ <th align="center"><sup>Claude Opus 4.6 <br><sup>(max effort)</sup></sup></th>
84
+ <th align="center"><sup>Gemini 3.1 Pro<br><sup>(thinking high)</sup></sup></th>
85
+ <th align="center"><sup>Kimi K2.5</sup></th>
86
+ </tr>
87
+ </thead>
88
+ <tbody>
89
+ <tr>
90
+ <td align="center" colspan=6><strong>Agentic</strong></td>
91
+ </tr>
92
+ <tr>
93
+ <td align="center" style="vertical-align: middle">HLE-Full<br>(w/ tools)</td>
94
+ <td align="center" style="vertical-align: middle">54.0</td>
95
+ <td align="center" style="vertical-align: middle">52.1</td>
96
+ <td align="center" style="vertical-align: middle">53.0</td>
97
+ <td align="center" style="vertical-align: middle">51.4</td>
98
+ <td align="center" style="vertical-align: middle">50.2</td>
99
+ </tr>
100
+ <tr>
101
+ <td align="center" style="vertical-align: middle">BrowseComp</td>
102
+ <td align="center" style="vertical-align: middle">83.2</td>
103
+ <td align="center" style="vertical-align: middle" rowspan="2">82.7</td>
104
+ <td align="center" style="vertical-align: middle" rowspan="2">83.7</td>
105
+ <td align="center" style="vertical-align: middle" rowspan="2">85.9</td>
106
+ <td align="center" style="vertical-align: middle">74.9</td>
107
+ </tr>
108
+ <tr>
109
+ <td align="center" style="vertical-align: middle">BrowseComp<br>(Agent Swarm)</td>
110
+ <td align="center" style="vertical-align: middle">86.3</td>
111
+ <td align="center" style="vertical-align: middle">78.4</td>
112
+ </tr>
113
+ <tr>
114
+ <td align="center" style="vertical-align: middle">DeepSearchQA<br>(f1-score)</td>
115
+ <td align="center" style="vertical-align: middle">92.5</td>
116
+ <td align="center" style="vertical-align: middle">78.6</td>
117
+ <td align="center" style="vertical-align: middle">91.3</td>
118
+ <td align="center" style="vertical-align: middle">81.9</td>
119
+ <td align="center" style="vertical-align: middle">89.0</td>
120
+ </tr>
121
+ <tr>
122
+ <td align="center" style="vertical-align: middle">DeepSearchQA<br>(accuracy)</td>
123
+ <td align="center" style="vertical-align: middle">83.0</td>
124
+ <td align="center" style="vertical-align: middle">63.7</td>
125
+ <td align="center" style="vertical-align: middle">80.6</td>
126
+ <td align="center" style="vertical-align: middle">60.2</td>
127
+ <td align="center" style="vertical-align: middle">77.1</td>
128
+ </tr>
129
+ <tr>
130
+ <td align="center" style="vertical-align: middle">WideSearch<br> (item-f1)</td>
131
+ <td align="center" style="vertical-align: middle">80.8</td>
132
+ <td align="center" style="vertical-align: middle">-</td>
133
+ <td align="center" style="vertical-align: middle">-</td>
134
+ <td align="center" style="vertical-align: middle">-</td>
135
+ <td align="center" style="vertical-align: middle">72.7</td>
136
+ </tr>
137
+ <tr>
138
+ <td align="center" style="vertical-align: middle">Toolathlon</td>
139
+ <td align="center" style="vertical-align: middle">50.0</td>
140
+ <td align="center" style="vertical-align: middle">54.6</td>
141
+ <td align="center" style="vertical-align: middle">47.2</td>
142
+ <td align="center" style="vertical-align: middle">48.8</td>
143
+ <td align="center" style="vertical-align: middle">27.8</td>
144
+ </tr>
145
+ <tr>
146
+ <td align="center" style="vertical-align: middle">MCPMark</td>
147
+ <td align="center" style="vertical-align: middle">55.9</td>
148
+ <td align="center" style="vertical-align: middle">62.5*</td>
149
+ <td align="center" style="vertical-align: middle">56.7*</td>
150
+ <td align="center" style="vertical-align: middle">55.9*</td>
151
+ <td align="center" style="vertical-align: middle">29.5</td>
152
+ </tr>
153
+ <tr>
154
+ <td align="center" style="vertical-align: middle">Claw Eval (pass^3)</td>
155
+ <td align="center" style="vertical-align: middle">62.3</td>
156
+ <td align="center" style="vertical-align: middle">60.3</td>
157
+ <td align="center" style="vertical-align: middle">70.4</td>
158
+ <td align="center" style="vertical-align: middle">57.8</td>
159
+ <td align="center" style="vertical-align: middle">52.3</td>
160
+ </tr>
161
+ <tr>
162
+ <td align="center" style="vertical-align: middle">Claw Eval (pass@3)</td>
163
+ <td align="center" style="vertical-align: middle">80.9</td>
164
+ <td align="center" style="vertical-align: middle">78.4</td>
165
+ <td align="center" style="vertical-align: middle">82.4</td>
166
+ <td align="center" style="vertical-align: middle">82.9</td>
167
+ <td align="center" style="vertical-align: middle">75.4</td>
168
+ </tr>
169
+ <tr>
170
+ <td align="center" style="vertical-align: middle">APEX-Agents</td>
171
+ <td align="center" style="vertical-align: middle">27.9</td>
172
+ <td align="center" style="vertical-align: middle">33.3</td>
173
+ <td align="center" style="vertical-align: middle">33.0</td>
174
+ <td align="center" style="vertical-align: middle">32.0</td>
175
+ <td align="center" style="vertical-align: middle">11.5</td>
176
+ </tr>
177
+ <tr>
178
+ <td align="center" style="vertical-align: middle">OSWorld-Verified</td>
179
+ <td align="center" style="vertical-align: middle">73.1</td>
180
+ <td align="center" style="vertical-align: middle">75.0</td>
181
+ <td align="center" style="vertical-align: middle">72.7</td>
182
+ <td align="center" style="vertical-align: middle">-</td>
183
+ <td align="center" style="vertical-align: middle">63.3</td>
184
+ </tr>
185
+ <tr>
186
+ <td align="center" colspan=6><strong>Coding</strong></td>
187
+ </tr>
188
+ <tr>
189
+ <td align="center" style="vertical-align: middle">Terminal-Bench 2.0<br>(Terminus-2)</td>
190
+ <td align="center" style="vertical-align: middle">66.7</td>
191
+ <td align="center" style="vertical-align: middle">65.4*</td>
192
+ <td align="center" style="vertical-align: middle">65.4</td>
193
+ <td align="center" style="vertical-align: middle">68.5</td>
194
+ <td align="center" style="vertical-align: middle">50.8</td>
195
+ </tr>
196
+ <tr>
197
+ <td align="center" style="vertical-align: middle">SWE-Bench Pro</td>
198
+ <td align="center" style="vertical-align: middle">58.6</td>
199
+ <td align="center" style="vertical-align: middle">57.7</td>
200
+ <td align="center" style="vertical-align: middle">53.4</td>
201
+ <td align="center" style="vertical-align: middle">54.2</td>
202
+ <td align="center" style="vertical-align: middle">50.7</td>
203
+ </tr>
204
+ <tr>
205
+ <td align="center" style="vertical-align: middle">SWE-Bench Multilingual</td>
206
+ <td align="center" style="vertical-align: middle">76.7</td>
207
+ <td align="center" style="vertical-align: middle">-</td>
208
+ <td align="center" style="vertical-align: middle">77.8</td>
209
+ <td align="center" style="vertical-align: middle">76.9*</td>
210
+ <td align="center" style="vertical-align: middle">73.0</td>
211
+ </tr>
212
+ <tr>
213
+ <td align="center" style="vertical-align: middle">SWE-Bench Verified</td>
214
+ <td align="center" style="vertical-align: middle">80.2</td>
215
+ <td align="center" style="vertical-align: middle">-</td>
216
+ <td align="center" style="vertical-align: middle">80.8</td>
217
+ <td align="center" style="vertical-align: middle">80.6</td>
218
+ <td align="center" style="vertical-align: middle">76.8</td>
219
+ </tr>
220
+ <tr>
221
+ <td align="center" style="vertical-align: middle">SciCode</td>
222
+ <td align="center" style="vertical-align: middle">52.2</td>
223
+ <td align="center" style="vertical-align: middle">56.6</td>
224
+ <td align="center" style="vertical-align: middle">51.9</td>
225
+ <td align="center" style="vertical-align: middle">58.9</td>
226
+ <td align="center" style="vertical-align: middle">48.7</td>
227
+ </tr>
228
+ <tr>
229
+ <td align="center" style="vertical-align: middle">OJBench (python)</td>
230
+ <td align="center" style="vertical-align: middle">60.6</td>
231
+ <td align="center" style="vertical-align: middle">-</td>
232
+ <td align="center" style="vertical-align: middle">60.3</td>
233
+ <td align="center" style="vertical-align: middle">70.7</td>
234
+ <td align="center" style="vertical-align: middle">54.7</td>
235
+ </tr>
236
+ <tr>
237
+ <td align="center" style="vertical-align: middle">LiveCodeBench (v6)</td>
238
+ <td align="center" style="vertical-align: middle">89.6</td>
239
+ <td align="center" style="vertical-align: middle">-</td>
240
+ <td align="center" style="vertical-align: middle">88.8</td>
241
+ <td align="center" style="vertical-align: middle">91.7</td>
242
+ <td align="center" style="vertical-align: middle">85.0</td>
243
+ </tr>
244
+ <tr>
245
+ <td align="center" colspan=6><strong>Reasoning &amp; Knowledge</strong></td>
246
+ </tr>
247
+ <tr>
248
+ <td align="center" style="vertical-align: middle">HLE-Full</td>
249
+ <td align="center" style="vertical-align: middle">34.7</td>
250
+ <td align="center" style="vertical-align: middle">39.8</td>
251
+ <td align="center" style="vertical-align: middle">40.0</td>
252
+ <td align="center" style="vertical-align: middle">44.4</td>
253
+ <td align="center" style="vertical-align: middle">30.1</td>
254
+ </tr>
255
+ <tr>
256
+ <td align="center" style="vertical-align: middle">AIME 2026</td>
257
+ <td align="center" style="vertical-align: middle">96.4</td>
258
+ <td align="center" style="vertical-align: middle">99.2</td>
259
+ <td align="center" style="vertical-align: middle">96.7</td>
260
+ <td align="center" style="vertical-align: middle">98.3</td>
261
+ <td align="center" style="vertical-align: middle">95.8</td>
262
+ </tr>
263
+ <tr>
264
+ <td align="center" style="vertical-align: middle">HMMT 2026 (Feb)</td>
265
+ <td align="center" style="vertical-align: middle">92.7</td>
266
+ <td align="center" style="vertical-align: middle">97.7</td>
267
+ <td align="center" style="vertical-align: middle">96.2</td>
268
+ <td align="center" style="vertical-align: middle">94.7</td>
269
+ <td align="center" style="vertical-align: middle">87.1</td>
270
+ </tr>
271
+ <tr>
272
+ <td align="center" style="vertical-align: middle">IMO-AnswerBench</td>
273
+ <td align="center" style="vertical-align: middle">86.0</td>
274
+ <td align="center" style="vertical-align: middle">91.4</td>
275
+ <td align="center" style="vertical-align: middle">75.3</td>
276
+ <td align="center" style="vertical-align: middle">91.0*</td>
277
+ <td align="center" style="vertical-align: middle">81.8</td>
278
+ </tr>
279
+ <tr>
280
+ <td align="center" style="vertical-align: middle">GPQA-Diamond</td>
281
+ <td align="center" style="vertical-align: middle">90.5</td>
282
+ <td align="center" style="vertical-align: middle">92.8</td>
283
+ <td align="center" style="vertical-align: middle">91.3</td>
284
+ <td align="center" style="vertical-align: middle">94.3</td>
285
+ <td align="center" style="vertical-align: middle">87.6</td>
286
+ </tr>
287
+ <tr>
288
+ <td align="center" colspan=6><strong>Vision</strong></td>
289
+ </tr>
290
+ <tr>
291
+ <td align="center" style="vertical-align: middle">MMMU-Pro</td>
292
+ <td align="center" style="vertical-align: middle">79.4</td>
293
+ <td align="center" style="vertical-align: middle">81.2</td>
294
+ <td align="center" style="vertical-align: middle">73.9</td>
295
+ <td align="center" style="vertical-align: middle">83.0*</td>
296
+ <td align="center" style="vertical-align: middle">78.5</td>
297
+ </tr>
298
+ <tr>
299
+ <td align="center" style="vertical-align: middle">MMMU-Pro (w/ python)</td>
300
+ <td align="center" style="vertical-align: middle">80.1</td>
301
+ <td align="center" style="vertical-align: middle">82.1</td>
302
+ <td align="center" style="vertical-align: middle">77.3</td>
303
+ <td align="center" style="vertical-align: middle">85.3*</td>
304
+ <td align="center" style="vertical-align: middle">77.7</td>
305
+ </tr>
306
+ <tr>
307
+ <td align="center" style="vertical-align: middle">CharXiv (RQ)</td>
308
+ <td align="center" style="vertical-align: middle">80.4</td>
309
+ <td align="center" style="vertical-align: middle">82.8*</td>
310
+ <td align="center" style="vertical-align: middle">69.1</td>
311
+ <td align="center" style="vertical-align: middle">80.2*</td>
312
+ <td align="center" style="vertical-align: middle">77.5</td>
313
+ </tr>
314
+ <tr>
315
+ <td align="center" style="vertical-align: middle">CharXiv (RQ) (w/ python)</td>
316
+ <td align="center" style="vertical-align: middle">86.7</td>
317
+ <td align="center" style="vertical-align: middle">90.0*</td>
318
+ <td align="center" style="vertical-align: middle">84.7</td>
319
+ <td align="center" style="vertical-align: middle">89.9*</td>
320
+ <td align="center" style="vertical-align: middle">78.7</td>
321
+ </tr>
322
+ <tr>
323
+ <td align="center" style="vertical-align: middle">MathVision</td>
324
+ <td align="center" style="vertical-align: middle">87.4</td>
325
+ <td align="center" style="vertical-align: middle">92.0*</td>
326
+ <td align="center" style="vertical-align: middle">71.2*</td>
327
+ <td align="center" style="vertical-align: middle">89.8*</td>
328
+ <td align="center" style="vertical-align: middle">84.2</td>
329
+ </tr>
330
+ <tr>
331
+ <td align="center" style="vertical-align: middle">MathVision (w/ python)</td>
332
+ <td align="center" style="vertical-align: middle">93.2</td>
333
+ <td align="center" style="vertical-align: middle">96.1*</td>
334
+ <td align="center" style="vertical-align: middle">84.6*</td>
335
+ <td align="center" style="vertical-align: middle">95.7*</td>
336
+ <td align="center" style="vertical-align: middle">85.0</td>
337
+ </tr>
338
+ <tr>
339
+ <td align="center" style="vertical-align: middle">BabyVision</td>
340
+ <td align="center" style="vertical-align: middle">39.8</td>
341
+ <td align="center" style="vertical-align: middle">49.7</td>
342
+ <td align="center" style="vertical-align: middle">14.8</td>
343
+ <td align="center" style="vertical-align: middle">51.6</td>
344
+ <td align="center" style="vertical-align: middle">36.5</td>
345
+ </tr>
346
+ <tr>
347
+ <td align="center" style="vertical-align: middle">BabyVision (w/ python)</td>
348
+ <td align="center" style="vertical-align: middle">68.5</td>
349
+ <td align="center" style="vertical-align: middle">80.2*</td>
350
+ <td align="center" style="vertical-align: middle">38.4*</td>
351
+ <td align="center" style="vertical-align: middle">68.3*</td>
352
+ <td align="center" style="vertical-align: middle">40.5</td>
353
+ </tr>
354
+ <tr>
355
+ <td align="center" style="vertical-align: middle">V* (w/ python)</td>
356
+ <td align="center" style="vertical-align: middle">96.9</td>
357
+ <td align="center" style="vertical-align: middle">98.4*</td>
358
+ <td align="center" style="vertical-align: middle">86.4*</td>
359
+ <td align="center" style="vertical-align: middle">96.9*</td>
360
+ <td align="center" style="vertical-align: middle">86.9</td>
361
+ </tr>
362
+ </tbody>
363
+ </table>
364
+ </div>
365
+
366
+ <details>
367
+ <summary><b>Footnotes</b></summary>
368
+
369
+ 1. **General Testing Details**
370
+ - We report results for Kimi K2.6 and Kimi K2.5 with thinking mode enabled, Claude Opus 4.6 with max effort, GPT-5.4 with xhigh reasoning effort, and Gemini 3.1 Pro with a high thinking level.
371
+ - Unless otherwise specified, all Kimi K2.6 experiments were conducted with temperature = 1.0, top-p = 1.0, and a context length of 262,144 tokens.
372
+ - Benchmarks without publicly available scores were re-evaluated under the same conditions used for Kimi K2.6 and are marked with an asterisk (`*`). Except where noted with an asterisk, all other results are cited from official reports.
373
+ 2. **Reasoning Benchmarks**
374
+ - IMO-AnswerBench scores for GPT-5.4 and Claude 4.6 were obtained from [z.ai/blog/glm-5.1](https://z.ai/blog/glm-5.1).
375
+ - Humanity's Last Exam (HLE) and other reasoning tasks were evaluated with a maximum generation length of 98,304 tokens. By default, we report results on the HLE full set. For the text-only subset, Kimi K2.6 achieves 36.4% accuracy without tools and 55.5% with tools.
376
+ 3. **Tool-Augmented / Agentic Tasks**
377
+ - Kimi K2.6 was equipped with search, code-interpreter, and web-browsing tools for HLE with tools, BrowseComp, DeepSearchQA, and WideSearch.
378
+ - For HLE-Full with tools, the maximum generation length is 262,144 tokens with a per-step limit of 49,152 tokens. We employ a simple context management strategy: once the context window exceeds the threshold, only the most recent round of tool-related messages is retained.
379
+ - For BrowseComp, we report scores obtained with context management using the same discard-all strategy as Kimi K2.5 and DeepSeek-V3.2.
380
+ - For DeepSearchQA, no context management was applied to Kimi K2.6 tests, and tasks exceeding the supported context length were directly counted as failed. Scores for Claude Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on DeepSearchQA are cited from the [Claude Opus 4.7 System Card](https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf).
381
+ - For WideSearch, we report results under the "hide tool result" context management setting. Once the context window exceeds the threshold, only the most recent round of tool-related messages is retained.
382
+ - The test system prompts are identical to those used in the [Kimi K2.5 technical report](https://arxiv.org/pdf/2602.02276).
383
+ - Claw Eval was conducted using version 1.1 with max-tokens-per-step = 16384.
384
+ - For APEX-Agents, we evaluate 452 tasks from the public 480-task release, as done by [Artificial Analysis](https://artificialanalysis.ai/evaluations/apex-agents-aa)(excluding Investment Banking Worlds 244 and 246, which have external runtime dependencies)
385
+ 4. **Coding Tasks**
386
+ - Terminal-Bench 2.0 scores were obtained with the default agent framework (Terminus-2) and the provided JSON parser, operating in preserve thinking mode.
387
+ - For the SWE-Bench series of evaluations (including Verified, Multilingual, and Pro), we used an in-house evaluation framework adapted from SWE-agent. This framework includes a minimal set of tools—bash tool, createfile tool, insert tool, view tool, strreplace tool, and submit tool.
388
+ - All reported scores for coding tasks are averaged over 10 independent runs.
389
+ 5. **Vision Benchmarks**
390
+ - Max-tokens = 98,304, averaged over three runs (avg@3).
391
+ - Settings with Python tool use max-tokens-per-step = 65,536 and max-steps = 50 for multi-step reasoning.
392
+ - MMMU-Pro follows the official protocol, preserving input order and prepending images.
393
+
394
+ </details>
395
+
396
+
397
+ ## 4. Native INT4 Quantization
398
+ Kimi-K2.6 adopts the same native int4 quantization method as [Kimi-K2-Thinking](https://huggingface.co/moonshotai/Kimi-K2-Thinking#4-native-int4-quantization).
399
+
400
+ ## 5. Deployment
401
+
402
+ > [!Note]
403
+ > You can access Kimi-K2.6's API on https://platform.moonshot.ai and we provide OpenAI/Anthropic-compatible API for you. To verify the deployment is correct, we also provide the [Kimi Vendor Verifier](https://kimi.com/blog/kimi-vendor-verifier.html).
404
+ Currently, Kimi-K2.6 is recommended to run on the following inference engines:
405
+ * vLLM
406
+ * SGLang
407
+ * KTransformers
408
+
409
+ Kimi-K2.6 has the same architecture as Kimi-K2.5, and the deployment method can be directly reused.
410
+
411
+ The version requirement for `transformers` is `>=4.57.1, <5.0.0`.
412
+
413
+ Deployment examples can be found in the [Model Deployment Guide](docs/deploy_guidance.md).
414
+
415
+
416
+ ---
417
+ ## 6. Model Usage
418
+
419
+ The usage demos below demonstrate how to call our official API.
420
+
421
+ For third-party APIs deployed with vLLM or SGLang, please note that:
422
+ > [!Note]
423
+ > - Chat with video content is an experimental feature and is only supported in our official API for now.
424
+ >
425
+ > - The recommended `temperature` will be `1.0` for Thinking mode and `0.6` for Instant mode.
426
+ >
427
+ > - The recommended `top_p` is `0.95`.
428
+ >
429
+ > - To use instant mode, you need to pass `{'chat_template_kwargs': {"thinking": False}}` in `extra_body`.
430
+
431
+ ### Chat Completion
432
+
433
+ This is a simple chat completion script which shows how to call K2.6 API in Thinking and Instant modes.
434
+
435
+ ```python
436
+ import openai
437
+ import base64
438
+ import requests
439
+ def simple_chat(client: openai.OpenAI, model_name: str):
440
+ messages = [
441
+ {'role': 'system', 'content': 'You are Kimi, an AI assistant created by Moonshot AI.'},
442
+ {
443
+ 'role': 'user',
444
+ 'content': [
445
+ {'type': 'text', 'text': 'which one is bigger, 9.11 or 9.9? think carefully.'}
446
+ ],
447
+ },
448
+ ]
449
+ response = client.chat.completions.create(
450
+ model=model_name, messages=messages, stream=False, max_tokens=4096
451
+ )
452
+ print('====== Below is reasoning content in Thinking Mode ======')
453
+ print(f'reasoning content: {response.choices[0].message.reasoning}')
454
+ print('====== Below is response in Thinking Mode ======')
455
+ print(f'response: {response.choices[0].message.content}')
456
+
457
+ # To use instant mode, pass {"thinking" = {"type":"disabled"}}
458
+ response = client.chat.completions.create(
459
+ model=model_name,
460
+ messages=messages,
461
+ stream=False,
462
+ max_tokens=4096,
463
+ extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
464
+ # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
465
+ )
466
+ print('====== Below is response in Instant Mode ======')
467
+ print(f'response: {response.choices[0].message.content}')
468
+ ```
469
+
470
+
471
+ ### Chat Completion with visual content
472
+
473
+ K2.6 supports Image and Video input.
474
+
475
+ The following example demonstrates how to call K2.6 API with image input:
476
+
477
+ ```python
478
+ import openai
479
+ import base64
480
+ import requests
481
+
482
+ def chat_with_image(client: openai.OpenAI, model_name: str):
483
+ url = 'https://huggingface.co/moonshotai/Kimi-K2.6/resolve/main/figures/kimi-logo.png'
484
+ image_base64 = base64.b64encode(requests.get(url).content).decode()
485
+ messages = [
486
+ {
487
+ 'role': 'user',
488
+ 'content': [
489
+ {'type': 'text', 'text': 'Describe this image in detail.'},
490
+ {
491
+ 'type': 'image_url',
492
+ 'image_url': {'url': f'data:image/png;base64, {image_base64}'},
493
+ },
494
+ ],
495
+ }
496
+ ]
497
+
498
+ response = client.chat.completions.create(
499
+ model=model_name, messages=messages, stream=False, max_tokens=8192
500
+ )
501
+ print('====== Below is reasoning content in Thinking Mode ======')
502
+ print(f'reasoning content: {response.choices[0].message.reasoning}')
503
+ print('====== Below is response in Thinking Mode ======')
504
+ print(f'response: {response.choices[0].message.content}')
505
+
506
+ # Also support instant mode if you pass {"thinking" = {"type":"disabled"}}
507
+ response = client.chat.completions.create(
508
+ model=model_name,
509
+ messages=messages,
510
+ stream=False,
511
+ max_tokens=4096,
512
+ extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
513
+ # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
514
+ )
515
+ print('====== Below is response in Instant Mode ======')
516
+ print(f'response: {response.choices[0].message.content}')
517
+
518
+ return response.choices[0].message.content
519
+ ```
520
+
521
+ The following example demonstrates how to call K2.6 API with video input:
522
+
523
+ ```python
524
+ import openai
525
+ import base64
526
+ import requests
527
+
528
+ def chat_with_video(client: openai.OpenAI, model_name:str):
529
+ url = 'https://huggingface.co/moonshotai/Kimi-K2.6/resolve/main/figures/demo_video.mp4'
530
+ video_base64 = base64.b64encode(requests.get(url).content).decode()
531
+ messages = [
532
+ {
533
+ "role": "user",
534
+ "content": [
535
+ {"type": "text","text": "Describe the video in detail."},
536
+ {
537
+ "type": "video_url",
538
+ "video_url": {"url": f"data:video/mp4;base64,{video_base64}"},
539
+ },
540
+ ],
541
+ }
542
+ ]
543
+
544
+ response = client.chat.completions.create(model=model_name, messages=messages)
545
+ print('====== Below is reasoning content in Thinking Mode ======')
546
+ print(f'reasoning content: {response.choices[0].message.reasoning}')
547
+ print('====== Below is response in Thinking Mode ======')
548
+ print(f'response: {response.choices[0].message.content}')
549
+
550
+ # Also support instant mode if pass {"thinking" = {"type":"disabled"}}
551
+ response = client.chat.completions.create(
552
+ model=model_name,
553
+ messages=messages,
554
+ stream=False,
555
+ max_tokens=4096,
556
+ extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
557
+ # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
558
+ )
559
+ print('====== Below is response in Instant Mode ======')
560
+ print(f'response: {response.choices[0].message.content}')
561
+ return response.choices[0].message.content
562
+ ```
563
+
564
+ ### Preserve Thinking
565
+ Kimi K2.6 supports `preserve_thinking` mode, which retains full reasoning content across multi-turn interactions and enhances performance in coding agent scenarios.
566
+
567
+ This feature is disabled by default. The following example demonstrates how to call K2.6 API in `preserve_thinking` mode:
568
+
569
+ ```python
570
+ def chat_with_preserve_thinking(client: openai.OpenAI, model_name: str):
571
+ messages = [
572
+ {
573
+ "role": "user",
574
+ "content": "Tell me three random numbers."
575
+ },
576
+ {
577
+ "role": "assistant",
578
+ "reasoning_content": "I'll start by listing five numbers: 473, 921, 235, 215, 222, and I'll tell you the first three.",
579
+ "content": "473, 921, 235"
580
+ },
581
+ {
582
+ "role": "user",
583
+ "content": "What are the other two numbers you have in mind?"
584
+ }
585
+ ]
586
+
587
+ response = client.chat.completions.create(
588
+ model=model_name,
589
+ messages=messages,
590
+ stream=False,
591
+ max_tokens=4096,
592
+ extra_body={'thinking': {'type': 'enabled', 'keep': 'all'}}, # this is for official API
593
+ # extra_body={"chat_template_kwargs": {"thinking":True, "preserve_thinking": True}}, # this is for vLLM/SGLang
594
+ # We recommend enabling preserve_thinking only in think mode.
595
+ )
596
+ # the assistant should mention 215 and 222 that appear in the prior reasoning content
597
+ print(f"response: {response.choices[0].message.reasoning}")
598
+ return response.choices[0].message.content
599
+
600
+ ```
601
+
602
+ ### Interleaved Thinking and Multi-Step Tool Call
603
+
604
+ K2.6 shares the same design of Interleaved Thinking and Multi-Step Tool Call as K2 Thinking. For usage example, please refer to the [K2 Thinking documentation](https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model#complete-example).
605
+
606
+ ### Coding Agent Framework
607
+
608
+ Kimi K2.6 works best with Kimi Code CLI as its agent framework — give it a try at https://www.kimi.com/code.
609
+
610
+
611
+ ---
612
+
613
+ ## 7. License
614
+
615
+ Both the code repository and the model weights are released under the [Modified MIT License](LICENSE).
616
+
617
+ ---
618
+
619
+ ## 8. Third Party Notices
620
+
621
+ See [THIRD PARTY NOTICES](THIRD_PARTY_NOTICES.md)
622
+
623
+ ---
624
+
625
+ ## 9. Contact Us
626
+
627
+ If you have any questions, please reach out at [support@moonshot.ai](mailto:support@moonshot.ai).