Initial upload: 2-bit MXTQ JANGTQ premium (per-importance plan)
Browse filesThis view is limited to 50 files because it contains too many changes. See raw diff
- .gitattributes +2 -0
- DeepSeek_V4.pdf +3 -0
- LICENSE +21 -0
- README.md +145 -0
- config.json +118 -0
- encoding/README.md +156 -0
- encoding/__pycache__/encoding_dsv4.cpython-314.pyc +0 -0
- encoding/encoding_dsv4.py +744 -0
- encoding/test_encoding_dsv4.py +89 -0
- encoding/tests/test_input_1.json +81 -0
- encoding/tests/test_input_2.json +24 -0
- encoding/tests/test_input_3.json +159 -0
- encoding/tests/test_input_4.json +28 -0
- encoding/tests/test_output_1.txt +36 -0
- encoding/tests/test_output_2.txt +1 -0
- encoding/tests/test_output_3.txt +38 -0
- encoding/tests/test_output_4.txt +29 -0
- generation_config.json +9 -0
- jang_config.json +65 -0
- jangtq_runtime.safetensors +3 -0
- model-00001-of-00085.safetensors +3 -0
- model-00002-of-00085.safetensors +3 -0
- model-00003-of-00085.safetensors +3 -0
- model-00004-of-00085.safetensors +3 -0
- model-00005-of-00085.safetensors +3 -0
- model-00006-of-00085.safetensors +3 -0
- model-00007-of-00085.safetensors +3 -0
- model-00008-of-00085.safetensors +3 -0
- model-00009-of-00085.safetensors +3 -0
- model-00010-of-00085.safetensors +3 -0
- model-00011-of-00085.safetensors +3 -0
- model-00012-of-00085.safetensors +3 -0
- model-00013-of-00085.safetensors +3 -0
- model-00014-of-00085.safetensors +3 -0
- model-00015-of-00085.safetensors +3 -0
- model-00016-of-00085.safetensors +3 -0
- model-00017-of-00085.safetensors +3 -0
- model-00018-of-00085.safetensors +3 -0
- model-00019-of-00085.safetensors +3 -0
- model-00020-of-00085.safetensors +3 -0
- model-00021-of-00085.safetensors +3 -0
- model-00022-of-00085.safetensors +3 -0
- model-00023-of-00085.safetensors +3 -0
- model-00024-of-00085.safetensors +3 -0
- model-00025-of-00085.safetensors +3 -0
- model-00026-of-00085.safetensors +3 -0
- model-00027-of-00085.safetensors +3 -0
- model-00028-of-00085.safetensors +3 -0
- model-00029-of-00085.safetensors +3 -0
- model-00030-of-00085.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
*.pdf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
DeepSeek_V4.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa4a3490e2dcc03c9da61b04a8be471795e9966ebbbf292a3899fa62683a330e
|
| 3 |
+
size 4479901
|
LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
MIT License
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2023 DeepSeek
|
| 4 |
+
|
| 5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
+
of this software and associated documentation files (the "Software"), to deal
|
| 7 |
+
in the Software without restriction, including without limitation the rights
|
| 8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 9 |
+
copies of the Software, and to permit persons to whom the Software is
|
| 10 |
+
furnished to do so, subject to the following conditions:
|
| 11 |
+
|
| 12 |
+
The above copyright notice and this permission notice shall be included in all
|
| 13 |
+
copies or substantial portions of the Software.
|
| 14 |
+
|
| 15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 21 |
+
SOFTWARE.
|
README.md
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- zh
|
| 5 |
+
license: mit
|
| 6 |
+
library_name: mlx
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
base_model: deepseek-ai/DeepSeek-V4-Flash
|
| 9 |
+
tags:
|
| 10 |
+
- mlx
|
| 11 |
+
- deepseek
|
| 12 |
+
- 2-bit
|
| 13 |
+
- moe
|
| 14 |
+
- jang
|
| 15 |
+
- jangtq
|
| 16 |
+
- turboquant
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# DeepSeek-V4-Flash-JANGTQ
|
| 20 |
+
|
| 21 |
+
**TurboQuant 2-bit MXTQ codec quantization of DeepSeek-V4-Flash. 79 GB. 69.50% MMLU 200q logit @ 25.9 tok/s on M3 Ultra.**
|
| 22 |
+
|
| 23 |
+
Built with [`jang_tools`](https://github.com/jangq-ai/jang) for Apple Silicon (MLX). Verified on Mac Studio M3 Ultra.
|
| 24 |
+
|
| 25 |
+
The **canonical premium tier** in the JANG family — uses TurboQuant codec
|
| 26 |
+
(Lloyd-Max codebook + Hadamard rotation) for routed experts at 2-bit, with
|
| 27 |
+
per-importance allocation: hash-routed layers 0-2 at 4-bit MXTQ, smooth-routed
|
| 28 |
+
layers 3-42 at 2-bit MXTQ. Smaller than affine-only baselines AND retains
|
| 29 |
+
quality through the codebook + AWQ-style importance plan.
|
| 30 |
+
|
| 31 |
+
## Recipe
|
| 32 |
+
|
| 33 |
+
| Tensor class | Bits | Codec | Notes |
|
| 34 |
+
|---|---|---|---|
|
| 35 |
+
| Routed experts (hash-routed L0-L2) | **4-bit** | MXTQ codebook | 256 experts × 3 projs × 3 layers |
|
| 36 |
+
| Routed experts (smooth-routed L3-L42) | **2-bit** | MXTQ codebook | 256 experts × 3 projs × 40 layers |
|
| 37 |
+
| Attention (`wq_a`, `wq_b`, `wkv`, `wo_a`, `wo_b`) | 8-bit | affine gs=32 | All 43 layers, uniform |
|
| 38 |
+
| Shared experts | 8-bit | affine gs=32 | 1 instance/layer |
|
| 39 |
+
| Compressor + Indexer (long-ctx) | 8-bit | affine gs=32 | Active when `VMLX_DSV4_LONG_CTX=1` |
|
| 40 |
+
| `embed_tokens`, `lm_head` | 8-bit | affine gs=32 | Per-token I/O |
|
| 41 |
+
| Norms / router gate / mHC | fp16 | passthrough | Required for runtime correctness |
|
| 42 |
+
|
| 43 |
+
## Benchmarks
|
| 44 |
+
|
| 45 |
+
### MMLU 200q logit-mode (fair seed, PYTHONHASHSEED=42, identical questions across all bundles)
|
| 46 |
+
|
| 47 |
+
| Bundle | Size | MMLU 200q | Decode tok/s |
|
| 48 |
+
|---|---:|---:|---:|
|
| 49 |
+
| **DeepSeek-V4-Flash-JANGTQ (this)** | **79 GB** | **69.50%** | **25.91** |
|
| 50 |
+
| DeepSeek-V4-Flash-JANGTQ2 | 79.6 GB | 70.00% | 22.34 |
|
| 51 |
+
| DeepSeek-V4-Flash-JANG_2L | 107 GB | 71.50% | 23.77 |
|
| 52 |
+
| mlx-community/DeepSeek-V4-Flash-2bit-DQ | 90 GB | 50.00% | 36.03 |
|
| 53 |
+
|
| 54 |
+
### MMLU per-subject (200q stratified, 5 questions per subject)
|
| 55 |
+
|
| 56 |
+
```
|
| 57 |
+
Subject Score
|
| 58 |
+
─────────────────────────────────────────────
|
| 59 |
+
high_school_government_and_politics 5/5 (100%)
|
| 60 |
+
public_relations 5/5 (100%)
|
| 61 |
+
computer_security 5/5 (100%)
|
| 62 |
+
philosophy 5/5 (100%)
|
| 63 |
+
high_school_us_history 5/5 (100%)
|
| 64 |
+
marketing 5/5 (100%)
|
| 65 |
+
high_school_macroeconomics 5/5 (100%)
|
| 66 |
+
high_school_psychology 5/5 (100%)
|
| 67 |
+
prehistory 5/5 (100%)
|
| 68 |
+
high_school_microeconomics 5/5 (100%)
|
| 69 |
+
conceptual_physics 5/5 (100%)
|
| 70 |
+
nutrition 5/5 (100%)
|
| 71 |
+
high_school_computer_science 4/5 (80%)
|
| 72 |
+
human_sexuality 4/5 (80%)
|
| 73 |
+
college_medicine 4/5 (80%)
|
| 74 |
+
miscellaneous 4/5 (80%)
|
| 75 |
+
clinical_knowledge 4/5 (80%)
|
| 76 |
+
high_school_geography 4/5 (80%)
|
| 77 |
+
professional_medicine 4/5 (80%)
|
| 78 |
+
high_school_biology 4/5 (80%)
|
| 79 |
+
world_religions 4/5 (80%)
|
| 80 |
+
logical_fallacies 3/5 (60%)
|
| 81 |
+
security_studies 3/5 (60%)
|
| 82 |
+
virology 3/5 (60%)
|
| 83 |
+
high_school_chemistry 3/5 (60%)
|
| 84 |
+
jurisprudence 3/5 (60%)
|
| 85 |
+
college_physics 3/5 (60%)
|
| 86 |
+
management 3/5 (60%)
|
| 87 |
+
moral_disputes 3/5 (60%)
|
| 88 |
+
professional_psychology 3/5 (60%)
|
| 89 |
+
econometrics 3/5 (60%)
|
| 90 |
+
high_school_european_history 2/5 (40%)
|
| 91 |
+
professional_law 2/5 (40%)
|
| 92 |
+
high_school_statistics 2/5 (40%)
|
| 93 |
+
human_aging 2/5 (40%)
|
| 94 |
+
formal_logic 1/5 (20%)
|
| 95 |
+
high_school_world_history 1/5 (20%)
|
| 96 |
+
business_ethics 1/5 (20%)
|
| 97 |
+
abstract_algebra 1/5 (20%)
|
| 98 |
+
high_school_mathematics 1/5 (20%)
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
+
### HumanEval+ pass@1
|
| 102 |
+
|
| 103 |
+
**Coming soon.** Currently running greedy T=0.0, max_tokens=4000, seed=42 against
|
| 104 |
+
JANGTQ vs the mlx-community 2-bit-DQ baseline. Will update with both numbers.
|
| 105 |
+
|
| 106 |
+
### Decode speed (M3 Ultra, sustained 200-token decode after warmup)
|
| 107 |
+
|
| 108 |
+
| Bundle | tok/s |
|
| 109 |
+
|---|---:|
|
| 110 |
+
| **DeepSeek-V4-Flash-JANGTQ (this)** | **25.91** |
|
| 111 |
+
| DeepSeek-V4-Flash-JANGTQ2 | 22.34 |
|
| 112 |
+
| DeepSeek-V4-Flash-JANG_2L | 23.77 |
|
| 113 |
+
| mlx-community/DeepSeek-V4-Flash-2bit-DQ | 36.03 |
|
| 114 |
+
|
| 115 |
+
## Use
|
| 116 |
+
|
| 117 |
+
```python
|
| 118 |
+
import os
|
| 119 |
+
os.environ["JANG_WIRED_LIMIT_GB"] = "160" # Mac Studio M3 Ultra
|
| 120 |
+
# Long context (optional, for >128-token attention recall):
|
| 121 |
+
# os.environ["VMLX_DSV4_LONG_CTX"] = "1"
|
| 122 |
+
|
| 123 |
+
import mlx.core as mx
|
| 124 |
+
from jang_tools.load_jangtq import load_jangtq_model
|
| 125 |
+
from mlx_lm.generate import generate
|
| 126 |
+
|
| 127 |
+
model, tok = load_jangtq_model("JANGQ-AI/DeepSeek-V4-Flash-JANGTQ")
|
| 128 |
+
|
| 129 |
+
text = tok.apply_chat_template(
|
| 130 |
+
[{"role": "user", "content": "What is 2+2?"}],
|
| 131 |
+
tokenize=False, add_generation_prompt=True,
|
| 132 |
+
)
|
| 133 |
+
print(generate(model, tok, prompt=text, max_tokens=200, verbose=True))
|
| 134 |
+
```
|
| 135 |
+
|
| 136 |
+
## Related bundles
|
| 137 |
+
|
| 138 |
+
- [`JANGQ-AI/DeepSeek-V4-Flash-JANGTQ2`](https://huggingface.co/JANGQ-AI/DeepSeek-V4-Flash-JANGTQ2) — uniform 2-bit MXTQ baseline (no per-importance plan)
|
| 139 |
+
- [`JANGQ-AI/DeepSeek-V4-Flash-JANG_2L`](https://huggingface.co/JANGQ-AI/DeepSeek-V4-Flash-JANG_2L) — all-affine 2-bit production (no MXTQ codec)
|
| 140 |
+
|
| 141 |
+
## Credits
|
| 142 |
+
|
| 143 |
+
Created by Jinho Jang — eric@jangq.ai
|
| 144 |
+
|
| 145 |
+
Built on top of [DeepSeek-V4-Flash](https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash) (deepseek-ai).
|
config.json
ADDED
|
@@ -0,0 +1,118 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"DeepseekV4ForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"attention_bias": false,
|
| 6 |
+
"attention_dropout": 0.0,
|
| 7 |
+
"bos_token_id": 0,
|
| 8 |
+
"eos_token_id": 1,
|
| 9 |
+
"hc_eps": 1e-06,
|
| 10 |
+
"hc_mult": 4,
|
| 11 |
+
"hc_sinkhorn_iters": 20,
|
| 12 |
+
"head_dim": 512,
|
| 13 |
+
"hidden_act": "silu",
|
| 14 |
+
"hidden_size": 4096,
|
| 15 |
+
"index_head_dim": 128,
|
| 16 |
+
"index_n_heads": 64,
|
| 17 |
+
"index_topk": 512,
|
| 18 |
+
"initializer_range": 0.02,
|
| 19 |
+
"max_position_embeddings": 1048576,
|
| 20 |
+
"model_type": "deepseek_v4",
|
| 21 |
+
"moe_intermediate_size": 2048,
|
| 22 |
+
"n_routed_experts": 256,
|
| 23 |
+
"n_shared_experts": 1,
|
| 24 |
+
"norm_topk_prob": true,
|
| 25 |
+
"num_attention_heads": 64,
|
| 26 |
+
"num_experts_per_tok": 6,
|
| 27 |
+
"num_hidden_layers": 43,
|
| 28 |
+
"num_hash_layers": 3,
|
| 29 |
+
"num_key_value_heads": 1,
|
| 30 |
+
"num_nextn_predict_layers": 1,
|
| 31 |
+
"o_groups": 8,
|
| 32 |
+
"o_lora_rank": 1024,
|
| 33 |
+
"q_lora_rank": 1024,
|
| 34 |
+
"qk_rope_head_dim": 64,
|
| 35 |
+
"rms_norm_eps": 1e-06,
|
| 36 |
+
"rope_scaling": {
|
| 37 |
+
"beta_fast": 32,
|
| 38 |
+
"beta_slow": 1,
|
| 39 |
+
"factor": 16,
|
| 40 |
+
"original_max_position_embeddings": 65536,
|
| 41 |
+
"type": "yarn"
|
| 42 |
+
},
|
| 43 |
+
"rope_theta": 10000,
|
| 44 |
+
"routed_scaling_factor": 1.5,
|
| 45 |
+
"scoring_func": "sqrtsoftplus",
|
| 46 |
+
"sliding_window": 128,
|
| 47 |
+
"swiglu_limit": 10.0,
|
| 48 |
+
"tie_word_embeddings": false,
|
| 49 |
+
"topk_method": "noaux_tc",
|
| 50 |
+
"torch_dtype": "bfloat16",
|
| 51 |
+
"transformers_version": "4.57.1",
|
| 52 |
+
"use_cache": true,
|
| 53 |
+
"vocab_size": 129280,
|
| 54 |
+
"compress_rope_theta": 160000,
|
| 55 |
+
"compress_ratios": [
|
| 56 |
+
0,
|
| 57 |
+
0,
|
| 58 |
+
4,
|
| 59 |
+
128,
|
| 60 |
+
4,
|
| 61 |
+
128,
|
| 62 |
+
4,
|
| 63 |
+
128,
|
| 64 |
+
4,
|
| 65 |
+
128,
|
| 66 |
+
4,
|
| 67 |
+
128,
|
| 68 |
+
4,
|
| 69 |
+
128,
|
| 70 |
+
4,
|
| 71 |
+
128,
|
| 72 |
+
4,
|
| 73 |
+
128,
|
| 74 |
+
4,
|
| 75 |
+
128,
|
| 76 |
+
4,
|
| 77 |
+
128,
|
| 78 |
+
4,
|
| 79 |
+
128,
|
| 80 |
+
4,
|
| 81 |
+
128,
|
| 82 |
+
4,
|
| 83 |
+
128,
|
| 84 |
+
4,
|
| 85 |
+
128,
|
| 86 |
+
4,
|
| 87 |
+
128,
|
| 88 |
+
4,
|
| 89 |
+
128,
|
| 90 |
+
4,
|
| 91 |
+
128,
|
| 92 |
+
4,
|
| 93 |
+
128,
|
| 94 |
+
4,
|
| 95 |
+
128,
|
| 96 |
+
4,
|
| 97 |
+
128,
|
| 98 |
+
4,
|
| 99 |
+
0
|
| 100 |
+
],
|
| 101 |
+
"quantization": {
|
| 102 |
+
"group_size": 32,
|
| 103 |
+
"bits": 8,
|
| 104 |
+
"mode": "affine"
|
| 105 |
+
},
|
| 106 |
+
"_name_or_path": "DSV4-Flash-JANGTQ2",
|
| 107 |
+
"routed_expert_bits": 2,
|
| 108 |
+
"group_size": 32,
|
| 109 |
+
"mxtq_seed": 42,
|
| 110 |
+
"rope_parameters": {
|
| 111 |
+
"beta_fast": 32.0,
|
| 112 |
+
"beta_slow": 1.0,
|
| 113 |
+
"factor": 16.0,
|
| 114 |
+
"original_max_position_embeddings": 65536,
|
| 115 |
+
"rope_type": "yarn",
|
| 116 |
+
"rope_theta": 10000.0
|
| 117 |
+
}
|
| 118 |
+
}
|
encoding/README.md
ADDED
|
@@ -0,0 +1,156 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# DeepSeek-V4 Encoding
|
| 2 |
+
|
| 3 |
+
This document describes the prompt encoding format used by DeepSeek-V4 series models. The encoding handles multi-turn conversations, tool calling, extended thinking (reasoning), and quick instruction tasks.
|
| 4 |
+
|
| 5 |
+
A self-contained reference implementation is provided in `encoding_dsv4.py`.
|
| 6 |
+
|
| 7 |
+
## Quick Start
|
| 8 |
+
|
| 9 |
+
```python
|
| 10 |
+
from encoding_dsv4 import encode_messages, parse_message_from_completion_text
|
| 11 |
+
|
| 12 |
+
# Encode a conversation
|
| 13 |
+
messages = [
|
| 14 |
+
{"role": "system", "content": "You are a helpful assistant."},
|
| 15 |
+
{"role": "user", "content": "What is 2+2?"},
|
| 16 |
+
]
|
| 17 |
+
prompt = encode_messages(messages, thinking_mode="thinking")
|
| 18 |
+
# => "<|begin▁of▁sentence|>You are a helpful assistant.<|User|>What is 2+2?<|Assistant|><think>"
|
| 19 |
+
|
| 20 |
+
# Parse model output back to structured message
|
| 21 |
+
completion = "Simple arithmetic.</think>2 + 2 = 4.<|end▁of▁sentence|>"
|
| 22 |
+
parsed = parse_message_from_completion_text(completion, thinking_mode="thinking")
|
| 23 |
+
# => {"role": "assistant", "reasoning_content": "Simple arithmetic.", "content": "2 + 2 = 4.", "tool_calls": []}
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
> **Note:** The `parse_message_from_completion_text` function is designed to handle well-formatted model output only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. For production use, additional error handling is recommended.
|
| 27 |
+
|
| 28 |
+
## Message Format
|
| 29 |
+
|
| 30 |
+
### Special Tokens
|
| 31 |
+
|
| 32 |
+
| Token | Purpose |
|
| 33 |
+
|-------|---------|
|
| 34 |
+
| `<|begin▁of▁sentence|>` | Beginning of sequence (BOS) |
|
| 35 |
+
| `<|end▁of▁sentence|>` | End of assistant turn (EOS) |
|
| 36 |
+
| `<|User|>` | User turn prefix |
|
| 37 |
+
| `<|Assistant|>` | Assistant turn prefix |
|
| 38 |
+
| `<|latest_reminder|>` | Latest reminder (date, locale, etc.) |
|
| 39 |
+
| `<think>` / `</think>` | Reasoning block delimiters |
|
| 40 |
+
| `|DSML|` | DSML markup token |
|
| 41 |
+
|
| 42 |
+
### Roles
|
| 43 |
+
|
| 44 |
+
The encoding supports the following message roles: `system`, `user`, `assistant`, `tool`, `latest_reminder`, and `developer`.
|
| 45 |
+
|
| 46 |
+
> **Note on the `developer` role:** The `developer` role is used exclusively in the internal search agent pipeline. It is not needed for general-purpose chat or tool-calling tasks, and the official API does not accept messages with this role.
|
| 47 |
+
|
| 48 |
+
### Basic Chat
|
| 49 |
+
|
| 50 |
+
A simple multi-turn conversation is encoded as:
|
| 51 |
+
|
| 52 |
+
```
|
| 53 |
+
<|begin▁of▁sentence|>{system_prompt}
|
| 54 |
+
<|User|>{user_message}<|Assistant|></think>{response}<|end▁of▁sentence|>
|
| 55 |
+
<|User|>{user_message_2}<|Assistant|></think>{response_2}<|end▁of▁sentence|>
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
- The BOS token is prepended at the very beginning of the conversation.
|
| 59 |
+
- In **chat mode** (`thinking_mode="chat"`), `</think>` is placed right after `<|Assistant|>` to immediately close the thinking block, so the model generates content directly.
|
| 60 |
+
|
| 61 |
+
### Interleaved Thinking Mode
|
| 62 |
+
|
| 63 |
+
In **thinking mode** (`thinking_mode="thinking"`), the model produces explicit reasoning inside `<think>...</think>` blocks before responding.
|
| 64 |
+
|
| 65 |
+
```
|
| 66 |
+
<|begin▁of▁sentence|>{system_prompt}
|
| 67 |
+
<|User|>{message}<|Assistant|><think>{reasoning}</think>{response}<|end▁of▁sentence|>
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
The `drop_thinking` parameter (default `True`) controls whether reasoning from earlier turns is preserved:
|
| 71 |
+
|
| 72 |
+
- **Without tools**: `drop_thinking` takes effect. Reasoning content from assistant turns **before** the last user message is stripped. Only the final assistant turn retains its `<think>...</think>` block.
|
| 73 |
+
- **With tools** (on system or developer message): `drop_thinking` is automatically disabled. All turns retain their reasoning, because tool-calling conversations require full context for the model to track multi-step reasoning across tool calls.
|
| 74 |
+
|
| 75 |
+
### Tool Calling (DSML Format)
|
| 76 |
+
|
| 77 |
+
Tools are defined on the `system` or `developer` message via the `tools` field (OpenAI-compatible format). When tools are present, the following schema block is injected into the system/user prompt:
|
| 78 |
+
|
| 79 |
+
```
|
| 80 |
+
## Tools
|
| 81 |
+
|
| 82 |
+
You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
|
| 83 |
+
|
| 84 |
+
<|DSML|tool_calls>
|
| 85 |
+
<|DSML|invoke name="$TOOL_NAME">
|
| 86 |
+
<|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
|
| 87 |
+
...
|
| 88 |
+
</|DSML|invoke>
|
| 89 |
+
<|DSML|invoke name="$TOOL_NAME2">
|
| 90 |
+
...
|
| 91 |
+
</|DSML|invoke>
|
| 92 |
+
</|DSML|tool_calls>
|
| 93 |
+
|
| 94 |
+
String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
|
| 95 |
+
|
| 96 |
+
If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
|
| 97 |
+
|
| 98 |
+
Otherwise, output directly after </think> with tool calls or final response.
|
| 99 |
+
|
| 100 |
+
### Available Tool Schemas
|
| 101 |
+
|
| 102 |
+
{tool_definitions_json}
|
| 103 |
+
|
| 104 |
+
You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
|
| 105 |
+
```
|
| 106 |
+
|
| 107 |
+
An actual tool call in the assistant turn looks like:
|
| 108 |
+
|
| 109 |
+
```xml
|
| 110 |
+
<|DSML|tool_calls>
|
| 111 |
+
<|DSML|invoke name="function_name">
|
| 112 |
+
<|DSML|parameter name="param" string="true">string_value</|DSML|parameter>
|
| 113 |
+
<|DSML|parameter name="count" string="false">5</|DSML|parameter>
|
| 114 |
+
</|DSML|invoke>
|
| 115 |
+
</|DSML|tool_calls><|end▁of▁sentence|>
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
- `string="true"`: the parameter value is a raw string.
|
| 119 |
+
- `string="false"`: the parameter value is JSON (number, boolean, array, object).
|
| 120 |
+
|
| 121 |
+
Tool execution results are wrapped in `<tool_result>` tags within user messages:
|
| 122 |
+
|
| 123 |
+
```
|
| 124 |
+
<|User|><tool_result>{result_json}</tool_result><|Assistant|><think>...
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
When multiple tool results are present, they are sorted by the order of the corresponding `tool_calls` in the preceding assistant message.
|
| 128 |
+
|
| 129 |
+
### Reasoning Effort
|
| 130 |
+
|
| 131 |
+
When `reasoning_effort="max"` is set, a special prefix is prepended at the very beginning of the prompt (before the system message) to instruct the model to maximize its reasoning depth:
|
| 132 |
+
|
| 133 |
+
```
|
| 134 |
+
Reasoning Effort: Absolute maximum with no shortcuts permitted.
|
| 135 |
+
You MUST be very thorough in your thinking and comprehensively decompose the problem to resolve the root cause, rigorously stress-testing your logic against all potential paths, edge cases, and adversarial scenarios.
|
| 136 |
+
Explicitly write out your entire deliberation process, documenting every intermediate step, considered alternative, and rejected hypothesis to ensure absolutely no assumption is left unchecked.
|
| 137 |
+
```
|
| 138 |
+
|
| 139 |
+
### Quick Instruction Special Tokens
|
| 140 |
+
|
| 141 |
+
Quick instruction tokens are used for auxiliary classification and generation tasks. They are appended to messages via the `"task"` field to trigger specialized model behavior for a single-token or short-form output.
|
| 142 |
+
|
| 143 |
+
| Special Token | Description | Format |
|
| 144 |
+
|:---|:---|:---|
|
| 145 |
+
| `<|action|>` | Determines whether the user prompt requires a web search or can be answered directly. | `...<|User|>{prompt}<|Assistant|><think><|action|>` |
|
| 146 |
+
| `<|title|>` | Generates a concise conversation title after the first assistant response. | `...<|Assistant|>{response}<|end▁of▁sentence|><|title|>` |
|
| 147 |
+
| `<|query|>` | Generates search queries for the user prompt. | `...<|User|>{prompt}<|query|>` |
|
| 148 |
+
| `<|authority|>` | Classifies the user prompt's demand for source authoritativeness. | `...<|User|>{prompt}<|authority|>` |
|
| 149 |
+
| `<|domain|>` | Identifies the domain of the user prompt. | `...<|User|>{prompt}<|domain|>` |
|
| 150 |
+
| `<|extracted_url|>` `<|read_url|>` | Determines whether each URL in the user prompt should be fetched and read. | `...<|User|>{prompt}<|extracted_url|>{url}<|read_url|>` |
|
| 151 |
+
|
| 152 |
+
Usage in message format:
|
| 153 |
+
|
| 154 |
+
- **`action`** on a user message: the `<|action|>` token is placed after the assistant prefix and thinking token, triggering a routing decision (e.g., "Search" or "Answer").
|
| 155 |
+
- **Other tasks** (`query`, `authority`, `domain`, `read_url`) on a user message: the task token is appended directly after the user content.
|
| 156 |
+
- **`title`** on an assistant message: the `<|title|>` token is appended after the assistant's EOS. The next assistant message provides the generated title.
|
encoding/__pycache__/encoding_dsv4.cpython-314.pyc
ADDED
|
Binary file (33.6 kB). View file
|
|
|
encoding/encoding_dsv4.py
ADDED
|
@@ -0,0 +1,744 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
DeepSeek-V4 Encoding
|
| 3 |
+
|
| 4 |
+
A self-contained implementation for encoding/decoding DeepSeek-V4 chat messages
|
| 5 |
+
with tool calling, thinking mode, and quick instruction task support.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from typing import Any, Dict, List, Union, Optional, Tuple
|
| 9 |
+
import copy
|
| 10 |
+
import json
|
| 11 |
+
import re
|
| 12 |
+
|
| 13 |
+
# ============================================================
|
| 14 |
+
# Special Tokens
|
| 15 |
+
# ============================================================
|
| 16 |
+
|
| 17 |
+
bos_token: str = "<|begin▁of▁sentence|>"
|
| 18 |
+
eos_token: str = "<|end▁of▁sentence|>"
|
| 19 |
+
thinking_start_token: str = "<think>"
|
| 20 |
+
thinking_end_token: str = "</think>"
|
| 21 |
+
dsml_token: str = "|DSML|"
|
| 22 |
+
|
| 23 |
+
USER_SP_TOKEN = "<|User|>"
|
| 24 |
+
ASSISTANT_SP_TOKEN = "<|Assistant|>"
|
| 25 |
+
LATEST_REMINDER_SP_TOKEN = "<|latest_reminder|>"
|
| 26 |
+
|
| 27 |
+
# Task special tokens for internal classification tasks
|
| 28 |
+
DS_TASK_SP_TOKENS = {
|
| 29 |
+
"action": "<|action|>",
|
| 30 |
+
"query": "<|query|>",
|
| 31 |
+
"authority": "<|authority|>",
|
| 32 |
+
"domain": "<|domain|>",
|
| 33 |
+
"title": "<|title|>",
|
| 34 |
+
"read_url": "<|read_url|>",
|
| 35 |
+
}
|
| 36 |
+
VALID_TASKS = set(DS_TASK_SP_TOKENS.keys())
|
| 37 |
+
|
| 38 |
+
# ============================================================
|
| 39 |
+
# Templates
|
| 40 |
+
# ============================================================
|
| 41 |
+
|
| 42 |
+
system_msg_template: str = "{content}"
|
| 43 |
+
user_msg_template: str = "{content}"
|
| 44 |
+
latest_reminder_msg_template: str = "{content}"
|
| 45 |
+
assistant_msg_template: str = "{reasoning}{content}{tool_calls}" + eos_token
|
| 46 |
+
assistant_msg_wo_eos_template: str = "{reasoning}{content}{tool_calls}"
|
| 47 |
+
thinking_template: str = "{reasoning_content}"
|
| 48 |
+
|
| 49 |
+
response_format_template: str = (
|
| 50 |
+
"## Response Format:\n\nYou MUST strictly adhere to the following schema to reply:\n{schema}"
|
| 51 |
+
)
|
| 52 |
+
tool_call_template: str = (
|
| 53 |
+
"<{dsml_token}invoke name=\"{name}\">\n{arguments}\n</{dsml_token}invoke>"
|
| 54 |
+
)
|
| 55 |
+
tool_calls_template = (
|
| 56 |
+
"<{dsml_token}{tc_block_name}>\n{tool_calls}\n</{dsml_token}{tc_block_name}>"
|
| 57 |
+
)
|
| 58 |
+
tool_calls_block_name: str = "tool_calls"
|
| 59 |
+
|
| 60 |
+
tool_output_template: str = (
|
| 61 |
+
"<tool_result>{content}</tool_result>"
|
| 62 |
+
)
|
| 63 |
+
|
| 64 |
+
REASONING_EFFORT_MAX = (
|
| 65 |
+
"Reasoning Effort: Absolute maximum with no shortcuts permitted.\n"
|
| 66 |
+
"You MUST be very thorough in your thinking and comprehensively decompose the problem to resolve the root cause, rigorously stress-testing your logic against all potential paths, edge cases, and adversarial scenarios.\n"
|
| 67 |
+
"Explicitly write out your entire deliberation process, documenting every intermediate step, considered alternative, and rejected hypothesis to ensure absolutely no assumption is left unchecked.\n\n"
|
| 68 |
+
)
|
| 69 |
+
|
| 70 |
+
TOOLS_TEMPLATE = """## Tools
|
| 71 |
+
|
| 72 |
+
You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<{dsml_token}tool_calls>" block like the following:
|
| 73 |
+
|
| 74 |
+
<{dsml_token}tool_calls>
|
| 75 |
+
<{dsml_token}invoke name="$TOOL_NAME">
|
| 76 |
+
<{dsml_token}parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</{dsml_token}parameter>
|
| 77 |
+
...
|
| 78 |
+
</{dsml_token}invoke>
|
| 79 |
+
<{dsml_token}invoke name="$TOOL_NAME2">
|
| 80 |
+
...
|
| 81 |
+
</{dsml_token}invoke>
|
| 82 |
+
</{dsml_token}tool_calls>
|
| 83 |
+
|
| 84 |
+
String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
|
| 85 |
+
|
| 86 |
+
If thinking_mode is enabled (triggered by {thinking_start_token}), you MUST output your complete reasoning inside {thinking_start_token}...{thinking_end_token} BEFORE any tool calls or final response.
|
| 87 |
+
|
| 88 |
+
Otherwise, output directly after {thinking_end_token} with tool calls or final response.
|
| 89 |
+
|
| 90 |
+
### Available Tool Schemas
|
| 91 |
+
|
| 92 |
+
{tool_schemas}
|
| 93 |
+
|
| 94 |
+
You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
|
| 95 |
+
"""
|
| 96 |
+
|
| 97 |
+
# ============================================================
|
| 98 |
+
# Utility Functions
|
| 99 |
+
# ============================================================
|
| 100 |
+
|
| 101 |
+
def to_json(value: Any) -> str:
|
| 102 |
+
"""Serialize a value to JSON string."""
|
| 103 |
+
try:
|
| 104 |
+
return json.dumps(value, ensure_ascii=False)
|
| 105 |
+
except:
|
| 106 |
+
return json.dumps(value, ensure_ascii=True)
|
| 107 |
+
|
| 108 |
+
|
| 109 |
+
def tools_from_openai_format(tools):
|
| 110 |
+
"""Extract function definitions from OpenAI-format tool list."""
|
| 111 |
+
return [tool["function"] for tool in tools]
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
def tool_calls_from_openai_format(tool_calls):
|
| 115 |
+
"""Convert OpenAI-format tool calls to internal format."""
|
| 116 |
+
return [
|
| 117 |
+
{
|
| 118 |
+
"name": tool_call["function"]["name"],
|
| 119 |
+
"arguments": tool_call["function"]["arguments"],
|
| 120 |
+
}
|
| 121 |
+
for tool_call in tool_calls
|
| 122 |
+
]
|
| 123 |
+
|
| 124 |
+
|
| 125 |
+
def tool_calls_to_openai_format(tool_calls):
|
| 126 |
+
"""Convert internal tool calls to OpenAI format."""
|
| 127 |
+
return [
|
| 128 |
+
{
|
| 129 |
+
"type": "function",
|
| 130 |
+
"function": {
|
| 131 |
+
"name": tool_call["name"],
|
| 132 |
+
"arguments": tool_call["arguments"],
|
| 133 |
+
}
|
| 134 |
+
}
|
| 135 |
+
for tool_call in tool_calls
|
| 136 |
+
]
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
def encode_arguments_to_dsml(tool_call: Dict[str, str]) -> str:
|
| 140 |
+
"""
|
| 141 |
+
Encode tool call arguments into DSML parameter format.
|
| 142 |
+
|
| 143 |
+
Args:
|
| 144 |
+
tool_call: Dict with "name" and "arguments" (JSON string) keys.
|
| 145 |
+
|
| 146 |
+
Returns:
|
| 147 |
+
DSML-formatted parameter string.
|
| 148 |
+
"""
|
| 149 |
+
p_dsml_template = '<{dsml_token}parameter name="{key}" string="{is_str}">{value}</{dsml_token}parameter>'
|
| 150 |
+
P_dsml_strs = []
|
| 151 |
+
|
| 152 |
+
try:
|
| 153 |
+
arguments = json.loads(tool_call["arguments"])
|
| 154 |
+
except Exception as err:
|
| 155 |
+
arguments = {"arguments": tool_call["arguments"]}
|
| 156 |
+
|
| 157 |
+
for k, v in arguments.items():
|
| 158 |
+
p_dsml_str = p_dsml_template.format(
|
| 159 |
+
dsml_token=dsml_token,
|
| 160 |
+
key=k,
|
| 161 |
+
is_str="true" if isinstance(v, str) else "false",
|
| 162 |
+
value=v if isinstance(v, str) else to_json(v),
|
| 163 |
+
)
|
| 164 |
+
P_dsml_strs.append(p_dsml_str)
|
| 165 |
+
|
| 166 |
+
return "\n".join(P_dsml_strs)
|
| 167 |
+
|
| 168 |
+
|
| 169 |
+
def decode_dsml_to_arguments(tool_name: str, tool_args: Dict[str, Tuple[str, str]]) -> Dict[str, str]:
|
| 170 |
+
"""
|
| 171 |
+
Decode DSML parameters back to a tool call dict.
|
| 172 |
+
|
| 173 |
+
Args:
|
| 174 |
+
tool_name: Name of the tool.
|
| 175 |
+
tool_args: Dict mapping param_name -> (value, is_string_flag).
|
| 176 |
+
|
| 177 |
+
Returns:
|
| 178 |
+
Dict with "name" and "arguments" (JSON string) keys.
|
| 179 |
+
"""
|
| 180 |
+
def _decode_value(key: str, value: str, string: str):
|
| 181 |
+
if string == "true":
|
| 182 |
+
value = to_json(value)
|
| 183 |
+
return f"{to_json(key)}: {value}"
|
| 184 |
+
|
| 185 |
+
tool_args_json = "{" + ", ".join([_decode_value(k, v, string=is_str) for k, (v, is_str) in tool_args.items()]) + "}"
|
| 186 |
+
return dict(name=tool_name, arguments=tool_args_json)
|
| 187 |
+
|
| 188 |
+
|
| 189 |
+
def render_tools(tools: List[Dict[str, Union[str, Dict[str, Any]]]]) -> str:
|
| 190 |
+
"""
|
| 191 |
+
Render tool schemas into the system prompt format.
|
| 192 |
+
|
| 193 |
+
Args:
|
| 194 |
+
tools: List of tool schema dicts (each with name, description, parameters).
|
| 195 |
+
|
| 196 |
+
Returns:
|
| 197 |
+
Formatted tools section string.
|
| 198 |
+
"""
|
| 199 |
+
tools_json = [to_json(t) for t in tools]
|
| 200 |
+
|
| 201 |
+
return TOOLS_TEMPLATE.format(
|
| 202 |
+
tool_schemas="\n".join(tools_json),
|
| 203 |
+
dsml_token=dsml_token,
|
| 204 |
+
thinking_start_token=thinking_start_token,
|
| 205 |
+
thinking_end_token=thinking_end_token,
|
| 206 |
+
)
|
| 207 |
+
|
| 208 |
+
|
| 209 |
+
def find_last_user_index(messages: List[Dict[str, Any]]) -> int:
|
| 210 |
+
"""Find the index of the last user/developer message."""
|
| 211 |
+
last_user_index = -1
|
| 212 |
+
for idx in range(len(messages) - 1, -1, -1):
|
| 213 |
+
if messages[idx].get("role") in ["user", "developer"]:
|
| 214 |
+
last_user_index = idx
|
| 215 |
+
break
|
| 216 |
+
return last_user_index
|
| 217 |
+
|
| 218 |
+
|
| 219 |
+
# ============================================================
|
| 220 |
+
# Message Rendering
|
| 221 |
+
# ============================================================
|
| 222 |
+
|
| 223 |
+
def render_message(index: int, messages: List[Dict[str, Any]], thinking_mode: str, drop_thinking: bool = True, reasoning_effort: Optional[str] = None) -> str:
|
| 224 |
+
"""
|
| 225 |
+
Render a single message at the given index into its encoded string form.
|
| 226 |
+
|
| 227 |
+
This is the core function that converts each message in the conversation
|
| 228 |
+
into the DeepSeek-V4 format.
|
| 229 |
+
|
| 230 |
+
Args:
|
| 231 |
+
index: Index of the message to render.
|
| 232 |
+
messages: Full list of messages in the conversation.
|
| 233 |
+
thinking_mode: Either "chat" or "thinking".
|
| 234 |
+
drop_thinking: Whether to drop reasoning content from earlier turns.
|
| 235 |
+
reasoning_effort: Optional reasoning effort level ("max", "high", or None).
|
| 236 |
+
|
| 237 |
+
Returns:
|
| 238 |
+
Encoded string for this message.
|
| 239 |
+
"""
|
| 240 |
+
assert 0 <= index < len(messages)
|
| 241 |
+
assert thinking_mode in ["chat", "thinking"], f"Invalid thinking_mode `{thinking_mode}`"
|
| 242 |
+
|
| 243 |
+
prompt = ""
|
| 244 |
+
msg = messages[index]
|
| 245 |
+
last_user_idx = find_last_user_index(messages)
|
| 246 |
+
|
| 247 |
+
role = msg.get("role")
|
| 248 |
+
content = msg.get("content")
|
| 249 |
+
tools = msg.get("tools")
|
| 250 |
+
response_format = msg.get("response_format")
|
| 251 |
+
tool_calls = msg.get("tool_calls")
|
| 252 |
+
reasoning_content = msg.get("reasoning_content")
|
| 253 |
+
wo_eos = msg.get("wo_eos", False)
|
| 254 |
+
|
| 255 |
+
if tools:
|
| 256 |
+
tools = tools_from_openai_format(tools)
|
| 257 |
+
if tool_calls:
|
| 258 |
+
tool_calls = tool_calls_from_openai_format(tool_calls)
|
| 259 |
+
|
| 260 |
+
# Reasoning effort prefix (only at index 0 in thinking mode with max effort)
|
| 261 |
+
assert reasoning_effort in ['max', None, 'high'], f"Invalid reasoning effort: {reasoning_effort}"
|
| 262 |
+
if index == 0 and thinking_mode == "thinking" and reasoning_effort == 'max':
|
| 263 |
+
prompt += REASONING_EFFORT_MAX
|
| 264 |
+
|
| 265 |
+
if role == "system":
|
| 266 |
+
prompt += system_msg_template.format(content=content or "")
|
| 267 |
+
if tools:
|
| 268 |
+
prompt += "\n\n" + render_tools(tools)
|
| 269 |
+
if response_format:
|
| 270 |
+
prompt += "\n\n" + response_format_template.format(schema=to_json(response_format))
|
| 271 |
+
|
| 272 |
+
elif role == "developer":
|
| 273 |
+
assert content, f"Invalid message for role `{role}`: {msg}"
|
| 274 |
+
|
| 275 |
+
content_developer = USER_SP_TOKEN
|
| 276 |
+
content_developer += content
|
| 277 |
+
|
| 278 |
+
if tools:
|
| 279 |
+
content_developer += "\n\n" + render_tools(tools)
|
| 280 |
+
if response_format:
|
| 281 |
+
content_developer += "\n\n" + response_format_template.format(schema=to_json(response_format))
|
| 282 |
+
|
| 283 |
+
prompt += user_msg_template.format(content=content_developer)
|
| 284 |
+
|
| 285 |
+
elif role == "user":
|
| 286 |
+
prompt += USER_SP_TOKEN
|
| 287 |
+
|
| 288 |
+
# Handle content blocks (tool results mixed with text)
|
| 289 |
+
content_blocks = msg.get("content_blocks")
|
| 290 |
+
if content_blocks:
|
| 291 |
+
parts = []
|
| 292 |
+
for block in content_blocks:
|
| 293 |
+
block_type = block.get("type")
|
| 294 |
+
if block_type == "text":
|
| 295 |
+
parts.append(block.get("text", ""))
|
| 296 |
+
elif block_type == "tool_result":
|
| 297 |
+
tool_content = block.get("content", "")
|
| 298 |
+
if isinstance(tool_content, list):
|
| 299 |
+
text_parts = []
|
| 300 |
+
for b in tool_content:
|
| 301 |
+
if b.get("type") == "text":
|
| 302 |
+
text_parts.append(b.get("text", ""))
|
| 303 |
+
else:
|
| 304 |
+
text_parts.append(f"[Unsupported {b.get('type')}]")
|
| 305 |
+
tool_content = "\n\n".join(text_parts)
|
| 306 |
+
parts.append(tool_output_template.format(content=tool_content))
|
| 307 |
+
else:
|
| 308 |
+
parts.append(f"[Unsupported {block_type}]")
|
| 309 |
+
prompt += "\n\n".join(parts)
|
| 310 |
+
else:
|
| 311 |
+
prompt += content or ""
|
| 312 |
+
|
| 313 |
+
elif role == "latest_reminder":
|
| 314 |
+
prompt += LATEST_REMINDER_SP_TOKEN + latest_reminder_msg_template.format(content=content)
|
| 315 |
+
|
| 316 |
+
elif role == "tool":
|
| 317 |
+
raise NotImplementedError("deepseek_v4 merges tool messages into user; please preprocess with merge_tool_messages()")
|
| 318 |
+
|
| 319 |
+
elif role == "assistant":
|
| 320 |
+
thinking_part = ""
|
| 321 |
+
tc_content = ""
|
| 322 |
+
|
| 323 |
+
if tool_calls:
|
| 324 |
+
tc_list = [
|
| 325 |
+
tool_call_template.format(
|
| 326 |
+
dsml_token=dsml_token,
|
| 327 |
+
name=tc.get("name"),
|
| 328 |
+
arguments=encode_arguments_to_dsml(tc)
|
| 329 |
+
)
|
| 330 |
+
for tc in tool_calls
|
| 331 |
+
]
|
| 332 |
+
tc_content += '\n\n' + tool_calls_template.format(
|
| 333 |
+
dsml_token=dsml_token,
|
| 334 |
+
tool_calls="\n".join(tc_list),
|
| 335 |
+
tc_block_name=tool_calls_block_name,
|
| 336 |
+
)
|
| 337 |
+
|
| 338 |
+
summary_content = content or ""
|
| 339 |
+
rc = reasoning_content or ""
|
| 340 |
+
|
| 341 |
+
# Check if previous message has a task - if so, this is a task output (no thinking)
|
| 342 |
+
prev_has_task = index - 1 >= 0 and messages[index - 1].get("task") is not None
|
| 343 |
+
|
| 344 |
+
if thinking_mode == "thinking" and not prev_has_task:
|
| 345 |
+
if not drop_thinking or index > last_user_idx:
|
| 346 |
+
thinking_part = thinking_template.format(reasoning_content=rc) + thinking_end_token
|
| 347 |
+
else:
|
| 348 |
+
thinking_part = ""
|
| 349 |
+
|
| 350 |
+
if wo_eos:
|
| 351 |
+
prompt += assistant_msg_wo_eos_template.format(
|
| 352 |
+
reasoning=thinking_part,
|
| 353 |
+
content=summary_content,
|
| 354 |
+
tool_calls=tc_content,
|
| 355 |
+
)
|
| 356 |
+
else:
|
| 357 |
+
prompt += assistant_msg_template.format(
|
| 358 |
+
reasoning=thinking_part,
|
| 359 |
+
content=summary_content,
|
| 360 |
+
tool_calls=tc_content,
|
| 361 |
+
)
|
| 362 |
+
else:
|
| 363 |
+
raise NotImplementedError(f"Unknown role: {role}")
|
| 364 |
+
|
| 365 |
+
# Append transition tokens based on what follows
|
| 366 |
+
if index + 1 < len(messages) and messages[index + 1].get("role") not in ["assistant", "latest_reminder"]:
|
| 367 |
+
return prompt
|
| 368 |
+
|
| 369 |
+
task = messages[index].get("task")
|
| 370 |
+
if task is not None:
|
| 371 |
+
# Task special token for internal classification tasks
|
| 372 |
+
assert task in VALID_TASKS, f"Invalid task: '{task}'. Valid tasks are: {list(VALID_TASKS)}"
|
| 373 |
+
task_sp_token = DS_TASK_SP_TOKENS[task]
|
| 374 |
+
|
| 375 |
+
if task != "action":
|
| 376 |
+
# Non-action tasks: append task sp token directly after the message
|
| 377 |
+
prompt += task_sp_token
|
| 378 |
+
else:
|
| 379 |
+
# Action task: append Assistant + thinking token + action sp token
|
| 380 |
+
prompt += ASSISTANT_SP_TOKEN
|
| 381 |
+
prompt += thinking_end_token if thinking_mode != "thinking" else thinking_start_token
|
| 382 |
+
prompt += task_sp_token
|
| 383 |
+
|
| 384 |
+
elif messages[index].get("role") in ["user", "developer"]:
|
| 385 |
+
# Normal generation: append Assistant + thinking token
|
| 386 |
+
prompt += ASSISTANT_SP_TOKEN
|
| 387 |
+
if not drop_thinking and thinking_mode == "thinking":
|
| 388 |
+
prompt += thinking_start_token
|
| 389 |
+
elif drop_thinking and thinking_mode == "thinking" and index >= last_user_idx:
|
| 390 |
+
prompt += thinking_start_token
|
| 391 |
+
else:
|
| 392 |
+
prompt += thinking_end_token
|
| 393 |
+
|
| 394 |
+
return prompt
|
| 395 |
+
|
| 396 |
+
|
| 397 |
+
# ============================================================
|
| 398 |
+
# Preprocessing
|
| 399 |
+
# ============================================================
|
| 400 |
+
|
| 401 |
+
def merge_tool_messages(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 402 |
+
"""
|
| 403 |
+
Merge tool messages into the preceding user message using content_blocks format.
|
| 404 |
+
|
| 405 |
+
DeepSeek-V4 does not have a standalone "tool" role; instead, tool results
|
| 406 |
+
are encoded as <tool_result> blocks within user messages.
|
| 407 |
+
|
| 408 |
+
This function converts a standard OpenAI-format conversation (with separate
|
| 409 |
+
"tool" role messages) into V4 format where tool results are merged into
|
| 410 |
+
user messages.
|
| 411 |
+
|
| 412 |
+
Args:
|
| 413 |
+
messages: List of message dicts in OpenAI format.
|
| 414 |
+
|
| 415 |
+
Returns:
|
| 416 |
+
Processed message list with tool messages merged into user messages.
|
| 417 |
+
"""
|
| 418 |
+
merged: List[Dict[str, Any]] = []
|
| 419 |
+
|
| 420 |
+
for msg in messages:
|
| 421 |
+
msg = copy.deepcopy(msg)
|
| 422 |
+
role = msg.get("role")
|
| 423 |
+
|
| 424 |
+
if role == "tool":
|
| 425 |
+
# Convert tool message to a user message with tool_result block
|
| 426 |
+
tool_block = {
|
| 427 |
+
"type": "tool_result",
|
| 428 |
+
"tool_use_id": msg.get("tool_call_id", ""),
|
| 429 |
+
"content": msg.get("content", ""),
|
| 430 |
+
}
|
| 431 |
+
# Merge into previous message if it's already a user (merged tool)
|
| 432 |
+
if merged and merged[-1].get("role") == "user" and "content_blocks" in merged[-1]:
|
| 433 |
+
merged[-1]["content_blocks"].append(tool_block)
|
| 434 |
+
else:
|
| 435 |
+
merged.append({
|
| 436 |
+
"role": "user",
|
| 437 |
+
"content_blocks": [tool_block],
|
| 438 |
+
})
|
| 439 |
+
elif role == "user":
|
| 440 |
+
text_block = {"type": "text", "text": msg.get("content", "")}
|
| 441 |
+
if merged and merged[-1].get("role") == "user" and "content_blocks" in merged[-1] and merged[-1].get("task") is None:
|
| 442 |
+
merged[-1]["content_blocks"].append(text_block)
|
| 443 |
+
else:
|
| 444 |
+
new_msg = {
|
| 445 |
+
"role": "user",
|
| 446 |
+
"content": msg.get("content", ""),
|
| 447 |
+
"content_blocks": [text_block],
|
| 448 |
+
}
|
| 449 |
+
# Preserve extra fields (task, wo_eos, mask, etc.)
|
| 450 |
+
for key in ("task", "wo_eos", "mask"):
|
| 451 |
+
if key in msg:
|
| 452 |
+
new_msg[key] = msg[key]
|
| 453 |
+
merged.append(new_msg)
|
| 454 |
+
else:
|
| 455 |
+
merged.append(msg)
|
| 456 |
+
|
| 457 |
+
return merged
|
| 458 |
+
|
| 459 |
+
|
| 460 |
+
def sort_tool_results_by_call_order(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 461 |
+
"""
|
| 462 |
+
Sort tool_result blocks within user messages by the order of tool_calls
|
| 463 |
+
in the preceding assistant message.
|
| 464 |
+
|
| 465 |
+
Args:
|
| 466 |
+
messages: Preprocessed message list (after merge_tool_messages).
|
| 467 |
+
|
| 468 |
+
Returns:
|
| 469 |
+
Message list with sorted tool result blocks.
|
| 470 |
+
"""
|
| 471 |
+
last_tool_call_order: Dict[str, int] = {}
|
| 472 |
+
|
| 473 |
+
for msg in messages:
|
| 474 |
+
role = msg.get("role")
|
| 475 |
+
if role == "assistant" and msg.get("tool_calls"):
|
| 476 |
+
last_tool_call_order = {}
|
| 477 |
+
for idx, tc in enumerate(msg["tool_calls"]):
|
| 478 |
+
tc_id = tc.get("id") or tc.get("function", {}).get("id", "")
|
| 479 |
+
if tc_id:
|
| 480 |
+
last_tool_call_order[tc_id] = idx
|
| 481 |
+
|
| 482 |
+
elif role == "user" and msg.get("content_blocks"):
|
| 483 |
+
tool_blocks = [b for b in msg["content_blocks"] if b.get("type") == "tool_result"]
|
| 484 |
+
if len(tool_blocks) > 1 and last_tool_call_order:
|
| 485 |
+
sorted_blocks = sorted(
|
| 486 |
+
tool_blocks,
|
| 487 |
+
key=lambda b: last_tool_call_order.get(b.get("tool_use_id", ""), 0)
|
| 488 |
+
)
|
| 489 |
+
sorted_idx = 0
|
| 490 |
+
new_blocks = []
|
| 491 |
+
for block in msg["content_blocks"]:
|
| 492 |
+
if block.get("type") == "tool_result":
|
| 493 |
+
new_blocks.append(sorted_blocks[sorted_idx])
|
| 494 |
+
sorted_idx += 1
|
| 495 |
+
else:
|
| 496 |
+
new_blocks.append(block)
|
| 497 |
+
msg["content_blocks"] = new_blocks
|
| 498 |
+
|
| 499 |
+
return messages
|
| 500 |
+
|
| 501 |
+
|
| 502 |
+
# ============================================================
|
| 503 |
+
# Main Encoding Function
|
| 504 |
+
# ============================================================
|
| 505 |
+
|
| 506 |
+
def encode_messages(
|
| 507 |
+
messages: List[Dict[str, Any]],
|
| 508 |
+
thinking_mode: str,
|
| 509 |
+
context: Optional[List[Dict[str, Any]]] = None,
|
| 510 |
+
drop_thinking: bool = True,
|
| 511 |
+
add_default_bos_token: bool = True,
|
| 512 |
+
reasoning_effort: Optional[str] = None,
|
| 513 |
+
) -> str:
|
| 514 |
+
"""
|
| 515 |
+
Encode a list of messages into the DeepSeek-V4 prompt format.
|
| 516 |
+
|
| 517 |
+
This is the main entry point for encoding conversations. It handles:
|
| 518 |
+
- BOS token insertion
|
| 519 |
+
- Thinking mode with optional reasoning content dropping
|
| 520 |
+
- Tool message merging into user messages
|
| 521 |
+
- Multi-turn conversation context
|
| 522 |
+
|
| 523 |
+
Args:
|
| 524 |
+
messages: List of message dicts to encode.
|
| 525 |
+
thinking_mode: Either "chat" or "thinking".
|
| 526 |
+
context: Optional preceding context messages (already encoded prefix).
|
| 527 |
+
drop_thinking: If True, drop reasoning_content from earlier assistant turns
|
| 528 |
+
(only keep reasoning for messages after the last user message).
|
| 529 |
+
add_default_bos_token: Whether to prepend BOS token at conversation start.
|
| 530 |
+
reasoning_effort: Optional reasoning effort level ("max", "high", or None).
|
| 531 |
+
|
| 532 |
+
Returns:
|
| 533 |
+
The encoded prompt string.
|
| 534 |
+
"""
|
| 535 |
+
context = context if context else []
|
| 536 |
+
|
| 537 |
+
# Preprocess: merge tool messages and sort tool results
|
| 538 |
+
messages = merge_tool_messages(messages)
|
| 539 |
+
messages = sort_tool_results_by_call_order(context + messages)[len(context):]
|
| 540 |
+
if context:
|
| 541 |
+
context = merge_tool_messages(context)
|
| 542 |
+
context = sort_tool_results_by_call_order(context)
|
| 543 |
+
|
| 544 |
+
full_messages = context + messages
|
| 545 |
+
|
| 546 |
+
prompt = bos_token if add_default_bos_token and len(context) == 0 else ""
|
| 547 |
+
|
| 548 |
+
# Resolve drop_thinking: if any message has tools defined, don't drop thinking
|
| 549 |
+
effective_drop_thinking = drop_thinking
|
| 550 |
+
if any(m.get("tools") for m in full_messages):
|
| 551 |
+
effective_drop_thinking = False
|
| 552 |
+
|
| 553 |
+
if thinking_mode == "thinking" and effective_drop_thinking:
|
| 554 |
+
full_messages = _drop_thinking_messages(full_messages)
|
| 555 |
+
# After dropping, recalculate how many messages to render
|
| 556 |
+
# (context may have shrunk too)
|
| 557 |
+
num_to_render = len(full_messages) - len(_drop_thinking_messages(context))
|
| 558 |
+
context_len = len(full_messages) - num_to_render
|
| 559 |
+
else:
|
| 560 |
+
num_to_render = len(messages)
|
| 561 |
+
context_len = len(context)
|
| 562 |
+
|
| 563 |
+
for idx in range(num_to_render):
|
| 564 |
+
prompt += render_message(
|
| 565 |
+
idx + context_len,
|
| 566 |
+
full_messages,
|
| 567 |
+
thinking_mode=thinking_mode,
|
| 568 |
+
drop_thinking=effective_drop_thinking,
|
| 569 |
+
reasoning_effort=reasoning_effort,
|
| 570 |
+
)
|
| 571 |
+
|
| 572 |
+
return prompt
|
| 573 |
+
|
| 574 |
+
|
| 575 |
+
def _drop_thinking_messages(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 576 |
+
"""
|
| 577 |
+
Drop reasoning_content and non-essential messages before the last user message.
|
| 578 |
+
|
| 579 |
+
Behavior:
|
| 580 |
+
- Messages with role in ["user", "system", "tool", "latest_reminder"] are always kept.
|
| 581 |
+
- Messages at or after the last user index are always kept.
|
| 582 |
+
- Assistant messages before the last user get reasoning_content removed.
|
| 583 |
+
- Developer messages before the last user are dropped entirely.
|
| 584 |
+
"""
|
| 585 |
+
last_user_idx = find_last_user_index(messages)
|
| 586 |
+
result = []
|
| 587 |
+
keep_roles = {"user", "system", "tool", "latest_reminder", "direct_search_results"}
|
| 588 |
+
|
| 589 |
+
for idx, msg in enumerate(messages):
|
| 590 |
+
role = msg.get("role")
|
| 591 |
+
if role in keep_roles or idx >= last_user_idx:
|
| 592 |
+
result.append(msg)
|
| 593 |
+
elif role == "assistant":
|
| 594 |
+
msg = copy.copy(msg)
|
| 595 |
+
msg.pop("reasoning_content", None)
|
| 596 |
+
result.append(msg)
|
| 597 |
+
# developer and other roles before last_user_idx are dropped
|
| 598 |
+
|
| 599 |
+
return result
|
| 600 |
+
|
| 601 |
+
|
| 602 |
+
# ============================================================
|
| 603 |
+
# Parsing (Decoding model output)
|
| 604 |
+
# ============================================================
|
| 605 |
+
|
| 606 |
+
def _read_until_stop(index: int, text: str, stop: List[str]) -> Tuple[int, str, Optional[str]]:
|
| 607 |
+
"""
|
| 608 |
+
Read text from index until one of the stop strings is found.
|
| 609 |
+
|
| 610 |
+
Returns:
|
| 611 |
+
Tuple of (new_index, content_before_stop, matched_stop_string_or_None).
|
| 612 |
+
"""
|
| 613 |
+
min_pos = len(text)
|
| 614 |
+
matched_stop = None
|
| 615 |
+
|
| 616 |
+
for s in stop:
|
| 617 |
+
pos = text.find(s, index)
|
| 618 |
+
if pos != -1 and pos < min_pos:
|
| 619 |
+
min_pos = pos
|
| 620 |
+
matched_stop = s
|
| 621 |
+
|
| 622 |
+
if matched_stop:
|
| 623 |
+
content = text[index:min_pos]
|
| 624 |
+
return min_pos + len(matched_stop), content, matched_stop
|
| 625 |
+
else:
|
| 626 |
+
content = text[index:]
|
| 627 |
+
return len(text), content, None
|
| 628 |
+
|
| 629 |
+
|
| 630 |
+
def parse_tool_calls(index: int, text: str) -> Tuple[int, Optional[str], List[Dict[str, str]]]:
|
| 631 |
+
"""
|
| 632 |
+
Parse DSML tool calls from text starting at the given index.
|
| 633 |
+
|
| 634 |
+
Args:
|
| 635 |
+
index: Starting position in text.
|
| 636 |
+
text: The full text to parse.
|
| 637 |
+
|
| 638 |
+
Returns:
|
| 639 |
+
Tuple of (new_index, last_stop_token, list_of_tool_call_dicts).
|
| 640 |
+
Each tool call dict has "name" and "arguments" keys.
|
| 641 |
+
"""
|
| 642 |
+
tool_calls: List[Dict[str, Any]] = []
|
| 643 |
+
stop_token = None
|
| 644 |
+
tool_calls_end_token = f"</{dsml_token}{tool_calls_block_name}>"
|
| 645 |
+
|
| 646 |
+
while index < len(text):
|
| 647 |
+
index, _, stop_token = _read_until_stop(index, text, [f"<{dsml_token}invoke", tool_calls_end_token])
|
| 648 |
+
if _ != ">\n":
|
| 649 |
+
raise ValueError(f"Tool call format error: expected '>\\n' but got '{_}'")
|
| 650 |
+
|
| 651 |
+
if stop_token == tool_calls_end_token:
|
| 652 |
+
break
|
| 653 |
+
|
| 654 |
+
if stop_token is None:
|
| 655 |
+
raise ValueError("Missing special token in tool calls")
|
| 656 |
+
|
| 657 |
+
index, tool_name_content, stop_token = _read_until_stop(index, text, [f"<{dsml_token}parameter", f"</{dsml_token}invoke"])
|
| 658 |
+
|
| 659 |
+
p_tool_name = re.findall(r'^\s*name="(.*?)">\n$', tool_name_content, flags=re.DOTALL)
|
| 660 |
+
if len(p_tool_name) != 1:
|
| 661 |
+
raise ValueError(f"Tool name format error: '{tool_name_content}'")
|
| 662 |
+
tool_name = p_tool_name[0]
|
| 663 |
+
|
| 664 |
+
tool_args: Dict[str, Tuple[str, str]] = {}
|
| 665 |
+
while stop_token == f"<{dsml_token}parameter":
|
| 666 |
+
index, param_content, stop_token = _read_until_stop(index, text, [f"/{dsml_token}parameter"])
|
| 667 |
+
|
| 668 |
+
param_kv = re.findall(r'^ name="(.*?)" string="(true|false)">(.*?)<$', param_content, flags=re.DOTALL)
|
| 669 |
+
if len(param_kv) != 1:
|
| 670 |
+
raise ValueError(f"Parameter format error: '{param_content}'")
|
| 671 |
+
param_name, string, param_value = param_kv[0]
|
| 672 |
+
|
| 673 |
+
if param_name in tool_args:
|
| 674 |
+
raise ValueError(f"Duplicate parameter name: '{param_name}'")
|
| 675 |
+
tool_args[param_name] = (param_value, string)
|
| 676 |
+
|
| 677 |
+
index, content, stop_token = _read_until_stop(index, text, [f"<{dsml_token}parameter", f"</{dsml_token}invoke"])
|
| 678 |
+
if content != ">\n":
|
| 679 |
+
raise ValueError(f"Parameter format error: expected '>\\n' but got '{content}'")
|
| 680 |
+
|
| 681 |
+
tool_call = decode_dsml_to_arguments(tool_name=tool_name, tool_args=tool_args)
|
| 682 |
+
tool_calls.append(tool_call)
|
| 683 |
+
|
| 684 |
+
return index, stop_token, tool_calls
|
| 685 |
+
|
| 686 |
+
|
| 687 |
+
def parse_message_from_completion_text(text: str, thinking_mode: str) -> Dict[str, Any]:
|
| 688 |
+
"""
|
| 689 |
+
Parse a model completion text into a structured assistant message.
|
| 690 |
+
|
| 691 |
+
This function takes the raw text output from the model (a single assistant turn)
|
| 692 |
+
and extracts:
|
| 693 |
+
- reasoning_content (thinking block)
|
| 694 |
+
- content (summary/response)
|
| 695 |
+
- tool_calls (if any)
|
| 696 |
+
|
| 697 |
+
NOTE: This function is designed to parse only correctly formatted strings and
|
| 698 |
+
will raise ValueError for malformed output.
|
| 699 |
+
|
| 700 |
+
Args:
|
| 701 |
+
text: The raw completion text (including EOS token).
|
| 702 |
+
thinking_mode: Either "chat" or "thinking".
|
| 703 |
+
|
| 704 |
+
Returns:
|
| 705 |
+
Dict with keys: "role", "content", "reasoning_content", "tool_calls".
|
| 706 |
+
tool_calls are in OpenAI format.
|
| 707 |
+
"""
|
| 708 |
+
summary_content, reasoning_content, tool_calls = "", "", []
|
| 709 |
+
index, stop_token = 0, None
|
| 710 |
+
tool_calls_start_token = f"\n\n<{dsml_token}{tool_calls_block_name}"
|
| 711 |
+
|
| 712 |
+
is_thinking = thinking_mode == "thinking"
|
| 713 |
+
is_tool_calling = False
|
| 714 |
+
|
| 715 |
+
if is_thinking:
|
| 716 |
+
index, content_delta, stop_token = _read_until_stop(index, text, [thinking_end_token, tool_calls_start_token])
|
| 717 |
+
reasoning_content = content_delta
|
| 718 |
+
assert stop_token == thinking_end_token, "Invalid thinking format: missing </think>"
|
| 719 |
+
|
| 720 |
+
index, content_delta, stop_token = _read_until_stop(index, text, [eos_token, tool_calls_start_token])
|
| 721 |
+
summary_content = content_delta
|
| 722 |
+
if stop_token == tool_calls_start_token:
|
| 723 |
+
is_tool_calling = True
|
| 724 |
+
else:
|
| 725 |
+
assert stop_token == eos_token, "Invalid format: missing EOS token"
|
| 726 |
+
|
| 727 |
+
if is_tool_calling:
|
| 728 |
+
index, stop_token, tool_calls = parse_tool_calls(index, text)
|
| 729 |
+
|
| 730 |
+
index, tool_ends_text, stop_token = _read_until_stop(index, text, [eos_token])
|
| 731 |
+
assert not tool_ends_text, "Unexpected content after tool calls"
|
| 732 |
+
|
| 733 |
+
assert len(text) == index and stop_token in [eos_token, None], "Unexpected content at end"
|
| 734 |
+
|
| 735 |
+
for sp_token in [bos_token, eos_token, thinking_start_token, thinking_end_token, dsml_token]:
|
| 736 |
+
assert sp_token not in summary_content and sp_token not in reasoning_content, \
|
| 737 |
+
f"Unexpected special token '{sp_token}' in content"
|
| 738 |
+
|
| 739 |
+
return {
|
| 740 |
+
"role": "assistant",
|
| 741 |
+
"content": summary_content,
|
| 742 |
+
"reasoning_content": reasoning_content,
|
| 743 |
+
"tool_calls": tool_calls_to_openai_format(tool_calls)
|
| 744 |
+
}
|
encoding/test_encoding_dsv4.py
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Test suite for DeepSeek-V4 Encoding.
|
| 3 |
+
|
| 4 |
+
Run: python test_encoding_dsv4.py
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import json
|
| 8 |
+
import os
|
| 9 |
+
|
| 10 |
+
from encoding_dsv4 import encode_messages, parse_message_from_completion_text
|
| 11 |
+
|
| 12 |
+
TESTS_DIR = os.path.join(os.path.dirname(__file__), "tests")
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def test_case_1():
|
| 16 |
+
"""Thinking mode with tool calls (multi-turn, tool results merged into user)."""
|
| 17 |
+
with open(os.path.join(TESTS_DIR, "test_input_1.json")) as f:
|
| 18 |
+
td = json.load(f)
|
| 19 |
+
messages = td["messages"]
|
| 20 |
+
messages[0]["tools"] = td["tools"]
|
| 21 |
+
gold = open(os.path.join(TESTS_DIR, "test_output_1.txt")).read()
|
| 22 |
+
prompt = encode_messages(messages, thinking_mode="thinking")
|
| 23 |
+
assert prompt == gold
|
| 24 |
+
|
| 25 |
+
# Parse: assistant turn with tool call
|
| 26 |
+
marker = "<|Assistant|><think>"
|
| 27 |
+
first_start = prompt.find(marker) + len(marker)
|
| 28 |
+
first_end = prompt.find("<|User|>", first_start)
|
| 29 |
+
parsed_tc = parse_message_from_completion_text(prompt[first_start:first_end], thinking_mode="thinking")
|
| 30 |
+
assert parsed_tc["reasoning_content"] == "The user wants to know the weather in Beijing. I should use the get_weather tool."
|
| 31 |
+
assert parsed_tc["content"] == ""
|
| 32 |
+
assert len(parsed_tc["tool_calls"]) == 1
|
| 33 |
+
assert parsed_tc["tool_calls"][0]["function"]["name"] == "get_weather"
|
| 34 |
+
assert json.loads(parsed_tc["tool_calls"][0]["function"]["arguments"]) == {"location": "Beijing", "unit": "celsius"}
|
| 35 |
+
|
| 36 |
+
# Parse: final assistant turn with content
|
| 37 |
+
last_start = prompt.rfind(marker) + len(marker)
|
| 38 |
+
parsed_final = parse_message_from_completion_text(prompt[last_start:], thinking_mode="thinking")
|
| 39 |
+
assert parsed_final["reasoning_content"] == "Got the weather data. Let me format a nice response."
|
| 40 |
+
assert "22°C" in parsed_final["content"]
|
| 41 |
+
assert parsed_final["tool_calls"] == []
|
| 42 |
+
|
| 43 |
+
print(" [PASS] case 1: thinking with tools (encode + parse)")
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
def test_case_2():
|
| 47 |
+
"""Thinking mode without tools (drop_thinking removes earlier reasoning)."""
|
| 48 |
+
messages = json.load(open(os.path.join(TESTS_DIR, "test_input_2.json")))
|
| 49 |
+
gold = open(os.path.join(TESTS_DIR, "test_output_2.txt")).read()
|
| 50 |
+
prompt = encode_messages(messages, thinking_mode="thinking")
|
| 51 |
+
assert prompt == gold
|
| 52 |
+
|
| 53 |
+
# Parse: last assistant turn
|
| 54 |
+
marker = "<|Assistant|><think>"
|
| 55 |
+
last_start = prompt.rfind(marker) + len(marker)
|
| 56 |
+
parsed = parse_message_from_completion_text(prompt[last_start:], thinking_mode="thinking")
|
| 57 |
+
assert parsed["reasoning_content"] == "The user asks about the capital of France. It is Paris."
|
| 58 |
+
assert parsed["content"] == "The capital of France is Paris."
|
| 59 |
+
assert parsed["tool_calls"] == []
|
| 60 |
+
|
| 61 |
+
# Verify drop_thinking: first assistant's reasoning should be absent
|
| 62 |
+
assert "The user said hello" not in prompt
|
| 63 |
+
|
| 64 |
+
print(" [PASS] case 2: thinking without tools (encode + parse)")
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
def test_case_3():
|
| 68 |
+
"""Interleaved thinking + search (developer with tools, latest_reminder)."""
|
| 69 |
+
messages = json.load(open(os.path.join(TESTS_DIR, "test_input_3.json")))
|
| 70 |
+
gold = open(os.path.join(TESTS_DIR, "test_output_3.txt")).read()
|
| 71 |
+
assert encode_messages(messages, thinking_mode="thinking") == gold
|
| 72 |
+
print(" [PASS] case 3: interleaved thinking + search")
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
def test_case_4():
|
| 76 |
+
"""Quick instruction task with latest_reminder (chat mode, action task)."""
|
| 77 |
+
messages = json.load(open(os.path.join(TESTS_DIR, "test_input_4.json")))
|
| 78 |
+
gold = open(os.path.join(TESTS_DIR, "test_output_4.txt")).read()
|
| 79 |
+
assert encode_messages(messages, thinking_mode="chat") == gold
|
| 80 |
+
print(" [PASS] case 4: quick instruction task")
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
if __name__ == "__main__":
|
| 84 |
+
print("Running DeepSeek-V4 Encoding Tests...\n")
|
| 85 |
+
test_case_1()
|
| 86 |
+
test_case_2()
|
| 87 |
+
test_case_3()
|
| 88 |
+
test_case_4()
|
| 89 |
+
print("\nAll 4 tests passed!")
|
encoding/tests/test_input_1.json
ADDED
|
@@ -0,0 +1,81 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"tools": [
|
| 3 |
+
{
|
| 4 |
+
"type": "function",
|
| 5 |
+
"function": {
|
| 6 |
+
"name": "get_weather",
|
| 7 |
+
"description": "Get the weather for a specific location",
|
| 8 |
+
"parameters": {
|
| 9 |
+
"type": "object",
|
| 10 |
+
"properties": {
|
| 11 |
+
"location": {
|
| 12 |
+
"type": "string",
|
| 13 |
+
"description": "The city name"
|
| 14 |
+
},
|
| 15 |
+
"unit": {
|
| 16 |
+
"type": "string",
|
| 17 |
+
"enum": ["celsius", "fahrenheit"],
|
| 18 |
+
"description": "Temperature unit"
|
| 19 |
+
}
|
| 20 |
+
},
|
| 21 |
+
"required": ["location"]
|
| 22 |
+
}
|
| 23 |
+
}
|
| 24 |
+
},
|
| 25 |
+
{
|
| 26 |
+
"type": "function",
|
| 27 |
+
"function": {
|
| 28 |
+
"name": "search",
|
| 29 |
+
"description": "Search the web for information",
|
| 30 |
+
"parameters": {
|
| 31 |
+
"type": "object",
|
| 32 |
+
"properties": {
|
| 33 |
+
"query": {
|
| 34 |
+
"type": "string",
|
| 35 |
+
"description": "Search query"
|
| 36 |
+
},
|
| 37 |
+
"num_results": {
|
| 38 |
+
"type": "integer",
|
| 39 |
+
"description": "Number of results to return"
|
| 40 |
+
}
|
| 41 |
+
},
|
| 42 |
+
"required": ["query"]
|
| 43 |
+
}
|
| 44 |
+
}
|
| 45 |
+
}
|
| 46 |
+
],
|
| 47 |
+
"messages": [
|
| 48 |
+
{
|
| 49 |
+
"role": "system",
|
| 50 |
+
"content": "You are a helpful assistant."
|
| 51 |
+
},
|
| 52 |
+
{
|
| 53 |
+
"role": "user",
|
| 54 |
+
"content": "What's the weather in Beijing?"
|
| 55 |
+
},
|
| 56 |
+
{
|
| 57 |
+
"role": "assistant",
|
| 58 |
+
"reasoning_content": "The user wants to know the weather in Beijing. I should use the get_weather tool.",
|
| 59 |
+
"tool_calls": [
|
| 60 |
+
{
|
| 61 |
+
"id": "call_001",
|
| 62 |
+
"type": "function",
|
| 63 |
+
"function": {
|
| 64 |
+
"name": "get_weather",
|
| 65 |
+
"arguments": "{\"location\": \"Beijing\", \"unit\": \"celsius\"}"
|
| 66 |
+
}
|
| 67 |
+
}
|
| 68 |
+
]
|
| 69 |
+
},
|
| 70 |
+
{
|
| 71 |
+
"role": "tool",
|
| 72 |
+
"tool_call_id": "call_001",
|
| 73 |
+
"content": "{\"temperature\": 22, \"condition\": \"sunny\", \"humidity\": 45}"
|
| 74 |
+
},
|
| 75 |
+
{
|
| 76 |
+
"role": "assistant",
|
| 77 |
+
"reasoning_content": "Got the weather data. Let me format a nice response.",
|
| 78 |
+
"content": "The weather in Beijing is currently sunny with a temperature of 22°C and 45% humidity."
|
| 79 |
+
}
|
| 80 |
+
]
|
| 81 |
+
}
|
encoding/tests/test_input_2.json
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"role": "system",
|
| 4 |
+
"content": "You are a helpful assistant."
|
| 5 |
+
},
|
| 6 |
+
{
|
| 7 |
+
"role": "user",
|
| 8 |
+
"content": "Hello"
|
| 9 |
+
},
|
| 10 |
+
{
|
| 11 |
+
"role": "assistant",
|
| 12 |
+
"reasoning_content": "The user said hello, I should greet back.",
|
| 13 |
+
"content": "Hi there! How can I help you?"
|
| 14 |
+
},
|
| 15 |
+
{
|
| 16 |
+
"role": "user",
|
| 17 |
+
"content": "What is the capital of France?"
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"role": "assistant",
|
| 21 |
+
"reasoning_content": "The user asks about the capital of France. It is Paris.",
|
| 22 |
+
"content": "The capital of France is Paris."
|
| 23 |
+
}
|
| 24 |
+
]
|
encoding/tests/test_input_3.json
ADDED
|
@@ -0,0 +1,159 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"role": "system",
|
| 4 |
+
"content": "该助手为DeepSeek,由深度求索公司创造。"
|
| 5 |
+
},
|
| 6 |
+
{
|
| 7 |
+
"role": "latest_reminder",
|
| 8 |
+
"content": "2026-02-21,星期六,广州,App,中文"
|
| 9 |
+
},
|
| 10 |
+
{
|
| 11 |
+
"role": "developer",
|
| 12 |
+
"content": "小柴胡冲剂和布洛芬能一起吃吗?\n\nCITATION FORMAT: 【{cursor_id}†L{start_line_id}(-L{end_line_id})?】",
|
| 13 |
+
"tools": [
|
| 14 |
+
{
|
| 15 |
+
"type": "function",
|
| 16 |
+
"function": {
|
| 17 |
+
"name": "search",
|
| 18 |
+
"description": "Web search. Split multiple queries with '||'.",
|
| 19 |
+
"parameters": {
|
| 20 |
+
"type": "object",
|
| 21 |
+
"properties": {
|
| 22 |
+
"queries": {
|
| 23 |
+
"type": "string",
|
| 24 |
+
"description": "query1||query2"
|
| 25 |
+
}
|
| 26 |
+
},
|
| 27 |
+
"required": [
|
| 28 |
+
"queries"
|
| 29 |
+
],
|
| 30 |
+
"additionalProperties": false,
|
| 31 |
+
"$schema": "http://json-schema.org/draft-07/schema#"
|
| 32 |
+
}
|
| 33 |
+
}
|
| 34 |
+
},
|
| 35 |
+
{
|
| 36 |
+
"type": "function",
|
| 37 |
+
"function": {
|
| 38 |
+
"name": "open",
|
| 39 |
+
"description": "Batch open IDs (format 【{id}†...】) or URLs.",
|
| 40 |
+
"parameters": {
|
| 41 |
+
"type": "object",
|
| 42 |
+
"properties": {
|
| 43 |
+
"open_list": {
|
| 44 |
+
"type": "array",
|
| 45 |
+
"items": {
|
| 46 |
+
"type": "object",
|
| 47 |
+
"properties": {
|
| 48 |
+
"id": {
|
| 49 |
+
"description": "ID or URL",
|
| 50 |
+
"anyOf": [
|
| 51 |
+
{
|
| 52 |
+
"type": "integer"
|
| 53 |
+
},
|
| 54 |
+
{
|
| 55 |
+
"type": "string"
|
| 56 |
+
}
|
| 57 |
+
],
|
| 58 |
+
"default": -1
|
| 59 |
+
},
|
| 60 |
+
"cursor": {
|
| 61 |
+
"type": "integer",
|
| 62 |
+
"description": "",
|
| 63 |
+
"default": -1
|
| 64 |
+
},
|
| 65 |
+
"loc": {
|
| 66 |
+
"type": "integer",
|
| 67 |
+
"description": "Start line",
|
| 68 |
+
"default": -1
|
| 69 |
+
},
|
| 70 |
+
"num_lines": {
|
| 71 |
+
"type": "integer",
|
| 72 |
+
"description": "",
|
| 73 |
+
"default": -1
|
| 74 |
+
},
|
| 75 |
+
"view_source": {
|
| 76 |
+
"type": "boolean",
|
| 77 |
+
"description": "",
|
| 78 |
+
"default": false
|
| 79 |
+
}
|
| 80 |
+
},
|
| 81 |
+
"additionalProperties": false
|
| 82 |
+
},
|
| 83 |
+
"description": ""
|
| 84 |
+
}
|
| 85 |
+
},
|
| 86 |
+
"required": [
|
| 87 |
+
"open_list"
|
| 88 |
+
],
|
| 89 |
+
"additionalProperties": false,
|
| 90 |
+
"$schema": "http://json-schema.org/draft-07/schema#"
|
| 91 |
+
}
|
| 92 |
+
}
|
| 93 |
+
},
|
| 94 |
+
{
|
| 95 |
+
"type": "function",
|
| 96 |
+
"function": {
|
| 97 |
+
"name": "find",
|
| 98 |
+
"description": "Find exact text pattern in pages.",
|
| 99 |
+
"parameters": {
|
| 100 |
+
"type": "object",
|
| 101 |
+
"properties": {
|
| 102 |
+
"find_list": {
|
| 103 |
+
"type": "array",
|
| 104 |
+
"items": {
|
| 105 |
+
"type": "object",
|
| 106 |
+
"properties": {
|
| 107 |
+
"pattern": {
|
| 108 |
+
"type": "string",
|
| 109 |
+
"description": ""
|
| 110 |
+
},
|
| 111 |
+
"cursor": {
|
| 112 |
+
"type": "integer",
|
| 113 |
+
"description": "",
|
| 114 |
+
"default": -1
|
| 115 |
+
}
|
| 116 |
+
},
|
| 117 |
+
"required": [
|
| 118 |
+
"pattern"
|
| 119 |
+
],
|
| 120 |
+
"additionalProperties": false
|
| 121 |
+
},
|
| 122 |
+
"description": ""
|
| 123 |
+
}
|
| 124 |
+
},
|
| 125 |
+
"required": [
|
| 126 |
+
"find_list"
|
| 127 |
+
],
|
| 128 |
+
"additionalProperties": false,
|
| 129 |
+
"$schema": "http://json-schema.org/draft-07/schema#"
|
| 130 |
+
}
|
| 131 |
+
}
|
| 132 |
+
}
|
| 133 |
+
]
|
| 134 |
+
},
|
| 135 |
+
{
|
| 136 |
+
"role": "assistant",
|
| 137 |
+
"content": "",
|
| 138 |
+
"reasoning_content": "用户想知道小柴胡冲剂和布洛芬能否一起服用。",
|
| 139 |
+
"tool_calls": [
|
| 140 |
+
{
|
| 141 |
+
"type": "function",
|
| 142 |
+
"function": {
|
| 143 |
+
"name": "search",
|
| 144 |
+
"arguments": "{\"queries\": \"小柴胡冲剂 布洛芬 相互作用 一起吃\"}"
|
| 145 |
+
}
|
| 146 |
+
}
|
| 147 |
+
]
|
| 148 |
+
},
|
| 149 |
+
{
|
| 150 |
+
"role": "tool",
|
| 151 |
+
"content": "[0]"
|
| 152 |
+
},
|
| 153 |
+
{
|
| 154 |
+
"role": "assistant",
|
| 155 |
+
"content": "请及时就医。",
|
| 156 |
+
"reasoning_content": "现在开始组织回答。",
|
| 157 |
+
"tool_calls": []
|
| 158 |
+
}
|
| 159 |
+
]
|
encoding/tests/test_input_4.json
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"role": "system",
|
| 4 |
+
"content": "该助手为DeepSeek-V3,由深度求索公司创造。\n今天是2025年10月17日,星期五。"
|
| 5 |
+
},
|
| 6 |
+
{
|
| 7 |
+
"role": "latest_reminder",
|
| 8 |
+
"content": "2024-11-15,上海市,App,中文"
|
| 9 |
+
},
|
| 10 |
+
{
|
| 11 |
+
"role": "user",
|
| 12 |
+
"content": "热海大滚锅是世界著名温泉吗"
|
| 13 |
+
},
|
| 14 |
+
{
|
| 15 |
+
"role": "assistant",
|
| 16 |
+
"content": "关于热海大滚锅是否是世界著名温泉,可以这样总结:它在**中国乃至全球的地热奇观中占有重要地位**,但“世界著名”的称号更侧重于它作为**独特的地质现象和旅游景点**,而非普遍的温泉疗养体验。\n\n为了让你快速了解,我整理了一个简要的表格:\n\n| 维度 | 热海大滚锅的地位与特点 |\n| :--- | :--- |\n| **地质奇观** | **世界罕见**的高温地热系统,泉眼多、水温高、形态多样。 |\n| **国内声誉** | **中国三大地热区之一**,**国家5A级旅游景区**,自古闻名(徐霞客曾游历并记载)。 |\n| **国际知名度** | 在一些旅游平台被提及为“世界六大温泉”之一,但此说法流传不广,其国际声誉更多建立在地质独特性上。 |\n| **核心体验** | **观赏地热奇观**(如97℃沸腾的“大滚锅”)、**体验温泉煮鸡蛋**。 |\n\n### 💡 游玩攻略与温馨提示\n\n如果你计划前往热海大滚锅,这里有一些实用信息供你参考:\n\n- **门票与开放时间**:\n - **门票**:景区门票约为**50元/人**。如果选择包含温泉沐浴的套餐,价格会更高,例如约**288元**。\n - **开放时间**:景区一般**08:00-18:00**开放,但具体时间可能变动,建议提前核实。\n\n- **特色体验**:\n - **温泉煮鸡蛋**:这几乎是必试项目。可以在景区门口购买用草绳串起的生鸡蛋(约5-8元/串),然后到“大滚锅”旁的指定区域蒸煮,几分钟便可熟食,趣味十足。\n - **金汤足浴**:可以直接用从“大滚锅”流出的温泉水泡脚,缓解旅途疲劳。\n\n- **注意事项**:\n - **安全第一**:“大滚锅”水温极高,务必遵守游览规则,在指定区域内观赏,切勿随意触碰泉水。\n - **规划行程**:建议为热海景区预留**3-4小时**的游览时间。景区内步道不走回头路,出入口有观光车接送。\n\n希望这些信息能帮助你更好地了解热海大滚锅。如果你对腾冲的其他景点或者行程规划有更多疑问,我很乐意提供进一步的信息。",
|
| 17 |
+
"mask": 1
|
| 18 |
+
},
|
| 19 |
+
{
|
| 20 |
+
"role": "user",
|
| 21 |
+
"content": "世界著名温泉有哪些",
|
| 22 |
+
"task": "action"
|
| 23 |
+
},
|
| 24 |
+
{
|
| 25 |
+
"role": "assistant",
|
| 26 |
+
"content": "Search"
|
| 27 |
+
}
|
| 28 |
+
]
|
encoding/tests/test_output_1.txt
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<|begin▁of▁sentence|>You are a helpful assistant.
|
| 2 |
+
|
| 3 |
+
## Tools
|
| 4 |
+
|
| 5 |
+
You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
|
| 6 |
+
|
| 7 |
+
<|DSML|tool_calls>
|
| 8 |
+
<|DSML|invoke name="$TOOL_NAME">
|
| 9 |
+
<|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
|
| 10 |
+
...
|
| 11 |
+
</|DSML|invoke>
|
| 12 |
+
<|DSML|invoke name="$TOOL_NAME2">
|
| 13 |
+
...
|
| 14 |
+
</|DSML|invoke>
|
| 15 |
+
</|DSML|tool_calls>
|
| 16 |
+
|
| 17 |
+
String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
|
| 18 |
+
|
| 19 |
+
If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
|
| 20 |
+
|
| 21 |
+
Otherwise, output directly after </think> with tool calls or final response.
|
| 22 |
+
|
| 23 |
+
### Available Tool Schemas
|
| 24 |
+
|
| 25 |
+
{"name": "get_weather", "description": "Get the weather for a specific location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"}}, "required": ["location"]}}
|
| 26 |
+
{"name": "search", "description": "Search the web for information", "parameters": {"type": "object", "properties": {"query": {"type": "string", "description": "Search query"}, "num_results": {"type": "integer", "description": "Number of results to return"}}, "required": ["query"]}}
|
| 27 |
+
|
| 28 |
+
You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
|
| 29 |
+
<|User|>What's the weather in Beijing?<|Assistant|><think>The user wants to know the weather in Beijing. I should use the get_weather tool.</think>
|
| 30 |
+
|
| 31 |
+
<|DSML|tool_calls>
|
| 32 |
+
<|DSML|invoke name="get_weather">
|
| 33 |
+
<|DSML|parameter name="location" string="true">Beijing</|DSML|parameter>
|
| 34 |
+
<|DSML|parameter name="unit" string="true">celsius</|DSML|parameter>
|
| 35 |
+
</|DSML|invoke>
|
| 36 |
+
</|DSML|tool_calls><|end▁of▁sentence|><|User|><tool_result>{"temperature": 22, "condition": "sunny", "humidity": 45}</tool_result><|Assistant|><think>Got the weather data. Let me format a nice response.</think>The weather in Beijing is currently sunny with a temperature of 22°C and 45% humidity.<|end▁of▁sentence|>
|
encoding/tests/test_output_2.txt
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
<|begin▁of▁sentence|>You are a helpful assistant.<|User|>Hello<|Assistant|></think>Hi there! How can I help you?<|end▁of▁sentence|><|User|>What is the capital of France?<|Assistant|><think>The user asks about the capital of France. It is Paris.</think>The capital of France is Paris.<|end▁of▁sentence|>
|
encoding/tests/test_output_3.txt
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<|begin▁of▁sentence|>该助手为DeepSeek,由深度求索公司创造。<|latest_reminder|>2026-02-21,星期六,广州,App,中文<|User|>小柴胡冲剂和布洛芬能一起吃吗?
|
| 2 |
+
|
| 3 |
+
CITATION FORMAT: 【{cursor_id}†L{start_line_id}(-L{end_line_id})?】
|
| 4 |
+
|
| 5 |
+
## Tools
|
| 6 |
+
|
| 7 |
+
You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
|
| 8 |
+
|
| 9 |
+
<|DSML|tool_calls>
|
| 10 |
+
<|DSML|invoke name="$TOOL_NAME">
|
| 11 |
+
<|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
|
| 12 |
+
...
|
| 13 |
+
</|DSML|invoke>
|
| 14 |
+
<|DSML|invoke name="$TOOL_NAME2">
|
| 15 |
+
...
|
| 16 |
+
</|DSML|invoke>
|
| 17 |
+
</|DSML|tool_calls>
|
| 18 |
+
|
| 19 |
+
String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
|
| 20 |
+
|
| 21 |
+
If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
|
| 22 |
+
|
| 23 |
+
Otherwise, output directly after </think> with tool calls or final response.
|
| 24 |
+
|
| 25 |
+
### Available Tool Schemas
|
| 26 |
+
|
| 27 |
+
{"name": "search", "description": "Web search. Split multiple queries with '||'.", "parameters": {"type": "object", "properties": {"queries": {"type": "string", "description": "query1||query2"}}, "required": ["queries"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
|
| 28 |
+
{"name": "open", "description": "Batch open IDs (format 【{id}†...】) or URLs.", "parameters": {"type": "object", "properties": {"open_list": {"type": "array", "items": {"type": "object", "properties": {"id": {"description": "ID or URL", "anyOf": [{"type": "integer"}, {"type": "string"}], "default": -1}, "cursor": {"type": "integer", "description": "", "default": -1}, "loc": {"type": "integer", "description": "Start line", "default": -1}, "num_lines": {"type": "integer", "description": "", "default": -1}, "view_source": {"type": "boolean", "description": "", "default": false}}, "additionalProperties": false}, "description": ""}}, "required": ["open_list"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
|
| 29 |
+
{"name": "find", "description": "Find exact text pattern in pages.", "parameters": {"type": "object", "properties": {"find_list": {"type": "array", "items": {"type": "object", "properties": {"pattern": {"type": "string", "description": ""}, "cursor": {"type": "integer", "description": "", "default": -1}}, "required": ["pattern"], "additionalProperties": false}, "description": ""}}, "required": ["find_list"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
|
| 30 |
+
|
| 31 |
+
You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
|
| 32 |
+
<|Assistant|><think>用户想知道小柴胡冲剂和布洛芬能否一起服用。</think>
|
| 33 |
+
|
| 34 |
+
<|DSML|tool_calls>
|
| 35 |
+
<|DSML|invoke name="search">
|
| 36 |
+
<|DSML|parameter name="queries" string="true">小柴胡冲剂 布洛芬 相互作用 一起吃</|DSML|parameter>
|
| 37 |
+
</|DSML|invoke>
|
| 38 |
+
</|DSML|tool_calls><|end▁of▁sentence|><|User|><tool_result>[0]</tool_result><|Assistant|><think>现在开始组织回答。</think>请及时就医。<|end▁of▁sentence|>
|
encoding/tests/test_output_4.txt
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<|begin▁of▁sentence|>该助手为DeepSeek-V3,由深度求索公司创造。
|
| 2 |
+
今天是2025年10月17日,星期五。<|latest_reminder|>2024-11-15,上海市,App,中文<|User|>热海大滚锅是世界著名温泉吗<|Assistant|></think>关于热海大滚锅是否是世界著名温泉,可以这样总结:它在**中国乃至全球的地热奇观中占有重要地位**,但“世界著名”的称号更侧重于它作为**独特的地质现象和旅游景点**,而非普遍的温泉疗养体验。
|
| 3 |
+
|
| 4 |
+
为了让你快速了解,我整理了一个简要的表格:
|
| 5 |
+
|
| 6 |
+
| 维度 | 热海大滚锅的地位与特点 |
|
| 7 |
+
| :--- | :--- |
|
| 8 |
+
| **地质奇观** | **世界罕见**的高温地热系统,泉眼多、水温高、形态多样。 |
|
| 9 |
+
| **国内声誉** | **中国三大地热区之一**,**国家5A级旅游景区**,自古闻名(徐霞客曾游历并记载)。 |
|
| 10 |
+
| **国际知名度** | 在一些旅游平台被提及为“世界六大温泉”之一,但此说法流传不广,其国际声誉更多建立在地质独特性上。 |
|
| 11 |
+
| **核心体验** | **观赏地热奇观**(如97℃沸腾的“大滚锅”)、**体验温泉煮鸡蛋**。 |
|
| 12 |
+
|
| 13 |
+
### 💡 游玩攻略与温馨提示
|
| 14 |
+
|
| 15 |
+
如果你计划前往热海大滚锅,这里有一些实用信息供你参考:
|
| 16 |
+
|
| 17 |
+
- **门票与开放时间**:
|
| 18 |
+
- **门票**:景区门票约为**50元/人**。如果选择包含温泉沐浴的套餐,价格会更高,例如约**288元**。
|
| 19 |
+
- **开放时间**:景区一般**08:00-18:00**开放,但具体时间可能变动,建议提前核实。
|
| 20 |
+
|
| 21 |
+
- **特色体验**:
|
| 22 |
+
- **温泉煮鸡蛋**:这几乎是必试项目。可以在景区门口购买用草绳串起的生鸡蛋(约5-8元/串),然后到“大滚锅”旁的指定区域蒸煮,几分钟便可熟食,趣味十足。
|
| 23 |
+
- **金汤足浴**:可以直接用从“大滚锅”流出的温泉水泡脚,缓解旅途疲劳。
|
| 24 |
+
|
| 25 |
+
- **注意事项**:
|
| 26 |
+
- **安全第一**:“大滚锅”水温极高,务必遵守游览规则,在指定区域内观赏,切勿随意触碰泉水。
|
| 27 |
+
- **规划行程**:建议为热海景区预留**3-4小时**的游览时间。景区内步道不走回头路,出入口有观光车接送。
|
| 28 |
+
|
| 29 |
+
希望这些信息能帮助你更好地了解热海大滚锅。如果你对腾冲的其他景点或者行程规划有更多疑问,我很乐意提供进一步的信息。<|end▁of▁sentence|><|User|>世界著名温泉有哪些<|Assistant|></think><|action|>Search<|end▁of▁sentence|>
|
generation_config.json
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 0,
|
| 4 |
+
"eos_token_id": 1,
|
| 5 |
+
"do_sample": true,
|
| 6 |
+
"temperature": 1.0,
|
| 7 |
+
"top_p": 1.0,
|
| 8 |
+
"transformers_version": "4.46.3"
|
| 9 |
+
}
|
jang_config.json
ADDED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"weight_format": "mxtq",
|
| 3 |
+
"profile": "JANGTQ2",
|
| 4 |
+
"mxtq_seed": 42,
|
| 5 |
+
"source_model": "/Users/eric/sources/DeepSeek-V4-Flash",
|
| 6 |
+
"source_config": {
|
| 7 |
+
"n_routed_experts": 256,
|
| 8 |
+
"num_hidden_layers": 43,
|
| 9 |
+
"n_hash_layers": 3
|
| 10 |
+
},
|
| 11 |
+
"mxtq_bits": {
|
| 12 |
+
"routed_expert": 2,
|
| 13 |
+
"attention": 8,
|
| 14 |
+
"shared_expert": 8,
|
| 15 |
+
"embed_tokens": 8,
|
| 16 |
+
"lm_head": 8,
|
| 17 |
+
"norms_router_hc": 16
|
| 18 |
+
},
|
| 19 |
+
"model_family": "deepseek_v4",
|
| 20 |
+
"chat": {
|
| 21 |
+
"encoder": "encoding_dsv4",
|
| 22 |
+
"encoder_fn": "encode_messages",
|
| 23 |
+
"chat_template_source": "builtin_encoding_module",
|
| 24 |
+
"has_tokenizer_chat_template": false,
|
| 25 |
+
"bos_token": "<\uff5cbegin\u2581of\u2581sentence\uff5c>",
|
| 26 |
+
"eos_token": "<\uff5cend\u2581of\u2581sentence\uff5c>",
|
| 27 |
+
"bos_token_id": 0,
|
| 28 |
+
"eos_token_id": 1,
|
| 29 |
+
"role_tokens": {
|
| 30 |
+
"user": "<\uff5cUser\uff5c>",
|
| 31 |
+
"assistant": "<\uff5cAssistant\uff5c>",
|
| 32 |
+
"latest_reminder": "<\uff5clatest_reminder\uff5c>"
|
| 33 |
+
},
|
| 34 |
+
"reasoning": {
|
| 35 |
+
"supported": true,
|
| 36 |
+
"modes": [
|
| 37 |
+
"chat",
|
| 38 |
+
"thinking"
|
| 39 |
+
],
|
| 40 |
+
"default_mode": "chat",
|
| 41 |
+
"thinking_start": "<think>",
|
| 42 |
+
"thinking_end": "</think>",
|
| 43 |
+
"reasoning_effort_levels": [
|
| 44 |
+
"max",
|
| 45 |
+
"high",
|
| 46 |
+
null
|
| 47 |
+
],
|
| 48 |
+
"drop_earlier_reasoning": true
|
| 49 |
+
},
|
| 50 |
+
"tool_calling": {
|
| 51 |
+
"supported": true,
|
| 52 |
+
"parser": "dsml",
|
| 53 |
+
"dsml_token": "\uff5cDSML\uff5c",
|
| 54 |
+
"tool_calls_block": "tool_calls",
|
| 55 |
+
"invoke_block": "invoke",
|
| 56 |
+
"parameter_block": "parameter",
|
| 57 |
+
"tool_output_tag": "tool_result"
|
| 58 |
+
},
|
| 59 |
+
"sampling_defaults": {
|
| 60 |
+
"temperature": 0.6,
|
| 61 |
+
"top_p": 0.95,
|
| 62 |
+
"max_new_tokens": 300
|
| 63 |
+
}
|
| 64 |
+
}
|
| 65 |
+
}
|
jangtq_runtime.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5e6a086cddc8b5765125dc16ab8399e9bc1a972e943f2dd3b7770944e3cc88ab
|
| 3 |
+
size 25176
|
model-00001-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:27584ef2dd716c620cbea5f9257211abfaba60169714ad7f0f65b224e848e577
|
| 3 |
+
size 1002613172
|
model-00002-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:50b5ee67ad5b0e9ab0728ee43c8c0519b05adf8dcadf816f4c66ec534c6bf71d
|
| 3 |
+
size 1003824028
|
model-00003-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bb6251d552b13f3decc35d5500131ef4e2739c2dd6553a392ea0e4777d0fa809
|
| 3 |
+
size 1003819660
|
model-00004-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0c096cff94ce97fff2e508b66a96c95db92aef89bf9dddaccd60adcae96dd785
|
| 3 |
+
size 1001781820
|
model-00005-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bd386a7e3ae69f2f1e55aae1c155c5404651b14630ac6fce2720dde779369124
|
| 3 |
+
size 1003819940
|
model-00006-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b0b4f08b756b5f44bc122d001e4620c661757fa060108693568b3d82afa52592
|
| 3 |
+
size 1003823964
|
model-00007-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e64c3674f8065f97903263c4db62bb382b8bd8fd75a957b29dfc34cdca4952fb
|
| 3 |
+
size 1000110208
|
model-00008-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:382d7dee49cd7edb64117e325dd31dc4815023de0a2741d8061f6aa2bb29d0f5
|
| 3 |
+
size 1002182192
|
model-00009-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d29dbc557d273735d706e968793ae8aeabfcba53aa34af6e9ca79bf445b868b1
|
| 3 |
+
size 1003824028
|
model-00010-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aeeb5b4212e01e4db52f10ad268d1dcd5396871d582a3f925846d2d20dac8480
|
| 3 |
+
size 1003823500
|
model-00011-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:abd3450f3aaa9fbf10cdbee071445a4373a1adb40af196f588f1f7a5b9e956f4
|
| 3 |
+
size 1000153024
|
model-00012-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b5b61e4b3e9b6a152cbb348bc670ced5d1301ad1253f958b6a7116efeec9d9ba
|
| 3 |
+
size 1003054812
|
model-00013-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:075585e5f46bdf2d7cf39fa0a310f3e31a557f1289d1475f8279617ca1a48a05
|
| 3 |
+
size 1001945680
|
model-00014-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:000cb5c9ebbb3e2605cf72f72b5bb026a9cb5753c156855a56d3e232daac4b51
|
| 3 |
+
size 1000621056
|
model-00015-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6b6da6956aada1fb13a0e5250740a29bb0581a5d3f1cfa98ff49da631ff8a5a2
|
| 3 |
+
size 1001000408
|
model-00016-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9ab2ac5aa9d0b6f525c753c5a86041ac66c3a4d1aca7b157afd5dc7cb27c5c25
|
| 3 |
+
size 1001898600
|
model-00017-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d727e16e469b72efd69aae758e1b8a01e9ff5067d29ded21fb72dfe3ae87aaa4
|
| 3 |
+
size 1001000000
|
model-00018-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9ffe810bb8b4d23dcf5b6a635b320ba2a2ee76e04412ca9ad595f7416dfa7929
|
| 3 |
+
size 1000626020
|
model-00019-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:46c89ebbd5688e4bc4aefa44ada1bc4c9db6f173d6c3051493e6047ed4104a25
|
| 3 |
+
size 1001898048
|
model-00020-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:24c525e447a679e29c40ed0e52620041fbf31f4aa5eadf64619a54b5dc8a2fef
|
| 3 |
+
size 1001000552
|
model-00021-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e179cc045cb631dfa2e5d0f5ad9faa9d77f76e3a984ae6b04f3a48678534801b
|
| 3 |
+
size 1000621056
|
model-00022-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:87cc05e846e5fa6a6bada64b0114af123820ac50c98f7f1f54ef5ac95db9b5af
|
| 3 |
+
size 1001000200
|
model-00023-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:af1f2723b697ba5db50621fcd81ab4879961aef21246985307b6b1a38416898f
|
| 3 |
+
size 1001899336
|
model-00024-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f01d67158132a8c4a20d49ed9ab8dca6795dd669cfc572d046473d58c82422df
|
| 3 |
+
size 1000701240
|
model-00025-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5d1c82848fb41c0b7a75fd998d896766d31265ad624361f57ca8e56cf99f6b52
|
| 3 |
+
size 1000927380
|
model-00026-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3dd910ce974b4c3194504d4caf554b37f6de19d4341fbf45a6f811d023dd4e9f
|
| 3 |
+
size 1001899520
|
model-00027-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:52b21f36dc12f91b1cae94923a9e21d6e5ce9d8236d6eeebba8b16ca3d31e015
|
| 3 |
+
size 1001001832
|
model-00028-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6f8ebfbec59c3aedc805c6df985155b298a21dc1da5fb8e6798254721e0fdabe
|
| 3 |
+
size 1000622544
|
model-00029-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:586f83a35d60ce5ceb4759df270f1339bdb5f489b8fe048886822af5f9b52e5a
|
| 3 |
+
size 1001001416
|
model-00030-of-00085.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7e4393478fd891260f3bf2f7f511ca032faebb0b35827fa79b253d28868976d5
|
| 3 |
+
size 1001900512
|