Text Generation
MLX
Safetensors
English
Chinese
glm_moe_dsa
apple-silicon
jang
glm
glm5
Mixture of Experts
mixture-of-experts
mla
quantized
2bit
experimental
conversational
Instructions to use JANGQ-AI/GLM-5.1-JANG_1L with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use JANGQ-AI/GLM-5.1-JANG_1L with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("JANGQ-AI/GLM-5.1-JANG_1L") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use JANGQ-AI/GLM-5.1-JANG_1L with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "JANGQ-AI/GLM-5.1-JANG_1L"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "JANGQ-AI/GLM-5.1-JANG_1L" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use JANGQ-AI/GLM-5.1-JANG_1L with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "JANGQ-AI/GLM-5.1-JANG_1L"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default JANGQ-AI/GLM-5.1-JANG_1L
Run Hermes
hermes
- MLX LM
How to use JANGQ-AI/GLM-5.1-JANG_1L with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "JANGQ-AI/GLM-5.1-JANG_1L"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "JANGQ-AI/GLM-5.1-JANG_1L" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JANGQ-AI/GLM-5.1-JANG_1L", "messages": [ {"role": "user", "content": "Hello"} ] }'
Add files using upload-large-folder tool
Browse files- README.md +139 -0
- chat_template.jinja +117 -0
- config.json +63 -0
- generation_config.json +12 -0
- jang_config.json +37 -0
- model-00002-of-00233.safetensors +3 -0
- model-00007-of-00233.safetensors +3 -0
- model-00008-of-00233.safetensors +3 -0
- model-00015-of-00233.safetensors +3 -0
- model-00028-of-00233.safetensors +3 -0
- model-00030-of-00233.safetensors +3 -0
- model-00035-of-00233.safetensors +3 -0
- model-00036-of-00233.safetensors +3 -0
- model-00043-of-00233.safetensors +3 -0
- model-00046-of-00233.safetensors +3 -0
- model-00049-of-00233.safetensors +3 -0
- model-00051-of-00233.safetensors +3 -0
- model-00054-of-00233.safetensors +3 -0
- model-00071-of-00233.safetensors +3 -0
- model-00074-of-00233.safetensors +3 -0
- model-00082-of-00233.safetensors +3 -0
- model-00084-of-00233.safetensors +3 -0
- model-00087-of-00233.safetensors +3 -0
- model-00088-of-00233.safetensors +3 -0
- model-00109-of-00233.safetensors +3 -0
- model-00111-of-00233.safetensors +3 -0
- model-00114-of-00233.safetensors +3 -0
- model-00123-of-00233.safetensors +3 -0
- model-00126-of-00233.safetensors +3 -0
- model-00129-of-00233.safetensors +3 -0
- model-00134-of-00233.safetensors +3 -0
- model-00150-of-00233.safetensors +3 -0
- model-00155-of-00233.safetensors +3 -0
- model-00156-of-00233.safetensors +3 -0
- model-00162-of-00233.safetensors +3 -0
- model-00167-of-00233.safetensors +3 -0
- model-00168-of-00233.safetensors +3 -0
- model-00175-of-00233.safetensors +3 -0
- model-00183-of-00233.safetensors +3 -0
- model-00186-of-00233.safetensors +3 -0
- model-00191-of-00233.safetensors +3 -0
- model-00194-of-00233.safetensors +3 -0
- model-00203-of-00233.safetensors +3 -0
- model-00206-of-00233.safetensors +3 -0
- model-00209-of-00233.safetensors +3 -0
- model-00223-of-00233.safetensors +3 -0
- model-00226-of-00233.safetensors +3 -0
- model-00231-of-00233.safetensors +3 -0
- model.safetensors.index.json +0 -0
- tokenizer_config.json +33 -0
README.md
ADDED
|
@@ -0,0 +1,139 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# GLM-5.1-JANG_1L
|
| 2 |
+
|
| 3 |
+
**744B-parameter Mixture-of-Experts at ~2.15 bits/weight**
|
| 4 |
+
**Created by Jinho Jang — eric@jangq.ai**
|
| 5 |
+
|
| 6 |
+
> ## ⚠ EXPERIMENTAL
|
| 7 |
+
> This is an early research release. Benchmarks (MMLU, HumanEval, GSM8K, etc.) are not yet finalized and will be uploaded in a follow-up revision. Expect rough edges in long-form reasoning outputs until tuning is complete.
|
| 8 |
+
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
## Requires MLX Studio
|
| 12 |
+
|
| 13 |
+
**This model only runs on [MLX Studio](https://mlxstudio.com/)** — Jinho Jang's native MLX inference app for Apple Silicon.
|
| 14 |
+
|
| 15 |
+
- **Standard `mlx_lm` will NOT work** with this model. MLX Studio contains a patched `deepseek_v32` runtime path that is required for coherent decode on quantized GLM-5.1 at bf16. Without the patched runtime, the model produces repetition loops during generation.
|
| 16 |
+
- MLX Studio auto-detects JANG v2 format and loads instantly via mmap (~50s on Mac Studio for this model size).
|
| 17 |
+
- All quantization, loading, and inference tuning is handled by MLX Studio — no extra setup required.
|
| 18 |
+
|
| 19 |
+
If you want to run this model and do not have MLX Studio, wait for the public release.
|
| 20 |
+
|
| 21 |
+
---
|
| 22 |
+
|
| 23 |
+
## Model summary
|
| 24 |
+
|
| 25 |
+
| Field | Value |
|
| 26 |
+
|---|---|
|
| 27 |
+
| Base architecture | GLM-5.1 (ZhipuAI / THUDM) — MoE, 744B total params, 40B active, 256 routed experts top-8, 78 transformer layers + 1 MTP |
|
| 28 |
+
| Attention | MLA (Multi-head Latent Attention) with DSA (Dense Sparse Attention) indexer |
|
| 29 |
+
| Context window | 202,752 tokens |
|
| 30 |
+
| Quantization method | **JANG_1L** — mixed-precision importance quantization (8-bit critical tier, 8-bit important tier, 2-bit compress tier) |
|
| 31 |
+
| Effective bits | **2.15 bits/weight** |
|
| 32 |
+
| On-disk size | **233 GB** |
|
| 33 |
+
| Active RAM during inference | ~235 GB (fits on 256 GB+ Apple Silicon w/ raised `iogpu.wired_limit_mb`) |
|
| 34 |
+
| Format | JANG v2 — MLX-native safetensors, instant mmap load |
|
| 35 |
+
| Source | Converted from the official GLM-5.1 FP8 release |
|
| 36 |
+
| Mode | Text-only |
|
| 37 |
+
|
| 38 |
+
**Why JANG_1L specifically?** The `JANG_1L` profile applies maximum-quality protection to the critical tensors (attention MLA `embed_q`/`unembed_out`, router gates, `lm_head`, token embeddings, MLA KV compression) while allowing the routed expert MLPs to go to 2 bits. At 744B params with 256 experts, most of the weight budget lives in the routed experts — compressing them aggressively while keeping the attention and routing fully-precise is the sweet spot for MoE at 2-bit average.
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
## Running the model
|
| 43 |
+
|
| 44 |
+
Short-form factual or instruction prompts (recommended default):
|
| 45 |
+
|
| 46 |
+
```python
|
| 47 |
+
from mlx_studio import load, generate, make_sampler
|
| 48 |
+
|
| 49 |
+
model, tokenizer = load("GLM-5.1-JANG_1L")
|
| 50 |
+
|
| 51 |
+
messages = [{"role": "user", "content": "What is the capital of France? Answer in one word."}]
|
| 52 |
+
prompt = tokenizer.apply_chat_template(
|
| 53 |
+
messages, add_generation_prompt=True, tokenize=False,
|
| 54 |
+
enable_thinking=False, # direct-answer mode for short-form prompts
|
| 55 |
+
)
|
| 56 |
+
print(generate(model, tokenizer, prompt=prompt, max_tokens=60,
|
| 57 |
+
sampler=make_sampler(temp=0.0)))
|
| 58 |
+
# → "Paris"
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
Multi-step reasoning (larger budget, thinking mode on):
|
| 62 |
+
|
| 63 |
+
```python
|
| 64 |
+
messages = [{"role": "user", "content": "If I drop a glass on a hard floor, what will happen? Explain."}]
|
| 65 |
+
prompt = tokenizer.apply_chat_template(
|
| 66 |
+
messages, add_generation_prompt=True, tokenize=False,
|
| 67 |
+
enable_thinking=True,
|
| 68 |
+
)
|
| 69 |
+
print(generate(model, tokenizer, prompt=prompt, max_tokens=1024,
|
| 70 |
+
sampler=make_sampler(temp=0.0)))
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
### Sampling recommendations
|
| 74 |
+
|
| 75 |
+
| Task | `enable_thinking` | `temp` | `top_p` | `max_tokens` |
|
| 76 |
+
|---|---|---|---|---|
|
| 77 |
+
| Short factual QA (one-word, one-number answers) | `False` | `0.0` (greedy) | — | 60 |
|
| 78 |
+
| Conversational / general | `False` | `0.7` | `0.9` | 256 |
|
| 79 |
+
| Multi-step reasoning | `True` | `0.0` or `1.0` | `0.95` | **1024+** |
|
| 80 |
+
|
| 81 |
+
**Do not** apply repetition penalty to math or factual prompts — GLM-5.1 penalizes correct-answer repetition (e.g. `"47+38=85"` becomes `"47+38=5, 7, 10"`).
|
| 82 |
+
|
| 83 |
+
Reasoning mode needs room: the `<think>...</think>` block can consume 300-800 tokens before the final answer. Budget at least 1024 `max_tokens` for any serious reasoning task.
|
| 84 |
+
|
| 85 |
+
---
|
| 86 |
+
|
| 87 |
+
## Performance snapshot (informal — full benchmarks TBD)
|
| 88 |
+
|
| 89 |
+
Tested on Apple M3 Ultra (256 GB unified memory) via MLX Studio.
|
| 90 |
+
|
| 91 |
+
| Metric | Value |
|
| 92 |
+
|---|---|
|
| 93 |
+
| Cold load time (mmap) | ~54 s |
|
| 94 |
+
| Short-form answer latency | **<1 s** after load |
|
| 95 |
+
| Reasoning generation speed | ~5–7 tok/s |
|
| 96 |
+
| RAM footprint during generation | ~235 GB wired |
|
| 97 |
+
|
| 98 |
+
### Qualitative coherence (10-prompt private benchmark, greedy)
|
| 99 |
+
|
| 100 |
+
| Mode | Coherent | Notes |
|
| 101 |
+
|---|---|---|
|
| 102 |
+
| `enable_thinking=False` short-form | **7/10** | Correct on Paris/Au/85/buenos días/sky-blue/glass-breaks/ocean poem; partial on pi digits and code one-liner; fails on multi-step word problems |
|
| 103 |
+
| `enable_thinking=True` reasoning | **9/10** coherent reasoning chains | Most prompts need `max_tokens ≥ 1000` to emit the final `</think>` + answer; some chain-of-thought reaches the correct conclusion in the `<think>` block |
|
| 104 |
+
|
| 105 |
+
**Formal benchmarks (MMLU, GSM8K, HumanEval, BBH, GPQA, etc.) coming in a follow-up revision.**
|
| 106 |
+
|
| 107 |
+
---
|
| 108 |
+
|
| 109 |
+
## Known limitations
|
| 110 |
+
|
| 111 |
+
1. **Reasoning budget** — many `enable_thinking=True` prompts need `max_tokens ≥ 1024` to fully emit their reasoning chain and final answer. Setting lower budgets will truncate mid-analysis.
|
| 112 |
+
2. **Code generation at 2-bit** — simple Python one-liners sometimes get stuck in slicing-notation patterns. Expect rough edges on code tasks until future revisions.
|
| 113 |
+
3. **Word problems under short budget** — multi-step word problems (Alice-apples-style) sometimes degenerate into numeric repetition when `enable_thinking=False`. Use `enable_thinking=True` + larger budget for any word problem that requires more than one algebraic step.
|
| 114 |
+
4. **Memory requirement** — this model requires **≥250 GB of GPU-wired memory**. On Mac Studio, verify `sysctl iogpu.wired_limit_mb` returns `250000` or higher before loading.
|
| 115 |
+
5. **MLX Studio only** — the model depends on MLX Studio's inference runtime. It will not run under stock `mlx_lm` or `mlx_vlm`. Attempting to do so will produce repetition loops during generation.
|
| 116 |
+
|
| 117 |
+
---
|
| 118 |
+
|
| 119 |
+
## Credits
|
| 120 |
+
|
| 121 |
+
- **Quantization & conversion** — Jinho Jang, `eric@jangq.ai`
|
| 122 |
+
- **Runtime** — MLX Studio by Jinho Jang
|
| 123 |
+
- **Base model** — GLM-5.1 by ZhipuAI / THUDM / zai-org (see the original model card for training data, intended use, and safety information)
|
| 124 |
+
|
| 125 |
+
All JANG tooling and MLX Studio are commercial products of Jinho Jang. Please refer to the MLX Studio project page for licensing terms.
|
| 126 |
+
|
| 127 |
+
---
|
| 128 |
+
|
| 129 |
+
## Status & roadmap
|
| 130 |
+
|
| 131 |
+
- [x] Initial conversion + runtime validation on Apple M3 Ultra
|
| 132 |
+
- [x] Short-form factual QA verified coherent
|
| 133 |
+
- [x] Reasoning mode (`enable_thinking=True`) verified coherent through multi-step chains
|
| 134 |
+
- [ ] Formal benchmark sweep — MMLU, GSM8K, HumanEval, BBH, GPQA — uploading in a follow-up revision
|
| 135 |
+
- [ ] Sampling-config tuning for code and multi-step word problems
|
| 136 |
+
- [x] **`GLM-5.1-JANG_2S` — currently converting**. JANG_2S uses the `(6, 4, 2)` bit tuple — tighter critical and important tiers vs JANG_1L's `(8, 8, 2)`, for users who want a slightly smaller file footprint at the cost of attention-layer precision. Upload to follow once conversion completes and benchmarks are run head-to-head against JANG_1L.
|
| 137 |
+
- [ ] Additional profile variants (JANG_2L, JANG_3M) under evaluation
|
| 138 |
+
|
| 139 |
+
Questions or issues — contact `eric@jangq.ai`.
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[gMASK]<sop>
|
| 2 |
+
{%- if tools -%}
|
| 3 |
+
{%- macro tool_to_json(tool) -%}
|
| 4 |
+
{%- set ns_tool = namespace(first=true) -%}
|
| 5 |
+
{{ '{' -}}
|
| 6 |
+
{%- for k, v in tool.items() -%}
|
| 7 |
+
{%- if k != 'defer_loading' and k != 'strict' -%}
|
| 8 |
+
{%- if not ns_tool.first -%}{{- ', ' -}}{%- endif -%}
|
| 9 |
+
{%- set ns_tool.first = false -%}
|
| 10 |
+
"{{ k }}": {{ v | tojson(ensure_ascii=False) }}
|
| 11 |
+
{%- endif -%}
|
| 12 |
+
{%- endfor -%}
|
| 13 |
+
{{- '}' -}}
|
| 14 |
+
{%- endmacro -%}
|
| 15 |
+
<|system|>
|
| 16 |
+
# Tools
|
| 17 |
+
|
| 18 |
+
You may call one or more functions to assist with the user query.
|
| 19 |
+
|
| 20 |
+
You are provided with function signatures within <tools></tools> XML tags:
|
| 21 |
+
<tools>
|
| 22 |
+
{% for tool in tools %}
|
| 23 |
+
{%- if 'function' in tool -%}
|
| 24 |
+
{%- set tool = tool['function'] -%}
|
| 25 |
+
{%- endif -%}
|
| 26 |
+
{% if tool.defer_loading is not defined or not tool.defer_loading %}
|
| 27 |
+
{{ tool_to_json(tool) }}
|
| 28 |
+
{% endif %}
|
| 29 |
+
{% endfor %}
|
| 30 |
+
</tools>
|
| 31 |
+
|
| 32 |
+
For each function call, output the function name and arguments within the following XML format:
|
| 33 |
+
<tool_call>{function-name}<arg_key>{arg-key-1}</arg_key><arg_value>{arg-value-1}</arg_value><arg_key>{arg-key-2}</arg_key><arg_value>{arg-value-2}</arg_value>...</tool_call>{%- endif -%}
|
| 34 |
+
{%- macro visible_text(content) -%}
|
| 35 |
+
{%- if content is string -%}
|
| 36 |
+
{{- content }}
|
| 37 |
+
{%- elif content is iterable and content is not mapping -%}
|
| 38 |
+
{%- for item in content -%}
|
| 39 |
+
{%- if item is mapping and item.type == 'text' -%}
|
| 40 |
+
{{- item.text }}
|
| 41 |
+
{%- elif item is string -%}
|
| 42 |
+
{{- item }}
|
| 43 |
+
{%- endif -%}
|
| 44 |
+
{%- endfor -%}
|
| 45 |
+
{%- else -%}
|
| 46 |
+
{{- content }}
|
| 47 |
+
{%- endif -%}
|
| 48 |
+
{%- endmacro -%}
|
| 49 |
+
{%- set ns = namespace(last_user_index=-1, thinking_indices='') -%}
|
| 50 |
+
{%- for m in messages %}
|
| 51 |
+
{%- if m.role == 'user' %}
|
| 52 |
+
{%- set ns.last_user_index = loop.index0 -%}
|
| 53 |
+
{%- elif m.role == 'assistant' %}
|
| 54 |
+
{%- if m.reasoning_content is string %}
|
| 55 |
+
{%- set ns.thinking_indices = ns.thinking_indices ~ ',' ~ ns.last_user_index ~ ',' -%}
|
| 56 |
+
{%- endif %}
|
| 57 |
+
{%- endif %}
|
| 58 |
+
{%- endfor %}
|
| 59 |
+
{%- set ns.has_thinking = false -%}
|
| 60 |
+
{%- for m in messages -%}
|
| 61 |
+
{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}{% set ns.has_thinking = (',' ~ loop.index0 ~ ',') in ns.thinking_indices -%}
|
| 62 |
+
{%- elif m.role == 'assistant' -%}
|
| 63 |
+
<|assistant|>
|
| 64 |
+
{%- set content = visible_text(m.content) %}
|
| 65 |
+
{%- if m.reasoning_content is string %}
|
| 66 |
+
{%- set reasoning_content = m.reasoning_content %}
|
| 67 |
+
{%- elif '</think>' in content %}
|
| 68 |
+
{%- set reasoning_content = content.split('</think>')[0].split('<think>')[-1] %}
|
| 69 |
+
{%- set content = content.split('</think>')[-1] %}
|
| 70 |
+
{%- elif loop.index0 > ns.last_user_index and not (enable_thinking is defined and not enable_thinking) %}
|
| 71 |
+
{%- set reasoning_content = '' %}
|
| 72 |
+
{%- elif loop.index0 < ns.last_user_index and ns.has_thinking %}
|
| 73 |
+
{%- set reasoning_content = '' %}
|
| 74 |
+
{%- endif %}
|
| 75 |
+
{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content is defined -%}
|
| 76 |
+
{{ '<think>' + reasoning_content + '</think>'}}
|
| 77 |
+
{%- else -%}
|
| 78 |
+
{{ '</think>' }}
|
| 79 |
+
{%- endif -%}
|
| 80 |
+
{%- if content.strip() -%}
|
| 81 |
+
{{ content.strip() }}
|
| 82 |
+
{%- endif -%}
|
| 83 |
+
{% if m.tool_calls %}
|
| 84 |
+
{% for tc in m.tool_calls %}
|
| 85 |
+
{%- if tc.function %}
|
| 86 |
+
{%- set tc = tc.function %}
|
| 87 |
+
{%- endif %}
|
| 88 |
+
{{- '<tool_call>' + tc.name -}}
|
| 89 |
+
{% set _args = tc.arguments %}{% for k, v in _args.items() %}<arg_key>{{ k }}</arg_key><arg_value>{{ v | tojson(ensure_ascii=False) if v is not string else v }}</arg_value>{% endfor %}</tool_call>{% endfor %}
|
| 90 |
+
{% endif %}
|
| 91 |
+
{%- elif m.role == 'tool' -%}
|
| 92 |
+
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
|
| 93 |
+
{{- '<|observation|>' -}}
|
| 94 |
+
{%- endif %}
|
| 95 |
+
{%- if m.content is string -%}
|
| 96 |
+
{{- '<tool_response>' + m.content + '</tool_response>' -}}
|
| 97 |
+
{%- else -%}
|
| 98 |
+
{{- '<tool_response><tools>\n' -}}
|
| 99 |
+
{% for tr in m.content %}
|
| 100 |
+
{%- for tool in tools -%}
|
| 101 |
+
{%- if 'function' in tool -%}
|
| 102 |
+
{%- set tool = tool['function'] -%}
|
| 103 |
+
{%- endif -%}
|
| 104 |
+
{%- if tool.name == tr.name -%}
|
| 105 |
+
{{- tool_to_json(tool) + '\n' -}}
|
| 106 |
+
{%- endif -%}
|
| 107 |
+
{%- endfor -%}
|
| 108 |
+
{%- endfor -%}
|
| 109 |
+
{{- '</tools></tool_response>' -}}
|
| 110 |
+
{% endif -%}
|
| 111 |
+
{%- elif m.role == 'system' -%}
|
| 112 |
+
<|system|>{{ visible_text(m.content) }}
|
| 113 |
+
{%- endif -%}
|
| 114 |
+
{%- endfor -%}
|
| 115 |
+
{%- if add_generation_prompt -%}
|
| 116 |
+
<|assistant|>{{- '</think>' if (enable_thinking is defined and not enable_thinking) else '<think>' -}}
|
| 117 |
+
{%- endif -%}
|
config.json
ADDED
|
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"GlmMoeDsaForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"attention_bias": false,
|
| 6 |
+
"attention_dropout": 0.0,
|
| 7 |
+
"dtype": "bfloat16",
|
| 8 |
+
"eos_token_id": [
|
| 9 |
+
154820,
|
| 10 |
+
154827,
|
| 11 |
+
154829
|
| 12 |
+
],
|
| 13 |
+
"ep_size": 1,
|
| 14 |
+
"first_k_dense_replace": 3,
|
| 15 |
+
"hidden_act": "silu",
|
| 16 |
+
"head_dim": 64,
|
| 17 |
+
"hidden_size": 6144,
|
| 18 |
+
"index_head_dim": 128,
|
| 19 |
+
"index_n_heads": 32,
|
| 20 |
+
"index_topk": 2048,
|
| 21 |
+
"indexer_rope_interleave": true,
|
| 22 |
+
"initializer_range": 0.02,
|
| 23 |
+
"intermediate_size": 12288,
|
| 24 |
+
"kv_lora_rank": 512,
|
| 25 |
+
"max_position_embeddings": 202752,
|
| 26 |
+
"moe_intermediate_size": 2048,
|
| 27 |
+
"moe_layer_freq": 1,
|
| 28 |
+
"model_type": "glm_moe_dsa",
|
| 29 |
+
"n_group": 1,
|
| 30 |
+
"n_routed_experts": 256,
|
| 31 |
+
"n_shared_experts": 1,
|
| 32 |
+
"norm_topk_prob": true,
|
| 33 |
+
"num_attention_heads": 64,
|
| 34 |
+
"num_experts_per_tok": 8,
|
| 35 |
+
"num_hidden_layers": 78,
|
| 36 |
+
"num_key_value_heads": 64,
|
| 37 |
+
"num_nextn_predict_layers": 1,
|
| 38 |
+
"pad_token_id": 154820,
|
| 39 |
+
"pretraining_tp": 1,
|
| 40 |
+
"q_lora_rank": 2048,
|
| 41 |
+
"qk_head_dim": 256,
|
| 42 |
+
"qk_nope_head_dim": 192,
|
| 43 |
+
"qk_rope_head_dim": 64,
|
| 44 |
+
"rms_norm_eps": 1e-05,
|
| 45 |
+
"rope_interleave": true,
|
| 46 |
+
"rope_parameters": {
|
| 47 |
+
"rope_theta": 1000000,
|
| 48 |
+
"rope_type": "default"
|
| 49 |
+
},
|
| 50 |
+
"routed_scaling_factor": 2.5,
|
| 51 |
+
"scoring_func": "sigmoid",
|
| 52 |
+
"tie_word_embeddings": false,
|
| 53 |
+
"topk_group": 1,
|
| 54 |
+
"topk_method": "noaux_tc",
|
| 55 |
+
"transformers_version": "5.4.0",
|
| 56 |
+
"use_cache": true,
|
| 57 |
+
"v_head_dim": 256,
|
| 58 |
+
"vocab_size": 154880,
|
| 59 |
+
"quantization": {
|
| 60 |
+
"group_size": 64,
|
| 61 |
+
"bits": 2
|
| 62 |
+
}
|
| 63 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"eos_token_id": [
|
| 4 |
+
154820,
|
| 5 |
+
154827,
|
| 6 |
+
154829
|
| 7 |
+
],
|
| 8 |
+
"pad_token_id": 154820,
|
| 9 |
+
"temperature": 1.0,
|
| 10 |
+
"top_p": 0.95,
|
| 11 |
+
"transformers_version": "5.4.0"
|
| 12 |
+
}
|
jang_config.json
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"quantization": {
|
| 3 |
+
"method": "jang-importance",
|
| 4 |
+
"profile": "JANG_1L",
|
| 5 |
+
"target_bits": 1.0,
|
| 6 |
+
"actual_bits": 2.15,
|
| 7 |
+
"block_size": 64,
|
| 8 |
+
"calibration_method": "weights",
|
| 9 |
+
"quantization_method": "mse",
|
| 10 |
+
"scoring_method": "weight-magnitude",
|
| 11 |
+
"bit_widths_used": [
|
| 12 |
+
2,
|
| 13 |
+
8
|
| 14 |
+
],
|
| 15 |
+
"quantization_scheme": "asymmetric",
|
| 16 |
+
"quantization_backend": "mx.quantize",
|
| 17 |
+
"hadamard_rotation": false
|
| 18 |
+
},
|
| 19 |
+
"source_model": {
|
| 20 |
+
"name": "GLM-5.1-FP8",
|
| 21 |
+
"dtype": "bfloat16",
|
| 22 |
+
"parameters": "30.4B"
|
| 23 |
+
},
|
| 24 |
+
"architecture": {
|
| 25 |
+
"type": "moe",
|
| 26 |
+
"attention": "mla",
|
| 27 |
+
"has_vision": false,
|
| 28 |
+
"has_ssm": false,
|
| 29 |
+
"has_moe": true
|
| 30 |
+
},
|
| 31 |
+
"runtime": {
|
| 32 |
+
"total_weight_bytes": 212140032,
|
| 33 |
+
"total_weight_gb": 0.2
|
| 34 |
+
},
|
| 35 |
+
"format": "jang",
|
| 36 |
+
"format_version": "2.0"
|
| 37 |
+
}
|
model-00002-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4b083b03daf0e45c3a913730a2ba3cab0c062a75f2e84d6868b22a6c0dfaa5bf
|
| 3 |
+
size 1011056984
|
model-00007-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d0e6e638fd1fd1f270ccaf35b6c04f080864d6a8c0491280aee4ecd434448230
|
| 3 |
+
size 1232036160
|
model-00008-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bb3083b71d6e51223540bfed9e2232fa37de3b7c60271b45b56d4f9ad714ee63
|
| 3 |
+
size 1006633376
|
model-00015-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b813524f44153ebf8608822e932d1057a7b4f08b65e8b03caf325088f1c2cda6
|
| 3 |
+
size 1006633368
|
model-00028-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1d8096a3d4fe83103dd83d02c0fb53f4f8ea2a463ebf5492d2cf8e16e7b3830f
|
| 3 |
+
size 1232036160
|
model-00030-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:96bd901a08a230f7afc2b2c8df6f4e84b218f8226bf91639d24b90a76cc3501a
|
| 3 |
+
size 1006633368
|
model-00035-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b6f2dccd74bf617d7302a10a0f0d972e2e557b478f45aa105737d8ca8e875067
|
| 3 |
+
size 1055654984
|
model-00036-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:23d07779e7de0af7cae9d50d81bc77f83956072c2e77c83ebde36bc1084f3e95
|
| 3 |
+
size 1006633376
|
model-00043-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3286f53888143861ea8e8be1fdfc3a1c0ec5f2cfc4c7a9bdb2b55ebdec8191af
|
| 3 |
+
size 1006633368
|
model-00046-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6a7e3feced79bd137bbfd32c5b81c029d0179a07d90a023621e9647b23fe90c8
|
| 3 |
+
size 1006633368
|
model-00049-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7bbf83247bc463f3da51625e9d86d9204198af0fb59c5ab7d199ce8c10631afb
|
| 3 |
+
size 1006633368
|
model-00051-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c5376136846c69cf9a59ad1fafb251ad874170fc955b52813a526afc5461ea5c
|
| 3 |
+
size 1006633376
|
model-00054-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4d2aa474f6756e0d88b95a12e09225825415eaeee5480e415fbfbe17144672db
|
| 3 |
+
size 1006633376
|
model-00071-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:877db64a210f053d08c2e1611a771df81822fa732d0e50aabb57e77c849b0b6b
|
| 3 |
+
size 1232036160
|
model-00074-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:87adbf03d65886d8596efc9a0ce58ab4132e0de65a2cc368ef52c96583d14174
|
| 3 |
+
size 1232036160
|
model-00082-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e937b5f49b0a8ccafbe32c3e72f98e1bee87f565bd89b53fd2715205d0ce895b
|
| 3 |
+
size 1006633368
|
model-00084-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:50a0fffef02cb093407facda820819895c88210515861e942c643dd15362fbb2
|
| 3 |
+
size 1006633376
|
model-00087-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c9429659347404203391aa1428d38741b59103b418b1e12566a392d6a8810d23
|
| 3 |
+
size 1006633376
|
model-00088-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:176cb357b5973631069735c7cd26ccd2d90e6d7117014a11b044724773a6a73e
|
| 3 |
+
size 1006633368
|
model-00109-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8a63bb751dfc2d760ce6763ff12cf04bde2ba69f31ab94be97217872a9e14c75
|
| 3 |
+
size 1006633368
|
model-00111-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b04acfa8728c4d80fb0f34ef2705c72e9115b98bad6871fd84c3ca038da35ee8
|
| 3 |
+
size 1006633376
|
model-00114-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3a88f631625296bc6cb3dc8b35e7dd5334d886cb93c52a648889954054e3f83d
|
| 3 |
+
size 1006633376
|
model-00123-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f41c1ce4f892d2cbbc9c28de486eac038dfbace2f207c9cbd3bc9ac2d9871e0c
|
| 3 |
+
size 1006633376
|
model-00126-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e3f0946a6c8abc9fa9a79556e789d75aec4cf8d76f66419262664de60e7e24d9
|
| 3 |
+
size 1006633376
|
model-00129-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:21f54fbea33dc5961f8e72ece60b0c56c24fdde4bc70466e530f8b4fab30cd32
|
| 3 |
+
size 1006633376
|
model-00134-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c05416a586e52735e2d7f429143ac4074abb3205d17c11933b9a7f94c6784c20
|
| 3 |
+
size 1232036128
|
model-00150-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:85b7c0635552217b5afa44212b7b595909ddf3fd77c5f0a2035613bcb630b7f8
|
| 3 |
+
size 1006633376
|
model-00155-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c1aaccf73a9ea7e67c3366e905d796f2e45dfc6ae4ed3963c139e7c06b3c6f60
|
| 3 |
+
size 1232036160
|
model-00156-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:92f09ce859ce4f419fc8652c7f66c0f0ae30a889ccff3426e44aaf1c3aad6b62
|
| 3 |
+
size 1006633376
|
model-00162-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c2acc3fc2afce5ec529f3a2cf53e7b7479e162e4ca89fff181a1bf191608c47e
|
| 3 |
+
size 1006633376
|
model-00167-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c762c0ec1aa5a0b338bfbf048a345773817dad9222c888b8ea2aa98485bb3ab3
|
| 3 |
+
size 1232036128
|
model-00168-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:24ee68158f22d267376b02a32178b366572c3ce370188052b6186717552bddfe
|
| 3 |
+
size 1006633376
|
model-00175-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ff9a70353b3092bfa2b81959fb39a376099dac54b319536b60e3572224a921b8
|
| 3 |
+
size 1006633368
|
model-00183-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e53137fada9f42bab79090179afea5ee6ec5ec673c3a046ce9a6d6619d4021be
|
| 3 |
+
size 1006633376
|
model-00186-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:97bc971b4d326466ea6498769cfeb162ee69d62851f3f1689c3752256f088665
|
| 3 |
+
size 1006633376
|
model-00191-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:504bc9236ad48f75686b5409a21836b8b361767e78698e8058e940332af99fd3
|
| 3 |
+
size 1232036160
|
model-00194-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:58f06e8b94017e1cd1018935348821c4afc8ce26a5cee6b01afb455e78f208c2
|
| 3 |
+
size 1232036160
|
model-00203-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ed013916d8d2be40d61c707bb1e9948d3d2e307692c20b6fca99d414f149cab0
|
| 3 |
+
size 1232036160
|
model-00206-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6e35da67a89b73d45b64ba234457b49e18364f6601c89940a198eba9f9c36742
|
| 3 |
+
size 1232036160
|
model-00209-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:83740ff561eb004abc8fe0bbe6de7f27c6bc7a95bdc0cedfd03876ae2336af7e
|
| 3 |
+
size 1232036160
|
model-00223-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a1f2c96b163f8d7b8871a3eaf2726e363c046235a161f67643c6a8e05fc0e099
|
| 3 |
+
size 1006633368
|
model-00226-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cde75d4dbd03a11fb23be36d267c177f259e6af0b3c842348cf3bfbb63e0baa5
|
| 3 |
+
size 1006633368
|
model-00231-of-00233.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:26441724dd595ab2ad9d5ee6ab1a144663133dbca531ef460c37f61bb5af84d0
|
| 3 |
+
size 1006633368
|
model.safetensors.index.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"backend": "tokenizers",
|
| 3 |
+
"clean_up_tokenization_spaces": false,
|
| 4 |
+
"do_lower_case": false,
|
| 5 |
+
"eos_token": "<|endoftext|>",
|
| 6 |
+
"extra_special_tokens": [
|
| 7 |
+
"<|endoftext|>",
|
| 8 |
+
"[MASK]",
|
| 9 |
+
"[gMASK]",
|
| 10 |
+
"[sMASK]",
|
| 11 |
+
"<sop>",
|
| 12 |
+
"<eop>",
|
| 13 |
+
"<|system|>",
|
| 14 |
+
"<|user|>",
|
| 15 |
+
"<|assistant|>",
|
| 16 |
+
"<|observation|>",
|
| 17 |
+
"<|begin_of_image|>",
|
| 18 |
+
"<|end_of_image|>",
|
| 19 |
+
"<|begin_of_video|>",
|
| 20 |
+
"<|end_of_video|>",
|
| 21 |
+
"<|begin_of_audio|>",
|
| 22 |
+
"<|end_of_audio|>",
|
| 23 |
+
"<|begin_of_transcription|>",
|
| 24 |
+
"<|end_of_transcription|>"
|
| 25 |
+
],
|
| 26 |
+
"is_local": true,
|
| 27 |
+
"model_max_length": 202752,
|
| 28 |
+
"model_specific_special_tokens": {},
|
| 29 |
+
"pad_token": "<|endoftext|>",
|
| 30 |
+
"padding_side": "left",
|
| 31 |
+
"remove_space": false,
|
| 32 |
+
"tokenizer_class": "TokenizersBackend"
|
| 33 |
+
}
|