dealignai commited on Apr 11

Commit

0153563

verified ·

1 Parent(s): ef498cd

Add files using upload-large-folder tool

Browse files

Files changed (50) hide show

README.md +139 -0
chat_template.jinja +117 -0
config.json +63 -0
generation_config.json +12 -0
jang_config.json +37 -0
model-00002-of-00233.safetensors +3 -0
model-00007-of-00233.safetensors +3 -0
model-00008-of-00233.safetensors +3 -0
model-00015-of-00233.safetensors +3 -0
model-00028-of-00233.safetensors +3 -0
model-00030-of-00233.safetensors +3 -0
model-00035-of-00233.safetensors +3 -0
model-00036-of-00233.safetensors +3 -0
model-00043-of-00233.safetensors +3 -0
model-00046-of-00233.safetensors +3 -0
model-00049-of-00233.safetensors +3 -0
model-00051-of-00233.safetensors +3 -0
model-00054-of-00233.safetensors +3 -0
model-00071-of-00233.safetensors +3 -0
model-00074-of-00233.safetensors +3 -0
model-00082-of-00233.safetensors +3 -0
model-00084-of-00233.safetensors +3 -0
model-00087-of-00233.safetensors +3 -0
model-00088-of-00233.safetensors +3 -0
model-00109-of-00233.safetensors +3 -0
model-00111-of-00233.safetensors +3 -0
model-00114-of-00233.safetensors +3 -0
model-00123-of-00233.safetensors +3 -0
model-00126-of-00233.safetensors +3 -0
model-00129-of-00233.safetensors +3 -0
model-00134-of-00233.safetensors +3 -0
model-00150-of-00233.safetensors +3 -0
model-00155-of-00233.safetensors +3 -0
model-00156-of-00233.safetensors +3 -0
model-00162-of-00233.safetensors +3 -0
model-00167-of-00233.safetensors +3 -0
model-00168-of-00233.safetensors +3 -0
model-00175-of-00233.safetensors +3 -0
model-00183-of-00233.safetensors +3 -0
model-00186-of-00233.safetensors +3 -0
model-00191-of-00233.safetensors +3 -0
model-00194-of-00233.safetensors +3 -0
model-00203-of-00233.safetensors +3 -0
model-00206-of-00233.safetensors +3 -0
model-00209-of-00233.safetensors +3 -0
model-00223-of-00233.safetensors +3 -0
model-00226-of-00233.safetensors +3 -0
model-00231-of-00233.safetensors +3 -0
model.safetensors.index.json +0 -0
tokenizer_config.json +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,139 @@

+# GLM-5.1-JANG_1L
+**744B-parameter Mixture-of-Experts at ~2.15 bits/weight**
+**Created by Jinho Jang — eric@jangq.ai**
+> ## ⚠ EXPERIMENTAL
+> This is an early research release. Benchmarks (MMLU, HumanEval, GSM8K, etc.) are not yet finalized and will be uploaded in a follow-up revision. Expect rough edges in long-form reasoning outputs until tuning is complete.
+---
+## Requires MLX Studio
+**This model only runs on [MLX Studio](https://mlxstudio.com/)** — Jinho Jang's native MLX inference app for Apple Silicon.
+- **Standard `mlx_lm` will NOT work** with this model. MLX Studio contains a patched `deepseek_v32` runtime path that is required for coherent decode on quantized GLM-5.1 at bf16. Without the patched runtime, the model produces repetition loops during generation.
+- MLX Studio auto-detects JANG v2 format and loads instantly via mmap (~50s on Mac Studio for this model size).
+- All quantization, loading, and inference tuning is handled by MLX Studio — no extra setup required.
+If you want to run this model and do not have MLX Studio, wait for the public release.
+---
+## Model summary
+| Field | Value |
+|---|---|
+| Base architecture | GLM-5.1 (ZhipuAI / THUDM) — MoE, 744B total params, 40B active, 256 routed experts top-8, 78 transformer layers + 1 MTP |
+| Attention | MLA (Multi-head Latent Attention) with DSA (Dense Sparse Attention) indexer |
+| Context window | 202,752 tokens |
+| Quantization method | **JANG_1L** — mixed-precision importance quantization (8-bit critical tier, 8-bit important tier, 2-bit compress tier) |
+| Effective bits | **2.15 bits/weight** |
+| On-disk size | **233 GB** |
+| Active RAM during inference | ~235 GB (fits on 256 GB+ Apple Silicon w/ raised `iogpu.wired_limit_mb`) |
+| Format | JANG v2 — MLX-native safetensors, instant mmap load |
+| Source | Converted from the official GLM-5.1 FP8 release |
+| Mode | Text-only |
+**Why JANG_1L specifically?** The `JANG_1L` profile applies maximum-quality protection to the critical tensors (attention MLA `embed_q`/`unembed_out`, router gates, `lm_head`, token embeddings, MLA KV compression) while allowing the routed expert MLPs to go to 2 bits. At 744B params with 256 experts, most of the weight budget lives in the routed experts — compressing them aggressively while keeping the attention and routing fully-precise is the sweet spot for MoE at 2-bit average.
+---
+## Running the model
+Short-form factual or instruction prompts (recommended default):
+```python
+from mlx_studio import load, generate, make_sampler
+model, tokenizer = load("GLM-5.1-JANG_1L")
+messages = [{"role": "user", "content": "What is the capital of France? Answer in one word."}]
+prompt = tokenizer.apply_chat_template(
+    messages, add_generation_prompt=True, tokenize=False,
+    enable_thinking=False,   # direct-answer mode for short-form prompts
+)
+print(generate(model, tokenizer, prompt=prompt, max_tokens=60,
+               sampler=make_sampler(temp=0.0)))
+# → "Paris"
+```
+Multi-step reasoning (larger budget, thinking mode on):
+```python
+messages = [{"role": "user", "content": "If I drop a glass on a hard floor, what will happen? Explain."}]
+prompt = tokenizer.apply_chat_template(
+    messages, add_generation_prompt=True, tokenize=False,
+    enable_thinking=True,
+)
+print(generate(model, tokenizer, prompt=prompt, max_tokens=1024,
+               sampler=make_sampler(temp=0.0)))
+```
+### Sampling recommendations
+| Task | `enable_thinking` | `temp` | `top_p` | `max_tokens` |
+|---|---|---|---|---|
+| Short factual QA (one-word, one-number answers) | `False` | `0.0` (greedy) | — | 60 |
+| Conversational / general | `False` | `0.7` | `0.9` | 256 |
+| Multi-step reasoning | `True` | `0.0` or `1.0` | `0.95` | **1024+** |
+**Do not** apply repetition penalty to math or factual prompts — GLM-5.1 penalizes correct-answer repetition (e.g. `"47+38=85"` becomes `"47+38=5, 7, 10"`).
+Reasoning mode needs room: the `<think>...</think>` block can consume 300-800 tokens before the final answer. Budget at least 1024 `max_tokens` for any serious reasoning task.
+---
+## Performance snapshot (informal — full benchmarks TBD)
+Tested on Apple M3 Ultra (256 GB unified memory) via MLX Studio.
+| Metric | Value |
+|---|---|
+| Cold load time (mmap) | ~54 s |
+| Short-form answer latency | **<1 s** after load |
+| Reasoning generation speed | ~5–7 tok/s |
+| RAM footprint during generation | ~235 GB wired |
+### Qualitative coherence (10-prompt private benchmark, greedy)
+| Mode | Coherent | Notes |
+|---|---|---|
+| `enable_thinking=False` short-form | **7/10** | Correct on Paris/Au/85/buenos días/sky-blue/glass-breaks/ocean poem; partial on pi digits and code one-liner; fails on multi-step word problems |
+| `enable_thinking=True` reasoning | **9/10** coherent reasoning chains | Most prompts need `max_tokens ≥ 1000` to emit the final `</think>` + answer; some chain-of-thought reaches the correct conclusion in the `<think>` block |
+**Formal benchmarks (MMLU, GSM8K, HumanEval, BBH, GPQA, etc.) coming in a follow-up revision.**
+---
+## Known limitations
+1. **Reasoning budget** — many `enable_thinking=True` prompts need `max_tokens ≥ 1024` to fully emit their reasoning chain and final answer. Setting lower budgets will truncate mid-analysis.
+2. **Code generation at 2-bit** — simple Python one-liners sometimes get stuck in slicing-notation patterns. Expect rough edges on code tasks until future revisions.
+3. **Word problems under short budget** — multi-step word problems (Alice-apples-style) sometimes degenerate into numeric repetition when `enable_thinking=False`. Use `enable_thinking=True` + larger budget for any word problem that requires more than one algebraic step.
+4. **Memory requirement** — this model requires **≥250 GB of GPU-wired memory**. On Mac Studio, verify `sysctl iogpu.wired_limit_mb` returns `250000` or higher before loading.
+5. **MLX Studio only** — the model depends on MLX Studio's inference runtime. It will not run under stock `mlx_lm` or `mlx_vlm`. Attempting to do so will produce repetition loops during generation.
+---
+## Credits
+- **Quantization & conversion** — Jinho Jang, `eric@jangq.ai`
+- **Runtime** — MLX Studio by Jinho Jang
+- **Base model** — GLM-5.1 by ZhipuAI / THUDM / zai-org (see the original model card for training data, intended use, and safety information)
+All JANG tooling and MLX Studio are commercial products of Jinho Jang. Please refer to the MLX Studio project page for licensing terms.
+---
+## Status & roadmap
+- [x] Initial conversion + runtime validation on Apple M3 Ultra
+- [x] Short-form factual QA verified coherent
+- [x] Reasoning mode (`enable_thinking=True`) verified coherent through multi-step chains
+- [ ] Formal benchmark sweep — MMLU, GSM8K, HumanEval, BBH, GPQA — uploading in a follow-up revision
+- [ ] Sampling-config tuning for code and multi-step word problems
+- [x] **`GLM-5.1-JANG_2S` — currently converting**. JANG_2S uses the `(6, 4, 2)` bit tuple — tighter critical and important tiers vs JANG_1L's `(8, 8, 2)`, for users who want a slightly smaller file footprint at the cost of attention-layer precision. Upload to follow once conversion completes and benchmarks are run head-to-head against JANG_1L.
+- [ ] Additional profile variants (JANG_2L, JANG_3M) under evaluation
+Questions or issues — contact `eric@jangq.ai`.

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,117 @@

+[gMASK]<sop>
+{%- if tools -%}
+{%- macro tool_to_json(tool) -%}
+    {%- set ns_tool = namespace(first=true) -%}
+    {{ '{' -}}
+    {%- for k, v in tool.items() -%}
+        {%- if k != 'defer_loading' and k != 'strict' -%}
+            {%- if not ns_tool.first -%}{{- ', ' -}}{%- endif -%}
+            {%- set ns_tool.first = false -%}
+            "{{ k }}": {{ v | tojson(ensure_ascii=False) }}
+        {%- endif -%}
+    {%- endfor -%}
+    {{- '}' -}}
+{%- endmacro -%}
+<|system|>
+# Tools
+You may call one or more functions to assist with the user query.
+You are provided with function signatures within <tools></tools> XML tags:
+<tools>
+{% for tool in tools %}
+{%- if 'function' in tool -%}
+    {%- set tool = tool['function'] -%}
+{%- endif -%}
+{% if tool.defer_loading is not defined or not tool.defer_loading %}
+{{ tool_to_json(tool) }}
+{% endif %}
+{% endfor %}
+</tools>
+For each function call, output the function name and arguments within the following XML format:
+<tool_call>{function-name}<arg_key>{arg-key-1}</arg_key><arg_value>{arg-value-1}</arg_value><arg_key>{arg-key-2}</arg_key><arg_value>{arg-value-2}</arg_value>...</tool_call>{%- endif -%}
+{%- macro visible_text(content) -%}
+    {%- if content is string -%}
+        {{- content }}
+    {%- elif content is iterable and content is not mapping -%}
+        {%- for item in content -%}
+            {%- if item is mapping and item.type == 'text' -%}
+                {{- item.text }}
+            {%- elif item is string -%}
+                {{- item }}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- else -%}
+        {{- content }}
+    {%- endif -%}
+{%- endmacro -%}
+{%- set ns = namespace(last_user_index=-1, thinking_indices='') -%}
+{%- for m in messages %}
+    {%- if m.role == 'user' %}
+        {%- set ns.last_user_index = loop.index0 -%}
+    {%- elif m.role == 'assistant' %}
+        {%- if m.reasoning_content is string %}
+            {%- set ns.thinking_indices = ns.thinking_indices ~ ',' ~ ns.last_user_index ~ ',' -%}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- set ns.has_thinking = false -%}
+{%- for m in messages -%}
+{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}{% set ns.has_thinking = (',' ~ loop.index0 ~ ',') in ns.thinking_indices -%}
+{%- elif m.role == 'assistant' -%}
+<|assistant|>
+{%- set content = visible_text(m.content) %}
+{%- if m.reasoning_content is string %}
+    {%- set reasoning_content = m.reasoning_content %}
+{%- elif '</think>' in content %}
+    {%- set reasoning_content = content.split('</think>')[0].split('<think>')[-1] %}
+    {%- set content = content.split('</think>')[-1] %}
+{%- elif loop.index0 > ns.last_user_index and not (enable_thinking is defined and not enable_thinking) %}
+    {%- set reasoning_content = '' %}
+{%- elif loop.index0 < ns.last_user_index and ns.has_thinking %}
+    {%- set reasoning_content = '' %}
+{%- endif %}
+{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content is defined -%}
+{{ '<think>' + reasoning_content +  '</think>'}}
+{%- else -%}
+{{ '</think>' }}
+{%- endif -%}
+{%- if content.strip() -%}
+{{ content.strip() }}
+{%- endif -%}
+{% if m.tool_calls %}
+{% for tc in m.tool_calls %}
+{%- if tc.function %}
+    {%- set tc = tc.function %}
+{%- endif %}
+{{- '<tool_call>' + tc.name -}}
+{% set _args = tc.arguments %}{% for k, v in _args.items() %}<arg_key>{{ k }}</arg_key><arg_value>{{ v | tojson(ensure_ascii=False) if v is not string else v }}</arg_value>{% endfor %}</tool_call>{% endfor %}
+{% endif %}
+{%- elif m.role == 'tool' -%}
+{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+    {{- '<|observation|>' -}}
+{%- endif %}
+{%- if m.content is string -%}
+    {{- '<tool_response>' + m.content + '</tool_response>' -}}
+{%- else -%}
+    {{- '<tool_response><tools>\n' -}}
+    {% for tr in m.content %}
+        {%- for tool in tools -%}
+            {%- if 'function' in tool -%}
+                {%- set tool = tool['function'] -%}
+            {%- endif -%}
+            {%- if tool.name == tr.name -%}
+                {{- tool_to_json(tool) + '\n' -}}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- endfor -%}
+    {{- '</tools></tool_response>' -}}
+{% endif -%}
+{%- elif m.role == 'system' -%}
+<|system|>{{ visible_text(m.content) }}
+{%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    <|assistant|>{{- '</think>' if (enable_thinking is defined and not enable_thinking) else '<think>' -}}
+{%- endif -%}

config.json ADDED Viewed

	@@ -0,0 +1,63 @@

+{
+  "architectures": [
+    "GlmMoeDsaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "dtype": "bfloat16",
+  "eos_token_id": [
+    154820,
+    154827,
+    154829
+  ],
+  "ep_size": 1,
+  "first_k_dense_replace": 3,
+  "hidden_act": "silu",
+  "head_dim": 64,
+  "hidden_size": 6144,
+  "index_head_dim": 128,
+  "index_n_heads": 32,
+  "index_topk": 2048,
+  "indexer_rope_interleave": true,
+  "initializer_range": 0.02,
+  "intermediate_size": 12288,
+  "kv_lora_rank": 512,
+  "max_position_embeddings": 202752,
+  "moe_intermediate_size": 2048,
+  "moe_layer_freq": 1,
+  "model_type": "glm_moe_dsa",
+  "n_group": 1,
+  "n_routed_experts": 256,
+  "n_shared_experts": 1,
+  "norm_topk_prob": true,
+  "num_attention_heads": 64,
+  "num_experts_per_tok": 8,
+  "num_hidden_layers": 78,
+  "num_key_value_heads": 64,
+  "num_nextn_predict_layers": 1,
+  "pad_token_id": 154820,
+  "pretraining_tp": 1,
+  "q_lora_rank": 2048,
+  "qk_head_dim": 256,
+  "qk_nope_head_dim": 192,
+  "qk_rope_head_dim": 64,
+  "rms_norm_eps": 1e-05,
+  "rope_interleave": true,
+  "rope_parameters": {
+    "rope_theta": 1000000,
+    "rope_type": "default"
+  },
+  "routed_scaling_factor": 2.5,
+  "scoring_func": "sigmoid",
+  "tie_word_embeddings": false,
+  "topk_group": 1,
+  "topk_method": "noaux_tc",
+  "transformers_version": "5.4.0",
+  "use_cache": true,
+  "v_head_dim": 256,
+  "vocab_size": 154880,
+  "quantization": {
+    "group_size": 64,
+    "bits": 2
+  }
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "_from_model_config": true,
+  "eos_token_id": [
+    154820,
+    154827,
+    154829
+  ],
+  "pad_token_id": 154820,
+  "temperature": 1.0,
+  "top_p": 0.95,
+  "transformers_version": "5.4.0"
+}

jang_config.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "quantization": {
+    "method": "jang-importance",
+    "profile": "JANG_1L",
+    "target_bits": 1.0,
+    "actual_bits": 2.15,
+    "block_size": 64,
+    "calibration_method": "weights",
+    "quantization_method": "mse",
+    "scoring_method": "weight-magnitude",
+    "bit_widths_used": [
+      2,
+      8
+    ],
+    "quantization_scheme": "asymmetric",
+    "quantization_backend": "mx.quantize",
+    "hadamard_rotation": false
+  },
+  "source_model": {
+    "name": "GLM-5.1-FP8",
+    "dtype": "bfloat16",
+    "parameters": "30.4B"
+  },
+  "architecture": {
+    "type": "moe",
+    "attention": "mla",
+    "has_vision": false,
+    "has_ssm": false,
+    "has_moe": true
+  },
+  "runtime": {
+    "total_weight_bytes": 212140032,
+    "total_weight_gb": 0.2
+  },
+  "format": "jang",
+  "format_version": "2.0"
+}

model-00002-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4b083b03daf0e45c3a913730a2ba3cab0c062a75f2e84d6868b22a6c0dfaa5bf
+size 1011056984

model-00007-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0e6e638fd1fd1f270ccaf35b6c04f080864d6a8c0491280aee4ecd434448230
+size 1232036160

model-00008-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bb3083b71d6e51223540bfed9e2232fa37de3b7c60271b45b56d4f9ad714ee63
+size 1006633376

model-00015-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b813524f44153ebf8608822e932d1057a7b4f08b65e8b03caf325088f1c2cda6
+size 1006633368

model-00028-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1d8096a3d4fe83103dd83d02c0fb53f4f8ea2a463ebf5492d2cf8e16e7b3830f
+size 1232036160

model-00030-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:96bd901a08a230f7afc2b2c8df6f4e84b218f8226bf91639d24b90a76cc3501a
+size 1006633368

model-00035-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b6f2dccd74bf617d7302a10a0f0d972e2e557b478f45aa105737d8ca8e875067
+size 1055654984

model-00036-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:23d07779e7de0af7cae9d50d81bc77f83956072c2e77c83ebde36bc1084f3e95
+size 1006633376

model-00043-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3286f53888143861ea8e8be1fdfc3a1c0ec5f2cfc4c7a9bdb2b55ebdec8191af
+size 1006633368

model-00046-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a7e3feced79bd137bbfd32c5b81c029d0179a07d90a023621e9647b23fe90c8
+size 1006633368

model-00049-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7bbf83247bc463f3da51625e9d86d9204198af0fb59c5ab7d199ce8c10631afb
+size 1006633368

model-00051-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c5376136846c69cf9a59ad1fafb251ad874170fc955b52813a526afc5461ea5c
+size 1006633376

model-00054-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4d2aa474f6756e0d88b95a12e09225825415eaeee5480e415fbfbe17144672db
+size 1006633376

model-00071-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:877db64a210f053d08c2e1611a771df81822fa732d0e50aabb57e77c849b0b6b
+size 1232036160

model-00074-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:87adbf03d65886d8596efc9a0ce58ab4132e0de65a2cc368ef52c96583d14174
+size 1232036160

model-00082-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e937b5f49b0a8ccafbe32c3e72f98e1bee87f565bd89b53fd2715205d0ce895b
+size 1006633368

model-00084-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:50a0fffef02cb093407facda820819895c88210515861e942c643dd15362fbb2
+size 1006633376

model-00087-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c9429659347404203391aa1428d38741b59103b418b1e12566a392d6a8810d23
+size 1006633376

model-00088-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:176cb357b5973631069735c7cd26ccd2d90e6d7117014a11b044724773a6a73e
+size 1006633368

model-00109-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a63bb751dfc2d760ce6763ff12cf04bde2ba69f31ab94be97217872a9e14c75
+size 1006633368

model-00111-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b04acfa8728c4d80fb0f34ef2705c72e9115b98bad6871fd84c3ca038da35ee8
+size 1006633376

model-00114-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3a88f631625296bc6cb3dc8b35e7dd5334d886cb93c52a648889954054e3f83d
+size 1006633376

model-00123-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f41c1ce4f892d2cbbc9c28de486eac038dfbace2f207c9cbd3bc9ac2d9871e0c
+size 1006633376

model-00126-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e3f0946a6c8abc9fa9a79556e789d75aec4cf8d76f66419262664de60e7e24d9
+size 1006633376

model-00129-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21f54fbea33dc5961f8e72ece60b0c56c24fdde4bc70466e530f8b4fab30cd32
+size 1006633376

model-00134-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c05416a586e52735e2d7f429143ac4074abb3205d17c11933b9a7f94c6784c20
+size 1232036128

model-00150-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:85b7c0635552217b5afa44212b7b595909ddf3fd77c5f0a2035613bcb630b7f8
+size 1006633376

model-00155-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c1aaccf73a9ea7e67c3366e905d796f2e45dfc6ae4ed3963c139e7c06b3c6f60
+size 1232036160

model-00156-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:92f09ce859ce4f419fc8652c7f66c0f0ae30a889ccff3426e44aaf1c3aad6b62
+size 1006633376

model-00162-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c2acc3fc2afce5ec529f3a2cf53e7b7479e162e4ca89fff181a1bf191608c47e
+size 1006633376

model-00167-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c762c0ec1aa5a0b338bfbf048a345773817dad9222c888b8ea2aa98485bb3ab3
+size 1232036128

model-00168-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:24ee68158f22d267376b02a32178b366572c3ce370188052b6186717552bddfe
+size 1006633376

model-00175-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ff9a70353b3092bfa2b81959fb39a376099dac54b319536b60e3572224a921b8
+size 1006633368

model-00183-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e53137fada9f42bab79090179afea5ee6ec5ec673c3a046ce9a6d6619d4021be
+size 1006633376

model-00186-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:97bc971b4d326466ea6498769cfeb162ee69d62851f3f1689c3752256f088665
+size 1006633376

model-00191-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:504bc9236ad48f75686b5409a21836b8b361767e78698e8058e940332af99fd3
+size 1232036160

model-00194-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58f06e8b94017e1cd1018935348821c4afc8ce26a5cee6b01afb455e78f208c2
+size 1232036160

model-00203-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed013916d8d2be40d61c707bb1e9948d3d2e307692c20b6fca99d414f149cab0
+size 1232036160

model-00206-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e35da67a89b73d45b64ba234457b49e18364f6601c89940a198eba9f9c36742
+size 1232036160

model-00209-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:83740ff561eb004abc8fe0bbe6de7f27c6bc7a95bdc0cedfd03876ae2336af7e
+size 1232036160

model-00223-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a1f2c96b163f8d7b8871a3eaf2726e363c046235a161f67643c6a8e05fc0e099
+size 1006633368

model-00226-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cde75d4dbd03a11fb23be36d267c177f259e6af0b3c842348cf3bfbb63e0baa5
+size 1006633368

model-00231-of-00233.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:26441724dd595ab2ad9d5ee6ab1a144663133dbca531ef460c37f61bb5af84d0
+size 1006633368

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "backend": "tokenizers",
+  "clean_up_tokenization_spaces": false,
+  "do_lower_case": false,
+  "eos_token": "<|endoftext|>",
+  "extra_special_tokens": [
+    "<|endoftext|>",
+    "[MASK]",
+    "[gMASK]",
+    "[sMASK]",
+    "<sop>",
+    "<eop>",
+    "<|system|>",
+    "<|user|>",
+    "<|assistant|>",
+    "<|observation|>",
+    "<|begin_of_image|>",
+    "<|end_of_image|>",
+    "<|begin_of_video|>",
+    "<|end_of_video|>",
+    "<|begin_of_audio|>",
+    "<|end_of_audio|>",
+    "<|begin_of_transcription|>",
+    "<|end_of_transcription|>"
+  ],
+  "is_local": true,
+  "model_max_length": 202752,
+  "model_specific_special_tokens": {},
+  "pad_token": "<|endoftext|>",
+  "padding_side": "left",
+  "remove_space": false,
+  "tokenizer_class": "TokenizersBackend"
+}