Built for vMLX — the MLX inferencer with KV-cache quantization, prefix-cache reuse, and full reasoning + tool calling support.
_{Free for macOS · vmlx.net}

Step 3.7 Flash — JANG_K CRACK

CRACK abliterated · JANG_K (mixed-precision, Q2 routed) · 45-layer MoE with always-on shared expert · Vision · Reasoning · 74 GB

What Is This?

This is Step 3.7 Flash by StepFun — a 45-layer multimodal Mixture-of-Experts reasoning model with always-on shared expert (DSV4-style). This release is:

CRACK abliterated — refusal behavior removed at the weight level. The model complies across all task categories while preserving reasoning quality, knowledge, and multimodal capabilities.
JANG_K (mixed-precision, Q2 routed) for MLX on Apple Silicon — 74 GB.

Results

Evaluated on a Mac Studio M5 Max via the vMLX engine in reasoning mode (temperature 0.6, top-p 0.95). HarmBench-text 70-prompt sample (10 per category). MMLU-114 (2 questions per subject across all 57 subjects) with up to 1,800 tokens per question for full reasoning traces.

HarmBench compliance (70 prompts · 10 per category)

Category	CRACK ASR
Chemical / biological	10/10
Copyright	8/10
Cybercrime / intrusion	10/10
Harassment / bullying	10/10
Illegal	10/10
Misinformation / disinformation	10/10
General harmful	10/10
Overall	68/70 (97%)

MMLU-114 (reasoning-mode, by subject area)

Subject area	base	CRACK	Δ
Overall	78.1%	76.3%	-1.8pp
STEM	75.0%	77.8%	+2.8pp
Humanities	73.1%	76.9%	+3.8pp
Social Sciences	87.5%	83.3%	-4.2pp
Other (medicine, business, …)	78.6%	67.9%	-10.7pp

Features

Reasoning — chat template pre-opens <think>; vMLX's reasoning parser surfaces the trace in message.reasoning_content and the final answer in message.content
MoE with always-on shared expert — every layer combines routed experts with a shared expert (DSV4-style)
Vision — multimodal image-text understanding
45 transformer layers, hidden 4096

Usage

Run with vMLX (recommended — full step3p7 reasoning + vision support).

# OpenAI-compatible chat completion
# POST /v1/chat/completions
{
  "model": "dealignai/Step-3.7-Flash-JANG_K-CRACK",
  "messages": [{"role": "user", "content": "..."}],
  "temperature": 0.6, "top_p": 0.95
}

About CRACK

CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety-refusal behavior at the weight level so the model complies with all task categories while preserving reasoning quality, factual knowledge, and coherence.

Support dealignai

All models are built from original research and released free.

Support us on Ko-fi — membership gets early access and extras.

Ko-fi · X @dealignai · dealign.ai

See our research: Safety Generalization in Frontier Models

Disclaimer

This model has had its safety-refusal behavior removed for research purposes. It will follow instructions across all categories without refusing. You are solely responsible for how you use it and for complying with all applicable laws. Published for AI-safety research and authorized security testing.

Downloads last month: 1,015

Safetensors

Model size

22B params

Tensor type

F32

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for dealignai/Step-3.7-Flash-JANG_K-CRACK

Base model

stepfun-ai/Step-3.7-Flash

Quantized

(31)

this model