DeepSeek-V4-Flash-FP8

FP8 re-packaging of deepseek-ai/DeepSeek-V4-Flash. Model architecture, tokenizer, chat template, and reference encoding/ are unchanged from the base repo. No fine-tuning, no retraining — weights only.

Deployment

SGLang Cookbook: https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4

License

MIT — see LICENSE. Copyright © DeepSeek.

Downloads last month
3,785
Safetensors
Model size
291B params
Tensor type
BF16
·
I64
·
F32
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sgl-project/DeepSeek-V4-Flash-FP8

Quantized
(15)
this model