sgl-project
/

DeepSeek-V4-Flash-FP8

Model card Files Files and versions

DeepSeek-V4-Flash-FP8

FP8 re-packaging of deepseek-ai/DeepSeek-V4-Flash. Model architecture, tokenizer, chat template, and reference encoding/ are unchanged from the base repo. No fine-tuning, no retraining — weights only.

Deployment

SGLang Cookbook: https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4

License

MIT — see LICENSE. Copyright © DeepSeek.

Downloads last month: 3,785

Safetensors

Model size

291B params

Tensor type

BF16

·

I64

·

F32

·

F8_E4M3

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sgl-project/DeepSeek-V4-Flash-FP8

Base model

deepseek-ai/DeepSeek-V4-Flash

Quantized

(15)

this model