Anyone tried the DeepSeek v4 GGUF? Official 8bit quant?

by yuanyuanlian - opened 3 days ago

Discussion

yuanyuanlian

3 days ago

Anyone tried the DeepSeek v4 GGUF? Official 8bit quant?

tecaprovn

Owner 3 days ago

Right, this is quantized from official DeepSeek v4 Flash version, here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash

islameissa

3 days ago

can you explain how did you manage to convert to GGUF please. I have been trying and it is very hard.
Can you drop the file (convert_hf_to_gguf.py) or the BF16 file version please. the link is not working.

pliskin123

3 days ago

llama.cpp can not launch this

reedmayhew

3 days ago

llama.cpp can not launch this

They may have modified their version. @tecaprovn do you mind letting us know?

jack19960516

2 days ago

Hello, I have a question: how do I convert the precision to bf16?

llmfan46

2 days ago

llama.cpp can not launch this

Tried in LM Studio, can confirm, it does not work.

Downtown-Case

2 days ago

•

edited 2 days ago

Yeah, how is anyone running this at all? GGUF DSV4 support doesn’t exist anywhere. It’s not even close.

islameissa

2 days ago

This version is corrupt and none of them works.
I managed to convert to MXFP4 version and it works but it produce 4 TPs on my gear. I thought I can quantize to like 60GB range but it quantize to like 160gb
The other issue is that the output is poor. So I gave up. The problem it’s mixture of FP8 and FP4 and I don’t think it will compress much. I don’t think it’s for me unless pruned heavily or the smarter dudes did something like gptoss style version which will be awesome.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment