Anyone tried the DeepSeek v4 GGUF? Official 8bit quant?
Anyone tried the DeepSeek v4 GGUF? Official 8bit quant?
Right, this is quantized from official DeepSeek v4 Flash version, here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash
can you explain how did you manage to convert to GGUF please. I have been trying and it is very hard.
Can you drop the file (convert_hf_to_gguf.py) or the BF16 file version please. the link is not working.
llama.cpp can not launch this
llama.cpp can not launch this
They may have modified their version. @tecaprovn do you mind letting us know?
Hello, I have a question: how do I convert the precision to bf16?
llama.cpp can not launch this
Tried in LM Studio, can confirm, it does not work.
Yeah, how is anyone running this at all? GGUF DSV4 support doesn’t exist anywhere. It’s not even close.
This version is corrupt and none of them works.
I managed to convert to MXFP4 version and it works but it produce 4 TPs on my gear. I thought I can quantize to like 60GB range but it quantize to like 160gb
The other issue is that the output is poor. So I gave up. The problem it’s mixture of FP8 and FP4 and I don’t think it will compress much. I don’t think it’s for me unless pruned heavily or the smarter dudes did something like gptoss style version which will be awesome.