πŸ€” Isn't a 8.61 GB safetensors file a bit big for 9B quantised to 4 bits?

#1
by HenkPoley - opened

Would have expected 9 * 2 / 16 * 4 = around 4.5 GB.

Z Lab org

We keep the original visual components in high precision, so the model size will be larger than normal pure-LLM models. Other quantized models are similar, e.g. https://huggingface.co/Qwen/Qwen3.5-27B-GPTQ-Int4.
We have an option to disable loading VLM components (but you will still have to download them), please check out https://github.com/z-lab/paroquant.

liang2kl changed discussion status to closed

Sign up or log in to comment