🤔 Isn't a 8.61 GB safetensors file a bit big for 9B quantised to 4 bits?

by HenkPoley - opened Mar 11

Discussion

HenkPoley

Mar 11

•

edited Mar 11

Would have expected 9 * 2 / 16 * 4 = around 4.5 GB.

liang2kl

Z Lab org Mar 12

We keep the original visual components in high precision, so the model size will be larger than normal pure-LLM models. Other quantized models are similar, e.g. https://huggingface.co/Qwen/Qwen3.5-27B-GPTQ-Int4.
We have an option to disable loading VLM components (but you will still have to download them), please check out https://github.com/z-lab/paroquant.

liang2kl changed discussion status to closed Mar 12

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment