These are NOT actual AWQ-quantized models.
#1
by cai-cai - opened
Heads up! Despite the "AWQ" tag in the title, the config.json reveals these models are using standard compressed-tensors (W4A16) rather than the AWQ (Activation-aware Weight Quantization) method. Real AWQ requires an activation calibration process and specific scaling factors, which are missing here. This is misleading for users looking for actual AWQ kernels.
u need the AWQ quant — you can easily find it with the AWQ tag. Any reason to use the original AWQ quant?
this guy is posting same comment on every awq quant lol so annoying!