Experimental global target bits‑per‑weight quantization of huihui-ai/Huihui-Kimi-Linear-48B-A3B-Instruct-abliterated

  • Using non-standard (forked) LLaMA C++ branch for quantization.
  • Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
  • Using dataset sources: tools, math, code, text_en, text_ru.
  • Using dataset chunks: 1000.
  • Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
  • Small set of patches added.

Many thanks to Ed Addario for an impressive job.

Quantization comparison

BPW/TGS SIZE (MiB) PPL correlation PPL mean ratio ΔPPL Mean KLD Median KLD Maximum KLD 99.9% KLD Mean Δp RMS Δp
3.000 17574 97.54% 1.099746 ± 0.001198 0.703413 ± 0.008959 0.122940 ± 0.000426 0.066714 9.294135 2.637652 -2.015 ± 0.019 % 10.272 ± 0.039 %
3.250 19038 98.25% 1.064443 ± 0.000969 0.454454 ± 0.006978 0.089130 ± 0.000308 0.050624 10.323897 1.992209 -1.732 ± 0.016 % 8.814 ± 0.035 %
3.500 20502 98.69% 1.042289 ± 0.000821 0.298224 ± 0.005835 0.064519 ± 0.000265 0.032550 8.311751 1.765223 -1.217 ± 0.014 % 7.481 ± 0.033 %
3.750 21966 99.04% 1.030411 ± 0.000697 0.214459 ± 0.004989 0.047054 ± 0.000201 0.022846 8.346386 1.258301 -0.654 ± 0.012 % 6.340 ± 0.031 %
4.000 23430 99.15% 1.028018 ± 0.000653 0.197583 ± 0.004668 0.042084 ± 0.000178 0.020843 7.031458 1.095171 -0.639 ± 0.011 % 5.992 ± 0.029 %
4.250 24894 99.25% 1.021574 ± 0.000610 0.152141 ± 0.004327 0.036486 ± 0.000158 0.018297 6.763760 1.009924 -0.568 ± 0.011 % 5.619 ± 0.028 %
4.500 26358 99.50% 1.012304 ± 0.000495 0.086766 ± 0.003506 0.024419 ± 0.000105 0.012078 5.742187 0.643191 -0.204 ± 0.009 % 4.565 ± 0.023 %
4.750 27822 99.61% 1.007470 ± 0.000434 0.052677 ± 0.003063 0.019040 ± 0.000086 0.009403 4.860535 0.504090 -0.200 ± 0.008 % 4.035 ± 0.020 %
5.000 29286 99.67% 1.006065 ± 0.000397 0.042769 ± 0.002799 0.015935 ± 0.000073 0.008021 3.658825 0.425957 -0.201 ± 0.007 % 3.731 ± 0.020 %
Downloads last month
577
GGUF
Model size
49B params
Architecture
kimi-linear
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train ENOSYS/Kimi-Linear-48B-A3B-Instruct-abliterated-1000-v1-GGUF