krampenschiesser
/

MiniMax-M2.7-GGUF

Text Generation

Model card Files Files and versions

krampenschiesser commited on 19 days ago

Commit

44828a9

·

verified ·

1 Parent(s): 52f7ed6

Update README.md

Files changed (1) hide show

README.md +18 -7

README.md CHANGED Viewed

@@ -11,12 +11,23 @@ tags:
   - gguf
   - nvfp4
 ---
-In progress...
-planned:
-* nvfp4, size estimate 130.8gb
-* iq4xs, size estimate 123.8gb
-* iq4xs-q4k, size estimate 126.1gb
-* iq4xs-q5k, size estimate 135.5gb
-* q3k-iq4xs, size estimate 108.6gb

   - gguf
   - nvfp4
 ---
+Some quants I use depending on the memory availability at nvfp4 in the hope for custom kernels.
+I recommend the Q3-IQ4XS and IQ4XS-Q5K quants. I currently use IQ4XS-Q4K.
+# KLD
+I need to use the Q8 version due to hardware restrictions for running the kld baseline.
+However it is quantized in the same way as the original model which also uses 8 bits for the expert weights so the difference should not be big.
+Sadly I am getting weird outputs (nan floats from llama-perplexity) from some kld runs so take this with a salt lake.
+|Provider   |Quant      |Size GB    |Mean PPL               |Mean KLD               |Same Top p         |
+|-----------|-----------|-----------|-----------------------|-----------------------|-------------------|
+|KS         |Q8         |           |7.0266 +/- 0.05210     |baseline               |baseline           |
+|KS         |IQ4XS      |123.8      |7.153799 ±   0.053213  |0.086127 ±   0.001029  |89.425 ± 0.082 %   |
+|KS         |IQ4XS-Q5K  |135.5      |  |  |   |
+|KS         |IQ4XS-Q4K  |126.1      |  |  |   |
+|KS         |NVFP4      |130.8      |7.177182 ±   0.053324  |0.105053 ±   0.001034  |88.154 ± 0.086 %   |
+|KS         |Q3K-IQ4XS  |108.6      |7.297092 ±   0.054489  |0.140361 ±   0.001216  |86.387 ± 0.091 %   |
+|unsloth    |UD-Q4_K_XL |141        |  |  |   |