caiovicentino1
/

Qwen3.5-4B-EOQ-Q6

@@ -10,40 +10,138 @@ Quantized with **EOQ (Entropy-Optimal Quantization)**: absmax Q6 + rANS entropy
 ## Benchmark (RTX PRO 6000 Blackwell 96GB VRAM)
-| Format | Size | PPL (WikiText-2) |
-|--------|------|------------------|
-| FP16 | 8412 MB | 7.58 |
-| GGUF Q4_K_M | 2709 MB | ~ref |
-| **EOQ Q6** | **2944 MB** | **7.76** |
 EOQ Q6 is **-8.7% larger** than GGUF Q4_K_M.
 PPL degradation vs FP16: +0.18 points.
 ## Inference Speed
 EOQ models are stored as dequantized FP16 safetensors.
 Inference speed is **identical to FP16** (no quantized kernels).
-This means EOQ is **not faster** than GGUF Q4_K_M at inference,
-since GGUF uses optimized INT4 kernels in llama.cpp that reduce
-memory bandwidth. EOQ advantage is **smaller file size** at
-comparable quality, not speed.
-Measured: 54.4 tok/s (same as FP16 baseline: 53.6 tok/s)
 ## Usage
 ## What is EOQ?
 EOQ combines block-wise absmax quantization with rANS entropy coding.
-Quantized weights have Shannon entropy below their bit width.
-rANS removes this redundancy losslessly, saving 10-18%.
-The result: simpler quantization (absmax) that matches complex
-GGUF K-quants in quality-per-byte, at smaller file size.
 ## GitHub
-https://github.com/caiovicentino/eoq-quantization

 ## Benchmark (RTX PRO 6000 Blackwell 96GB VRAM)
+| Format | Size | PPL (WikiText-2) | tok/s |
+|--------|------|------------------|-------|
+| FP16 | 8412 MB | 7.58 | 54.0 |
+| GGUF Q4_K_M | 2709 MB | ~ref | ~ref |
+| **EOQ Q6** | **2944 MB** | **7.76** | **54.3** |
 EOQ Q6 is **-8.7% larger** than GGUF Q4_K_M.
 PPL degradation vs FP16: +0.18 points.
+## Cross-Model Validation
+| Model | FP16 PPL | EOQ Q5 Size | EOQ Q5 PPL | Delta |
+|-------|----------|-------------|------------|-------|
+| Qwen2.5-0.5B | 10.87 | 279 MB | 11.69 | +0.83 |
+| Qwen2.5-3B | 6.54 | 1,724 MB | 6.77 | +0.23 |
+| Qwen3.5-4B | 7.58 | 2,398 MB | 7.77 | +0.18 |
+| Qwen3.5-27B | 5.65 | 15,353 MB | 5.94 | +0.31 |
+| Qwen3.5-35B-A3B | 5.19 | 19,680 MB | 5.39 | +0.21 |
 ## Inference Speed
 EOQ models are stored as dequantized FP16 safetensors.
 Inference speed is **identical to FP16** (no quantized kernels).
+EOQ advantage is **smaller file size** at comparable quality, not speed.
+Measured: 54.3 tok/s (same as FP16: 54.0 tok/s)
 ## Usage
+Version: ImageMagick 7.1.2-13 Q16-HDRI aarch64 23522 https://imagemagick.org
+Copyright: (C) 1999 ImageMagick Studio LLC
+License: https://imagemagick.org/license/
+Features: Cipher DPC HDRI Modules
+Delegates (built-in): bzlib heic jng jpeg lcms ltdl lzma png tiff webp xml zlib zstd
+Compiler: clang (17.0.0)
+Usage: import [options ...] [ file ]
+Image Settings:
+  -adjoin              join images into a single multi-image file
+  -border              include window border in the output image
+  -channel type        apply option to select image channels
+  -colorspace type     alternate image colorspace
+  -comment string      annotate image with comment
+  -compress type       type of pixel compression when writing the image
+  -define format:option
+                       define one or more image format options
+  -density geometry    horizontal and vertical density of the image
+  -depth value         image depth
+  -descend             obtain image by descending window hierarchy
+  -display server      X server to contact
+  -dispose method      layer disposal method
+  -dither method       apply error diffusion to image
+  -delay value         display the next image after pausing
+  -encipher filename   convert plain pixels to cipher pixels
+  -endian type         endianness (MSB or LSB) of the image
+  -encoding type       text encoding type
+  -filter type         use this filter when resizing an image
+  -format "string"     output formatted image characteristics
+  -frame               include window manager frame
+  -gravity direction   which direction to gravitate towards
+  -identify            identify the format and characteristics of the image
+  -interlace type      None, Line, Plane, or Partition
+  -interpolate method  pixel color interpolation method
+  -label string        assign a label to an image
+  -limit type value    Area, Disk, Map, or Memory resource limit
+  -monitor             monitor progress
+  -page geometry       size and location of an image canvas
+  -pause seconds       seconds delay between snapshots
+  -pointsize value     font point size
+  -quality value       JPEG/MIFF/PNG compression level
+  -quiet               suppress all warning messages
+  -regard-warnings     pay attention to warning messages
+  -repage geometry     size and location of an image canvas
+  -respect-parentheses settings remain in effect until parenthesis boundary
+  -sampling-factor geometry
+                       horizontal and vertical sampling factor
+  -scene value         image scene number
+  -screen              select image from root window
+  -seed value          seed a new sequence of pseudo-random numbers
+  -set property value  set an image property
+  -silent              operate silently, i.e. don't ring any bells
+  -snaps value         number of screen snapshots
+  -support factor      resize support: > 1.0 is blurry, < 1.0 is sharp
+  -synchronize         synchronize image to storage device
+  -taint               declare the image as modified
+  -transparent-color color
+                       transparent color
+  -treedepth value     color tree depth
+  -verbose             print detailed information about the image
+  -virtual-pixel method
+                       Constant, Edge, Mirror, or Tile
+  -window id           select window with this id or name
+                       root selects whole screen
+Image Operators:
+  -annotate geometry text
+                       annotate the image with text
+  -colors value        preferred number of colors in the image
+  -crop geometry       preferred size and location of the cropped image
+  -encipher filename   convert plain pixels to cipher pixels
+  -extent geometry     set the image size
+  -geometry geometry   preferred size or location of the image
+  -help                print program options
+  -monochrome          transform image to black and white
+  -negate              replace every pixel with its complementary color
+  -quantize colorspace reduce colors in this colorspace
+  -resize geometry     resize the image
+  -rotate degrees      apply Paeth rotation to the image
+  -strip               strip image of all profiles and comments
+  -thumbnail geometry  create a thumbnail of the image
+  -transparent color   make this color transparent within the image
+  -trim                trim image edges
+  -type type           image type
+Miscellaneous Options:
+  -debug events        display copious debugging information
+  -help                print program options
+  -list type           print a list of supported option arguments
+  -log format          format of debugging information
+  -version             print version information
+By default, 'file' is written in the MIFF image format.  To
+specify a particular image format, precede the filename with an image
+format name and a colon (i.e. ps:image) or specify the image type as
+the filename suffix (i.e. image.ps).  Specify 'file' as '-' for
+standard input or output.
 ## What is EOQ?
 EOQ combines block-wise absmax quantization with rANS entropy coding.
+Simple quantization that matches complex GGUF K-quants in quality-per-byte.
 ## GitHub
+https://github.com/caiovicentino/eoq-quantization