Qwen3.5-4B EOQ Q6

Quantized with EOQ (Entropy-Optimal Quantization): absmax Q6 + rANS entropy coding.

Benchmark (RTX PRO 6000 Blackwell 96GB VRAM)

Format Size PPL (WikiText-2) tok/s
FP16 8412 MB 7.58 54.0
GGUF Q4_K_M 2709 MB ~ref ~ref
EOQ Q6 2944 MB 7.76 54.3

EOQ Q6 is -8.7% larger than GGUF Q4_K_M. PPL degradation vs FP16: +0.18 points.

Cross-Model Validation

Model FP16 PPL EOQ Q5 Size EOQ Q5 PPL Delta
Qwen2.5-0.5B 10.87 279 MB 11.69 +0.83
Qwen2.5-3B 6.54 1,724 MB 6.77 +0.23
Qwen3.5-4B 7.58 2,398 MB 7.77 +0.18
Qwen3.5-27B 5.65 15,353 MB 5.94 +0.31
Qwen3.5-35B-A3B 5.19 19,680 MB 5.39 +0.21

Inference Speed

EOQ models are stored as dequantized FP16 safetensors. Inference speed is identical to FP16 (no quantized kernels). EOQ advantage is smaller file size at comparable quality, not speed.

Measured: 54.3 tok/s (same as FP16: 54.0 tok/s)

Usage

Version: ImageMagick 7.1.2-13 Q16-HDRI aarch64 23522 https://imagemagick.org Copyright: (C) 1999 ImageMagick Studio LLC License: https://imagemagick.org/license/ Features: Cipher DPC HDRI Modules Delegates (built-in): bzlib heic jng jpeg lcms ltdl lzma png tiff webp xml zlib zstd Compiler: clang (17.0.0) Usage: import [options ...] [ file ]

Image Settings: -adjoin join images into a single multi-image file -border include window border in the output image -channel type apply option to select image channels -colorspace type alternate image colorspace -comment string annotate image with comment -compress type type of pixel compression when writing the image -define format:option define one or more image format options -density geometry horizontal and vertical density of the image -depth value image depth -descend obtain image by descending window hierarchy -display server X server to contact -dispose method layer disposal method -dither method apply error diffusion to image -delay value display the next image after pausing -encipher filename convert plain pixels to cipher pixels -endian type endianness (MSB or LSB) of the image -encoding type text encoding type -filter type use this filter when resizing an image -format "string" output formatted image characteristics -frame include window manager frame -gravity direction which direction to gravitate towards -identify identify the format and characteristics of the image -interlace type None, Line, Plane, or Partition -interpolate method pixel color interpolation method -label string assign a label to an image -limit type value Area, Disk, Map, or Memory resource limit -monitor monitor progress -page geometry size and location of an image canvas -pause seconds seconds delay between snapshots -pointsize value font point size -quality value JPEG/MIFF/PNG compression level -quiet suppress all warning messages -regard-warnings pay attention to warning messages -repage geometry size and location of an image canvas -respect-parentheses settings remain in effect until parenthesis boundary -sampling-factor geometry horizontal and vertical sampling factor -scene value image scene number -screen select image from root window -seed value seed a new sequence of pseudo-random numbers -set property value set an image property -silent operate silently, i.e. don't ring any bells -snaps value number of screen snapshots -support factor resize support: > 1.0 is blurry, < 1.0 is sharp -synchronize synchronize image to storage device -taint declare the image as modified -transparent-color color transparent color -treedepth value color tree depth -verbose print detailed information about the image -virtual-pixel method Constant, Edge, Mirror, or Tile -window id select window with this id or name root selects whole screen

Image Operators: -annotate geometry text annotate the image with text -colors value preferred number of colors in the image -crop geometry preferred size and location of the cropped image -encipher filename convert plain pixels to cipher pixels -extent geometry set the image size -geometry geometry preferred size or location of the image -help print program options -monochrome transform image to black and white -negate replace every pixel with its complementary color -quantize colorspace reduce colors in this colorspace -resize geometry resize the image -rotate degrees apply Paeth rotation to the image -strip strip image of all profiles and comments -thumbnail geometry create a thumbnail of the image -transparent color make this color transparent within the image -trim trim image edges -type type image type

Miscellaneous Options: -debug events display copious debugging information -help print program options -list type print a list of supported option arguments -log format format of debugging information -version print version information

By default, 'file' is written in the MIFF image format. To specify a particular image format, precede the filename with an image format name and a colon (i.e. ps:image) or specify the image type as the filename suffix (i.e. image.ps). Specify 'file' as '-' for standard input or output.

What is EOQ?

EOQ combines block-wise absmax quantization with rANS entropy coding. Simple quantization that matches complex GGUF K-quants in quality-per-byte.

GitHub

https://github.com/caiovicentino/eoq-quantization

Downloads last month
7
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for caiovicentino1/Qwen3.5-4B-EOQ-Q6

Finetuned
Qwen/Qwen3.5-4B
Finetuned
(198)
this model

Collection including caiovicentino1/Qwen3.5-4B-EOQ-Q6