caiovicentino1 commited on
Commit
097e96d
·
verified ·
1 Parent(s): 2affaa5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +116 -18
README.md CHANGED
@@ -10,40 +10,138 @@ Quantized with **EOQ (Entropy-Optimal Quantization)**: absmax Q6 + rANS entropy
10
 
11
  ## Benchmark (RTX PRO 6000 Blackwell 96GB VRAM)
12
 
13
- | Format | Size | PPL (WikiText-2) |
14
- |--------|------|------------------|
15
- | FP16 | 8412 MB | 7.58 |
16
- | GGUF Q4_K_M | 2709 MB | ~ref |
17
- | **EOQ Q6** | **2944 MB** | **7.76** |
18
 
19
  EOQ Q6 is **-8.7% larger** than GGUF Q4_K_M.
20
  PPL degradation vs FP16: +0.18 points.
21
 
 
 
 
 
 
 
 
 
 
 
22
  ## Inference Speed
23
 
24
  EOQ models are stored as dequantized FP16 safetensors.
25
  Inference speed is **identical to FP16** (no quantized kernels).
 
26
 
27
- This means EOQ is **not faster** than GGUF Q4_K_M at inference,
28
- since GGUF uses optimized INT4 kernels in llama.cpp that reduce
29
- memory bandwidth. EOQ advantage is **smaller file size** at
30
- comparable quality, not speed.
31
-
32
- Measured: 54.4 tok/s (same as FP16 baseline: 53.6 tok/s)
33
 
34
  ## Usage
35
 
36
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ## What is EOQ?
39
 
40
  EOQ combines block-wise absmax quantization with rANS entropy coding.
41
- Quantized weights have Shannon entropy below their bit width.
42
- rANS removes this redundancy losslessly, saving 10-18%.
43
-
44
- The result: simpler quantization (absmax) that matches complex
45
- GGUF K-quants in quality-per-byte, at smaller file size.
46
 
47
  ## GitHub
48
 
49
- https://github.com/caiovicentino/eoq-quantization
 
10
 
11
  ## Benchmark (RTX PRO 6000 Blackwell 96GB VRAM)
12
 
13
+ | Format | Size | PPL (WikiText-2) | tok/s |
14
+ |--------|------|------------------|-------|
15
+ | FP16 | 8412 MB | 7.58 | 54.0 |
16
+ | GGUF Q4_K_M | 2709 MB | ~ref | ~ref |
17
+ | **EOQ Q6** | **2944 MB** | **7.76** | **54.3** |
18
 
19
  EOQ Q6 is **-8.7% larger** than GGUF Q4_K_M.
20
  PPL degradation vs FP16: +0.18 points.
21
 
22
+ ## Cross-Model Validation
23
+
24
+ | Model | FP16 PPL | EOQ Q5 Size | EOQ Q5 PPL | Delta |
25
+ |-------|----------|-------------|------------|-------|
26
+ | Qwen2.5-0.5B | 10.87 | 279 MB | 11.69 | +0.83 |
27
+ | Qwen2.5-3B | 6.54 | 1,724 MB | 6.77 | +0.23 |
28
+ | Qwen3.5-4B | 7.58 | 2,398 MB | 7.77 | +0.18 |
29
+ | Qwen3.5-27B | 5.65 | 15,353 MB | 5.94 | +0.31 |
30
+ | Qwen3.5-35B-A3B | 5.19 | 19,680 MB | 5.39 | +0.21 |
31
+
32
  ## Inference Speed
33
 
34
  EOQ models are stored as dequantized FP16 safetensors.
35
  Inference speed is **identical to FP16** (no quantized kernels).
36
+ EOQ advantage is **smaller file size** at comparable quality, not speed.
37
 
38
+ Measured: 54.3 tok/s (same as FP16: 54.0 tok/s)
 
 
 
 
 
39
 
40
  ## Usage
41
 
42
+ Version: ImageMagick 7.1.2-13 Q16-HDRI aarch64 23522 https://imagemagick.org
43
+ Copyright: (C) 1999 ImageMagick Studio LLC
44
+ License: https://imagemagick.org/license/
45
+ Features: Cipher DPC HDRI Modules
46
+ Delegates (built-in): bzlib heic jng jpeg lcms ltdl lzma png tiff webp xml zlib zstd
47
+ Compiler: clang (17.0.0)
48
+ Usage: import [options ...] [ file ]
49
+
50
+ Image Settings:
51
+ -adjoin join images into a single multi-image file
52
+ -border include window border in the output image
53
+ -channel type apply option to select image channels
54
+ -colorspace type alternate image colorspace
55
+ -comment string annotate image with comment
56
+ -compress type type of pixel compression when writing the image
57
+ -define format:option
58
+ define one or more image format options
59
+ -density geometry horizontal and vertical density of the image
60
+ -depth value image depth
61
+ -descend obtain image by descending window hierarchy
62
+ -display server X server to contact
63
+ -dispose method layer disposal method
64
+ -dither method apply error diffusion to image
65
+ -delay value display the next image after pausing
66
+ -encipher filename convert plain pixels to cipher pixels
67
+ -endian type endianness (MSB or LSB) of the image
68
+ -encoding type text encoding type
69
+ -filter type use this filter when resizing an image
70
+ -format "string" output formatted image characteristics
71
+ -frame include window manager frame
72
+ -gravity direction which direction to gravitate towards
73
+ -identify identify the format and characteristics of the image
74
+ -interlace type None, Line, Plane, or Partition
75
+ -interpolate method pixel color interpolation method
76
+ -label string assign a label to an image
77
+ -limit type value Area, Disk, Map, or Memory resource limit
78
+ -monitor monitor progress
79
+ -page geometry size and location of an image canvas
80
+ -pause seconds seconds delay between snapshots
81
+ -pointsize value font point size
82
+ -quality value JPEG/MIFF/PNG compression level
83
+ -quiet suppress all warning messages
84
+ -regard-warnings pay attention to warning messages
85
+ -repage geometry size and location of an image canvas
86
+ -respect-parentheses settings remain in effect until parenthesis boundary
87
+ -sampling-factor geometry
88
+ horizontal and vertical sampling factor
89
+ -scene value image scene number
90
+ -screen select image from root window
91
+ -seed value seed a new sequence of pseudo-random numbers
92
+ -set property value set an image property
93
+ -silent operate silently, i.e. don't ring any bells
94
+ -snaps value number of screen snapshots
95
+ -support factor resize support: > 1.0 is blurry, < 1.0 is sharp
96
+ -synchronize synchronize image to storage device
97
+ -taint declare the image as modified
98
+ -transparent-color color
99
+ transparent color
100
+ -treedepth value color tree depth
101
+ -verbose print detailed information about the image
102
+ -virtual-pixel method
103
+ Constant, Edge, Mirror, or Tile
104
+ -window id select window with this id or name
105
+ root selects whole screen
106
+
107
+ Image Operators:
108
+ -annotate geometry text
109
+ annotate the image with text
110
+ -colors value preferred number of colors in the image
111
+ -crop geometry preferred size and location of the cropped image
112
+ -encipher filename convert plain pixels to cipher pixels
113
+ -extent geometry set the image size
114
+ -geometry geometry preferred size or location of the image
115
+ -help print program options
116
+ -monochrome transform image to black and white
117
+ -negate replace every pixel with its complementary color
118
+ -quantize colorspace reduce colors in this colorspace
119
+ -resize geometry resize the image
120
+ -rotate degrees apply Paeth rotation to the image
121
+ -strip strip image of all profiles and comments
122
+ -thumbnail geometry create a thumbnail of the image
123
+ -transparent color make this color transparent within the image
124
+ -trim trim image edges
125
+ -type type image type
126
+
127
+ Miscellaneous Options:
128
+ -debug events display copious debugging information
129
+ -help print program options
130
+ -list type print a list of supported option arguments
131
+ -log format format of debugging information
132
+ -version print version information
133
+
134
+ By default, 'file' is written in the MIFF image format. To
135
+ specify a particular image format, precede the filename with an image
136
+ format name and a colon (i.e. ps:image) or specify the image type as
137
+ the filename suffix (i.e. image.ps). Specify 'file' as '-' for
138
+ standard input or output.
139
 
140
  ## What is EOQ?
141
 
142
  EOQ combines block-wise absmax quantization with rANS entropy coding.
143
+ Simple quantization that matches complex GGUF K-quants in quality-per-byte.
 
 
 
 
144
 
145
  ## GitHub
146
 
147
+ https://github.com/caiovicentino/eoq-quantization