Naming notice (2026-04-10). The "PolarQuant" technique used in this model is being rebranded to HLWQ (Hadamard-Lloyd Weight Quantization). The change is only the name; the algorithm and the weights in this repository are unchanged.

The rebrand resolves a name collision with an unrelated, earlier KV cache quantization method also named PolarQuant (Han et al., arXiv:2502.02617, 2025). HLWQ addresses weight quantization with a deterministic Walsh-Hadamard rotation and Lloyd-Max scalar codebook; Han et al.'s PolarQuant addresses KV cache quantization with a random polar rotation. The two methods are technically distinct.

Existing loaders that load this repository by ID continue to work without changes. Future model uploads will use the HLWQ name.

Reference paper for this technique: arXiv:2603.29078 (v2 in preparation; v1 still uses the old name).

🧊 PolarQuant Skills + CLI

14 slash commands for Claude Code + 13 CLI commands for terminal.

CLI Install

pip install git+https://github.com/caiovicentino/polarengine-vllm.git[all]

polarquant info google/gemma-4-31B-it
polarquant chat google/gemma-4-31B-it --vision
polarquant quantize google/gemma-4-31B-it --upload
polarquant serve caiovicentino1/model --port 8000
polarquant monitor --stats

Skills Install (Claude Code)

huggingface-cli download caiovicentino1/polarquant-skills --local-dir polarquant-skills
cp polarquant-skills/*.md ~/.claude/commands/

Public Commands

Command	CLI	Skill	Description
Quantize	`polarquant quantize`	`/polarquant`	PQ5+INT4 any model
Chat	`polarquant chat`	`/polarquant-inference`	Gradio chat (24GB GPUs)
Serve	`polarquant serve`	`/polarquant-serve`	OpenAI API server
Info	`polarquant info`	—	VRAM estimate + specs
Bench	`polarquant bench`	`/polarquant-bench`	PQ vs torchao vs BnB
GGUF	`polarquant gguf`	`/polarquant-gguf`	ollama/llama.cpp
Monitor	`polarquant monitor`	`/polarquant-monitor`	Track downloads
MLX	`polarquant mlx`	`/polarquant-mlx`	Apple Silicon
llama.cpp	`polarquant llamacpp`	`/polarquant-llamacpp`	KV cache Q3
vLLM KV	`polarquant vllm-kv`	`/polarquant-vllm-kv`	vLLM KV compression
Arena	`polarquant arena`	`/polarquant-arena`	MMLU, HumanEval
Fine-tune	`polarquant finetune`	`/polarquant-finetune`	QLoRA pipeline

Papers for caiovicentino1/polarquant-skills

PolarQuant: Optimal Gaussian Weight Quantization via Hadamard Rotation for LLM Compression

Paper • 2603.29078 • Published 19 days ago

PolarQuant: Quantizing KV Caches with Polar Transformation

Paper • 2502.02617 • Published Feb 4, 2025 • 1

caiovicentino1
/

polarquant-skills

🧊 PolarQuant Skills + CLI

CLI Install

Skills Install (Claude Code)

Public Commands

Links

Papers for caiovicentino1/polarquant-skills

PolarQuant: Optimal Gaussian Weight Quantization via Hadamard Rotation for LLM Compression

PolarQuant: Quantizing KV Caches with Polar Transformation