gemma-4-26B-A4B-it-uncensored-IQ2_M-GGUF

A IQ2_M quantized version of TrevorJS/gemma-4-26B-A4B-it-uncensored

This is an I-quant (Importance Quantized) version of the popular uncensored Gemma-4 26B-A4B model, created for efficient local inference with minimal quality loss.

Model Details

Base Model: google/gemma-4-26B-A4B-it
Finetuned / Abliterated Model: TrevorJS/gemma-4-26B-A4B-it-uncensored
Quantization: IQ2_M (≈2.7 bpw) using llama.cpp with importance matrix (imatrix) calibration
Quantization Source: Q8_0 GGUF of the uncensored model
Parameters: 25.2B total (MoE architecture, ~3.8B active)
Context Length: 262,144 tokens
Architecture: Gemma 4 (MoE)

The official mmproj-F32.gguf can be loaded externally to enable image input.

Quantization Method

Converted from the official Q8_0 GGUF using the latest llama.cpp (build 8849+)
Used bartowski’s calibration_datav3.txt dataset to generate the imatrix
--imatrix flag applied for best possible quality at this bitrate
This is currently one of the highest-quality IQ2_M versions available for this model

License

Apache 2.0 (same as the original Gemma 4 and TrevorJS uncensored model)

This model is a derivative work. Please respect the original Apache 2.0 license. Commercial use, modification, and redistribution are permitted.

Downloads last month: 432

GGUF

Model size

25B params

Architecture

gemma4

Hardware compatibility

2-bit

Model tree for SnowSwordScholar/gemma-4-26B-A4B-it-uncensored-IQ2_M-GGUF

Base model

google/gemma-4-26B-A4B-it

Finetuned

TrevorJS/gemma-4-26B-A4B-it-uncensored

Quantized

(14)

this model