gemma-4-26B-A4B-it-uncensored-IQ2_M-GGUF

A IQ2_M quantized version of TrevorJS/gemma-4-26B-A4B-it-uncensored

This is an I-quant (Importance Quantized) version of the popular uncensored Gemma-4 26B-A4B model, created for efficient local inference with minimal quality loss.

Model Details

  • Base Model: google/gemma-4-26B-A4B-it
  • Finetuned / Abliterated Model: TrevorJS/gemma-4-26B-A4B-it-uncensored
  • Quantization: IQ2_M (≈2.7 bpw) using llama.cpp with importance matrix (imatrix) calibration
  • Quantization Source: Q8_0 GGUF of the uncensored model
  • Parameters: 25.2B total (MoE architecture, ~3.8B active)
  • Context Length: 262,144 tokens
  • Architecture: Gemma 4 (MoE)

The official mmproj-F32.gguf can be loaded externally to enable image input.

Quantization Method

  • Converted from the official Q8_0 GGUF using the latest llama.cpp (build 8849+)
  • Used bartowski’s calibration_datav3.txt dataset to generate the imatrix
  • --imatrix flag applied for best possible quality at this bitrate
  • This is currently one of the highest-quality IQ2_M versions available for this model

License

Apache 2.0 (same as the original Gemma 4 and TrevorJS uncensored model)

This model is a derivative work. Please respect the original Apache 2.0 license. Commercial use, modification, and redistribution are permitted.

Downloads last month
432
GGUF
Model size
25B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SnowSwordScholar/gemma-4-26B-A4B-it-uncensored-IQ2_M-GGUF

Quantized
(14)
this model