--- pipeline_tag: image-text-to-text base_model: huihui-ai/Huihui-gemma-4-26B-A4B-it-abliterated base_model_relation: quantized library_name: llama.cpp tags: - gguf - mxfp4 - quantized - multimodal - abliterated - uncensored ---

Huihui Gemma 4 26B A4B GGUF banner

# Huihui Gemma 4 26B A4B IT Abliterated — GGUF Quantizations This repository contains **GGUF / llama.cpp quantized builds** of: [huihui-ai/Huihui-gemma-4-26B-A4B-it-abliterated](https://huggingface.co/huihui-ai/Huihui-gemma-4-26B-A4B-it-abliterated) These are **UD quantizations** prepared for efficient local inference with **llama.cpp**, including support for **multimodal image-text-to-text workflows** when used with the corresponding `mmproj` file. ## Overview This release is designed for users who want to run the Huihui Gemma 4 26B A4B abliterated model locally with reduced VRAM and RAM requirements while preserving as much output quality as possible. The quantization variants use an optimized tensor distribution strategy inspired by **Unsloth-style mixed-quality quantization recipes**, balancing model fidelity, speed, and memory efficiency across different hardware targets. ## Quick Start 1. Download the latest release of **llama.cpp**. 2. Download your preferred `.gguf` model file from this repository. 3. For multimodal inference, also download the matching `mmproj` file. 4. Run the model with llama.cpp using your preferred frontend or CLI. Example: ```bash ./llama-cli \ -m Huihui-Gemma-4-26B-A4B-it-abliterated-UD-Q4_K_XL.gguf \ --mmproj mmproj-model.gguf \ -p "Describe this image in detail." ```` Adjust the model filename and `mmproj` filename to match the files you downloaded. ## Which Quant Should I Choose? Choose based on your available memory and quality target: * **Higher-bit / larger quants**: Better quality, higher VRAM/RAM usage. * **Mid-range quants**: Best balance for most local setups. * **Lower-bit quants**: Faster and smaller, but with more quality loss. For best results, use the largest quantization your hardware can comfortably run. ## Multimodal Usage This model supports **image-text-to-text** inference when used with the appropriate multimodal projection file. Make sure the `mmproj` file matches this model family. Using an incorrect projection file may result in broken or degraded vision-language behavior. ## Notes * This is a **quantized GGUF release** of the fine-tuned model. * Original model: [huihui-ai/Huihui-gemma-4-26B-A4B-it-abliterated](https://huggingface.co/huihui-ai/Huihui-gemma-4-26B-A4B-it-abliterated) * Runtime target: **llama.cpp** * Format: **GGUF** * Modality: **image-text-to-text** * Quantization style: **UD / mixed tensor distribution** ## Disclaimer This repository only provides quantized GGUF builds. Model behavior, alignment characteristics, and training details are inherited from the original base model and fine-tune.