Parameters Format Quant Multimodal

MYTHOS-26B-A4B — PRISM Dynamic Quantization (GGUF)

Gemma 4 26B-A4B MoE PRISM-PRO-Dynamic-Quant

  • PRISM-PRO: Production model with full over-refusal and bias mechanisms completely removed using State of the Art PRISM pipeline.
  • DQ: Per-tensor-class mixed-precision allocation derived entirely from weight structure sensitivity analysis — not closed-gated datasets.

Created by Ex0bit


💡 Support My Research & Development efforts. Members Receive access to the latest PRISM-PRO Model drops on Day-0

Ko-fi


Model Details

Property Value
Base Model google/gemma-4-26B-A4B-it
Architecture Gemma 4 MoE (128 experts, top-8 routing)
Parameters 26B total / 4B active per token
Quantization PRISM-PRO-DYNAMIC-QUANT
Achieved BPW 5.73
File Size ~17 GB (language) + ~1.2 GB (vision projector)
Context Length 262,144 tokens
Modalities Text, Image, Video
Creator Ex0bit

Supported Modalities

  • Text: Full instruction-following and chat
  • Image: Vision understanding via SigLIP encoder (280 soft tokens per image)
  • Video: Gemma4VideoProcessor (32 frames, pooled)

Note: This 26B MoE variant does not include audio support. For audio, see the 31B dense variant.

Files

File Size Purpose
mythos-26b-a4b-prism-pro-dq.gguf 17 GB Language model (quantized)
mmproj-mythos-26b-a4b-prism-pro.gguf 1.2 GB Vision projector (F16)

Both files are required for multimodal inference. For text-only use, only the language model file is needed.

PRISM-DQ Quantization

This model uses PRISM-PRO Dynamic Quantization — a per-tensor-class mixed-precision allocation that assigns different quantization types to different tensor classes based on weight structure sensitivity.

Unlike uniform quantization (Q4_K_M, Q5_K_M), PRISM-DQ analyzes each tensor class's sensitivity and allocates precision where it matters most. Attention projections receive higher precision than FFN layers, with block-level overrides that protect critical layers.

The result: BF16-equivalent quality at 5.73 bits-per-weight — a 64% size reduction with zero measurable quality loss.

Usage

llama.cpp (multimodal with vision)

llama-mtmd-cli \
  --model mythos-26b-a4b-prism-pro-dq.gguf \
  --mmproj mmproj-mythos-26b-a4b-prism-pro.gguf \
  --image path/to/image.jpg \
  --prompt "Describe this image." \
  -ngl 99

llama.cpp (text-only server)

llama-server \
  --model mythos-26b-a4b-prism-pro-dq.gguf \
  --port 8080 -ngl 99

LM Studio

Download both mythos-26b-a4b-prism-pro-dq.gguf and mmproj-mythos-26b-a4b-prism-pro.gguf. LM Studio will automatically detect the vision projector for multimodal chat.

Refusal & Bias Removal

This model has been treated to remove bias, over-refusals and propaganda from the base google/gemma-4-26B-A4B-it using the State of The Art PRISM pipeline.

License

Apache 2.0 (inherited from google/gemma-4-26B-A4B-it)

Credits

Downloads last month
1,148
GGUF
Model size
25B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ex0bit/MYTHOS-26B-A4B-PRISM-PRO-DQ-GGUF

Finetuned
(36)
this model

Collection including Ex0bit/MYTHOS-26B-A4B-PRISM-PRO-DQ-GGUF