How to use from
Ollama
ollama run hf.co/GestaltLabs/Gemma-4-E4B-SABER-GGUF:
Quick Links

Gemma-4-E4B-SABER GGUF

This repository contains GGUF conversions of GestaltLabs/Gemma-4-E4B-SABER for llama.cpp-compatible runtimes.

The source model is a Gemma 4 E4B instruction model modified with the SABER/refusal-ablation workflow. These files preserve the source tokenizer and chat template metadata in GGUF form.

Files

File Quantization Approx. size
Gemma-4-E4B-SABER-BF16.gguf BF16 13.92 GiB
Gemma-4-E4B-SABER-Q8_0.gguf Q8_0 7.43 GiB
Gemma-4-E4B-SABER-Q6_K.gguf Q6_K 5.75 GiB
Gemma-4-E4B-SABER-Q5_K_M.gguf Q5_K_M 5.33 GiB
Gemma-4-E4B-SABER-Q4_K_M.gguf Q4_K_M 4.94 GiB
Gemma-4-E4B-SABER-Q3_K_M.gguf Q3_K_M 4.49 GiB
Gemma-4-E4B-SABER-Q2_K.gguf Q2_K 4.08 GiB

Quantization Notes

The BF16 GGUF was converted from the original Hugging Face safetensors checkpoint using a local llama.cpp build with Gemma 4 support. The quantized GGUF files were produced from that BF16 GGUF using llama-quantize.

Recommended starting points:

  • Q8_0: highest-quality quantized option.
  • Q6_K: strong quality/size tradeoff.
  • Q4_K_M: compact general-purpose option.
  • Q2_K and Q3_K_M: smallest files, with larger quality tradeoffs.

No importance matrix was used.

Example

llama-cli \
  -m Gemma-4-E4B-SABER-Q4_K_M.gguf \
  -p "Explain quantum computing in simple terms."

Use a current llama.cpp build with Gemma 4 support.

Source

  • Source model: GestaltLabs/Gemma-4-E4B-SABER
  • Base model: google/gemma-4-E4B-it
  • License: Gemma license. See the source model and base model license terms.

Conversion Details

  • Source format: Hugging Face safetensors, BF16.
  • GGUF converter: llama.cpp convert_hf_to_gguf.py.
  • GGUF quantizer: llama.cpp llama-quantize.
  • Quant types: BF16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, Q3_K_M, Q2_K.
Downloads last month
764
GGUF
Model size
7B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GestaltLabs/Gemma-4-E4B-SABER-GGUF

Quantized
(6)
this model