Gemma-4-E2B-NoAudio-GGUF

This repository provides GGUF weights for the text-only version of Gemma-4-E2B-it. The audio components have been removed for maximum efficiency on home hardware.

Key Features

Imatrix Optimized: All files are quantized using a custom Importance Matrix (imatrix) generated from 1,000+ chunks of Thai text to ensure superior reasoning and language quality.

Home Hardware Ready: Highly efficient and lightweight, perfect for running locally on consumer GPUs and CPUs.

Available Files

  • Q8_0 High fidelity for maximum accuracy.
  • Q4_K_M Best balance of speed and intelligence.

Usage

This model is compatible with most GGUF runners. For the best experience at home, we recommend:

  • LM Studio
  • Ollama
  • llama.cpp / llama-server
Downloads last month
240
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bombman/Gemma-4-E2B-NoAudio-GGUF

Quantized
(123)
this model