gemma-4-E2B-it-F32-GGUF

Gemma-4-E2B-it from Google is an ultra-efficient 2.3B effective parameter (5.1B total with Per-Layer Embeddings) multimodal dense model in the Gemma 4 family, purpose-built for on-device deployment across smartphones, laptops, Raspberry Pi, and IoT edge hardware with native support for text, images (variable aspect ratio/resolution), audio, and configurable thinking modes for advanced reasoning. Featuring 35 layers, 512-token sliding window, 128K context length, and 262K vocabulary, it excels at agentic workflows, OCR (multilingual/handwriting), document/PDF parsing, UI/screen understanding, chart comprehension, object detection, coding assistance, and low-latency inference optimized for Qualcomm/MediaTek chips via Android AICore—delivering frontier-level intelligence rivaling models 20x larger while consuming minimal RAM/battery. The instruction-tuned variant prioritizes seamless integration for mobile developers prototyping autonomous agents, with safety protocols matching Google's proprietary standards and open weights enabling local-first AI servers on consumer GPUs for reasoning-heavy tasks like IDE assistance and structured data extraction.

Quick start with llama.cpp

llama-server -hf prithivMLmods/gemma-4-E2B-it-F32-GGUF:F32

Model Files

File Name	Quant Type	File Size	File Link
gemma-4-E2B-it.BF16.gguf	BF16	9.31 GB	Download
gemma-4-E2B-it.F16.gguf	F16	9.31 GB	Download
gemma-4-E2B-it.F32.gguf	F32	18.6 GB	Download
gemma-4-E2B-it.Q8_0.gguf	Q8_0	4.95 GB	Download
gemma-4-E2B-it.mmproj-bf16.gguf	mmproj-bf16	987 MB	Download
gemma-4-E2B-it.mmproj-f16.gguf	mmproj-f16	987 MB	Download
gemma-4-E2B-it.mmproj-f32.gguf	mmproj-f32	1.9 GB	Download
gemma-4-E2B-it.mmproj-q8_0.gguf	mmproj-q8_0	557 MB	Download

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 4,985

GGUF

Model size

5B params

Architecture

gemma4

Hardware compatibility

8-bit

16-bit

32-bit

Model tree for prithivMLmods/gemma-4-E2B-it-F32-GGUF

Base model

google/gemma-4-E2B-it

Quantized

(123)

this model

Collection including prithivMLmods/gemma-4-E2B-it-F32-GGUF

Gemma-4 F32 GGUF

Collection

Collection of Gemma-4 Quants • 4 items • Updated 13 days ago • 2