These are miscellaneous GGUF quantizations of the instruct-tuned Gemma 4 series of models, released by Google.

For more information about Gemma, you should refer to the original model cards.

The chat template baked into these GGUFs is technically outdated, however, inference in llama.cpp should still work exactly as it should, thanks to these fixes:

Downloads last month
85,928
GGUF
Model size
25B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ddh0/gemma-4-it-GGUF

Quantized
(91)
this model