These are miscellaneous GGUF quantizations of the instruct-tuned Gemma 4 series of models, released by Google.
For more information about Gemma, you should refer to the original model cards.
The chat template baked into these GGUFs is technically outdated, however, inference in llama.cpp should still work exactly as it should, thanks to these fixes:
- llama.cpp#21704:
common : better align to the updated official gemma4 template - llama.cpp#21760:
common/gemma4 : handle parsing edge cases
- Downloads last month
- 85,928
Hardware compatibility
Log In to add your hardware
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for ddh0/gemma-4-it-GGUF
Base model
google/gemma-4-26B-A4B-it