File size: 2,914 Bytes

---
language:
- en
library_name: mlx
license: gemma
license_link: https://ai.google.dev/gemma/docs/gemma_4_license
pipeline_tag: any-to-any
base_model: google/gemma-4-E4B-it
tags:
- quantized
- apple-silicon
- mlx
- gemma4
- vision
- audio
- multimodal
- 8bit
---

<p align="center">
  <a href="https://osaurus.ai"><img src="https://cdn-avatars.huggingface.co/v1/production/uploads/69d00705ce8872981c6c4fce/GWKjOwezSOhW5iuKpDwq_.png" alt="Osaurus AI" width="120"></a>
</p>

<h3 align="center">Gemma 4 E4B-it &mdash; 8-bit (MLX)</h3>
<p align="center">Properly converted with all vision and audio tower weights verified intact</p>

<p align="center">
  <a href="https://osaurus.ai"><img src="https://img.shields.io/badge/Web-osaurus.ai-blue" alt="Website"></a>&nbsp;
  <a href="https://huggingface.co/OsaurusAI"><img src="https://img.shields.io/badge/HF-OsaurusAI-yellow?logo=huggingface" alt="OsaurusAI"></a>
</p>

---

> **Why this exists:** The mlx-community 8-bit conversion of Gemma 4 E4B has broken/zeroed-out vision tower weights, producing a model that appears functional for text but silently fails on image and audio inputs. This is a clean conversion from the original `google/gemma-4-E4B-it` with every multimodal weight tensor verified non-zero.

---

## Model Details

| Property | Value |
|----------|-------|
| **Base Model** | [`google/gemma-4-E4B-it`](https://huggingface.co/google/gemma-4-E4B-it) |
| **Parameters** | 4.5B effective (8B total with Per-Layer Embeddings) |
| **Quantization** | 8-bit affine |
| **Avg Bits/Weight** | 8.998 |
| **Model Size** | 8.3 GB |
| **Architecture** | Gemma 4 (text + vision + audio) |
| **Context Length** | 128K tokens |
| **Vocabulary** | 262K tokens |

## Multimodal Weight Verification

Every tensor in every multimodal component was loaded and checked for `max(abs(tensor)) > 0`. **Zero broken weights found.**

| Component | Tensor Count | Status |
|-----------|-------------|--------|
| **Vision Tower** (SigLIP) | 658 | All non-zero |
| **Audio Tower** (Conformer) | 751 | All non-zero |
| **Language Model** | 1,485 | All non-zero |
| **Total** | **2,894** | **All verified** |

## Usage

```bash
# Requires Osaurus (https://osaurus.ai)
osaurus serve OsaurusAI/gemma-4-E4B-it-8bit
```

```python
# Python API
from mlx_vlm import load, generate

model, processor = load("OsaurusAI/gemma-4-E4B-it-8bit")

# Text-only
output = generate(model, processor, "Explain quantum computing", max_tokens=500)

# With image
output = generate(model, processor, "Describe this image", ["path/to/image.jpg"], max_tokens=500)
```

## Conversion Details

| Detail | Value |
|--------|-------|
| **Tool** | `mlx-vlm` v0.4.4 |
| **Source dtype** | bfloat16 |
| **Quantization mode** | affine |
| **Group size** | 64 |
| **Source** | `google/gemma-4-E4B-it` (original Google release) |

---

<p align="center">Converted by <a href="https://osaurus.ai">Osaurus AI</a></p>