Gemma 4 E4B MTP Extraction Effort

How to Replicate

Model extracted with the litertlm_peek_main CLI from https://github.com/google-ai-edge/LiteRT-LM

To replicate:

git clone https://github.com/google-ai-edge/LiteRT-LM.git
cd LiteRT-LM/
git fetch --tags

sudo apt update && sudo apt install -y clang build-essential

Note: You need to install Bazel https://bazel.build/install

bazel run //schema/py:litertlm_peek_main -- --litertlm_file=/path/to/gemma-4-E4B-it.litertlm --dump_files_dir=/path/to/extracted_gemma4

Someone smarter than me with more knowledge in C++ should please reverse engineer these files, so we can figure out how the MTP runs: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_mtp_drafter.h

A very good idea is to utilize the Google AI Edge Model Explorer https://github.com/google-ai-edge/model-explorer

Which will show up as this huge graph:

I have extracted the graph as a JSON, which can be found here

Maybe someone can reverse engineer this graph with GPT or Claude or something and output a clean Pytorch file????

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support