Gemma 4 E4B MTP Extraction Effort


How to Replicate

Model extracted with the litertlm_peek_main CLI from https://github.com/google-ai-edge/LiteRT-LM

To replicate:

  1. Git clone the repo and enter the directory
git clone https://github.com/google-ai-edge/LiteRT-LM.git
cd LiteRT-LM/
git fetch --tags
  1. Install gcc and other compilers
sudo apt update && sudo apt install -y clang build-essential

Note: You need to install Bazel https://bazel.build/install

  1. Run the extractor CLI
bazel run //schema/py:litertlm_peek_main -- --litertlm_file=/path/to/gemma-4-E4B-it.litertlm --dump_files_dir=/path/to/extracted_gemma4

Clues

Someone smarter than me with more knowledge in C++ should please reverse engineer these files, so we can figure out how the MTP runs: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_mtp_drafter.h

https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_mtp_drafter.cc

It looks like it's called here for e2e drafting: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_compiled_model_executor.cc

There's this file that tests it: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_compiled_model_executor_test.cc#L435

A very good idea is to utilize the Google AI Edge Model Explorer https://github.com/google-ai-edge/model-explorer

In this model explorer you can visualize Section11_TFLiteModel_tf_lite_mtp_drafter.tflite

Which will show up as this huge graph:

Image of Model Explorer showing the MTP Graph

I have extracted the graph as a JSON, which can be found here

Maybe someone can reverse engineer this graph with GPT or Claude or something and output a clean Pytorch file????

Baseline of what ChatGPT Pro 5.4 Extended thinking spat out: https://chatgpt.com/share/69d8d08a-c458-838f-9b6d-e72d2956dede

Downloads last month
160
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support