Gemma 4 E4B MTP Extraction Effort
How to Replicate
Model extracted with the litertlm_peek_main CLI from https://github.com/google-ai-edge/LiteRT-LM
To replicate:
- Git clone the repo and enter the directory
git clone https://github.com/google-ai-edge/LiteRT-LM.git
cd LiteRT-LM/
git fetch --tags
- Install gcc and other compilers
sudo apt update && sudo apt install -y clang build-essential
Note: You need to install Bazel https://bazel.build/install
- Run the extractor CLI
bazel run //schema/py:litertlm_peek_main -- --litertlm_file=/path/to/gemma-4-E4B-it.litertlm --dump_files_dir=/path/to/extracted_gemma4
Clues
Someone smarter than me with more knowledge in C++ should please reverse engineer these files, so we can figure out how the MTP runs: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_mtp_drafter.h
It looks like it's called here for e2e drafting: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_compiled_model_executor.cc
There's this file that tests it: https://github.com/google-ai-edge/LiteRT-LM/blob/cdb7e4bc31bf01b000eba5d2599337ada5e4945c/runtime/executor/llm_litert_compiled_model_executor_test.cc#L435
A very good idea is to utilize the Google AI Edge Model Explorer https://github.com/google-ai-edge/model-explorer
In this model explorer you can visualize Section11_TFLiteModel_tf_lite_mtp_drafter.tflite
Which will show up as this huge graph:
I have extracted the graph as a JSON, which can be found here
Maybe someone can reverse engineer this graph with GPT or Claude or something and output a clean Pytorch file????
Baseline of what ChatGPT Pro 5.4 Extended thinking spat out: https://chatgpt.com/share/69d8d08a-c458-838f-9b6d-e72d2956dede
- Downloads last month
- 160
