MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models
Paper • 2601.11464 • Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Research Paper: "MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models"
Step 1: Download the Stage 1 checkpoint cnxup/LLaVA-NeXT-8B-MLA-stage1-rope32
Step 2: For MHA2MLA-VLM models using Partial-RoPE MKL method, Download the MKL file.
Step 3: For MHA2MLA-VLM models using MD-SVD method, Take the d_kv_64 as an example.
Step 4: For evaluation, please visit our GitHub and use lmms-eval for benchmarking. Detailed instructions can be found in the Evaluation section.
cd eval
cd llavanext
sh eval.sh
@misc{fan2026mha2mlavlmenablingdeepseekseconomical,
title={MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models},
author={Xiaoran Fan and Zhichao Sun and Tao Ji and Lixing Shen and Tao Gui},
year={2026},
eprint={2601.11464},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.11464},
}