MTP support

#24
by BroLaurens - opened

Happy user of many of your quant here! So first of all thanks for all the great efforts.

Ik_llama.cpp added support for MTP now. However these layers are missing in the existing quants (most likely for compatibility). I understand you already produce so much for the community, and this will add another dimension to worry about. Still I'm going to try and look at you with my puppy eyes and request if you could consider adding an MTP version as well, even if only for a select quant (hopefully q4 for us poor 24GB GPU folks).

Sign up or log in to comment