Error loading the model

by cpatonn - opened 30 days ago

cyankiwi org 30 days ago

Hello, this model does not support --tesor-parallel-size > 2, so please use pipeline-parallel-size together with tesor-parallel-size to avoid model loading error.

In addition, MTP layers are implemented and can be invoked using the flag --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}', but MTP layers cannot be used together with pipeline parallelism.

dr-e

30 days ago

Thanks for releasing the quant version <3

JoeSmith245

29 days ago

Thank you!! :D

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment