Error loading the model
#1
by cpatonn - opened
Hello, this model does not support --tesor-parallel-size > 2, so please use pipeline-parallel-size together with tesor-parallel-size to avoid model loading error.
In addition, MTP layers are implemented and can be invoked using the flag --speculative-config '{"method":"qwen3_next_mtp","num_speculative_tokens":2}', but MTP layers cannot be used together with pipeline parallelism.
Thanks for releasing the quant version <3
Thank you!! :D