Quant_method error, might not work on LMStudio or oMLX

#1
by NemoTLS - opened

Hi, the model is not uploaded yet but following the pattern from previous recent uploads, is this model going to work with oMLX or LMStudio ?
Or is it henceforth only possible to use your models on the Inferencer app ?
Thank you

Did you figure out how to load it in LMStudio? on M3 Ultra 512GB RAM I get this error:

🥲 Failed to load the model

Failed to load model.

Error when loading model: ValueError: Received 2170 parameters not in model: 
lm_head.biases,
lm_head.scales,
model.embed_tokens.biases,
model.embed_tokens.scales,
model.layers.0.mlp.down_proj.biases,
model.layers.0.mlp.down_proj.scales,
model.layers.0.mlp.gate_proj.biases,
model.layers.0.mlp.gate_proj.scales,
model.layers.0.mlp.up_proj.biases,
model.layers.0.mlp.up_proj.scales,
model.layers.0.self_attn.indexer.weights_proj.biases,
model.layers.0.self_attn.indexer.weights_proj.scales,
model.layers.0.self_attn.indexer.wk.biases,
model.layers.0.self_attn.indexer.wk.scales,
model.layers.0.self_attn.indexer.wq_b.biases,
model.layers.0.self_attn.indexer.wq_b.scales,
model.layers.0.self_attn.kv_a_proj_with_mqa.biases,
model.layers.0.self_attn.kv_a_proj_with_mqa.scales,
model.layers.0.self_attn.o_proj.biases,
model.layers.0.self_attn.o_proj.scales,
model.layers.0.self_attn.q_a_proj.biases,
model.layers.0.self_attn.q_a_proj.scales,
model.layers.0.self_attn.q_b_proj.biases,
model.layers.0.self_attn.q_b_proj.scales,
model.layers.0.self_attn.unembed_out.biases,
model.layers.0.self_attn.unembed_out.scales,
model.layers.1.mlp.down_proj.biases,
model.layers.1.mlp.down_proj.scales,
model.layers.1.mlp.gate_proj.biases,
model.layers.1.mlp.gate_proj.scales,
model.layers.1.mlp.up_proj.biases,
model.layers.1.mlp.up_proj.scales,
model.layers.1.self_attn.indexer.weights_proj.biases,
model.layers.1.self_attn.indexer.weights_proj.scales,
model.layers.1.self_attn.indexer.wk.biases,
model.layers.1.self_attn.indexer.wk.scales,
model.layers.1.self_attn.indexer.wq_b.biases,
model.layers.1.self_attn.indexer.wq_b.scales,
model.layers.1.self_attn.kv_a_proj_with_mqa.biases,
model.layers.1.self_attn.kv_a_proj_with_mqa.scales,
model.layers.1.self_attn.o_proj.biases,
model.layers.1.self_attn.o_proj.scales,
model.layers.1.self_attn.q_a_proj.biases,
model.layers.1.self_attn.q_a_proj.scales,
model.layers.1.self_attn.q_b_proj.biases,
model.layers.1.self_attn.q_b_proj.scales,
model.layers.1.self_attn.unembed_out.biases,
model.layers.1.self_attn.unembed_out.scales,
model.layers.10.mlp.shared_experts.down_proj.biases,
....

it seems like providing models that can run on other engines is no longer a priority...

Sign up or log in to comment