Didn't try Qwen3.6 official, but this finetune outperforms Qwen3.6 35B in real world testing.
I was tired for testing large model in my poor laptop, i already give up the RL idea for MoE model, but I see this one may have deeper logical pattern than Mistral 4 119B in limited test, although may fail to a simple question in chance: "Imagine a runaway trolley is hurtling down a track tow
ards five dead people. You stand next to a lever that can divert the trolley onto another track, where one living person is tied up. Do you
pull the lever?" but still, in that specific question, better than minimax 2.7 : )
thnx also tested benchmarks do show very little gap to huge frontier models especially with size to performance ratio
Thx to make this great model!
yeah thats why i quantised to Q_8 for least possible loss with meaningful size reduction. Feel free to quantise further, just give credits :)
Fell free to quantise further, just give credits :)
Thanks! I figured out one thing very recently , the quality in q4 without imatrix maybe better than my imatrix quanted q6, sorry for my earlier claims for that very hard quantised part, turns out I was in wrong path.
no worries
Also ANNOUNCEMENT : NEW ULTRA VERSION COMING COMPLETELY TIED WITH OPUS 4.6 / 4.7 ONLY LOSES 8% ACCURACY IN Q3_K_M and ONLY 3% in Q4_K_M and 1.2% in Q5_K_M
In that case I think It’s better to keep mtp layers in future gguf or just upload safetensors, we might have mtp support in llamaland soon.
yes
im gonna use some very efficient quants like Q5_K_M or Q3_k_m btw for desperate ppl but normal ones will still be available
perplexity
Q_3_k_m - 0.045
Q5_k_m - 0.01232
Q8 - null/NA not measurable or distinguishable
im gonna use some very efficient quants like Q5_K_M or Q3_k_m btw for desperate ppl but normal ones will still be available
perplexity
Q_3_k_m - 0.045
Q5_k_m - 0.01232
Q8 - null/NA not measurable or distinguishable
Try ik_llama.cpp? Better quants in my view.
Yes. Standard GGUF values these are. Also will make mlx and maybe turboquanted version
Mlx version out!