GPTQ + KV TQ4 local results: HumanEval ~ 67.68%
Hi, thanks a lot for releasing this model β great work.
I have a quick question regarding the HumanEval pass@1 comparison and test alignment.
Qwopus3.5β9Bβv3 (base) reports a pass@1 = 87.80% (144 / 164), which is very strong. In my own local tests of the GPTQ variant (using KV quantization = TQ4, HumanEval setup on my side), I get pass@1 β 67.68%. This is still very high quality in practice, but noticeably lower than the reported base-model result.
Hi, thanks a lot for releasing this model β great work.
I have a quick question regarding the HumanEval pass@1 comparison and test alignment.
Qwopus3.5β9Bβv3 (base) reports a pass@1 = 87.80% (144 / 164), which is very strong. In my own local tests of the GPTQ variant (using KV quantization = TQ4, HumanEval setup on my side), I get pass@1 β 67.68%. This is still very high quality in practice, but noticeably lower than the reported base-model result.
i appreciate your work!
still great results but far from the baseline, sorry for any miss interp.