FYI: performance vs mlx
#1
by dakerholdings - opened
FWIW, I tried this, & was only getting ~10tps in chat responses on M1...
(as a comparison: with mlx_lm, I was getting 19-23tps with cs2764/Qwen3-4B-mlx-mixed_4_6).
FWIW, I tried this, & was only getting ~10tps in chat responses on M1...
(as a comparison: with mlx_lm, I was getting 19-23tps with cs2764/Qwen3-4B-mlx-mixed_4_6).