GGUF/llama.cpp support

by tcpmux - opened Mar 13

Discussion

tcpmux

Mar 13

Would be awesome!

Lockout

Mar 14

It may already be supported since it's just llama architecture. There are GGUF of the base model uploaded. As long as it doesn't mirror/echo from the instruction tuning should be a good one.

tcpmux

13 days ago

•

edited 13 days ago

"It may already be supported since it's just llama architecture"
Sadly it's not. It can be converted and quantized, but the corresponding file is not properly accepted by llama.cpp and instead crashes with errors when loading.

Lockout

3 days ago

Did it run if you change metadata to one of the qwens? I didn't look deeply at the whole architecture or if they truly did anything new.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment