Vocaela-500M-GGUF

This repo contains GGUF-format weights for Vocaela-500M, containing:

  • Language model (LLM): Q8_0
  • Vision encoder (mmproj): Q8_0

Note: So far LlamaCpp has problem on rendering chat template correctly. To workaround it, we apply chat template (e.g., using python / node.js etc.) before calling llama-server endpoint. For examples of how to use it, please refer to repo vocaela-500m-demo

Downloads last month
11
GGUF
Model size
0.4B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vocaela/Vocaela-500M-GGUF

Collection including vocaela/Vocaela-500M-GGUF