Agent Nemo β Qwen2-VL-2B-Instruct (Fine-tuned GGUF)
Fine-tuned Qwen2-VL-2B-Instruct for autonomous web navigation, quantized to GGUF Q4_K_M for edge deployment.
Training
- Base model: Qwen2-VL-2B-Instruct
- Method: QLoRA (r=16, alpha=16)
- Dataset: Mind2Web (~7,775 conversations)
- Task: Web navigation β screenshot + AXTree β action JSON
- LoRA adapters: yashsikdar/agent-nemo-qwen2vl-lora
Files
| File | Size | Description |
|---|---|---|
agent-nemo-qwen2vl-q4_k_m.gguf |
~1.4 GB | Language model (Q4_K_M quantization) |
mmproj-agent-nemo-qwen2vl-f16.gguf |
~600 MB | Vision encoder (F16) |
Usage with llama-cpp-python
from llama_cpp import Llama
from llama_cpp.llama_chat_format import Qwen25VLChatHandler
handler = Qwen25VLChatHandler(clip_model_path="mmproj-agent-nemo-qwen2vl-f16.gguf")
llm = Llama(model_path="agent-nemo-qwen2vl-q4_k_m.gguf", chat_handler=handler, n_ctx=4096)
Download
# Using the Agent Nemo download script:
./scripts/download_model.sh
# Or manually:
huggingface-cli download yashsikdar/agent-nemo-qwen2vl-gguf --local-dir ~/.agent-nemo/models/
- Downloads last month
- 3
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support