internvl3-2b-walk-lora-v1
Model Description
This is a LoRA adapter for InternV3-2B, fine-tuned on the WalkVLM dataset to assist visually impaired individuals with navigation hazard detection.
How to Use
Method 1: Using PEFT (Recommended)
import torch
from peft import PeftModel
from transformers import AutoModel, AutoTokenizer
# Load Base Model
base_model = AutoModel.from_pretrained(
"OpenGVLab/InternVL3-2B",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("OpenGVLab/InternVL3-2B", trust_remote_code=True)
# Load LoRA Adapter
model = PeftModel.from_pretrained(base_model, "blind-assist/internvl3-2b-walk-lora-v1")
# Merge for faster inference (optional)
model = model.merge_and_unload()
# Use for inference
response = model.chat(
tokenizer=tokenizer,
pixel_values=pixel_values, # Your preprocessed image
question="Describe any obstacles in this scene.",
generation_config=dict(max_new_tokens=256)
)
Method 2: Manual LoRA Merge
If PEFT doesn't work due to model architecture, use manual merging:
# See our inference script at:
# https://github.com/Blind-Assist/InternVL/blob/walkvlm/internvl_chat/test_finetuned_model.py
Training Details
- Base Model: OpenGVLab/InternVL3-2B
- Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 128
- Dataset: blind-assist/walk-train
- Task: Navigation hazard detection for visually impaired users
Files
adapter_config.json- PEFT LoRA configurationadapter_model.safetensors- LoRA weights only (~50MB)
License
Same as base model (OpenGVLab/InternVL3-2B)
- Downloads last month
- 4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for blind-assist/internvl3-2b-walk-lora-v1
Base model
OpenGVLab/InternVL3-2B-Pretrained Finetuned
OpenGVLab/InternVL3-2B-Instruct Finetuned
OpenGVLab/InternVL3-2B