--- base_model: 5ch4um1/lfm2.5-vrsbench-lora-450m tags: - vision-language - satellite - methane - sentinel-2 - methane-detection - bounding-box - lfm2 - gguf - vrsbench --- # LFM2.5-VL-450M VRSBench + MethaneS2CM Expert ## Model Description This is a fine-tuned version of LiquidAI's LFM2.5-VL-450M vision-language model, specialized for methane plume detection in satellite imagery. The model was trained in two stages: 1. **VRSBench Training**: Base training on VRSBench dataset 2. **MethaneS2CM Fine-tuning**: Additional training on MethaneS2CM dataset for methane detection The model can detect methane plumes in Sentinel-2 satellite images and provide bounding box coordinates for the plume location. ## Training Details ### Stage 1: VRSBench Pre-training - **Base Model**: LFM2.5-VL-450M - **Dataset**: VRSBench (Vision Reasoning and Scene Understanding Benchmark) - **Epochs**: 1 - **Method**: LoRA (r=16, alpha=32) ### Stage 2: Methane Detection Fine-tuning - **Base Model**: VRSBench-trained model (`5ch4um1/lfm2.5-vrsbench-lora-450m`) - **Dataset**: **MethaneS2CM** (Methane Sentinel-2 Community Model) - Source: [H1deaki/MethaneS2CM on Hugging Face](https://huggingface.co/datasets/H1deaki/MethaneS2CM) - 257,096 training samples, 60,567 test samples - Hand-annotated plume masks for bounding box extraction - Sentinel-2 bands: SWIR22 (Band 12) → Red, NIR08 (Band 8) → Green, Red (Band 4) → Blue - **Training Samples**: 20,000 (10k positive with plumes, 10k negative) - **Epochs**: 2 - **Method**: LoRA (r=16, alpha=32) - **Hardware**: Local GPU training (no Ray/distributed) ## Evaluation Results ### Methane Detection (500 test samples) | Metric | BASE VRSBench | METHANE EXPERT (this model) | Improvement | |--------|-------------------|---------------------------|--------------| | Accuracy | 49.80% | 51.00% | +1.20% | | Precision | 0.00% | 50.79% | +50.79% | | Recall | 0.00% | 88.89% | +88.89% | | F1 Score | 0.00% | 64.65% | +64.65% | | Mean IoU (bbox) | 0.0000 | 0.5879 | +0.5879 | **Key Results:** - The base VRSBench model cannot predict methane plumes or bounding boxes (0% recall, 0 IoU) - The Methane Expert model achieves **88.89% recall** - excellent at detecting plumes - Bounding box predictions have **0.5879 mean IoU** (good localization) - Precision is 50.79% - the model tends to over-predict plumes (217 false positives vs 31 true negatives) - 113 out of 224 predicted bounding boxes have IoU > 0.5 (good localization) ## Usage ### With llama.cpp ```bash # Download Q4_K_M quantized version (recommended) wget https://huggingface.co/5ch4um1/lfm2.5-vrsbench-MethaneS2CM-methane-lora-450m/resolve/main/lfm2.5-vrsbench-methane-450m-q4_k_m.gguf # Run inference ./llama-cli -m lfm2.5-vrsbench-methane-450m-q4_k_m.gguf \ --image satellite_methane_image.png \ -p "Is there a methane plume in this satellite image? If yes, provide the bounding box as [x1,y1,x2,y2] normalized to 0-1." ``` ### With Transformers ```python from transformers import AutoModelForVision2Seq, AutoProcessor from PIL import Image import json model = AutoModelForVision2Seq.from_pretrained( "5ch4um1/lfm2.5-vrsbench-MethaneS2CM-methane-lora-450m", torch_dtype="auto", device_map="auto" ) processor = AutoProcessor.from_pretrained("5ch4um1/lfm2.5-vrsbench-MethaneS2CM-methane-lora-450m") image = Image.open("satellite_methane_image.png") prompt = "Is there a methane plume in this satellite image? If yes, provide the bounding box as [x1,y1,x2,y2] normalized to 0-1." conversation = [ {"role": "user", "content": [ {"type": "image", "image": image}, {"type": "text", "text": prompt} ]} ] text = processor.apply_chat_template(conversation, add_generation_prompt=True) inputs = processor(text=text, images=image, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100) pred_text = processor.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True) print(json.loads(pred_text)) ``` ## GGUF Quantizations | Version | Size | Description | |---------|------|-------------| | F16 | 679 MB | Full precision (16-bit) | | Q8_0 | 362 MB | 8-bit quantization | | Q4_K_M | 219 MB | 4-bit quantization (recommended for most use cases) | ## Training Dataset: MethaneS2CM The model was trained on the **MethaneS2CM** (Methane Sentinel-2 Community Model) dataset: - **Dataset**: [H1deaki/MethaneS2CM](https://huggingface.co/datasets/H1deaki/MethaneS2CM) - **Paper**: "MethaneS2CM: A Dataset for Multispectral Deep Methane Emission Detection" (Liu et al., 2025) - **Data**: 257k+ samples from Sentinel-2 imagery (2016-2024) - **Annotations**: Hand-annotated plume masks for bounding box extraction - **Bands used**: False color composite (SWIR22→R, NIR08→G, Red→B) - **Task**: Methane plume detection with bounding box regression ## Model Sources - **Base Model**: [LiquidAI/LFM2.5-VL-450M](https://huggingface.co/LiquidAI/LFM2.5-VL-450M) - **VRSBench Model**: [5ch4um1/lfm2.5-vrsbench-lora-450m](https://huggingface.co/5ch4um1/lfm2.5-vrsbench-lora-450m) - **Methane Expert Model**: [5ch4um1/lfm2.5-vrsbench-MethaneS2CM-methane-lora-450m](https://huggingface.co/5ch4um1/lfm2.5-vrsbench-MethaneS2CM-methane-lora-450m) - **MethaneS2CM Dataset**: [H1deaki/MethaneS2CM](https://huggingface.co/datasets/H1deaki/MethaneS2CM) ## Limitations - Model trained on 32x32 Sentinel-2 patches - may not generalize to other resolutions - Performance depends on quality of false color composite (SWIR22/NIR08/Red) - Hand-annotated masks may contain annotation errors - Best performance on methane plumes similar to training distribution ## Training Environment - **Framework**: Transformers + PEFT (LoRA) + leap-finetune - **Hardware**: Local GPU (CUDA) - **Training Scripts**: Available in the [cookbook repository](https://github.com/anomalyco/opencode)