Ministral-DrugDetector-14B-MFI
Model Description
Ministral-DrugDetector-14B-MFI is a LoRA fine-tuned model for detecting illicit drug use in clinical notes. It is based on mistralai/Ministral-3-14B-Instruct-2512 (14B parameters).
The model performs multi-task classification, predicting:
- Illicit use detection for Methamphetamine, Fentanyl, and Injection Drug Use (True/False/Unknown)
- Temporal classification for each drug (Current/Historical/Unknown/N/A)
Performance
Ranked 1st out of 3 models in our evaluation (Macro F1: 0.940)
Illicit Use Detection (F1 Scores)
| Drug | F1 | Precision | Recall |
|---|---|---|---|
| Methamphetamine | 0.931 | 0.904 | 0.959 |
| Fentanyl | 0.969 | 1.000 | 0.939 |
| Injection Drug Use | 0.921 | 0.967 | 0.879 |
| Macro Average | 0.940 |
Temporal Classification (Current vs Historical)
| Drug | Accuracy |
|---|---|
| Methamphetamine | 0.723 |
| Fentanyl | 0.710 |
| Injection Drug Use | 0.679 |
| Average | 0.704 |
Model Comparison
| Model | Size | Macro F1 | Temporal Acc |
|---|---|---|---|
| Ministral-DrugDetector-14B-MFI | 14B | 0.940 | 0.704 |
| Llama-DrugDetector-8B-MFI-v2 | 8B | 0.876 | 0.533 |
| MediPhi-DrugDetector-MFI | 3.8B | 0.863 | 0.502 |
Usage
from transformers import Mistral3ForConditionalGeneration, MistralCommonBackend
from peft import PeftModel
import torch
# Load base model
base_model = "mistralai/Ministral-3-14B-Instruct-2512"
model = Mistral3ForConditionalGeneration.from_pretrained(
base_model,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(model, "fabriceyhc/Ministral-DrugDetector-14B-MFI")
# Load tokenizer
tokenizer = MistralCommonBackend.from_pretrained("fabriceyhc/Ministral-DrugDetector-14B-MFI")
# Task instruction
task_instruction = """### Task Description:
Please carefully review the following medical note and identify illicit drug use.
**CRITICAL RULES:**
1. **Positive drug test -> ALWAYS ILLICIT** (unless in medication list)
2. **PMH/History of use -> ILLICIT** (even if historical)
3. **Substance use disorder -> ILLICIT**
4. **Patient self-reports (endorses, reports, admits) -> ILLICIT**
5. **Prescribed/medical use -> NOT ILLICIT**
**Drugs to identify:**
- **Methamphetamine**: Illicit amphetamine use (not prescribed Adderall)
- **Fentanyl**: Illicit fentanyl use (not prescribed patches/procedural)
- **Injection Drug Use**: IV drug use (IVDU, IVDA)
**Temporal Classification:**
- **Current**: Present tense, recent use, positive test
- **Historical**: Past tense, "history of", "former user"
- **Unknown**: Timeframe unclear
### Desired Format:
Methamphetamine Illicit Use: <True/False/Unknown>
Fentanyl Illicit Use: <True/False/Unknown>
Injection Drug Use: <True/False/Unknown>
Methamphetamine Temporal Status: <Current/Historical/Unknown/N/A>
Fentanyl Temporal Status: <Current/Historical/Unknown/N/A>
Injection Drug Use Temporal Status: <Current/Historical/Unknown/N/A>"""
# Example note
note_text = "Patient reports using meth daily for the past 2 weeks. Denies IV drug use."
# Format prompt
prompt = f"""<s>[INST] {task_instruction}
### The medical note to evaluate:
{note_text}
### Answer: [/INST]"""
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
# Decode
result = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(result)
Expected Output
Methamphetamine Illicit Use: True
Fentanyl Illicit Use: False
Injection Drug Use: False
Methamphetamine Temporal Status: Current
Fentanyl Temporal Status: N/A
Injection Drug Use Temporal Status: N/A
Training Details
- Base Model: mistralai/Ministral-3-14B-Instruct-2512
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 16
- LoRA Alpha: 32
- Training Data: 93 annotated clinical note samples
- Validation Data: 21 samples
- Test Data: 129 samples
- Epochs: 5 (with early stopping)
- Learning Rate: 2e-4
- Precision: bfloat16
Limitations
- Domain-specific: Trained on clinical notes from specific healthcare systems
- Drug Coverage: Limited to methamphetamine, fentanyl, and injection drug use
- Context Window: May truncate very long notes
- Temporal Accuracy: ~50-70% accuracy on Current vs Historical classification
Intended Use
Recommended:
- Retrospective chart review for substance use research
- Clinical decision support (with human review)
- Cohort identification for substance use studies
Not Recommended:
- Fully automated clinical decision-making without human oversight
- Forensic or legal determination of substance use
Citation
@misc{ministral-drugdetector-14b-mfi-2025,
author = {Hanna-Chang, Fabrice},
title = {Ministral-DrugDetector-14B-MFI},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/fabriceyhc/Ministral-DrugDetector-14B-MFI}
}
License
APACHE-2.0
- Downloads last month
- 4
Model tree for fabriceyhc/Ministral-DrugDetector-14B-MFI
Base model
mistralai/Ministral-3-14B-Base-2512 Quantized
mistralai/Ministral-3-14B-Instruct-2512