Ministral-DrugDetector-14B-MFI

Model Description

Ministral-DrugDetector-14B-MFI is a LoRA fine-tuned model for detecting illicit drug use in clinical notes. It is based on mistralai/Ministral-3-14B-Instruct-2512 (14B parameters).

The model performs multi-task classification, predicting:

Illicit use detection for Methamphetamine, Fentanyl, and Injection Drug Use (True/False/Unknown)
Temporal classification for each drug (Current/Historical/Unknown/N/A)

Performance

Ranked 1st out of 3 models in our evaluation (Macro F1: 0.940)

Illicit Use Detection (F1 Scores)

Drug	F1	Precision	Recall
Methamphetamine	0.931	0.904	0.959
Fentanyl	0.969	1.000	0.939
Injection Drug Use	0.921	0.967	0.879
Macro Average	0.940

Temporal Classification (Current vs Historical)

Drug	Accuracy
Methamphetamine	0.723
Fentanyl	0.710
Injection Drug Use	0.679
Average	0.704

Model Comparison

Model	Size	Macro F1	Temporal Acc
Ministral-DrugDetector-14B-MFI	14B	0.940	0.704
Llama-DrugDetector-8B-MFI-v2	8B	0.876	0.533
MediPhi-DrugDetector-MFI	3.8B	0.863	0.502

Usage

from transformers import Mistral3ForConditionalGeneration, MistralCommonBackend
from peft import PeftModel
import torch

# Load base model
base_model = "mistralai/Ministral-3-14B-Instruct-2512"
model = Mistral3ForConditionalGeneration.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "fabriceyhc/Ministral-DrugDetector-14B-MFI")

# Load tokenizer
tokenizer = MistralCommonBackend.from_pretrained("fabriceyhc/Ministral-DrugDetector-14B-MFI")

# Task instruction
task_instruction = """### Task Description:
Please carefully review the following medical note and identify illicit drug use.

**CRITICAL RULES:**
1. **Positive drug test -> ALWAYS ILLICIT** (unless in medication list)
2. **PMH/History of use -> ILLICIT** (even if historical)
3. **Substance use disorder -> ILLICIT**
4. **Patient self-reports (endorses, reports, admits) -> ILLICIT**
5. **Prescribed/medical use -> NOT ILLICIT**

**Drugs to identify:**
- **Methamphetamine**: Illicit amphetamine use (not prescribed Adderall)
- **Fentanyl**: Illicit fentanyl use (not prescribed patches/procedural)
- **Injection Drug Use**: IV drug use (IVDU, IVDA)

**Temporal Classification:**
- **Current**: Present tense, recent use, positive test
- **Historical**: Past tense, "history of", "former user"
- **Unknown**: Timeframe unclear

### Desired Format:
Methamphetamine Illicit Use: <True/False/Unknown>
Fentanyl Illicit Use: <True/False/Unknown>
Injection Drug Use: <True/False/Unknown>
Methamphetamine Temporal Status: <Current/Historical/Unknown/N/A>
Fentanyl Temporal Status: <Current/Historical/Unknown/N/A>
Injection Drug Use Temporal Status: <Current/Historical/Unknown/N/A>"""

# Example note
note_text = "Patient reports using meth daily for the past 2 weeks. Denies IV drug use."

# Format prompt
prompt = f"""<s>[INST] {task_instruction}

### The medical note to evaluate:
{note_text}

### Answer: [/INST]"""

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    do_sample=False,
    pad_token_id=tokenizer.eos_token_id
)

# Decode
result = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(result)

Expected Output

Methamphetamine Illicit Use: True
Fentanyl Illicit Use: False
Injection Drug Use: False
Methamphetamine Temporal Status: Current
Fentanyl Temporal Status: N/A
Injection Drug Use Temporal Status: N/A

Training Details

Base Model: mistralai/Ministral-3-14B-Instruct-2512
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 16
LoRA Alpha: 32
Training Data: 93 annotated clinical note samples
Validation Data: 21 samples
Test Data: 129 samples
Epochs: 5 (with early stopping)
Learning Rate: 2e-4
Precision: bfloat16

Limitations

Domain-specific: Trained on clinical notes from specific healthcare systems
Drug Coverage: Limited to methamphetamine, fentanyl, and injection drug use
Context Window: May truncate very long notes
Temporal Accuracy: ~50-70% accuracy on Current vs Historical classification

Intended Use

Recommended:

Retrospective chart review for substance use research
Clinical decision support (with human review)
Cohort identification for substance use studies

Not Recommended:

Fully automated clinical decision-making without human oversight
Forensic or legal determination of substance use

Citation

@misc{ministral-drugdetector-14b-mfi-2025,
  author = {Hanna-Chang, Fabrice},
  title = {Ministral-DrugDetector-14B-MFI},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/fabriceyhc/Ministral-DrugDetector-14B-MFI}
}

License

APACHE-2.0

Downloads last month: 4

Model tree for fabriceyhc/Ministral-DrugDetector-14B-MFI

Base model

mistralai/Ministral-3-14B-Base-2512

Quantized

mistralai/Ministral-3-14B-Instruct-2512

Adapter

(2)

this model