Ministral-DrugDetector-14B-MFI

Model Description

Ministral-DrugDetector-14B-MFI is a LoRA fine-tuned model for detecting illicit drug use in clinical notes. It is based on mistralai/Ministral-3-14B-Instruct-2512 (14B parameters).

The model performs multi-task classification, predicting:

  1. Illicit use detection for Methamphetamine, Fentanyl, and Injection Drug Use (True/False/Unknown)
  2. Temporal classification for each drug (Current/Historical/Unknown/N/A)

Performance

Ranked 1st out of 3 models in our evaluation (Macro F1: 0.940)

Illicit Use Detection (F1 Scores)

Drug F1 Precision Recall
Methamphetamine 0.931 0.904 0.959
Fentanyl 0.969 1.000 0.939
Injection Drug Use 0.921 0.967 0.879
Macro Average 0.940

Temporal Classification (Current vs Historical)

Drug Accuracy
Methamphetamine 0.723
Fentanyl 0.710
Injection Drug Use 0.679
Average 0.704

Model Comparison

Model Size Macro F1 Temporal Acc
Ministral-DrugDetector-14B-MFI 14B 0.940 0.704
Llama-DrugDetector-8B-MFI-v2 8B 0.876 0.533
MediPhi-DrugDetector-MFI 3.8B 0.863 0.502

Usage

from transformers import Mistral3ForConditionalGeneration, MistralCommonBackend
from peft import PeftModel
import torch

# Load base model
base_model = "mistralai/Ministral-3-14B-Instruct-2512"
model = Mistral3ForConditionalGeneration.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "fabriceyhc/Ministral-DrugDetector-14B-MFI")

# Load tokenizer
tokenizer = MistralCommonBackend.from_pretrained("fabriceyhc/Ministral-DrugDetector-14B-MFI")

# Task instruction
task_instruction = """### Task Description:
Please carefully review the following medical note and identify illicit drug use.

**CRITICAL RULES:**
1. **Positive drug test -> ALWAYS ILLICIT** (unless in medication list)
2. **PMH/History of use -> ILLICIT** (even if historical)
3. **Substance use disorder -> ILLICIT**
4. **Patient self-reports (endorses, reports, admits) -> ILLICIT**
5. **Prescribed/medical use -> NOT ILLICIT**

**Drugs to identify:**
- **Methamphetamine**: Illicit amphetamine use (not prescribed Adderall)
- **Fentanyl**: Illicit fentanyl use (not prescribed patches/procedural)
- **Injection Drug Use**: IV drug use (IVDU, IVDA)

**Temporal Classification:**
- **Current**: Present tense, recent use, positive test
- **Historical**: Past tense, "history of", "former user"
- **Unknown**: Timeframe unclear

### Desired Format:
Methamphetamine Illicit Use: <True/False/Unknown>
Fentanyl Illicit Use: <True/False/Unknown>
Injection Drug Use: <True/False/Unknown>
Methamphetamine Temporal Status: <Current/Historical/Unknown/N/A>
Fentanyl Temporal Status: <Current/Historical/Unknown/N/A>
Injection Drug Use Temporal Status: <Current/Historical/Unknown/N/A>"""

# Example note
note_text = "Patient reports using meth daily for the past 2 weeks. Denies IV drug use."

# Format prompt
prompt = f"""<s>[INST] {task_instruction}

### The medical note to evaluate:
{note_text}

### Answer: [/INST]"""

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    do_sample=False,
    pad_token_id=tokenizer.eos_token_id
)

# Decode
result = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(result)

Expected Output

Methamphetamine Illicit Use: True
Fentanyl Illicit Use: False
Injection Drug Use: False
Methamphetamine Temporal Status: Current
Fentanyl Temporal Status: N/A
Injection Drug Use Temporal Status: N/A

Training Details

  • Base Model: mistralai/Ministral-3-14B-Instruct-2512
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • LoRA Rank: 16
  • LoRA Alpha: 32
  • Training Data: 93 annotated clinical note samples
  • Validation Data: 21 samples
  • Test Data: 129 samples
  • Epochs: 5 (with early stopping)
  • Learning Rate: 2e-4
  • Precision: bfloat16

Limitations

  1. Domain-specific: Trained on clinical notes from specific healthcare systems
  2. Drug Coverage: Limited to methamphetamine, fentanyl, and injection drug use
  3. Context Window: May truncate very long notes
  4. Temporal Accuracy: ~50-70% accuracy on Current vs Historical classification

Intended Use

Recommended:

  • Retrospective chart review for substance use research
  • Clinical decision support (with human review)
  • Cohort identification for substance use studies

Not Recommended:

  • Fully automated clinical decision-making without human oversight
  • Forensic or legal determination of substance use

Citation

@misc{ministral-drugdetector-14b-mfi-2025,
  author = {Hanna-Chang, Fabrice},
  title = {Ministral-DrugDetector-14B-MFI},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/fabriceyhc/Ministral-DrugDetector-14B-MFI}
}

License

APACHE-2.0

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fabriceyhc/Ministral-DrugDetector-14B-MFI