🛡️ DeepGuard v2 — EfficientNet-B4 Deepfake Detector
Custom-trained deepfake detection model based on EfficientNet-B4 (19M params), fine-tuned on 200K+ real and fake face images.
Model Description
This model classifies face images as Real or Fake (AI-generated/manipulated). It replaces the generic CommunityForensics model used in DeepGuard v1, which only achieved ~60% accuracy.
Architecture
- Backbone: EfficientNet-B4 (ImageNet pretrained, 380×380 input)
- Head: Linear(1792 → 2) with label smoothing (0.1)
- Loss: Weighted cross-entropy (handles class imbalance)
- Parameters: ~19M total, all trainable
Training Recipe (Based on SOTA Literature)
| Parameter | Value |
|---|---|
| Optimizer | AdamW (β1=0.9, β2=0.999) |
| Learning Rate | 2e-4 (cosine decay) |
| Weight Decay | 0.01 |
| Warmup | 10% of steps |
| Batch Size | 64 (16 × 4 grad accum) |
| Epochs | 5 (early stopping patience=3) |
| Precision | FP16 |
| Augmentations | RandomResizedCrop, HFlip, ColorJitter, GaussianBlur, RandomErasing, Rotation |
| Regularization | Label smoothing (0.1), Dropout (0.4), Drop-connect (0.2), Weight decay |
Training Data
- Hemg/deepfake-and-real-images: 140K+ images
- JamieWithofs/Deepfake-and-real-images: train/val/test splits
- Combined: ~200K+ images with balanced real/fake classes
Usage
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch
model = AutoModelForImageClassification.from_pretrained("Jetherruns/deepfake-detector-efnb4")
processor = AutoImageProcessor.from_pretrained("Jetherruns/deepfake-detector-efnb4")
image = Image.open("face.jpg")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
pred = probs.argmax(-1).item()
print(f"{'Fake' if pred == 0 else 'Real'} ({probs[0][pred]:.1%} confidence)")
ONNX Inference (Faster)
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("model.onnx")
# or INT8 quantized: session = ort.InferenceSession("model_int8.onnx")
logits = session.run(None, {"pixel_values": pixel_values_np})[0]
SOTA References
- SBI (CVPR 2022): Self-Blended Images for deepfake detection
- GenD (2025): CLIP ViT-L/14 + LN-Tuning → 96% AUC cross-dataset
- Effort (NeurIPS 2024): SVD orthogonal modeling for generalization
Limitations
- Trained on face images only — may not generalize to full-scene AI generation
- Best used with face detection preprocessing (crop face before classification)
- Performance may vary on novel deepfake methods not in training distribution
License
Apache 2.0