ash12321 commited on
Commit
c8e03ca
·
verified ·
1 Parent(s): 1891040

Add Model card

Browse files
Files changed (1) hide show
  1. README.md +315 -43
README.md CHANGED
@@ -1,83 +1,355 @@
1
  ---
 
2
  license: mit
3
  tags:
4
  - image-classification
5
- - ai-detection
6
  - sdxl
 
7
  - deepfake-detection
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  library_name: pytorch
 
9
  ---
10
 
11
- # SDXL Detector - ResNet50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- Binary classifier for detecting AI-generated images from Stable Diffusion XL.
 
14
 
15
- ## Model Details
 
 
 
16
 
17
- - **Architecture**: ResNet-50 (ImageNet pretrained)
18
- - **Task**: Binary classification (Real vs Fake)
19
- - **Training Data**: 10,000 real + 10,000 SDXL images
20
- - **Input Size**: 256×256 RGB
21
- - **Classes**: Real (0), Fake (1)
22
 
23
- ## Performance
24
 
25
- See `test_results.json` for detailed metrics.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ## Usage
28
 
 
 
 
 
 
 
 
 
29
  ```python
30
  import torch
31
- from torchvision import models, transforms
32
  from PIL import Image
 
33
 
34
- # Load model
35
- model = models.resnet50()
36
- model.fc = torch.nn.Sequential(
37
- torch.nn.Dropout(0.5),
38
- torch.nn.Linear(2048, 2)
39
  )
40
 
41
- checkpoint = torch.load('pytorch_model.bin')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  model.load_state_dict(checkpoint['model_state_dict'])
43
  model.eval()
44
 
45
- # Prepare image
46
  transform = transforms.Compose([
47
- transforms.Resize((256, 256)),
 
48
  transforms.ToTensor(),
49
- transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
 
 
 
50
  ])
51
 
52
- image = Image.open('test.jpg').convert('RGB')
53
- image = transform(image).unsqueeze(0)
54
-
55
  # Predict
 
 
 
56
  with torch.no_grad():
57
- output = model(image)
58
- probs = torch.softmax(output, dim=1)
59
- pred = output.argmax(1).item()
 
60
 
61
- print(f"Prediction: {'Fake' if pred == 1 else 'Real'}")
62
- print(f"Confidence: {probs[0][pred].item()*100:.2f}%")
 
 
63
  ```
64
 
65
- ## Files
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
 
67
- - `pytorch_model.bin`: Model weights
68
- - `config.json`: Configuration
69
- - `training_history.csv`: Training metrics
70
- - `test_results.json`: Test results
71
- - `*.png`: Visualizations
72
 
73
- ## Training
 
 
 
74
 
75
- - Epochs: 30
76
- - Batch Size: 32
77
- - Learning Rate: 0.0001
78
- - Optimizer: AdamW
79
- - Early Stopping: Patience 5
80
 
81
- ## Dataset
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
- Generated images: [ash12321/sdxl-generated-10k](https://huggingface.co/datasets/ash12321/sdxl-generated-10k)
 
1
  ---
2
+ language: en
3
  license: mit
4
  tags:
5
  - image-classification
6
+ - fake-detection
7
  - sdxl
8
+ - ai-detection
9
  - deepfake-detection
10
+ datasets:
11
+ - food101
12
+ - huggan/AFHQ
13
+ - timm/oxford-iiit-pet
14
+ - tanganke/stanford_cars
15
+ - beans
16
+ - ash12321/sdxl-generated-10k
17
+ metrics:
18
+ - accuracy
19
+ - f1
20
+ - precision
21
+ - recall
22
+ - auc
23
  library_name: pytorch
24
+ pipeline_tag: image-classification
25
  ---
26
 
27
+ # SDXL Detector (ResNet-50)
28
+
29
+ ## Model Description
30
+
31
+ A specialized deep learning model for detecting images generated by Stable Diffusion XL (SDXL) 1.0 at 1024×1024 resolution.
32
+
33
+ **Architecture:** ResNet-50 (pretrained on ImageNet, fine-tuned for SDXL detection)
34
+
35
+ **Training Date:** December 30, 2025
36
+
37
+ **Purpose:** This is a specialist model designed specifically for SDXL 1.0 detection. For general AI image detection across multiple generators, use this as part of an ensemble with other specialist models.
38
+
39
+ ## Performance Metrics
40
+
41
+ ### Test Set Results (2,856 images)
42
+
43
+ | Metric | Score |
44
+ |--------|-------|
45
+ | **Accuracy** | **99.75%** |
46
+ | **F1 Score** | **99.77%** |
47
+ | **Precision** | **99.61%** |
48
+ | **Recall** | **99.93%** |
49
+ | **AUC-ROC** | **0.9999** |
50
+ | **Average Precision** | **0.9999** |
51
+
52
+ ### Per-Class Performance
53
+
54
+ ```
55
+ precision recall f1-score support
56
+ Real 99.92% 99.55% 99.73% 1,320
57
+ Fake 99.61% 99.93% 99.77% 1,536
58
+ ```
59
+
60
+ ### Training Details
61
+
62
+ - **Total Epochs:** 12
63
+ - **Final Training Accuracy:** 99.92%
64
+ - **Final Validation Accuracy:** 99.75%
65
+ - **Training Time:** ~6 minutes on H100 GPU
66
+ - **Model Parameters:** 24,559,170
67
+
68
+ ### Confusion Matrix
69
+
70
+ Out of 2,856 test images:
71
+ - **Real images (1,320):** 1,314 correct, 6 misclassified
72
+ - **Fake images (1,536):** 1,535 correct, 1 misclassified
73
+ - **Total errors:** Only 7 images (0.25% error rate)
74
+
75
+ ## Intended Use
76
 
77
+ ### Primary Use Case
78
+ Detecting images generated by Stable Diffusion XL (SDXL) 1.0 at 1024×1024 resolution.
79
 
80
+ ### What This Model Can Do
81
+ ✅ Detect SDXL 1.0 generated images with 99.75% accuracy
82
+ ✅ Identify SDXL-specific generation patterns and artifacts
83
+ ✅ Work with 1024×1024 SDXL outputs
84
 
85
+ ### What This Model Cannot Do
86
+ Detect images from other generators (Midjourney, DALL-E, Flux, etc.)
87
+ Work reliably on non-1024×1024 resolutions
88
+ Detect other Stable Diffusion versions (1.5, 2.1, etc.)
 
89
 
90
+ **Note:** For comprehensive AI image detection across multiple generators, this model should be used as part of an ensemble with other specialist detectors.
91
 
92
+ ## Training Data
93
+
94
+ ### Real Images (9,034 total)
95
+ - **Food101:** 2,000 images (food photography)
96
+ - **AFHQ:** 2,000 images (animal faces)
97
+ - **Oxford Pets:** 2,000 images (pet photography)
98
+ - **Stanford Cars:** 2,000 images (vehicle photography)
99
+ - **Beans:** 1,034 images (agricultural images)
100
+
101
+ All real images were resized to 1024×1024 to match SDXL output dimensions.
102
+
103
+ ### Fake Images (10,000 total)
104
+ - **Source:** SDXL 1.0 generated images
105
+ - **Resolution:** 1024×1024
106
+ - **Dataset:** ash12321/sdxl-generated-10k
107
+
108
+ ### Data Split
109
+ - Training: 70% (13,323 images)
110
+ - Validation: 15% (2,855 images)
111
+ - Test: 15% (2,856 images)
112
+
113
+ ## Model Architecture
114
+
115
+ **Base Model:** ResNet-50 (pretrained on ImageNet)
116
+
117
+ **Custom Classifier Head:**
118
+ ```python
119
+ Sequential(
120
+ Dropout(p=0.3),
121
+ Linear(2048 → 512),
122
+ BatchNorm1d(512),
123
+ ReLU(),
124
+ Dropout(p=0.15),
125
+ Linear(512 → 2)
126
+ )
127
+ ```
128
+
129
+ **Input:** RGB images resized to 224×224
130
+ **Output:** Binary classification (Real vs SDXL-generated)
131
+
132
+ ## Training Configuration
133
+
134
+ ### Hyperparameters
135
+ - **Optimizer:** AdamW
136
+ - **Learning Rate:** 0.001 (with cosine annealing)
137
+ - **Batch Size:** 128
138
+ - **Weight Decay:** 0.01
139
+ - **Dropout:** 0.3
140
+ - **Label Smoothing:** 0.05
141
+ - **Mixed Precision:** bfloat16 (H100 optimized)
142
+
143
+ ### Augmentation (Training Only)
144
+ - RandomResizedCrop (scale: 0.8-1.0)
145
+ - RandomHorizontalFlip (p=0.5)
146
+ - RandomRotation (±15°)
147
+ - ColorJitter (brightness, contrast, saturation, hue)
148
+ - Normalization (ImageNet stats)
149
+
150
+ ### Hardware
151
+ - **GPU:** NVIDIA H100
152
+ - **Training Time:** ~6 minutes
153
+ - **Inference Speed:** ~4ms per image (H100)
154
 
155
  ## Usage
156
 
157
+ ### Installation
158
+
159
+ ```bash
160
+ pip install torch torchvision pillow huggingface_hub
161
+ ```
162
+
163
+ ### Quick Start
164
+
165
  ```python
166
  import torch
167
+ from torchvision import transforms
168
  from PIL import Image
169
+ from huggingface_hub import hf_hub_download
170
 
171
+ # Download model
172
+ model_path = hf_hub_download(
173
+ repo_id="ash12321/sdxl-detector-resnet50",
174
+ filename="best.pth"
 
175
  )
176
 
177
+ # Load model
178
+ checkpoint = torch.load(model_path, map_location='cpu')
179
+
180
+ # Create model architecture
181
+ import torchvision.models as models
182
+ import torch.nn as nn
183
+
184
+ class SDXLDetector(nn.Module):
185
+ def __init__(self):
186
+ super().__init__()
187
+ self.backbone = models.resnet50(pretrained=False)
188
+ num_features = self.backbone.fc.in_features
189
+ self.backbone.fc = nn.Sequential(
190
+ nn.Dropout(p=0.3),
191
+ nn.Linear(num_features, 512),
192
+ nn.BatchNorm1d(512),
193
+ nn.ReLU(inplace=True),
194
+ nn.Dropout(p=0.15),
195
+ nn.Linear(512, 2)
196
+ )
197
+
198
+ def forward(self, x):
199
+ return self.backbone(x)
200
+
201
+ # Initialize and load weights
202
+ model = SDXLDetector()
203
  model.load_state_dict(checkpoint['model_state_dict'])
204
  model.eval()
205
 
206
+ # Preprocessing
207
  transform = transforms.Compose([
208
+ transforms.Resize(256),
209
+ transforms.CenterCrop(224),
210
  transforms.ToTensor(),
211
+ transforms.Normalize(
212
+ mean=[0.485, 0.456, 0.406],
213
+ std=[0.229, 0.224, 0.225]
214
+ )
215
  ])
216
 
 
 
 
217
  # Predict
218
+ image = Image.open("test_image.jpg").convert('RGB')
219
+ input_tensor = transform(image).unsqueeze(0)
220
+
221
  with torch.no_grad():
222
+ outputs = model(input_tensor)
223
+ probs = torch.softmax(outputs, dim=1)
224
+ prediction = torch.argmax(probs, dim=1).item()
225
+ confidence = probs[0][prediction].item()
226
 
227
+ # Results
228
+ labels = ['Real', 'SDXL-generated']
229
+ print(f"Prediction: {labels[prediction]}")
230
+ print(f"Confidence: {confidence*100:.2f}%")
231
  ```
232
 
233
+ ### Batch Prediction
234
+
235
+ ```python
236
+ from torch.utils.data import DataLoader, Dataset
237
+
238
+ class ImageDataset(Dataset):
239
+ def __init__(self, image_paths, transform):
240
+ self.image_paths = image_paths
241
+ self.transform = transform
242
+
243
+ def __len__(self):
244
+ return len(self.image_paths)
245
+
246
+ def __getitem__(self, idx):
247
+ image = Image.open(self.image_paths[idx]).convert('RGB')
248
+ return self.transform(image)
249
+
250
+ # Create dataset and loader
251
+ image_paths = ['image1.jpg', 'image2.jpg', ...]
252
+ dataset = ImageDataset(image_paths, transform)
253
+ loader = DataLoader(dataset, batch_size=32, num_workers=4)
254
+
255
+ # Batch inference
256
+ predictions = []
257
+ confidences = []
258
+
259
+ model.eval()
260
+ with torch.no_grad():
261
+ for batch in loader:
262
+ outputs = model(batch)
263
+ probs = torch.softmax(outputs, dim=1)
264
+ preds = torch.argmax(probs, dim=1)
265
+ confs = torch.max(probs, dim=1)[0]
266
+
267
+ predictions.extend(preds.cpu().numpy())
268
+ confidences.extend(confs.cpu().numpy())
269
+ ```
270
+
271
+ ## Limitations
272
+
273
+ 1. **Generator-Specific:** Only trained on SDXL 1.0. Will not reliably detect:
274
+ - Other Stable Diffusion versions (1.5, 2.1, 3.0)
275
+ - Midjourney, DALL-E, Flux
276
+ - Other generative models
277
 
278
+ 2. **Resolution-Specific:** Optimized for 1024×1024 SDXL images. Performance may degrade on:
279
+ - Lower resolutions
280
+ - Higher resolutions
281
+ - Non-square aspect ratios
 
282
 
283
+ 3. **Dataset Bias:** Trained on specific real image categories (food, animals, vehicles, etc.). May perform differently on:
284
+ - Artistic images
285
+ - Abstract images
286
+ - Specialized domains (medical, satellite, etc.)
287
 
288
+ 4. **Adversarial Attacks:** Not hardened against adversarial perturbations
 
 
 
 
289
 
290
+ ## Ethical Considerations
291
+
292
+ ### Intended Applications
293
+ ✅ Content moderation
294
+ ✅ Academic research
295
+ ✅ Digital forensics
296
+ ✅ Media verification
297
+
298
+ ### Prohibited Uses
299
+ ❌ Surveillance without consent
300
+ ❌ Discrimination or profiling
301
+ ❌ Bypassing content policies
302
+
303
+ ### False Positives/Negatives
304
+ - **False Positives (0.45%):** Real images misclassified as SDXL-generated
305
+ - May unfairly flag authentic content
306
+ - Always provide human review for high-stakes decisions
307
+
308
+ - **False Negatives (0.07%):** SDXL images misclassified as real
309
+ - SDXL-generated content may slip through
310
+ - Use as part of multi-layer verification
311
+
312
+ ### Transparency
313
+ This model should be deployed with clear communication to users about:
314
+ - Its specific purpose (SDXL detection only)
315
+ - Its limitations (not for other generators)
316
+ - Confidence scores for each prediction
317
+ - The possibility of errors
318
+
319
+ ## Citation
320
+
321
+ If you use this model in your research, please cite:
322
+
323
+ ```bibtex
324
+ @misc{sdxl_detector_2024,
325
+ author = {Your Name},
326
+ title = {SDXL Detector: ResNet-50 Fine-tuned for SDXL Detection},
327
+ year = {2024},
328
+ publisher = {HuggingFace},
329
+ howpublished = {\url{https://huggingface.co/ash12321/sdxl-detector-resnet50}},
330
+ }
331
+ ```
332
+
333
+ ## Model Card Authors
334
+
335
+ ash12321
336
+
337
+ ## Model Card Contact
338
+
339
+ For questions or issues, please open an issue on the model repository.
340
+
341
+ ## License
342
+
343
+ MIT License
344
+
345
+ ## Changelog
346
+
347
+ ### Version 1.0 (2025-12-30)
348
+ - Initial release
349
+ - 99.75% test accuracy on SDXL detection
350
+ - ResNet-50 architecture
351
+ - Trained on 19,034 images (9,034 real + 10,000 SDXL)
352
+
353
+ ---
354
 
355
+ **Keywords:** SDXL detection, AI image detection, fake image detection, deepfake detection, ResNet-50, image classification, computer vision