lucvantien1211 commited on
Commit
5e4221e
·
verified ·
1 Parent(s): d0183e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
  title: Dinosaur Project
3
- emoji: 🏃
4
  colorFrom: green
5
- colorTo: red
6
  sdk: gradio
7
  sdk_version: 5.44.1
8
  app_file: app.py
@@ -11,4 +11,50 @@ license: apache-2.0
11
  short_description: A simple dinosaur species classifier, fine-tuned on ConvNext
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Dinosaur Project
3
+ emoji: 👀
4
  colorFrom: green
5
+ colorTo: pink
6
  sdk: gradio
7
  sdk_version: 5.44.1
8
  app_file: app.py
 
11
  short_description: A simple dinosaur species classifier, fine-tuned on ConvNext
12
  ---
13
 
14
+ <p align="center">
15
+ <h1 align="center">Jurassic Park Dinosaur Species Classifier</h1>
16
+ </p>
17
+
18
+ ## Introduction
19
+ This Huggingface Space contains source code for my project of building a classification model that can distinguish dinosaur species that appears in the Jurassic Park franchise.
20
+ The model is then used to create a simple web app (using Gradio). With my code, you can:
21
+ * Upload an image of a dinosaur and get prediction on its species (top-5 predictions will be shown, along with their probabilities).
22
+ * Train the model again with different configuration (see `src/train.py` script).
23
+ * Use this as a baseline model and develop your own model.
24
+
25
+ ## About the data
26
+ The original dataset can be found on kaggle https://www.kaggle.com/datasets/antaresl/jurassic-park-dinosaurs-dataset.
27
+ However, this dataset is crawled from web, so it's very diverse in style, size, quality, ... and of course, there is a lot of noise in the data
28
+ (some species which looks very close to each other are labeled incorrectly, some images are children's drawing or fossil photograph, ...)
29
+ So I wrote the `src/filter_and_split.py` script to:
30
+ * Filter out children's drawing or fossil photograph (most of them) - which I believe, negatively impact the ability of the model to learn features the most.
31
+ * Split the dataset into train/validation/test data with ratio of 0.8/0.1/0/1 respectively
32
+
33
+ I use a CLIP model to quickly filter out invalid images, and I create a custom dataset class, which can be found in `src/dataset.py` to utilize batch processing and speed up the filtering process.
34
+ But it is my fault that I did not set a seed for random in the script, which reduce the reproducibility of the script.
35
+
36
+ ## Training
37
+ After trying on different models like ResNet, EfficientNet, ..., I find out that ConvNext Tiny yeilds the best result, and also the most stable model. The model was trained on GPU P100 of Kaggle.
38
+ The `src/train.py` script is used to train the model. Detailed process of exploring data, set up for training, examining train process and evaluate model on test data
39
+ can be found on `dinosaur_species_classification.ipynb` notebook.
40
+
41
+ ## Trained model
42
+ You can find my final model for this project in `model` directory.
43
+
44
+ ## Final results
45
+ The model achived an accuracy score of 0.75 and a weighted F1 score of roughly 0.76, which to me is not too shabby, considering the quality of data and limited resources that we have.
46
+ You can see the confusion matrix and a table of top-10 misclassified pairs at the end of the `dinosaur_species_classification.ipynb` notebook to have a closer look on the model's performance.
47
+
48
+ ## Suggestions for improvement
49
+ There are several ideas that I did not try, but you could in order to improve this model:
50
+ * Neural Style Transfer: Our data consists of a variety of styles (2D hand-drawing, 2D digital drawing, 3D models, ...), so our model might be prone to learning images's style, not biological features.
51
+ In order to address this, you can use a Neural Style Transfer model to generate stylized version of original dataset, and train the model on this new dataset to see if there is any improvement.
52
+ * Bilinear Pooling: This is a very interesting method, since it uses two CNN backbones to generate two feature maps, then calculate the outer product of them to create a new feature map that
53
+ contain information on how original features interact. This has proved to be efficient in fine-grained classification task like our project, where some classes have very small difference.
54
+ However, this method also comes with a cost of computation, so you should do some research on Compact Bilinear Pooling - a method in which we estimate the outer product of feature maps instead of
55
+ calculating it.
56
+ * Synthetic Data Generation: Our data is relatively small, only ~70 images per class, so you should consider using Gen-AI models to generate new samples.
57
+ However, do note that you will need to generate samples in a way that preserve features of each class. After having a larger dataset, you can consider unfreezing more layers of ConvNext Tiny
58
+ to fine-tune, or using Vision Transformers models.
59
+
60
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference