In this project, the goal is to train a model for segmenting various parts of face.
Dataset: link
Number of Images: Train(19535) and Validation (2653)
Original Dataset have 18 classes can be referred from here, However, I compounded classes like left_ear and right_ear to ear, etc. and also neglected the unnecessary classes like hair, hat, etc.
Classes: 8
- 0: Background
- 1: Face Skin
- 2: Eyebrows
- 4: Eyes
- 6: Nose
- 7: Mouth
- 13: Ears
- 16: Glasses
Evaluation Metric: Precision, Recall, Dice Score (F1-Score), Jacard Score (mIoU Score)
Loss: Dice Loss + Categorical_Cross_Entropy Loss + Jacard Loss
Images are in very big in size with varying orientation (Portrait as well as Landscape). So, for this project, I have decided to used images and masks of resized value of 256 x 256.
Pixel wise distribution of classes per image:
- I used PSPNet with ResNet50 as a backbone and Deeplabv3+ model with EfficientNetB0 as a backbone
- I used 4000 training images and 800 validation images only for training because of shortage of infrastructure
- I trained the model for 39 epochs with initial learning rate of 0.1 with SGD (Stochastic Gradient Descent) optimizer and Reduce Learning Rate on Plateau with patience of 4 and factor of 0.5)
- I also used Early Stopping callback with patience of 10 epochs to prevent overfitting
- PSPNet
Learning Rate | loss | F1 | IOU |
---|---|---|---|
- Deeplab
Learning Rate | loss | F1 | IOU |
---|---|---|---|
- PSPNet
Image | Ground Truth | Prediction |
---|---|---|
- Deeplab
Image | Ground Truth | Prediction |
---|---|---|
- On Complete (Train: 4000 and Val: 800 images) Dataset:
- On Test Dataset (7766 images):
- Try training model with complete dataset
- Try training model with different loss functions
- Try training Deeplabv3+ and PSPNet models with different backbones
- Try with different model such as Unet, LinkNet, ICNet with different backbones
- Try training model further for more epochs
- Try gathering more data for training by new annotations and augmentations
- Try training the model further with smaller learning rate with Adam or SGD (with momentum)