(Day 42) Creating a UNet with PyTorch

Ivan Ivanov · February 12, 2024

Hello :) Today is Day 42!

A quick summary of today:

  • Road segmentation with UNet
  • Forest area segmentation with UNet
  • Human segmentation with UNet

But why? I can easily load an UNet (well-made UNet, even pretrained) and it will work great on the images I found. Well, I tried to follow the model from yesterday, that was written with TensorFlow, and I just wanted to translate it to PyTorch and use it on new data.

Was my attempt successful? - Not exactly haha Why ? - Training takes a lot of time. I ran out of free GPU resources and started using CPU (super slow ㅜㅜ). Adjusting the learning rate helped, but adjusting other hyper params can help too, but training time is too long without free GPU hours ㅜㅜ

I had 3 attempts on 3 different datasets

image

This is the format that I tried to follow.

Model structure:

image image image image image image image

Actually, I followed a TF model and translated it, so a potential issue is that I might have missed or did something in the wrong order. But given that the results are not terrible, such possibility is low (but still exists).

For the loss, I used DiceLoss (also learned about it yesterday) and also binary crossentropy. As for data augmentations - used albumentations and did horizontal and vertical flip with 50% probability. And of course resizing to 128 (or 256).

Below are summaries and info about the 3 models.

1) Road segmentation with UNet

image

Actually this dataset was a bit problematic, because some of the pictures had these white spots, which ruined the models’ performance, but if we look in the ground truth, there are actually roads there. Some results that were not too bad ~ around 0.5 loss.

image

And here is an example of a picture that is not very good and just hurts the model (probably can just delete those from the training to improve performance)

image

2) Forest area segmentation with UNet

image

Example of this particular dataset ^ The valid loss after 25 epochs was 0.65 and a sample result. Definitely not great. The model needs more adjustment for these images.

image

3) Human segmentation with UNet

Sample image (flipped in this case haha)

image

After 100 epochs, with learning rate 0.0001 (last attempt) - 0.6 loss And a sample result

image

Not great. :(

Because the training was taking long, I ran notebooks at the same time on Kaggle, google colab and my laptop. Tried adjusting learning rate, image size, batch size. I tried to avoid adjusting the filters layers of the UNet to see if the same structure can work for different images, but unfortunately today I could not get good performance. I am thinking of concentrating on 1 and just let it run in the background while learning/trying something else tomorrow.

That is all for today!

See you tomorrow :)