(Day 61) Stanford CS224N (NLP with DL) - Machine translation, seq2seq + a side CDCGAN mini project

Ivan Ivanov · March 2, 2024

applying-knowledge theory cnn

Hello :) Today is Day 61!

A quick summary of today:

Covered Lecture 7: machine translation, seq2seq, attention from Stanford CS224N
Tried to make a conditional DCGAN to generate MNIST numbers (colab) (kaggle)

I will first cover the GAN story (then will share my notes from the lecture)

So… while watching and taking notes today, I started thinking, what if I can use my notes as data to a model and afterwards, when I want, I can give it raw string text and it will output text in the format of my notes (with my handwriting).

Well I started looking around and actually the first model architecture that came to my mind was the GAN (specifically conditional GAN) - I remembered there was a GAN architecture that alongside the pictures, we can give it the labels, and then on-demand generate. In retrospect, there are of course others, but I decided to go with GAN.

For maybe 2 hours I busted my head trying to make a simple model with the EMNIST dataset (english characters), and I kept getting weird input size issues. And after a bit I read online that the num of classes for the EMNIST from PyTorch is a bit weird (i.e.). What is more, I saw this:

This image with handwritten text is generated with a conditional deep conv GAN (repo link). And my mind was hooked on the idea -> I want to do it too. But firstly, I wanted to do it with just numbers, a bit more simple (and not having to struggle with loading the EMNIST dataset).

Thanks to Aladdin Persson’s youtube channel, I got reminded how GAN’s architecture works and I had a simple model, and started training in my google colab. After a few hours of adjusting params to optimize training and lower loss, and also doing inference - producing new number images, I had a working model. I was so happy. This felt soo long, at first with the EMNIST problems, and then just getting a conditional DCGAN to work. Aaaand… my colab gpu free time finished in a middle of a run and all was lost because I had not saved any weights :/ I did not even take a screenshot of the generated output numbers I got - and they were amazing, human-like. Right now as I am writing this, in the colab, there is no output, but hopefully later if someone reads this, you will be able to see sample results (I will rerun it when I get my free gpu hours).

But also, I am running it on kaggle right now, will get a generated output pic and go to sleep because it is almost 2am.

So, this is the result. Not too bad. But in colab I could use tensorboard (there is a way in Kaggle too, but it gives me an error :/ ) (Link to the kaggle notebook)

As for my notes for the lecture 7:

That is all for today, tomorrow we continue with CS224N! :)

That is all for today!

See you tomorrow :)