(Day 65) Stanford CS224N (NLP with DL) - Multimodal DL and Model analysis and explanation

Ivan Ivanov · March 6, 2024

Hello :) Today is Day 65!

A quick summary of today:

  • Lecture 16: Multimodal deep learning
  • Lecture 17: Model analysis and explanation
  • (just watched) Lecture 18 and 19: Future of NLP and Model Interpretability & Editing

First I will share my notes, and then just a quick summary and thoughts on the course.

Lecture 16: Multimodal deep learning

image image image image image image image image image

I got a lot of research papers to read from this lecture. Two are: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale; and Learning Transferable Visual Models From Natural Language Supervision.

Lecture 17: Model analysis and explanation

image image image image image image image image image image

Quick summary and thoughts on the course

image

I wrote a lot of notes, and I am glad - writing something down with my hand definitely helps me remember it better.

Definitely a very comprehensive course and high quality, and I am glad I decided to commit to doing it. Going back to the start of NLP, through human languages > word vectors > word2vec > seq2seq > RNNs > LSTMs > Transformers, but also learning code generation, natural language generation, how LLMs word - pretraining, finetuning, evaluation metrics. Amazing.

Maybe someday I might pay for the XCS224N if I have the money, so that I can see lectures about current advances in the field. But this course (half of which was from 2021, half from 2023) definitely helped me create a sturd base for my future NLP journey. Thank you to Professor Chris Manning and head TA John Hewitt - they were top-notch ^^

That is all for today!

See you tomorrow :)