(Day 307) Covering sklearn's MOOC on stream

Ivan Ivanov · November 3, 2024

Hello :) Today is Day 307!

A quick summary of today:

  • streamed twice and covered modules 2 and 3 of sklearn’s MOOC

Stream

Today’s stream was longer than expected ~ 3 hours.

I covered a dbt course:

However, I just watched the videos :/ Again, because I do not want to deal with GCP/AWS. Nevertheless it was a decent course.

After that I covered module 2 of sklearn’s MOOC:

Here is the overview:

  • understand the concept of overfitting and underfitting
  • understand the concept of generalization
  • understand the general cross-validation framework used to evaluate a model

It is all basic concepts, however they are very nicely explain, and also I want to do a comprehensive review before taking sklearn’s official exam.

I decided to do a late night stream as well, and continue with Module 3: Hyperparameter tuning

Here are its main takeaways:

  • Hyperparameters have an impact on the models’ performance and should be wisely chosen
  • The search for the best hyperparameters can be automated with a grid-search approach or a randomized search approach
  • A grid-search is expensive and does not scale when the number of hyperparameters to optimize increase. Besides, the combination are sampled only on a regular grid
  • A randomized-search allows a search with a fixed budget even with an increasing number of hyperparameters. Besides, the combination are sampled on a non-regular grid

Here is the link to the late night stream with no mic/cam ~1.5hrs

I am glad it was shorter because I got really sleepy towards the end


That is all for today!

See you tomorrow :)