Hello :) Today is Day 307!
A quick summary of today:
- streamed twice and covered modules 2 and 3 of sklearn’s MOOC
Stream
Today’s stream was longer than expected ~ 3 hours.
I covered a dbt course:
However, I just watched the videos :/ Again, because I do not want to deal with GCP/AWS. Nevertheless it was a decent course.
After that I covered module 2 of sklearn’s MOOC:
Here is the overview:
- understand the concept of overfitting and underfitting
- understand the concept of generalization
- understand the general cross-validation framework used to evaluate a model
It is all basic concepts, however they are very nicely explain, and also I want to do a comprehensive review before taking sklearn’s official exam.
I decided to do a late night stream as well, and continue with Module 3: Hyperparameter tuning
Here are its main takeaways:
- Hyperparameters have an impact on the models’ performance and should be wisely chosen
- The search for the best hyperparameters can be automated with a grid-search approach or a randomized search approach
- A grid-search is expensive and does not scale when the number of hyperparameters to optimize increase. Besides, the combination are sampled only on a regular grid
- A randomized-search allows a search with a fixed budget even with an increasing number of hyperparameters. Besides, the combination are sampled on a non-regular grid
Here is the link to the late night stream with no mic/cam ~1.5hrs
I am glad it was shorter because I got really sleepy towards the end
That is all for today!
See you tomorrow :)