Hello :) Today is Day 153!
A quick summary of today:
- started Module 3: Orchestration and ML pipelines from the MLOps zoomcamp
It is my 26th birthday today, so my study time was pretty limited and studied during my 2 hour bus rider home from Seoul.
The below is the full outline
I only got to complete 3.1. Data preparation, and below are some pics/notes I took. So before this, in the MLOps zoomcamp we covered mlflow, experiment tracking and model management. I guess orchestration is the next step which automates the whole data prep, training, testing process (note: I am not sure what else so far).
The 2024 cohort of the camp, uses mage.ai as a free-to-use platform, so today’s and the next steps will be done on its platform.
First it was setup using docker.
git clone https://github.com/mage-ai/mlops.git
cd mlops
./scripts/start.sh
And a local mage.ai webapp was started
In the 3.1 part we created a data preparation pipeline, and here is the final version:
It is not hard to use, and fairly easy to set up. Each block is a piece of code. For example the ingest block is:
It downloads taxi data. Then when we create a next block, following it, it seems to automatically chain outputs from previous to inputs of current block. Below is the second ‘prepare’ block:
And the final ‘build’ block is where we have the dataset split and data vectorized (using util functions)
I will try to complete the rest of the module in the coming week ^^
That is all for today!
See you tomorrow :)