Hello :) Today is Day 189!
A quick summary of today:
- completed the project and wrote project description
The whole project and all info is on my repo
Here is a project diagram I created using lucid.app
Well ~ today I added some final things to my project.
First I added terraform code to create GCP buckets for mlflow articats and raw data, and also start a VM.
The last one is kind of cool because when we start the VM, mlflow starts automatically as well because the code to start mlflow is included in the metadata_startup_script
which is:
I also added the whole terraform init, plan and apply setup in the project’s Makefile for easy set and start up. So now the make help looks like:
Next, I added pylint
Following the suggestions from MLOps zoomcamp, I added pylint and fixed a lot of whitespace, and import order lines of code as per pylint’s suggestions.
Next, I tried to add some tests using pytest
I added 5 simple tests to some of the prefect flows.
The exact tests are:
- test_read_new_data
- test_predict
- test_save_predictions
- test_batch_model_predict
- test_local_to_gcs
But there is work to be done for more thorough testing. One thing Mage.ai was nice for is that in every block of code, at the end of it there was a dedicated test place so adding tests was very easy (which is not the case for Prefect)
Next, I added python documentation
I used sphinx back in my placement year with Lloyds Banking Group, and I have been adding small docstrings to my code, so adding a sphinx python docs seemed like a nice little addition. I easily set it up, but I could not get github actions to automate this, so I used an alternative:
Use a new branch where the _build folder (normally not uploaded to github because it has many files) is uploaded and netlify watches that branch and updates the docs
Here is the link to the netlify hosted python docs for the project.
Finally, I added some basic git pre-commit hooks
- trailing-whitespace
- end-of-file-fixer
- check-yaml
- check-added-large-files
- pytest-check
- pylint
These are ran and checked before every commit, and the commit is stopped if one fails.
After this, I started putting things together and writing the README of the project, and creating the project diagram (top of this blog).
There are things to improve, but nevertheless DataTalksClub - THANK YOU SO MUCH! MLOps zoomcamp is such an amazing course, and it taught me amazing tools to build amazing things.
From now on, I will continue learning the ongoing LLM zoomcamp - hopefully I will build an amazing project from there as well!
That is all for today!
See you tomorrow :)