Hello :) Today is Day 187!
A quick summary of today:
- creating a docker container with postgres db, pgadmin, grafana and fastapi
- a small Glaswegian project update
Docker and I are becoming closer and closer friends ^^
Before going into that, I will go in order of things I did today.
All files and code from today, and up until this point are on my repo.
Firstly, monitoring with Evidently
I learned about Evidently from the mlops zoomcamp, and it looked very easy to use, so I gave it a try.
In order to generate the reports, I decided to use my data before creating dummies. Because the balanced random forest classifier model uses a dataset with all dummies (all categorical data). So in order for the monitoring to be a bit more understandable, I decided to use the ‘pre-dummy’ dataset, and below are some parts of the generated reports by Evidently.
There is variable info:
Missing values:
Drift:
Model performance:
The reports (as html) are on my repo in the monitoring/evidently_reports folder.
Next ~ Grafana
In the zoomcamp we were thought of both Evidently and Grafana. And I see that Grafana is a more flexible tool for monitoring that allows easier data integration from multiple resources.
I did start a grafana server, however I started thinking about PostgreSQL.
I started to think that I do not want to use GCS for all my files - it might be better practice and a good exercise to think of GCS as a storage for keeping just some raw files, while I can use a postgres server for intermediate data.
So I started to set up Grafana and a postgres server, and then I added pgAdmin using Docker. Then I decided to combine these services with the FastAPI docker service I established yesterday - so all runs together. And this is where I spent most of my time today.
I had little folder mapping probles, however the biggest one - I kept getting ‘anonymous credentials cannot be refreshed’ from google. I use GCS to download a model for FastAPI. It worked completely fine locally, but when I run the FastAPI server from docker the error kept coming up. I kept just chaning paths to try to point the env to the credentials file, but nothing worked. In the end… I went to what I had originally (this is after ~2 hours of trying to fix this) and it ran… haha
So now, with make the command make start-all
I run postgres, pgAdmin, grafana, and fastAPI.
Since I have postgres now, I left some TODOs for tomorrow - removing all reads of local file data, now I should read from postgres (rather than GCS only) - I think I have 4 such TODOs.
On another note, a small Glaswegian dataset update
We are up to 86 minutes of audio. And our fine-tuned whisper is performing very well (according to my Scottish project partner), so I will try to help a bit with the transcription and send just some notes on what he might need to double-check to make sure whisper and I transcripbed a clip correctly.
That is all for today!
See you tomorrow :)