(Day 234) Improving the Grafana dashboard and writing a final script for the KB project presentation

Ivan Ivanov · August 22, 2024

applying-knowledge mlops

Hello :) Today is Day 234!

A quick summary of today:

worked in improving the Grafana dashboard
wrote a final script for the technical part of the presentation

I fell asleep and I woke up and was about to completely go to bed, but I am writing this blog again right before bed as I am a bit exhausted from today.

Mainly I played around with Grafana - making a bit more interesting dashboards so that in the real-time pipeline demo, the grafana dashboard looks a bit more interesting.

In the below pics there are empty visualisations mainly because I cleaned the database so for tomorrow’s recording I get a clean start.

In the above picture, at the moment there are no fraud transactions, but if there are where it says ‘No data in response’ would show the transaction amount distribution for fraud cases.

In addition in the table at the top, if a transaction is labeled as fraud from our system, the row will be red. I wanted to also add a hyperlink to the grafana alert dashboard but I could not figure out how to do conditional formatting to only add hyperlinks where the value of a row is something specific. What I could figure out is adding a hyperlink to each value in the table but that is not really useful to us.

The bottom part is similar, and if there are fraud cases, one of the empty graphs will show a pie chart with the transaction’s categories. And the bottom left (atm empty) table will show how many times there is a consensus between the models, and how many times there is a majority voting (2 out of 3).

As for the script

This is just the technical part and does not include the background, intro part, and going forward/improvements part.

기능구현 설계도
This is our overall architecture. 
We use Docker to containerize everything so that it can easily be used across different systems
Mage is used as an orchestrator that allows the training and real-time pipelines to be developed and easily run. Mage’s interface allows for the pipelines to be run by both technical engineers and non-technical analysts.
We use neo4j as our graph database as it is one of the most popular free, open-sourced and widely used graph databases.

We develop two main pipelines: model training one and one for real-time inference.

For the model training pipeline:
We developed them using a few libraries - pytorch-geometric (PyG is an acronym) (for graph neural networks), scikit-learn, catboost, and xgboost
We use mlflow for seamless model tracking and registry
Using the models saved in mlflow, we provide a model dictionary that helps to better understand the decisions of the developed models and provides AI explainability features for both traditional machine learning methods and graph neural networks.

Next, about our real-time pipeline: 
To imitate real-time data, we set up a script that sends transactions to Kafka every second, and Kafka ensures its efficient processing.
For real-time inference, we run a real-time pipeline in Mage that takes transaction data from Kafka, takes selected models that are saved in mlflow, processes the transactions by adding predictions from our models, and inserts it into neo4j
For insights and anslysis, we connect neo4j to Grafana to create a real-time dashboard that can update in real-time. In Grafana, we also provide an alarm dashboard that keeps track of suspicious transactions.
Finally, in the case of our models predicting that a transaction is suspicious, we immediately send a kakaotalk message to the customer.

Next, we will look into each part of this architecture in a bit more detail. 

Dataset (1)
We first needed to find real credit card data to detect fraudulent transactions. As one of the most popular platforms for open-source data science, we looked into Kaggle and found a dataset that we believe reflects real-life transaction data as it consists of user, merchant, and transaction information. The transactions in the dataset are from the US for the period 2019-2020 and there are a total of 1.8m transactions. Just like reality, the majority of transactions are not fraudulent with 99.4% not fraud, and only 0.6% fraud cases. As mentioned before, we are developing two main pipelines - for model training and for real-time inference, so for this reason we decide to split our data 70% for model development, and 30% to simulate real-time data.
First, we will talk about the model training pipeline and what happens inside of it.
Dataset (2) 
Given the complex and interconnected nature of transactions, utilising a graph database and a graph neural network becomes particularly advantageous for managing transaction data and improving fraud detection models. By capturing the relationships between transactions, these tools allow for more sophisticated analysis, uncovering patterns and anomalies that might be missed with traditional methods.

To use Neo4j, we first convert our table data into a graph format. Since, in real life, customers use credit cards at various merchants, we modelled our data similarly. We created nodes to represent CreditCards and Merchants, with Transactions as edges connecting CreditCards to Merchants.

For training our models, during our feature selection process, we took into account all features and considered their implications for AI ethics, so we did not want to include features based on name, gender, or job. So we ended up using only location information like longitude and latitude for the customer and merchant, and transaction information like amount, category, and date and time of occurance.

After selecting our features, we did some preprocessing.

Dataset (3)

From our model development process, we found that downsampling the majority class fits our case the best, and this resulted in around 7500 not fraud and 7500 fraud transactions left to train and test the models. Next, since we have text data such as credit card state, transaction category, date, and time, we convert these features into dummy variables to make them usable in our models. And our train/test split ratio is 80/20. 

Modelling (1) - 모델 정보

Because fraud detection is inherently a graph problem, we wanted to construct a Graph Neural Network as our main model. A Graph Convolutional Network (GCN) is great for detecting fraud in transactions because it can handle large amounts of data efficiently - this is important because every second millions of transactions occur and we need a system that can handle this volume. In addition, GCNs understand and analyse complex connections between different credit cards and transactions. It looks at how transactions are related to each other and can spot suspicious patterns that might be hard to see with other methods. A GCN updates each node’s features by aggregating and transforming features from its neighbours. It repeats this process across multiple layers to capture information from farther nodes, and improves its knowledge to predict whether a transaction is fraud or not fraud. 

We wanted to give extra help to our decision-making system, so we looked into more traditional machine learning models like Logistic Regression, XGBoost, RandomForest, CatBoost and LGBMClassifier. Among those, after hyperparameter tuning, XGBoost and CatBoost ended up performing the best on our test dataset. And the final hyperparameters can be seen. 

We looked into non-graph neural network models because while the GCN focuses on relationships between transactions, CatBoost and XGBoost enhance the system by providing high accuracy by analysing individual transactions. By providing a system with multiple models, we hope that all types of fraud are effectively detected, and our fraud detection system is more robust and reliable.

If our models are outdated, or if we detect data drift or some other anomaly that signifies our models need to be updated, we can easily do that in Mage - our orchestrator. And next, we will see a short demo on how to do that.

Modelling (2) - 모델 훈련 Demo/모델 훈련 파이프라인 Demo

- n/a

Modelling (3) - 모델 사전 (Model Dictionary)

The model dictionary automatically picks up newly trained models and displays it in our UI hosted by Streamlit. The dictionary includes information related to model metrics, confusion matrices, feature importance and SHAP values. CatBoost and XGBoost models have SHAP values on their pages as SHAP values help with explaining predictions of machine learning models by attributing the contribution of each feature to the prediction. As for explaining our Graph Convolutional Network, neural networks are known for being black box models and as users we are not able to understand how they made decisions. However, the graph neural network library that we use (pytorch-geometric) provides explainability features which, just like for a non-neural network model, can show us the feature importance.

Modelling (4) - 모델 테스트 지표 요약

In our real-time pipeline, about which we will talk next, we currently use 3 different models - a Graph Convolutional Network, CatBoost and XGBoost. This table shows the results of those models on our test dataset. For evaluation metrics, we chose Recall (TP / (TP + FN)) as the main one as it measures how well the system identifies all the fraudulent transactions, which is crucial for minimising missed fraud cases. And as you can see they all compliment each other when it comes to both recall and accuracy. 

실시간 파이프라인

Our real-time pipeline is designed for continuous and efficient processing of transaction data, ensuring timely predictions for fraud detection. The pipeline seamlessly integrates multiple components, from data ingestion to inference and storage, using industry-standard tools and our machine learning models.

(Suggestion: For completeness, we can include mlflow and kakaotalk in the flow. Also under Mage, it is better to write: 실시간 추론 파이프라인)


Of the total dataset from Kaggle, we left out 30% for our real-time pipeline development and execution. 
We simulate real-world transactions using a custom script that generates and sends transaction data to Kafka every second. Kafka, known for its high-throughput and fault-tolerant messaging system, ensures that all incoming transaction data is captured in real-time. And as each transaction enters Kafka, it immediately goes through our real-time pipeline in Mage, ensuring minimal latency in data flow
Mage pulls our main 3 models from mlflow (our model registry) and applies them to the incoming transactions
For each transaction, predictions are generated using our models, and then the transaction data is sent to neo4j
We have connected neo4j with Grafana, a widely used visualisation platform where we provide a real-time dashboard for analysts to play with; an alert dashboard which keeps track and updates whenever a fraudulent transaction is found.

Finally, if a transaction is deemed fraudulent by our system, a kakaotalk message is sent to the associated customer as a warning

Sending messages with Kakao

Today, my lab mate gave me some suggestions for the real-time pipeline video that I shared yesterday. I wanted to implement them (like going slower, explaining things, just making the video clearer) and record a new video, but apparently for my free account there is an API limit for sending messages. It reset at midnight so tomorrow morning after I wake up I will re-record the final video and edit it.

That is all for today!

See you tomorrow :)