(Day 202) Setting up a Graph Convolution Network model to detect fraud credit card transactions

Ivan Ivanov · July 21, 2024

Hello :) Today is Day 202!

A quick summary of today:

  • finally got a GNN to work
  • learned how to use Mage for data streaming pipelines

I started today where I ended yesterday - trying to create some kind of a graph neural network to predict whether a transaction is fraud or not.

Tldr (as it is ~3.20am, another late night)

I ended up using torch geometric’s Homogeneous data class and the resulting data looks something like:

Data(x=[4290, 18], edge_index=[2, 23278], y=[4290], train_mask=[4290], test_mask=[4290])

The preprocessing involves undersampling the majority class and we end up with a balanced dataset.

The dataset has the following amount of edges and nodes

image

Neo4j is nice.

The model I found that works (at least for now, version 0.1) is:

image

After splitting data into train and test, the best model so far achieved the following results:

image

Accuracy: 0.8833, Precision: 0.8151, Recall: 0.9909, F1: 0.8945

image

Today I experimented with creating the training pipeline, but nothing is final yet. These are just experiments.

On another note, after getting a model to work - I decided to check out mage’s streaming pipelines.

I set up kafka services in docker compose, and with a python script:

image

I started sending sample data to try to access it through mage. On the mage side it is quite simple: using a Kafka data_loader block and a python transformer block to read the messages from the kafka stream

image

When I start running this pipeline we see:

image

Nice ^^ at least I know we can use mage for the real-time inference pipeline.

That is all for today!

See you tomorrow :)