(Day 239) 7th place at the KB future finance competition 🥳

Ivan Ivanov · August 27, 2024

deep-learning theory

Hello :) Today is Day 239!

A quick summary of today:

the kb project is finally over
finishing Machine Learning Algorithms in Depth

We ended up at 7th place in the comptition 🥳. I did not take many pictures. Just the above one when we were entering the hall.

To close out the project, I un-archived the repo and added the latest code and updated some of the pictures like:

the architecture, which includes kakao talk customer messaging as the last step

And the monitoring dashboard:

Chapter 11 (last chapter) Advanced deep learning algorithms of the ML algs in depth book

It was similar to the previous chapter and it just briefly introduced some of the popular DL architecture types over the years, ending with the Transformer and GNNs.

Out of the chapter, I got an idea to try to use a VAE for time-series analysis.

Autoencoders are unsupervised neural networks trained to reconstruct the input. The bottleneck layer of the autoencoder can be used for dimensionality reduction or feature extraction
A variational autoencoder consists of an encoder that takes in data as input, transforming it into a latent representation, and a decoder that takes a latent representation, returning a reconstruction.

VAE Anomaly Detection in Time Series

This section discussed using a VAE for anomaly detection in time series data. The model architecture uses LSTM layers to process time series signals. The input consists of n signals, and the output is the log probability of observing a signal under normal (non-anomalous) conditions, defined by parameters μ (mean) and σ (variance).

Training: The model is trained unsupervised on non-anomalous data.
Anomaly Detection: When an anomaly occurs, the log likelihood of the signal drops, indicating the anomaly.
Assumptions: Gaussian likelihood is assumed, where each sensor has two degrees of freedom (μ, σ) for anomaly representation.
Embedding: Independent input signals are embedded into a joint latent space using the VAE’s sampling layer, structured as a Gaussian approximating N(0,1).
Objective: The objective function maximizes the log likelihood across sensors and ensures the embedding space approximates N(0,1)
Amortized variational inference is the idea that instead of optimizing a set of free parameters, we can introduce a parameterized function that maps from observation space to the parameters of the approximate posterior distribution.
Mixture density networks are mixture models in which the parameters, such as means, covariances, and mixture proportions, are learned by a neural network
Attention allows a model to adaptively focus on different parts of the input by adjusting the attention weights
A transformer is a Seq2Seq model that uses attention in the encoder as well as the decoder. Self-attention is the key component of transformer architecture
Graph neural networks are neural models that capture dependence relationships via passing messages between the nodes of the graph

Overall

Great book overall. Great for beginners if they have math and stats knowledge. Otherwise some of the formulas and derivations might be scary. Nonetheless, it is a great source of more in-depth information about each othe algorithms presented, covering supervised, unsupervides, semi-supervised, and DL.

What’s next for me?

Well, tomorrow (Wednesday) I will go to a career fair here in Seoul so I will give some info on that as well.

But in terms of things to study, at the moment I think I will start with one of the below courses in Coursera by Duke University professors:

That is all for today!

See you tomorrow :)