(Day 154) Diving deeper into Graph Neural Networks used in taxi demand prediction

Ivan Ivanov · June 3, 2024

Hello :) Today is Day 154!

A quick summary of today:

  • STGCN - Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting
  • CACRNN - Predicting taxi demands via an attention-based convolutional recurrent neural network

Thanks to a lab mate, I found a way to sync my Obsidian notes using google drive. So I don’t need to use github anymore! But still I have to upload text with math notations as pictures.

image

Frist paper, STGCN

Introduction

The paper presents methods to effectively capture the temporal and spatial patterns in traffic flow. Instead of viewing the traffic network as separate grids or segments, it represents it as a general graph to better leverage spatial data. To address the shortcomings of recurrent networks, a fully convolutional structure along the time axis is used. The key contribution is the development of a novel deep learning model, called spatio-temporal graph convolutional networks, specifically designed for traffic forecasting. This model consists of spatio-temporal convolutional blocks that integrate graph convolutional layers with convolutional sequence learning layers, enabling the simultaneous modelling of spatial and temporal dependencies. This approach is, to our knowledge, the first to apply purely convolutional structures for extracting spatio-temporal features from graph-based time series in traffic analysis.

Preliminaries

image image

Proposed model

Network architecture

image

Graph CNNs for extracting spatial features The Theta formula from the previous section, can be expensive as O(n^2), but two approximation strategies are applied to overcome this issue.

Chebyshev polynomials approximation From O(n^2) we can go to O(K|E|).

1st-order approximation

They are a simplified way to perform graph convolutions, making them efficient for large-scale graphs. They simplify the convolution operation by merging two parameters into one and normalizing the graph’s adjacency matrix. The result is a deep architecture can be created to capture spatial information efficiently, with each layer using information from neighbouring nodes up to a certain depth (defined by the number of layers, K).

Generalisation of graph convolutions

image

Graph convolutions can handle multi-channel and multi-frame data, making them suitable for complex inputs like traffic prediction data.

Gated CNNs for extracting temporal features The popular for time-series analysis, suffer when it comes to traffic predictions due to their time-consuming nature and inability to respond to dynamic changes. CNNs, on the other hand, have fast training, simple structure, and no dependency constraints to previous steps.

Spatio-temporal convolutional block This block fuses both spatial and temporal features. The sandwich structure (middle picture from the network architecture) helps the network sufficiently apply bottleneck strategy to achieve scale compression and feature squeezing by downscaling and upscaling of channels C through the graph convolutional layer.

Model summary

  • STGCN is a universal network that can handle any type of spatio-temporal sequence learning task
  • The spatio-temporal block combines graph convolutions and gated temporal convolutions, which extract spatial and temporal features, respectively
  • The model is entirely composed of convolutional structures and therefore achieves parallelisation over input with fewer parameters and faster training speed. Also, due to the approximations included, large-scale networks can be handled as well.

Experiment

Dataset

Using two real-world traffic datasets - BJER4 (Beijing) and PeMSD7 (California).

image

Settings

Only workday traffic data is used to eliminate atypical traffic (Li et al., 2015). Grid search is used to locate the best parameters on validations. All tests use 60 minutes as a historical time window (i.e. 12 observed data points; M=12 are used to forecast traffic conditions in the next 15, 30, and 45 minutes)

Evaluation metrics

MAE, MAPE, RMSE

Baseline models

HA, Linear SVR, ARIMA, FNN, FC-LSTM, GCGRU

Results

image image image

Second paper, CACRNN

Introduction

This paper proposes a context-aware attention-based convolutional recurrent neural network (CACRNN) for predicting fine-grained taxi demand in an urban area.

Key features include urban area partitioning: using morphology-based map segmentation, the urban area is divided into fine-grained regions; and for region, three taxi demand predictions are made, considering various spatio-temporal dependencies and external factor. The proposed model combines local convolutional layers and GRUs to capture spatial and temporal data characteristics. It also uses RNNs to identify short- and long-term periodic taxi demand patterns.

Problem formulation

Definition 1: road network

A road network of an urban area is composed of a set of road segments. Each road segment is associated with two terminal points (i.e. intersections of crossroads), and connects with other road segments by sharing the same terminals. All road segments compose the road network in the format of a graph.

Time is split into equal interval time slots, and the whole urban area is divided into disjoint regions based on its road network. Each region is an irregular polygon, encompassed by several road segments.

image

Data

Road network data

image

Taxi trip data

image image image

POI data

image image

Meteorological and holiday data

image

Methodology

image image

Definition 6: Functional similarity

Two regions evaluated whether they are similar based on cosine-similarity of their embedding vector (containing significance of POIs in a specific category c, measured by TF-IDF).

image image

Experiments

Baseline models HA, ARIMA, LSTM, DCRNN, STGCN Results

Also compared with different variations of the CACRNN

image

Finally, thanks to my lab professor and lab mates for the birthday cake

image

That is all for today!

See you tomorrow :)