Scaling Deep Learning Models for Large Spatial Time-Series Forecasting

Scaling Deep Learning Models for Large Spatial
Time-Series Forecasting
Zainab Abbas1, Jon Reginbald Ivarsson1, Ahmad Al-Shishtawy2 and Vladimir Vlassov1
1 KTH Royal Institute of Technology
2 RISE SICS
Stockholm, Sweden
IEEE BIGDATA 2019, LOS ANGELES, DEC 9-12, 2019

Challenge of Scale
● Deep Neural networks are used for different machine learning tasks, such as
spatial time-series forecasting.
● At scale, training deep NNs is computationally and memory intensive.
● Partitioning and distribution is a general approach to the challenge of scale in
NN-based modelling
○ dividing the problem into smaller tasks.
○ these tasks comprise of smaller models working on subsets of data
2

Traffic Data
● Sensor ID
● GPS coordinates
● Time
● Flow (no. of cars per minute)
● Average Speed (Km per hr)
Flow / Speed = Density (Cars per Km)
3Scaling Deep Learning Models for Large Spatial Time-Series Forecasting

Large amount of sensor data
● Radar sensors deployed on
Stockholm highways
● More than 88 millions data
points collected by 2058
sensors
● Number of sensors
increasing every year
● Sensor values per minute

Research Questions
● How to partition spatial time-series while preserving dependencies among
them?
● Which and how many spatial time-series do we take into account for a fast and
accurate forecast?
5
Scaling Deep Learning Models for Large Spatial Time-Series Forecasting

Graph Representation
● The traffic sensors are represented in the
form of a directed weighted graph
● Sensors that are at the same location but
on different lanes are represented as a
vertex
● The paths between sensors are
represented as edges
● An edge is weighted with the travel time
between the corresponding sensors
6

Graph Partitioning
7
2) Creation of Base Partitions
4) Additions of Partitions from Front and Behind.3) Creation of Base Partitions Graph
Do backward traversal from the
starting vertex in the graph till the
threshold is met.
1) Directed Weighted Graph

Data Representation
8
Spatio-temporal dependencies of sensor data Space-Time window representation of sensor data

Stacked LSTMs
● Stacked LSTMs are build of multiple layers of LSTM
placed over each other
● More powerful and deep neural network compared to
the conventional architecture
● Capable to learn non-linear dependencies in the data

Technology
● Apache Spark 2.4.0
● Python 2.7.15
● Tensorflow 1.11.0

Prediction Models

1-1 Single Sensor Model (SSM)
12

1-1 Single Sensor Model (SSM)
13

n-n Entire Sensor Infrastructure Model (ESIM)
n n

Partition-Based Models (B-t)
15
3 min partitions (B3) 5 min partitions (B5)

Partition-Based Models (B-t)
16
10 min partitions (B10) 20 min partitions (B20)

Results
Partition-based model B15 has ≈2x less RMSE compared to SSM and ESIM.
17

Results
Sequential training time (sec) Parallel training time (sec)
18

* SSMs have very less prediction time, hence omitted. 19
Results

Conclusion
● Scalability
● Accuracy
● Performance

Thank you :)
Zainab Abbas
zainabab@kth.se
Vladimir Vlassov
vladv@kth.se
Ahmad Al-Shishtawy
ahmad.al-shishtawy@ri.se
Jon R. Ivarsson
mail@reginbald.com

Scaling Deep Learning Models for Large Spatial Time-Series Forecasting

More Related Content

What's hot (20)

Similar to Scaling Deep Learning Models for Large Spatial Time-Series Forecasting (20)

Recently uploaded (20)

Scaling Deep Learning Models for Large Spatial Time-Series Forecasting

Editor's Notes