SlideShare a Scribd company logo
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Spatio-Temporal Data Analysis using
Statistical Methods and Deep Learning
Diego Ercoli
Supervisors: Prof. Roberta Sirovich
Dr. Lorenzo Bongiovanni
Master’s Degree in Computer Science (LM-18), University of Turin
Turin, 26 October 2023
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 0 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Table of Contents
1 Problem Definition
2 Exploratory Data Analysis
3 Statistical Methods
4 Deep Learning
5 Conclusions
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 1 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Table of Contents
1 Problem Definition
2 Exploratory Data Analysis
3 Statistical Methods
4 Deep Learning
5 Conclusions
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 1 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Motivation
• Criminal activities have direct impact on the quality of life as well as
on the socio-economic development of a nation [1]
• Governments are prone to use advance technologies to tackle and
prevent in beforehand
• Methods for their analysis and prediction will be the objective of this
dissertation
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 2 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Motivation
• Criminal activities have direct impact on the quality of life as well as
on the socio-economic development of a nation [1]
• Governments are prone to use advance technologies to tackle and
prevent in beforehand
• Methods for their analysis and prediction will be the objective of this
dissertation
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 2 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Objectives
• The majority of the work was spent to analyze the Boston crimes
dataset
• Boston Police Department reported all crime events happened from
2015 to 2022
• Two kinds of approach will be used for their analysis:
• Classical Statistical Methods
• Advance Deep Learning Architecures: the Tranformer Model
• This thesis work has been developed in collaboration with Links
Foundation1
1
LINKS Foundation operates at the heart of the Turin research and innovation
ecosystem, in a solid international network. Its target is to contribute to technological
and socio-economic progress through advanced processes of applied research.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 3 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Objectives
• The majority of the work was spent to analyze the Boston crimes
dataset
• Boston Police Department reported all crime events happened from
2015 to 2022
• Two kinds of approach will be used for their analysis:
• Classical Statistical Methods
• Advance Deep Learning Architecures: the Tranformer Model
• This thesis work has been developed in collaboration with Links
Foundation1
1
LINKS Foundation operates at the heart of the Turin research and innovation
ecosystem, in a solid international network. Its target is to contribute to technological
and socio-economic progress through advanced processes of applied research.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 3 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Objectives
• The majority of the work was spent to analyze the Boston crimes
dataset
• Boston Police Department reported all crime events happened from
2015 to 2022
• Two kinds of approach will be used for their analysis:
• Classical Statistical Methods
• Advance Deep Learning Architecures: the Tranformer Model
• This thesis work has been developed in collaboration with Links
Foundation1
1
LINKS Foundation operates at the heart of the Turin research and innovation
ecosystem, in a solid international network. Its target is to contribute to technological
and socio-economic progress through advanced processes of applied research.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 3 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Dataset
For each crime event we have taken into account only the following
features:
• pair of <langitude, longitude> gives the exact position of where the
event happened.
• timestamp of the event.
• category of the crime (i.e. larceny, assault, ...etc.)
This is a clear example of spatio-temporal data:
process evolving both in space and in time
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 4 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Dataset
For each crime event we have taken into account only the following
features:
• pair of <langitude, longitude> gives the exact position of where the
event happened.
• timestamp of the event.
• category of the crime (i.e. larceny, assault, ...etc.)
This is a clear example of spatio-temporal data:
process evolving both in space and in time
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 4 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Discretization
It has been adopted a Grid Strategy: this involves partitioning the whole
area of Boston into a regular grid of rectangular regions2.
A count is taken in each region on a daily basis.
In other words, crimes have been aggregated:
• temporally considering daily granularity
• spatially considering cells of 1 Km x 1 Km
2
From now on, we call them cells
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 5 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Table of Contents
1 Problem Definition
2 Exploratory Data Analysis
3 Statistical Methods
4 Deep Learning
5 Conclusions
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 5 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
The Grid
0 1 2 3 4 5 6 7 8 9 10
0 404 1033 140 616 976
1 1 2486 6294 3017 5817
2 184 445 52 1656 9026 4843 2553
3 1919 3168 4690 196 2 620 6317 26295 974 1
4 6688 5091 4754 1146 1560 7421 16327 14737 7484 1986 354
5 2560 2000 3164 4932 12434 15645 8314 4047 1923
6 56 1 93 6297 7109 18755 10347 8412 2994 1693
7 7 1192 6764 7818 9332 9051 6312 1338 24
8 143 2001 6463 6842 12354 10354 7588 267
9 81 64 496 1679 1516 415 11472 11396 15559 624
10 785 522 975 3230 274 4328 8235 9078 6102 1683
11 1370 4333 2810 3184 867 13556 7158 6748 1973 1461
12 374 1381 1292 2453 2159 6649 2427 2664 142
13 1422 346 628 2918 3095 2225
14 469 966 1662 6721 94
Table 1: Total number of events in each cell
The city of Boston is not a perfect square. Some cells will not contain any
event since they could be outside the city (or crossed by a river).
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 6 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Heatmap of the grid
(a) City map of Boston (b) Heatmap of the grid
Figure 1: Comparison between heatmap and city map
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 7 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Univariate Timeseries
Focusing on a particular region (total area of Boston or specific cell) and
filtering on one or more crimes, we get a univariate time series:
Xt = xt, for t = 1, 2, . . . , n
where:
• xt ≥ 0, since there could be days in which no event happens,
• t refers to a day (daily granularity),
• n is the total number of days.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 8 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Average Intertime
For each time series, to get a general idea of how frequently events occur,
average intertime has been computed in the following way:
1 First, record the days at which at least one event occurred and sort
them chronologically: d1, d2, ..., dm, where m ≤ n.
2 Calculate the time differences between consecutive days to get the
intertime values.
3 Average these intertime values.
Mathematically, this can be written as:
µ =
1
n − 1
n−1
X
i=1
(di+1 − di ) (1)
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 9 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Categories
Figure 2: Barchart for categories
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 10 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Comparison of average intertime
(a) Larceny: the most frequent one (b) Robbery: the 1st in the middle
Figure 3: Comparison of heatmaps in daily resolution between 4 categories. Color
refers to the number of crime events, while the label refers to the average
intertime.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 11 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Comparison of average intertime
(a) Violation: the 2nd in the middle (b) Kidnapping: the least frequent one
Figure 3: Comparison of heatmaps in daily resolution between 4 categories. Color
refers to the number of crime events, while the label refers to the average
intertime.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 11 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Table of Contents
1 Problem Definition
2 Exploratory Data Analysis
3 Statistical Methods
4 Deep Learning
5 Conclusions
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 11 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Modeling assumptions
As a first attempt, we try to model the problem assuming spatial
independence. Notes:
• Intuitively, we expect that there is no interaction/interdependence
among the events happened in different cells. In other words, an event
at cell x doesn’t encourage or inhibit the occurrence of other events in
the neighborhood of x.
• This allows us to employ classical statistical methods, which influenced
for a long time the forecasting domain (i.e. ARIMA model).
• The purpose is to create initial baseline, against which the
effectiveness of more sophisticated models can be evaluated.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 12 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Core ingredients
Our methodology relies on well-established techniques that are taught in
statistics and econometrics courses worldwide:
• STL Decomposition based on LOESS Algorithm (see next slide)
• Cubic spline: fitting different third-degree polynomials in different
intervals of the input space
• ARIMA
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 13 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
STL Decomposition-Example
Figure 4: STL Decomposition-Example
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 14 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology
Assuming to have identified a splitting point, all the past observations are
analyzed in order to forecast the n subsequent days.
1 We perform STL Decomposition of the training set, in order to identify
the three components of a timeseries: season, trend and residuals.
2 We fit an ARIMA model on the remainder it order to do the
forecasting of this component.
3 We use cubic spline method to forecast the trend component.
4 We extract the part of the seasonal component corresponding in time
to the interval of forecasting shifted one year back.
5 We sum up the last three computed quantities in order to get the
forecast.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 15 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Design Pattern-Template Method
Figure 5: Template Method is a behavioral design pattern that defines the
skeleton of an algorithm in the superclass but lets subclasses override specific
steps of the algorithm without changing its structure.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 16 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology in Action
Long-term-forecasting of Larceny Crime (year 2022)
(a) STL Decomposition of training set
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology in Action
Long-term-forecasting of Larceny Crime (year 2022)
(b) Forecasting of residual with ARIMA
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology in Action
Long-term-forecasting of Larceny Crime (year 2022)
(c) Trend forecast using cubic spline
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology in Action
Long-term-forecasting of Larceny Crime (year 2022)
(d) Seasonal component with post-smoothing
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology in Action
Long-term-forecasting of Larceny Crime (year 2022)
(e) Training set, test set, forecasting
(Post-smoothing the seasonal component)
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Methodology in Action
Long-term-forecasting of Larceny Crime (year 2022)
(f) Performances: RMSE = 6.19
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Sliding Window
1 t r a i n :2015−06−15−>2021−12−31 t e s t :2022−01−01−>2022−01−07
2
3 t r a i n :2015−06−15−>2022−01−07 t e s t :2022−01−08−>2022−01−14
4
5 t r a i n :2015−06−15−>2022−01−14 t e s t :2022−01−15−>2022−01−21
6
7 t r a i n :2015−06−15−>2022−01−21 t e s t :2022−01−22−>2022−01−28
8
9 t r a i n :2015−06−15−>2022−01−28 t e s t :2022−01−29−>2022−02−04
10
11 . . . continues . . .
12
13 t r a i n :2015−06−15−>2022−12−23 t e s t :2022−12−24−>2022−12−31
Listing 1: Log of sliding window
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 18 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Results
RMSE = 1.55
Figure 6: Color: number of crimes, label: RMSE value
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 19 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Deep Learning
1 Problem Definition
2 Exploratory Data Analysis
3 Statistical Methods
4 Deep Learning
5 Conclusions
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 19 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Transformer
• The Transformer model, introduced by Vaswani et al. in 2017 [5],
revolutionized the field of natural language processing (NLP) by
proposing the attention mechanism
• As well as for NLP tasks, it has also been applied successfully in other
application fields, such as time series forecasting.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 20 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Feature Embedding
A =

a11 a12 a13
a21 a22 a23

v = [a11, a12, a13, a21, a22, a23]
• In the case where the
transformer model is used for
non-text data, the process of
forming the input will still
involve converting the raw input
into a sequence of numerical
vectors.
• In particular, for each day, we
populate a matrix (grid) with
the number of crimes happened
in the various cells.
• The embedding is generated by
flattening the matrix by rows.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 21 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
The Model
Figure 7: The Encoder receives the feature embedding vectors of the previous 30
days. The Decoder outputs the forecast for the following 7 days.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 22 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Splitting the dataset
Table 2: Size of training, validation and test sets with related percentages
training set validation set test set
2191 213 325
80.29% 7.81% 11.91%
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 23 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Training process
• Each epoch, corresponding to one complete pass through the samples
of the training dataset using mini-batch gradient descent.
• At the end of each epoch, performances are evaluated both on the
training set and on the validation set.
• Update model with best accuracy on validation set
• Evaluate best model on the test set (unbiased estimate of
generalization).
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 24 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Results and comparison
Table 3: Comparison of performances between the different models
Model RMSE (Test set)
Statistical Approach 1.55
Transformer 1.264
Spatial correlation is a key factor!
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 25 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Table of Contents
1 Problem Definition
2 Exploratory Data Analysis
3 Statistical Methods
4 Deep Learning
5 Conclusions
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 25 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Statistical Methodology
Pro
• Explainability: present in
understandable terms to a
human how that machine
learning model makes its
decisions.
• Good enough performances
despite its simplicity.
• Easy to Implement.
Downsides
• Spatial independence
assumption.
• STL decomposition isn’t able to
work for all cells.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 26 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Statistical Methodology
Pro
• Explainability: present in
understandable terms to a
human how that machine
learning model makes its
decisions.
• Good enough performances
despite its simplicity.
• Easy to Implement.
Downsides
• Spatial independence
assumption.
• STL decomposition isn’t able to
work for all cells.
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 26 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Deep Learning
Pro
• Learn interactions between space
and time through the attention
mechanism.
• Learning in automatic way.
Downsides
• Black box.
• Requires large amount of
training data.
• It could demand significant
computational power
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 27 / 27
Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References
Deep Learning
Pro
• Learn interactions between space
and time through the attention
mechanism.
• Learning in automatic way.
Downsides
• Black box.
• Requires large amount of
training data.
• It could demand significant
computational power
Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 27 / 27
References I
[1] S. Hossain, A. Abtahee, I. Kashem, M. M. Hoque, and I. H. Sarker,
“Crime prediction using spatio-temporal data,” in Computing Science,
Communication and Security: First International Conference, COMS2
2020, Gujarat, India, March 26–27, 2020, Revised Selected Papers 1,
Springer, 2020, pp. 277–289.
[2] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to
Statistical Learning: with Applications in R. Springer, 2013. [Online].
Available:
https://faculty.marshall.usc.edu/gareth-james/ISL/.
[3] R. B. Cleveland, W. S. Cleveland, J. E. McRae, and I. Terpenning,
“Stl: A seasonal-trend decomposition procedure based on loess (with
discussion),” Journal of Official Statistics, vol. 6, pp. 3–73, 1990.
[4] R. Hyndman and G. Athanasopoulos, Forecasting: Principles and
Practice, English, 3rd. Australia: OTexts, 2021.
References II
[5] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,”
in Advances in Neural Information Processing Systems, I. Guyon,
U. V. Luxburg, S. Bengio, et al., Eds., vol. 30, Curran Associates, Inc.,
2017. [Online]. Available:
https://proceedings.neurips.cc/paper_files/paper/2017/
file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
SLIDE EXTRA
Data Analysis Results
category #days start_date end_date inter_time count_crimes avg_day stdv %days avg_global
LARCENY 2717 2015-06-15 2022-11-21 1.0 72623 26.729 7.37 99.963 26.719
M/V ACCIDENT 2718 2015-06-15 2022-11-22 1.0 68564 25.226 6.752 100.0 25.226
INVESTIGATE 2718 2015-06-15 2022-11-22 1.0 66020 24.29 7.068 100.0 24.29
ASSAULT 2717 2015-06-15 2022-11-21 1.0 42837 15.766 6.007 99.963 15.76
VAL 2714 2015-06-15 2022-11-21 1.001 37190 13.703 5.244 99.853 13.683
DRUGS 2680 2015-06-15 2022-11-21 1.014 29095 10.856 6.63 98.602 10.705
VANDALISM 2716 2015-06-15 2022-11-21 1.0 28105 10.348 4.377 99.926 10.34
OTHER 2713 2015-06-15 2022-11-21 1.001 24630 9.079 3.897 99.816 9.062
VERBAL DISPUTE 2692 2015-06-15 2022-11-21 1.009 21728 8.071 4.552 99.043 7.994
FRAUD 2696 2015-06-15 2022-11-21 1.008 19896 7.38 4.46 99.191 7.32
Table 4: Overall statistics at daily resolution for top 10 crimes
x y category #days start_date end_date inter_time count_crimes mean stdv %days
8 3 LARCENY 2469 2015-06-15 2022-11-21 1.1 6868 2.782 1.61 90.839
6 4 LARCENY 2383 2015-06-15 2022-11-21 1.14 6719 2.82 1.636 87.675
7 4 LARCENY 1902 2015-06-16 2022-11-17 1.426 3631 1.909 1.085 69.978
7 5 LARCENY 1714 2015-06-15 2022-11-19 1.584 3139 1.831 1.127 63.061
8 3 ASSAULT 1693 2015-06-16 2022-11-20 1.604 2942 1.738 1.015 62.288
8 9 M/V ACCIDENT 1582 2015-06-16 2022-11-21 1.717 2835 1.792 1.104 58.205
8 3 INVESTIGATE 1572 2015-06-18 2022-11-21 1.727 2429 1.545 0.86 57.837
6 6 LARCENY 1473 2015-06-19 2022-11-20 1.842 2296 1.559 0.847 54.194
7 6 DRUGS 1197 2015-06-16 2022-11-21 2.27 2215 1.85 1.236 44.04
6 6 M/V ACCIDENT 1376 2015-06-15 2022-11-21 1.975 2181 1.585 0.913 50.625
Table 5: Statistics at daily resolution for top 10 records considering grid of cells
Detection of events registered erroneously
index district timestamp latitude longitude category x y
494498 A7 23/05/2022 10:49 42.361055 -71.028312 POLICE SERVICE INCIDENTS 10 3
495163 External 26/05/2022 18:35 42.330420 -71.126040 M/V ACCIDENT 2 6
497227 D4 06/06/2022 11:40 42.379410 -71.101430 DRUGS 4 1
Table 6: Noisy events: errors related to event recording
After a quick examination, it turned out that:
•  42.21441, −71.00939  refers to a point in the sea
•  42.330420, −71.126040  is in Brooklin (outside Boston)
•  42.379410, −71.101430  is in Sommerville (outside Boston)
STL Decomposition-Algorithm
The algorithm uses an iterative approach that heavily relies on LOESS
smoothing (see next slide) in order to estimate the components. In
particolar, at each iteration :
1 Set D(k+1) = Y − Tk , by removing the current trend component Tk
from the original time series Y .
2 Get C(k+1) by applying loess-smoothing to the cycle-subseries and
collecting all the values.
3 Get L(k+1) by applying to C(k+1) specific transformations based on
moving average and again loess smoothing.
4 Get the seasonal component: S(k+1) = C(k+1) − L(k+1).
5 Set DS(k+1) = Y − S(k+1).
6 Get the trend component: T(k+1) thought smoothing by loess
DS(k+1).
Example of LOESS
Figure 8: Local regression illustrated on some simulated data, where the blue
curve represents f (x) from which the data were generated, and the light orange
curve corresponds to the local regression estimate ˆ
f (x).
Algorithm 1 Basic LOESS Algorithm
Require: x, y : vectors of data points
Require: q : number of nearest neighbors
Require: x0 : point of estimation
Ensure: ˆ
y0 : estimated value at x0
1: Sort data points based on proximity to x0
2: Select the k data points nearest to x0
3: Compute h as the distance of the q − th neighbor
4: for each point i in the selected k points do
5: Compute weight wi using tricube weight function:
6: wi =

1 −

|xi −x0|
h
3
3
if |xi − x0|  h else wi = 0
7: Fit a weighted least squares polynomial (usually linear) to the selected points using weights
wi , in other words find b0 and b1 that minimizes:
n
X
i=1
wi (yi − b0 − b1x1)2
8: Evaluate the fitted polynomial at x0 to obtain ˆ
y0
9: return ˆ
y0
Smoothing with LOESS
Figure 9: In our case, we use LOESS to smooth timeseries
Cubic Spline
• Divide the input space up into
regions by setting K knots at
uniform quantiles of data
• A knot identifies a transition
between two consecutive regions
• Fit each region with a different
cubic polynomial ensuring
continuous first and second
derivates
Figure 10: Cubic Spline with 1 knot
ARIMA Model Overview
ARIMA stands for AutoRegressive Integrated Moving Average.
• AR (AutoRegressive): Involves forecasting the variable of interest
using a linear combination of past values of the variable
• I (Integrated): Represents the number of differences needed to make
the series stationary (i.e., data values are replaced by the difference
between their values and the previous values).
• MA (Moving Average): Rather than using past values of the
forecast variable in a regression, a moving average model uses past
forecast errors in a regression-like model.
It is used for forecasting time series data that has been made stationary
through differencing.
Training Transformer-1
Figure 11: At the end of each epoch, we plot the RMSE for both the Training Set
and the Validation Set.
Training Transformer-2
Figure 12: (Zoom) At the end of each epoch, we plot the RMSE for both the
Training Set and the Validation Set.
Training Transformer-3
• The validation error is lower than the training error.
• A potential technical reason is related to regularization effects:
dropout can cause the model to behave differently during training
than during evaluation.
• During training, dropout randomly sets a fraction of inputs to zero,
which might increase the training error. During validation, dropout is
turned off, and the model might perform better, resulting in a lower
validation error.

More Related Content

PPT
Week11-EvaluationMethods.ppt
PDF
Human Behaviour Understanding using Top-View RGB-D Data
PDF
SENTIMENT ANALYSIS AND GEOGRAPHICAL ANALYSIS FOR ENHANCING SECURITY
PDF
Introduction to basic statistics 1 Don Ozisco
PPTX
Big data as a source for official statistics
PDF
Strata Big data presentation
PPT
Gabriel Rissola: "Measuring the impact of eInclusion actors"
PDF
Introduction to basic statistics 1 Don Ozisco
Week11-EvaluationMethods.ppt
Human Behaviour Understanding using Top-View RGB-D Data
SENTIMENT ANALYSIS AND GEOGRAPHICAL ANALYSIS FOR ENHANCING SECURITY
Introduction to basic statistics 1 Don Ozisco
Big data as a source for official statistics
Strata Big data presentation
Gabriel Rissola: "Measuring the impact of eInclusion actors"
Introduction to basic statistics 1 Don Ozisco

Similar to Master's Presentation: Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning.pdf (20)

PPTX
Semantic-based Process Analysis
PDF
Unveiling the Journey: Data Collection, Processing, and Analysis in Geography
PDF
Week_2_Lecture.pdf
PDF
Lecture_2_Stats.pdf
PPT
MAC411(A) Analysis in Communication Researc.ppt
PDF
5-Research Methodology (Quantitative and Qualitative Approach-Data Collection...
PDF
Introduction THE ANALYSIS OF AND DATA ANALYTICS
PPTX
Chapter 16
PDF
Leicester City Covid-19 Testing Programme webinar
PDF
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
PDF
s40537-015-0030-3-data-analytics-a-survey.pdf
PDF
Understanding Public Safety Trends in Calgary: A Data Mining Perspective
PDF
Simplified planning technique
PDF
Temporal models for mining, ranking and recommendation in the Web
PDF
PWC: Data Driven Cities [2016]
PPTX
Business Statistics learning with excellent
PPTX
DS-Intro.pptx
PDF
Emotional Social Signals for Search Ranking
PDF
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Semantic-based Process Analysis
Unveiling the Journey: Data Collection, Processing, and Analysis in Geography
Week_2_Lecture.pdf
Lecture_2_Stats.pdf
MAC411(A) Analysis in Communication Researc.ppt
5-Research Methodology (Quantitative and Qualitative Approach-Data Collection...
Introduction THE ANALYSIS OF AND DATA ANALYTICS
Chapter 16
Leicester City Covid-19 Testing Programme webinar
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
s40537-015-0030-3-data-analytics-a-survey.pdf
Understanding Public Safety Trends in Calgary: A Data Mining Perspective
Simplified planning technique
Temporal models for mining, ranking and recommendation in the Web
PWC: Data Driven Cities [2016]
Business Statistics learning with excellent
DS-Intro.pptx
Emotional Social Signals for Search Ranking
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Ad

Recently uploaded (20)

PPTX
Tartificialntelligence_presentation.pptx
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Machine Learning_overview_presentation.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mushroom cultivation and it's methods.pdf
Tartificialntelligence_presentation.pptx
cloud_computing_Infrastucture_as_cloud_p
NewMind AI Weekly Chronicles - August'25-Week II
Machine Learning_overview_presentation.pptx
Machine learning based COVID-19 study performance prediction
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Assigned Numbers - 2025 - Bluetooth® Document
OMC Textile Division Presentation 2021.pptx
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
A comparative study of natural language inference in Swahili using monolingua...
TLE Review Electricity (Electricity).pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Univ-Connecticut-ChatGPT-Presentaion.pdf
Encapsulation_ Review paper, used for researhc scholars
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mushroom cultivation and it's methods.pdf
Ad

Master's Presentation: Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning.pdf

  • 1. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning Diego Ercoli Supervisors: Prof. Roberta Sirovich Dr. Lorenzo Bongiovanni Master’s Degree in Computer Science (LM-18), University of Turin Turin, 26 October 2023 Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 0 / 27
  • 2. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Table of Contents 1 Problem Definition 2 Exploratory Data Analysis 3 Statistical Methods 4 Deep Learning 5 Conclusions Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 1 / 27
  • 3. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Table of Contents 1 Problem Definition 2 Exploratory Data Analysis 3 Statistical Methods 4 Deep Learning 5 Conclusions Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 1 / 27
  • 4. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Motivation • Criminal activities have direct impact on the quality of life as well as on the socio-economic development of a nation [1] • Governments are prone to use advance technologies to tackle and prevent in beforehand • Methods for their analysis and prediction will be the objective of this dissertation Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 2 / 27
  • 5. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Motivation • Criminal activities have direct impact on the quality of life as well as on the socio-economic development of a nation [1] • Governments are prone to use advance technologies to tackle and prevent in beforehand • Methods for their analysis and prediction will be the objective of this dissertation Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 2 / 27
  • 6. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Objectives • The majority of the work was spent to analyze the Boston crimes dataset • Boston Police Department reported all crime events happened from 2015 to 2022 • Two kinds of approach will be used for their analysis: • Classical Statistical Methods • Advance Deep Learning Architecures: the Tranformer Model • This thesis work has been developed in collaboration with Links Foundation1 1 LINKS Foundation operates at the heart of the Turin research and innovation ecosystem, in a solid international network. Its target is to contribute to technological and socio-economic progress through advanced processes of applied research. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 3 / 27
  • 7. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Objectives • The majority of the work was spent to analyze the Boston crimes dataset • Boston Police Department reported all crime events happened from 2015 to 2022 • Two kinds of approach will be used for their analysis: • Classical Statistical Methods • Advance Deep Learning Architecures: the Tranformer Model • This thesis work has been developed in collaboration with Links Foundation1 1 LINKS Foundation operates at the heart of the Turin research and innovation ecosystem, in a solid international network. Its target is to contribute to technological and socio-economic progress through advanced processes of applied research. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 3 / 27
  • 8. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Objectives • The majority of the work was spent to analyze the Boston crimes dataset • Boston Police Department reported all crime events happened from 2015 to 2022 • Two kinds of approach will be used for their analysis: • Classical Statistical Methods • Advance Deep Learning Architecures: the Tranformer Model • This thesis work has been developed in collaboration with Links Foundation1 1 LINKS Foundation operates at the heart of the Turin research and innovation ecosystem, in a solid international network. Its target is to contribute to technological and socio-economic progress through advanced processes of applied research. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 3 / 27
  • 9. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Dataset For each crime event we have taken into account only the following features: • pair of <langitude, longitude> gives the exact position of where the event happened. • timestamp of the event. • category of the crime (i.e. larceny, assault, ...etc.) This is a clear example of spatio-temporal data: process evolving both in space and in time Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 4 / 27
  • 10. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Dataset For each crime event we have taken into account only the following features: • pair of <langitude, longitude> gives the exact position of where the event happened. • timestamp of the event. • category of the crime (i.e. larceny, assault, ...etc.) This is a clear example of spatio-temporal data: process evolving both in space and in time Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 4 / 27
  • 11. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Discretization It has been adopted a Grid Strategy: this involves partitioning the whole area of Boston into a regular grid of rectangular regions2. A count is taken in each region on a daily basis. In other words, crimes have been aggregated: • temporally considering daily granularity • spatially considering cells of 1 Km x 1 Km 2 From now on, we call them cells Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 5 / 27
  • 12. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Table of Contents 1 Problem Definition 2 Exploratory Data Analysis 3 Statistical Methods 4 Deep Learning 5 Conclusions Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 5 / 27
  • 13. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References The Grid 0 1 2 3 4 5 6 7 8 9 10 0 404 1033 140 616 976 1 1 2486 6294 3017 5817 2 184 445 52 1656 9026 4843 2553 3 1919 3168 4690 196 2 620 6317 26295 974 1 4 6688 5091 4754 1146 1560 7421 16327 14737 7484 1986 354 5 2560 2000 3164 4932 12434 15645 8314 4047 1923 6 56 1 93 6297 7109 18755 10347 8412 2994 1693 7 7 1192 6764 7818 9332 9051 6312 1338 24 8 143 2001 6463 6842 12354 10354 7588 267 9 81 64 496 1679 1516 415 11472 11396 15559 624 10 785 522 975 3230 274 4328 8235 9078 6102 1683 11 1370 4333 2810 3184 867 13556 7158 6748 1973 1461 12 374 1381 1292 2453 2159 6649 2427 2664 142 13 1422 346 628 2918 3095 2225 14 469 966 1662 6721 94 Table 1: Total number of events in each cell The city of Boston is not a perfect square. Some cells will not contain any event since they could be outside the city (or crossed by a river). Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 6 / 27
  • 14. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Heatmap of the grid (a) City map of Boston (b) Heatmap of the grid Figure 1: Comparison between heatmap and city map Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 7 / 27
  • 15. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Univariate Timeseries Focusing on a particular region (total area of Boston or specific cell) and filtering on one or more crimes, we get a univariate time series: Xt = xt, for t = 1, 2, . . . , n where: • xt ≥ 0, since there could be days in which no event happens, • t refers to a day (daily granularity), • n is the total number of days. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 8 / 27
  • 16. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Average Intertime For each time series, to get a general idea of how frequently events occur, average intertime has been computed in the following way: 1 First, record the days at which at least one event occurred and sort them chronologically: d1, d2, ..., dm, where m ≤ n. 2 Calculate the time differences between consecutive days to get the intertime values. 3 Average these intertime values. Mathematically, this can be written as: µ = 1 n − 1 n−1 X i=1 (di+1 − di ) (1) Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 9 / 27
  • 17. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Categories Figure 2: Barchart for categories Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 10 / 27
  • 18. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Comparison of average intertime (a) Larceny: the most frequent one (b) Robbery: the 1st in the middle Figure 3: Comparison of heatmaps in daily resolution between 4 categories. Color refers to the number of crime events, while the label refers to the average intertime. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 11 / 27
  • 19. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Comparison of average intertime (a) Violation: the 2nd in the middle (b) Kidnapping: the least frequent one Figure 3: Comparison of heatmaps in daily resolution between 4 categories. Color refers to the number of crime events, while the label refers to the average intertime. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 11 / 27
  • 20. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Table of Contents 1 Problem Definition 2 Exploratory Data Analysis 3 Statistical Methods 4 Deep Learning 5 Conclusions Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 11 / 27
  • 21. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Modeling assumptions As a first attempt, we try to model the problem assuming spatial independence. Notes: • Intuitively, we expect that there is no interaction/interdependence among the events happened in different cells. In other words, an event at cell x doesn’t encourage or inhibit the occurrence of other events in the neighborhood of x. • This allows us to employ classical statistical methods, which influenced for a long time the forecasting domain (i.e. ARIMA model). • The purpose is to create initial baseline, against which the effectiveness of more sophisticated models can be evaluated. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 12 / 27
  • 22. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Core ingredients Our methodology relies on well-established techniques that are taught in statistics and econometrics courses worldwide: • STL Decomposition based on LOESS Algorithm (see next slide) • Cubic spline: fitting different third-degree polynomials in different intervals of the input space • ARIMA Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 13 / 27
  • 23. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References STL Decomposition-Example Figure 4: STL Decomposition-Example Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 14 / 27
  • 24. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology Assuming to have identified a splitting point, all the past observations are analyzed in order to forecast the n subsequent days. 1 We perform STL Decomposition of the training set, in order to identify the three components of a timeseries: season, trend and residuals. 2 We fit an ARIMA model on the remainder it order to do the forecasting of this component. 3 We use cubic spline method to forecast the trend component. 4 We extract the part of the seasonal component corresponding in time to the interval of forecasting shifted one year back. 5 We sum up the last three computed quantities in order to get the forecast. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 15 / 27
  • 25. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Design Pattern-Template Method Figure 5: Template Method is a behavioral design pattern that defines the skeleton of an algorithm in the superclass but lets subclasses override specific steps of the algorithm without changing its structure. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 16 / 27
  • 26. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology in Action Long-term-forecasting of Larceny Crime (year 2022) (a) STL Decomposition of training set Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
  • 27. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology in Action Long-term-forecasting of Larceny Crime (year 2022) (b) Forecasting of residual with ARIMA Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
  • 28. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology in Action Long-term-forecasting of Larceny Crime (year 2022) (c) Trend forecast using cubic spline Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
  • 29. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology in Action Long-term-forecasting of Larceny Crime (year 2022) (d) Seasonal component with post-smoothing Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
  • 30. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology in Action Long-term-forecasting of Larceny Crime (year 2022) (e) Training set, test set, forecasting (Post-smoothing the seasonal component) Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
  • 31. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Methodology in Action Long-term-forecasting of Larceny Crime (year 2022) (f) Performances: RMSE = 6.19 Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 17 / 27
  • 32. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Sliding Window 1 t r a i n :2015−06−15−>2021−12−31 t e s t :2022−01−01−>2022−01−07 2 3 t r a i n :2015−06−15−>2022−01−07 t e s t :2022−01−08−>2022−01−14 4 5 t r a i n :2015−06−15−>2022−01−14 t e s t :2022−01−15−>2022−01−21 6 7 t r a i n :2015−06−15−>2022−01−21 t e s t :2022−01−22−>2022−01−28 8 9 t r a i n :2015−06−15−>2022−01−28 t e s t :2022−01−29−>2022−02−04 10 11 . . . continues . . . 12 13 t r a i n :2015−06−15−>2022−12−23 t e s t :2022−12−24−>2022−12−31 Listing 1: Log of sliding window Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 18 / 27
  • 33. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Results RMSE = 1.55 Figure 6: Color: number of crimes, label: RMSE value Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 19 / 27
  • 34. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Deep Learning 1 Problem Definition 2 Exploratory Data Analysis 3 Statistical Methods 4 Deep Learning 5 Conclusions Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 19 / 27
  • 35. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Transformer • The Transformer model, introduced by Vaswani et al. in 2017 [5], revolutionized the field of natural language processing (NLP) by proposing the attention mechanism • As well as for NLP tasks, it has also been applied successfully in other application fields, such as time series forecasting. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 20 / 27
  • 36. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Feature Embedding A = a11 a12 a13 a21 a22 a23 v = [a11, a12, a13, a21, a22, a23] • In the case where the transformer model is used for non-text data, the process of forming the input will still involve converting the raw input into a sequence of numerical vectors. • In particular, for each day, we populate a matrix (grid) with the number of crimes happened in the various cells. • The embedding is generated by flattening the matrix by rows. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 21 / 27
  • 37. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References The Model Figure 7: The Encoder receives the feature embedding vectors of the previous 30 days. The Decoder outputs the forecast for the following 7 days. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 22 / 27
  • 38. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Splitting the dataset Table 2: Size of training, validation and test sets with related percentages training set validation set test set 2191 213 325 80.29% 7.81% 11.91% Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 23 / 27
  • 39. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Training process • Each epoch, corresponding to one complete pass through the samples of the training dataset using mini-batch gradient descent. • At the end of each epoch, performances are evaluated both on the training set and on the validation set. • Update model with best accuracy on validation set • Evaluate best model on the test set (unbiased estimate of generalization). Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 24 / 27
  • 40. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Results and comparison Table 3: Comparison of performances between the different models Model RMSE (Test set) Statistical Approach 1.55 Transformer 1.264 Spatial correlation is a key factor! Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 25 / 27
  • 41. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Table of Contents 1 Problem Definition 2 Exploratory Data Analysis 3 Statistical Methods 4 Deep Learning 5 Conclusions Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 25 / 27
  • 42. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Statistical Methodology Pro • Explainability: present in understandable terms to a human how that machine learning model makes its decisions. • Good enough performances despite its simplicity. • Easy to Implement. Downsides • Spatial independence assumption. • STL decomposition isn’t able to work for all cells. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 26 / 27
  • 43. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Statistical Methodology Pro • Explainability: present in understandable terms to a human how that machine learning model makes its decisions. • Good enough performances despite its simplicity. • Easy to Implement. Downsides • Spatial independence assumption. • STL decomposition isn’t able to work for all cells. Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 26 / 27
  • 44. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Deep Learning Pro • Learn interactions between space and time through the attention mechanism. • Learning in automatic way. Downsides • Black box. • Requires large amount of training data. • It could demand significant computational power Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 27 / 27
  • 45. Problem Definition Exploratory Data Analysis Statistical Methods Deep Learning Conclusions References Deep Learning Pro • Learn interactions between space and time through the attention mechanism. • Learning in automatic way. Downsides • Black box. • Requires large amount of training data. • It could demand significant computational power Diego Ercoli Spatio-Temporal Data Analysis using Statistical Methods and Deep Learning 27 / 27
  • 46. References I [1] S. Hossain, A. Abtahee, I. Kashem, M. M. Hoque, and I. H. Sarker, “Crime prediction using spatio-temporal data,” in Computing Science, Communication and Security: First International Conference, COMS2 2020, Gujarat, India, March 26–27, 2020, Revised Selected Papers 1, Springer, 2020, pp. 277–289. [2] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: with Applications in R. Springer, 2013. [Online]. Available: https://faculty.marshall.usc.edu/gareth-james/ISL/. [3] R. B. Cleveland, W. S. Cleveland, J. E. McRae, and I. Terpenning, “Stl: A seasonal-trend decomposition procedure based on loess (with discussion),” Journal of Official Statistics, vol. 6, pp. 3–73, 1990. [4] R. Hyndman and G. Athanasopoulos, Forecasting: Principles and Practice, English, 3rd. Australia: OTexts, 2021.
  • 47. References II [5] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, et al., Eds., vol. 30, Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/ file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
  • 49. Data Analysis Results category #days start_date end_date inter_time count_crimes avg_day stdv %days avg_global LARCENY 2717 2015-06-15 2022-11-21 1.0 72623 26.729 7.37 99.963 26.719 M/V ACCIDENT 2718 2015-06-15 2022-11-22 1.0 68564 25.226 6.752 100.0 25.226 INVESTIGATE 2718 2015-06-15 2022-11-22 1.0 66020 24.29 7.068 100.0 24.29 ASSAULT 2717 2015-06-15 2022-11-21 1.0 42837 15.766 6.007 99.963 15.76 VAL 2714 2015-06-15 2022-11-21 1.001 37190 13.703 5.244 99.853 13.683 DRUGS 2680 2015-06-15 2022-11-21 1.014 29095 10.856 6.63 98.602 10.705 VANDALISM 2716 2015-06-15 2022-11-21 1.0 28105 10.348 4.377 99.926 10.34 OTHER 2713 2015-06-15 2022-11-21 1.001 24630 9.079 3.897 99.816 9.062 VERBAL DISPUTE 2692 2015-06-15 2022-11-21 1.009 21728 8.071 4.552 99.043 7.994 FRAUD 2696 2015-06-15 2022-11-21 1.008 19896 7.38 4.46 99.191 7.32 Table 4: Overall statistics at daily resolution for top 10 crimes x y category #days start_date end_date inter_time count_crimes mean stdv %days 8 3 LARCENY 2469 2015-06-15 2022-11-21 1.1 6868 2.782 1.61 90.839 6 4 LARCENY 2383 2015-06-15 2022-11-21 1.14 6719 2.82 1.636 87.675 7 4 LARCENY 1902 2015-06-16 2022-11-17 1.426 3631 1.909 1.085 69.978 7 5 LARCENY 1714 2015-06-15 2022-11-19 1.584 3139 1.831 1.127 63.061 8 3 ASSAULT 1693 2015-06-16 2022-11-20 1.604 2942 1.738 1.015 62.288 8 9 M/V ACCIDENT 1582 2015-06-16 2022-11-21 1.717 2835 1.792 1.104 58.205 8 3 INVESTIGATE 1572 2015-06-18 2022-11-21 1.727 2429 1.545 0.86 57.837 6 6 LARCENY 1473 2015-06-19 2022-11-20 1.842 2296 1.559 0.847 54.194 7 6 DRUGS 1197 2015-06-16 2022-11-21 2.27 2215 1.85 1.236 44.04 6 6 M/V ACCIDENT 1376 2015-06-15 2022-11-21 1.975 2181 1.585 0.913 50.625 Table 5: Statistics at daily resolution for top 10 records considering grid of cells
  • 50. Detection of events registered erroneously index district timestamp latitude longitude category x y 494498 A7 23/05/2022 10:49 42.361055 -71.028312 POLICE SERVICE INCIDENTS 10 3 495163 External 26/05/2022 18:35 42.330420 -71.126040 M/V ACCIDENT 2 6 497227 D4 06/06/2022 11:40 42.379410 -71.101430 DRUGS 4 1 Table 6: Noisy events: errors related to event recording After a quick examination, it turned out that: • 42.21441, −71.00939 refers to a point in the sea • 42.330420, −71.126040 is in Brooklin (outside Boston) • 42.379410, −71.101430 is in Sommerville (outside Boston)
  • 51. STL Decomposition-Algorithm The algorithm uses an iterative approach that heavily relies on LOESS smoothing (see next slide) in order to estimate the components. In particolar, at each iteration : 1 Set D(k+1) = Y − Tk , by removing the current trend component Tk from the original time series Y . 2 Get C(k+1) by applying loess-smoothing to the cycle-subseries and collecting all the values. 3 Get L(k+1) by applying to C(k+1) specific transformations based on moving average and again loess smoothing. 4 Get the seasonal component: S(k+1) = C(k+1) − L(k+1). 5 Set DS(k+1) = Y − S(k+1). 6 Get the trend component: T(k+1) thought smoothing by loess DS(k+1).
  • 52. Example of LOESS Figure 8: Local regression illustrated on some simulated data, where the blue curve represents f (x) from which the data were generated, and the light orange curve corresponds to the local regression estimate ˆ f (x).
  • 53. Algorithm 1 Basic LOESS Algorithm Require: x, y : vectors of data points Require: q : number of nearest neighbors Require: x0 : point of estimation Ensure: ˆ y0 : estimated value at x0 1: Sort data points based on proximity to x0 2: Select the k data points nearest to x0 3: Compute h as the distance of the q − th neighbor 4: for each point i in the selected k points do 5: Compute weight wi using tricube weight function: 6: wi = 1 − |xi −x0| h 3 3 if |xi − x0| h else wi = 0 7: Fit a weighted least squares polynomial (usually linear) to the selected points using weights wi , in other words find b0 and b1 that minimizes: n X i=1 wi (yi − b0 − b1x1)2 8: Evaluate the fitted polynomial at x0 to obtain ˆ y0 9: return ˆ y0
  • 54. Smoothing with LOESS Figure 9: In our case, we use LOESS to smooth timeseries
  • 55. Cubic Spline • Divide the input space up into regions by setting K knots at uniform quantiles of data • A knot identifies a transition between two consecutive regions • Fit each region with a different cubic polynomial ensuring continuous first and second derivates Figure 10: Cubic Spline with 1 knot
  • 56. ARIMA Model Overview ARIMA stands for AutoRegressive Integrated Moving Average. • AR (AutoRegressive): Involves forecasting the variable of interest using a linear combination of past values of the variable • I (Integrated): Represents the number of differences needed to make the series stationary (i.e., data values are replaced by the difference between their values and the previous values). • MA (Moving Average): Rather than using past values of the forecast variable in a regression, a moving average model uses past forecast errors in a regression-like model. It is used for forecasting time series data that has been made stationary through differencing.
  • 57. Training Transformer-1 Figure 11: At the end of each epoch, we plot the RMSE for both the Training Set and the Validation Set.
  • 58. Training Transformer-2 Figure 12: (Zoom) At the end of each epoch, we plot the RMSE for both the Training Set and the Validation Set.
  • 59. Training Transformer-3 • The validation error is lower than the training error. • A potential technical reason is related to regularization effects: dropout can cause the model to behave differently during training than during evaluation. • During training, dropout randomly sets a fraction of inputs to zero, which might increase the training error. During validation, dropout is turned off, and the model might perform better, resulting in a lower validation error.