RDataMining slides-time-series-analysis

Time Series Analysis with R
Yanchang Zhao
http://www.RDataMining.com
R and Data Mining Course
Beijing University of Posts and Telecommunications,
Beijing, China
July 2019
1 / 40

Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classiﬁcation
Online Resources
2 / 40

Time Series Analysis with R ∗
time series data in R
time series decomposition, forecasting, clustering and
classiﬁcation
autoregressive integrated moving average (ARIMA) model
Dynamic Time Warping (DTW)
Discrete Wavelet Transform (DWT)
k-NN classiﬁcation
∗
Chapter 8: Time Series Analysis and Mining, in book
R and Data Mining: Examples and Case Studies.
http://www.rdatamining.com/docs/RDataMining-book.pdf
3 / 40

Time Series Data in R
class ts
represents data which has been sampled at equispaced points
in time
frequency=7: a weekly series
frequency=12: a monthly series
frequency=4: a quarterly series
4 / 40

Time Series Data in R
## an example of time series data
a <- ts(1:20, frequency = 12, start = c(2011, 3))
print(a)
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2011 1 2 3 4 5 6 7 8 9 10
## 2012 11 12 13 14 15 16 17 18 19 20
str(a)
## Time-Series [1:20] from 2011 to 2013: 1 2 3 4 5 6 7 8 9 10...
attributes(a)
## $tsp
## [1] 2011.167 2012.750 12.000
##
## $class
## [1] "ts"
5 / 40

Contents
Introduction
Online Resources
6 / 40

What is Time Series Decomposition
To decompose a time series into
components [Brockwell and Davis, 2016]:
Trend component: long term trend
Seasonal component: seasonal variation
Cyclical component: repeated but non-periodic ﬂuctuations
Irregular component: the residuals
7 / 40

Data AirPassengers
Data AirPassengers: monthly totals of Box Jenkins international
airline passengers, 1949 to 1960. It has 144(=12×12) values.
## load time series data
plot(AirPassengers)
AirPassengers
1950 1952 1954 1956 1958 1960
100200300400500600
8 / 40

Decomposition
## time series decomposation
apts <- ts(AirPassengers, frequency = 12)
f <- decompose(apts)
plot(f$figure, type = "b") # seasonal figures
2 4 6 8 10 12
−40−200204060
f$figure
9 / 40

Decomposition
plot(f)
100300500
observed
150250350450
trend
−400204060
seasonal
−400204060
2 4 6 8 10 12
random
Time
Decomposition of additive time series
10 / 40

Contents
Introduction
Online Resources
11 / 40

To forecast future events based on known past data
For example, to predict the price of a stock based on its past
performance
Popular models
Autoregressive moving average (ARMA)
Autoregressive integrated moving average (ARIMA)
12 / 40

Forecasting
## build an ARIMA model
fit <- arima(AirPassengers, order=c(1,0,0),
list(order=c(2,1,0), period=12))
## make forecast
fore <- predict(fit, n.ahead=24)
## error bounds at 95% confidence level
upper.bound <- fore$pred + 2*fore$se
lower.bound <- fore$pred - 2*fore$se
## plot forecast results
ts.plot(AirPassengers, fore$pred, upper.bound, lower.bound,
col = c(1, 2, 4, 4), lty = c(1, 1, 2, 2))
legend("topleft", col = c(1, 2, 4), lty = c(1, 1, 2),
c("Actual", "Forecast", "Error Bounds (95% Confidence)"))
13 / 40

Forecasting
Time
1950 1952 1954 1956 1958 1960 1962
100200300400500600700 Actual
Forecast
Error Bounds (95% Confidence)
14 / 40

Contents
Introduction
Online Resources
15 / 40

To partition time series data into groups based on similarity or
distance, so that time series in the same cluster are similar
Measure of distance/dissimilarity
Euclidean distance
Manhattan distance
Maximum norm
Hamming distance
The angle between two vectors (inner product)
Dynamic Time Warping (DTW) distance
...
16 / 40

Dynamic Time Warping (DTW)
DTW ﬁnds optimal alignment between two time
series [Keogh and Pazzani, 2001].
## Dynamic Time Warping (DTW)
library(dtw)
idx <- seq(0, 2 * pi, len = 100)
a <- sin(idx) + runif(100)/10
b <- cos(idx)
align <- dtw(a, b, step = asymmetricP1, keep = T)
dtwPlotTwoWay(align)
Queryvalue
0 20 40 60 80 100
−1.0−0.50.00.51.0
17 / 40

Synthetic Control Chart Time Series
The dataset contains 600 examples of control charts
synthetically generated by the process in Alcock and
Manolopoulos (1999).
Each control chart is a time series with 60 values.
Six classes:
1-100 Normal
101-200 Cyclic
201-300 Increasing trend
301-400 Decreasing trend
401-500 Upward shift
501-600 Downward shift
http://kdd.ics.uci.edu/databases/synthetic_control/synthetic_
control.html
18 / 40

Synthetic Control Chart Time Series
# read data into R
# sep="": the separator is white space, i.e., one
# or more spaces, tabs, newlines or carriage returns
sc <- read.table("../data/synthetic_control.data", header=F, sep="")
# show one sample from each class
idx <- c(1, 101, 201, 301, 401, 501)
sample1 <- t(sc[idx,])
plot.ts(sample1, main="")
19 / 40

Six Classes
24262830323436
1
15202530354045
101
2530354045
0 10 20 30 40 50 60
201
Time
0102030
301
2530354045
401
101520253035
0 10 20 30 40 50 60
501
Time
20 / 40

Hierarchical Clustering with Euclidean distance
# sample n cases from every class
n <- 10
s <- sample(1:100, n)
idx <- c(s, 100 + s, 200 + s, 300 + s, 400 + s, 500 + s)
sample2 <- sc[idx, ]
observedLabels <- rep(1:6, each = n)
## hierarchical clustering with Euclidean distance
hc <- hclust(dist(sample2), method = "ave")
plot(hc, labels = observedLabels, main = "")
21 / 40

6
4
6
6
6
6
6
6
6
6
6
4
4
4
4
4
4
4
4
4
3
3
3
3
3
3
5
5
5
5
5
5
3
3
3
3
5
5
5
5
2
2
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
20406080100120140
hclust (*, "average")
dist(sample2)
Height
22 / 40

# cut tree to get 8 clusters
memb <- cutree(hc, k = 8)
table(observedLabels, memb)
## memb
## observedLabels 1 2 3 4 5 6 7 8
## 1 10 0 0 0 0 0 0 0
## 2 0 3 1 1 3 2 0 0
## 3 0 0 0 0 0 0 10 0
## 4 0 0 0 0 0 0 0 10
## 5 0 0 0 0 0 0 10 0
## 6 0 0 0 0 0 0 0 10
23 / 40

Hierarchical Clustering with DTW Distance
# hierarchical clustering with DTW distance
myDist <- dist(sample2, method = "DTW")
hc <- hclust(myDist, method = "average")
plot(hc, labels = observedLabels, main = "")
# cut tree to get 8 clusters
memb <- cutree(hc, k = 8)
table(observedLabels, memb)
## memb
## observedLabels 1 2 3 4 5 6 7 8
## 1 10 0 0 0 0 0 0 0
## 2 0 4 3 2 1 0 0 0
## 3 0 0 0 0 0 6 4 0
## 4 0 0 0 0 0 0 0 10
## 5 0 0 0 0 0 0 10 0
## 6 0 0 0 0 0 0 0 10
24 / 40

Hierarchical Clustering with DTW Distance
3
3
3
3
3
3
5
5
3
3
3
3
5
5
5
5
5
5
5
5
6
6
6
4
6
4
4
4
4
4
4
4
4
4
6
6
6
6
6
6
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
02004006008001000
hclust (*, "average")
myDist
Height
25 / 40

Contents
Introduction
Online Resources
26 / 40

To build a classiﬁcation model based on labelled time series
and then use the model to predict the lable of unlabelled time
series
Feature Extraction
Singular Value Decomposition (SVD)
Discrete Fourier Transform (DFT)
Discrete Wavelet Transform (DWT)
Piecewise Aggregate Approximation (PAA)
Perpetually Important Points (PIP)
Piecewise Linear Representation
Symbolic Representation
27 / 40

Decision Tree (ctree)
ctree from package party
## build a decision tree
classId <- rep(as.character(1:6), each = 100)
newSc <- data.frame(cbind(classId, sc))
library(party)
ct <- ctree(classId ~ ., data = newSc,
controls = ctree_control(minsplit = 20,
minbucket = 5, maxdepth = 5))
28 / 40

Decision Tree
pClassId <- predict(ct)
table(classId, pClassId)
## pClassId
## classId 1 2 3 4 5 6
## 1 100 0 0 0 0 0
## 2 1 97 2 0 0 0
## 3 0 0 99 0 1 0
## 4 0 0 0 100 0 0
## 5 4 0 8 0 88 0
## 6 0 3 0 90 0 7
# accuracy
(sum(classId == pClassId))/nrow(sc)
## [1] 0.8183333
29 / 40

DWT (Discrete Wavelet Transform)
Wavelet transform provides a multi-resolution representation
using wavelets [Burrus et al., 1998].
Haar Wavelet Transform – the simplest DWT
http://dmr.ath.cx/gfx/haar/
DFT (Discrete Fourier Transform): another popular feature
extraction technique
30 / 40

DWT (Discrete Wavelet Transform)
# extract DWT (with Haar filter) coefficients
library(wavelets)
wtData <- NULL
for (i in 1:nrow(sc)) {
a <- t(sc[i, ])
wt <- dwt(a, filter = "haar", boundary = "periodic")
wtData <- rbind(wtData, unlist(c(wt@W, wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
wtSc <- data.frame(cbind(classId, wtData))
31 / 40

Decision Tree with DWT
## build a decision tree
ct <- ctree(classId ~ ., data = wtSc,
controls = ctree_control(minsplit=20, minbucket=5,
maxdepth=5))
pClassId <- predict(ct)
table(classId, pClassId)
## pClassId
## classId 1 2 3 4 5 6
## 1 98 2 0 0 0 0
## 2 1 99 0 0 0 0
## 3 0 0 81 0 19 0
## 4 0 0 0 74 0 26
## 5 0 0 16 0 84 0
## 6 0 0 0 3 0 97
(sum(classId==pClassId)) / nrow(wtSc)
## [1] 0.8883333
32 / 40

plot(ct, ip_args = list(pval = F), ep_args = list(digits = 0))
V57
1
≤ 117 > 117
W43
2
≤ −4 > −4
W5
3
≤ −8 > −8
Node 4 (n = 68)
123456
0
0.2
0.4
0.6
0.8
1
Node 5 (n = 6)
123456
0
0.2
0.4
0.6
0.8
1
W31
6
≤ −6 > −6
Node 7 (n = 9)
123456
0
0.2
0.4
0.6
0.8
1
Node 8 (n = 86)
123456
0
0.2
0.4
0.6
0.8
1
V57
9
≤ 140 > 140
Node 10 (n = 31)
123456
0
0.2
0.4
0.6
0.8
1
V57
11
≤ 178 > 178
W22
12
≤ −6 > −6
Node 13 (n = 80)
123456
0
0.2
0.4
0.6
0.8
1
W31
14
≤ −13 > −13
Node 15 (n = 9)
123456
0
0.2
0.4
0.6
0.8
1
Node 16 (n = 99)
123456
0
0.2
0.4
0.6
0.8
1
W31
17
≤ −15 > −15
Node 18 (n = 12)
123456
0
0.2
0.4
0.6
0.8
1
W43
19
≤ 3 > 3
Node 20 (n = 103)
123456
0
0.2
0.4
0.6
0.8
1
Node 21 (n = 97)
123456
0
0.2
0.4
0.6
0.8
1
33 / 40

k-NN Classification
find the k nearest neighbours of a new instance
label it by majority voting
needs an efficient indexing structure for large datasets
## k-NN classification
k <- 20
newTS <- sc[501, ] + runif(100) * 15
distances <- dist(newTS, sc, method = "DTW")
s <- sort(as.vector(distances), index.return = TRUE)
# class IDs of k nearest neighbours
table(classId[s$ix[1:k]])
##
## 4 6
## 3 17
34 / 40

k-NN Classification
find the k nearest neighbours of a new instance
label it by majority voting
needs an efficient indexing structure for large datasets
## k-NN classification
k <- 20
newTS <- sc[501, ] + runif(100) * 15
distances <- dist(newTS, sc, method = "DTW")
s <- sort(as.vector(distances), index.return = TRUE)
# class IDs of k nearest neighbours
table(classId[s$ix[1:k]])
##
## 4 6
## 3 17
Results of majority voting: class 6
34 / 40

The TSclust Package
TSclust: a package for time seriesclustering †
measures of dissimilarity between time series to perform time
series clustering.
metrics based on raw data, on generating models and on the
forecast behavior
time series clustering algorithms and cluster evaluation metrics
†
http://cran.r-project.org/web/packages/TSclust/
35 / 40

Contents
Introduction
Online Resources
36 / 40

Online Resources
Book titled R and Data Mining: Examples and Case
Studies [Zhao, 2012]
http://www.rdatamining.com/docs/RDataMining-book.pdf
R Reference Card for Data Mining
http://www.rdatamining.com/docs/RDataMining-reference-card.pdf
Free online courses and documents
http://www.rdatamining.com/resources/
RDataMining Group on LinkedIn (27,000+ members)
http://group.rdatamining.com
Twitter (3,300+ followers)
@RDataMining
37 / 40

The End
Thanks!
Email: yanchang(at)RDataMining.com
Twitter: @RDataMining
38 / 40

How to Cite This Work
Citation
Yanchang Zhao. R and Data Mining: Examples and Case Studies. ISBN
978-0-12-396963-7, December 2012. Academic Press, Elsevier. 256
pages. URL: http://www.rdatamining.com/docs/RDataMining-book.pdf.
BibTex
@BOOK{Zhao2012R,
title = {R and Data Mining: Examples and Case Studies},
publisher = {Academic Press, Elsevier},
year = {2012},
author = {Yanchang Zhao},
pages = {256},
month = {December},
isbn = {978-0-123-96963-7},
keywords = {R, data mining},
url = {http://www.rdatamining.com/docs/RDataMining-book.pdf}
}
39 / 40

References I
Brockwell, P. J. and Davis, R. A. (2016).
Introduction to Time Series and Forecasting, ISBN 9783319298528.
Springer.
Burrus, C. S., Gopinath, R. A., and Guo, H. (1998).
Introduction to Wavelets and Wavelet Transforms: A Primer.
Prentice-Hall, Inc.
Keogh, E. J. and Pazzani, M. J. (2001).
Derivative dynamic time warping.
In the 1st SIAM Int. Conf. on Data Mining (SDM-2001), Chicago, IL, USA.
Zhao, Y. (2012).
R and Data Mining: Examples and Case Studies, ISBN 978-0-12-396963-7.
Academic Press, Elsevier.
40 / 40

RDataMining slides-time-series-analysis

More Related Content

What's hot (20)

Similar to RDataMining slides-time-series-analysis (20)

More from Yanchang Zhao (9)

Recently uploaded (20)

RDataMining slides-time-series-analysis