SlideShare a Scribd company logo
Time Series Analysis with R
Yanchang Zhao
http://www.RDataMining.com
R and Data Mining Course
Beijing University of Posts and Telecommunications,
Beijing, China
July 2019
1 / 40
Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
2 / 40
Time Series Analysis with R ∗
time series data in R
time series decomposition, forecasting, clustering and
classification
autoregressive integrated moving average (ARIMA) model
Dynamic Time Warping (DTW)
Discrete Wavelet Transform (DWT)
k-NN classification
∗
Chapter 8: Time Series Analysis and Mining, in book
R and Data Mining: Examples and Case Studies.
http://www.rdatamining.com/docs/RDataMining-book.pdf
3 / 40
Time Series Data in R
class ts
represents data which has been sampled at equispaced points
in time
frequency=7: a weekly series
frequency=12: a monthly series
frequency=4: a quarterly series
4 / 40
Time Series Data in R
## an example of time series data
a <- ts(1:20, frequency = 12, start = c(2011, 3))
print(a)
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2011 1 2 3 4 5 6 7 8 9 10
## 2012 11 12 13 14 15 16 17 18 19 20
str(a)
## Time-Series [1:20] from 2011 to 2013: 1 2 3 4 5 6 7 8 9 10...
attributes(a)
## $tsp
## [1] 2011.167 2012.750 12.000
##
## $class
## [1] "ts"
5 / 40
Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
6 / 40
What is Time Series Decomposition
To decompose a time series into
components [Brockwell and Davis, 2016]:
Trend component: long term trend
Seasonal component: seasonal variation
Cyclical component: repeated but non-periodic fluctuations
Irregular component: the residuals
7 / 40
Data AirPassengers
Data AirPassengers: monthly totals of Box Jenkins international
airline passengers, 1949 to 1960. It has 144(=12×12) values.
## load time series data
plot(AirPassengers)
AirPassengers
1950 1952 1954 1956 1958 1960
100200300400500600
8 / 40
Decomposition
## time series decomposation
apts <- ts(AirPassengers, frequency = 12)
f <- decompose(apts)
plot(f$figure, type = "b") # seasonal figures
2 4 6 8 10 12
−40−200204060
f$figure
9 / 40
Decomposition
plot(f)
100300500
observed
150250350450
trend
−400204060
seasonal
−400204060
2 4 6 8 10 12
random
Time
Decomposition of additive time series
10 / 40
Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
11 / 40
Time Series Forecasting
To forecast future events based on known past data
For example, to predict the price of a stock based on its past
performance
Popular models
Autoregressive moving average (ARMA)
Autoregressive integrated moving average (ARIMA)
12 / 40
Forecasting
## build an ARIMA model
fit <- arima(AirPassengers, order=c(1,0,0),
list(order=c(2,1,0), period=12))
## make forecast
fore <- predict(fit, n.ahead=24)
## error bounds at 95% confidence level
upper.bound <- fore$pred + 2*fore$se
lower.bound <- fore$pred - 2*fore$se
## plot forecast results
ts.plot(AirPassengers, fore$pred, upper.bound, lower.bound,
col = c(1, 2, 4, 4), lty = c(1, 1, 2, 2))
legend("topleft", col = c(1, 2, 4), lty = c(1, 1, 2),
c("Actual", "Forecast", "Error Bounds (95% Confidence)"))
13 / 40
Forecasting
Time
1950 1952 1954 1956 1958 1960 1962
100200300400500600700 Actual
Forecast
Error Bounds (95% Confidence)
14 / 40
Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
15 / 40
Time Series Clustering
To partition time series data into groups based on similarity or
distance, so that time series in the same cluster are similar
Measure of distance/dissimilarity
Euclidean distance
Manhattan distance
Maximum norm
Hamming distance
The angle between two vectors (inner product)
Dynamic Time Warping (DTW) distance
...
16 / 40
Dynamic Time Warping (DTW)
DTW finds optimal alignment between two time
series [Keogh and Pazzani, 2001].
## Dynamic Time Warping (DTW)
library(dtw)
idx <- seq(0, 2 * pi, len = 100)
a <- sin(idx) + runif(100)/10
b <- cos(idx)
align <- dtw(a, b, step = asymmetricP1, keep = T)
dtwPlotTwoWay(align)
Queryvalue
0 20 40 60 80 100
−1.0−0.50.00.51.0
17 / 40
Synthetic Control Chart Time Series
The dataset contains 600 examples of control charts
synthetically generated by the process in Alcock and
Manolopoulos (1999).
Each control chart is a time series with 60 values.
Six classes:
1-100 Normal
101-200 Cyclic
201-300 Increasing trend
301-400 Decreasing trend
401-500 Upward shift
501-600 Downward shift
http://kdd.ics.uci.edu/databases/synthetic_control/synthetic_
control.html
18 / 40
Synthetic Control Chart Time Series
# read data into R
# sep="": the separator is white space, i.e., one
# or more spaces, tabs, newlines or carriage returns
sc <- read.table("../data/synthetic_control.data", header=F, sep="")
# show one sample from each class
idx <- c(1, 101, 201, 301, 401, 501)
sample1 <- t(sc[idx,])
plot.ts(sample1, main="")
19 / 40
Six Classes
24262830323436
1
15202530354045
101
2530354045
0 10 20 30 40 50 60
201
Time
0102030
301
2530354045
401
101520253035
0 10 20 30 40 50 60
501
Time
20 / 40
Hierarchical Clustering with Euclidean distance
# sample n cases from every class
n <- 10
s <- sample(1:100, n)
idx <- c(s, 100 + s, 200 + s, 300 + s, 400 + s, 500 + s)
sample2 <- sc[idx, ]
observedLabels <- rep(1:6, each = n)
## hierarchical clustering with Euclidean distance
hc <- hclust(dist(sample2), method = "ave")
plot(hc, labels = observedLabels, main = "")
21 / 40
Hierarchical Clustering with Euclidean distance
6
4
6
6
6
6
6
6
6
6
6
4
4
4
4
4
4
4
4
4
3
3
3
3
3
3
5
5
5
5
5
5
3
3
3
3
5
5
5
5
2
2
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
20406080100120140
hclust (*, "average")
dist(sample2)
Height
22 / 40
Hierarchical Clustering with Euclidean distance
# cut tree to get 8 clusters
memb <- cutree(hc, k = 8)
table(observedLabels, memb)
## memb
## observedLabels 1 2 3 4 5 6 7 8
## 1 10 0 0 0 0 0 0 0
## 2 0 3 1 1 3 2 0 0
## 3 0 0 0 0 0 0 10 0
## 4 0 0 0 0 0 0 0 10
## 5 0 0 0 0 0 0 10 0
## 6 0 0 0 0 0 0 0 10
23 / 40
Hierarchical Clustering with DTW Distance
# hierarchical clustering with DTW distance
myDist <- dist(sample2, method = "DTW")
hc <- hclust(myDist, method = "average")
plot(hc, labels = observedLabels, main = "")
# cut tree to get 8 clusters
memb <- cutree(hc, k = 8)
table(observedLabels, memb)
## memb
## observedLabels 1 2 3 4 5 6 7 8
## 1 10 0 0 0 0 0 0 0
## 2 0 4 3 2 1 0 0 0
## 3 0 0 0 0 0 6 4 0
## 4 0 0 0 0 0 0 0 10
## 5 0 0 0 0 0 0 10 0
## 6 0 0 0 0 0 0 0 10
24 / 40
Hierarchical Clustering with DTW Distance
3
3
3
3
3
3
5
5
3
3
3
3
5
5
5
5
5
5
5
5
6
6
6
4
6
4
4
4
4
4
4
4
4
4
6
6
6
6
6
6
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
02004006008001000
hclust (*, "average")
myDist
Height
25 / 40
Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
26 / 40
Time Series Classification
Time Series Classification
To build a classification model based on labelled time series
and then use the model to predict the lable of unlabelled time
series
Feature Extraction
Singular Value Decomposition (SVD)
Discrete Fourier Transform (DFT)
Discrete Wavelet Transform (DWT)
Piecewise Aggregate Approximation (PAA)
Perpetually Important Points (PIP)
Piecewise Linear Representation
Symbolic Representation
27 / 40
Decision Tree (ctree)
ctree from package party
## build a decision tree
classId <- rep(as.character(1:6), each = 100)
newSc <- data.frame(cbind(classId, sc))
library(party)
ct <- ctree(classId ~ ., data = newSc,
controls = ctree_control(minsplit = 20,
minbucket = 5, maxdepth = 5))
28 / 40
Decision Tree
pClassId <- predict(ct)
table(classId, pClassId)
## pClassId
## classId 1 2 3 4 5 6
## 1 100 0 0 0 0 0
## 2 1 97 2 0 0 0
## 3 0 0 99 0 1 0
## 4 0 0 0 100 0 0
## 5 4 0 8 0 88 0
## 6 0 3 0 90 0 7
# accuracy
(sum(classId == pClassId))/nrow(sc)
## [1] 0.8183333
29 / 40
DWT (Discrete Wavelet Transform)
Wavelet transform provides a multi-resolution representation
using wavelets [Burrus et al., 1998].
Haar Wavelet Transform – the simplest DWT
http://dmr.ath.cx/gfx/haar/
DFT (Discrete Fourier Transform): another popular feature
extraction technique
30 / 40
DWT (Discrete Wavelet Transform)
# extract DWT (with Haar filter) coefficients
library(wavelets)
wtData <- NULL
for (i in 1:nrow(sc)) {
a <- t(sc[i, ])
wt <- dwt(a, filter = "haar", boundary = "periodic")
wtData <- rbind(wtData, unlist(c(wt@W, wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
wtSc <- data.frame(cbind(classId, wtData))
31 / 40
Decision Tree with DWT
## build a decision tree
ct <- ctree(classId ~ ., data = wtSc,
controls = ctree_control(minsplit=20, minbucket=5,
maxdepth=5))
pClassId <- predict(ct)
table(classId, pClassId)
## pClassId
## classId 1 2 3 4 5 6
## 1 98 2 0 0 0 0
## 2 1 99 0 0 0 0
## 3 0 0 81 0 19 0
## 4 0 0 0 74 0 26
## 5 0 0 16 0 84 0
## 6 0 0 0 3 0 97
(sum(classId==pClassId)) / nrow(wtSc)
## [1] 0.8883333
32 / 40
plot(ct, ip_args = list(pval = F), ep_args = list(digits = 0))
V57
1
≤ 117 > 117
W43
2
≤ −4 > −4
W5
3
≤ −8 > −8
Node 4 (n = 68)
123456
0
0.2
0.4
0.6
0.8
1
Node 5 (n = 6)
123456
0
0.2
0.4
0.6
0.8
1
W31
6
≤ −6 > −6
Node 7 (n = 9)
123456
0
0.2
0.4
0.6
0.8
1
Node 8 (n = 86)
123456
0
0.2
0.4
0.6
0.8
1
V57
9
≤ 140 > 140
Node 10 (n = 31)
123456
0
0.2
0.4
0.6
0.8
1
V57
11
≤ 178 > 178
W22
12
≤ −6 > −6
Node 13 (n = 80)
123456
0
0.2
0.4
0.6
0.8
1
W31
14
≤ −13 > −13
Node 15 (n = 9)
123456
0
0.2
0.4
0.6
0.8
1
Node 16 (n = 99)
123456
0
0.2
0.4
0.6
0.8
1
W31
17
≤ −15 > −15
Node 18 (n = 12)
123456
0
0.2
0.4
0.6
0.8
1
W43
19
≤ 3 > 3
Node 20 (n = 103)
123456
0
0.2
0.4
0.6
0.8
1
Node 21 (n = 97)
123456
0
0.2
0.4
0.6
0.8
1
33 / 40
k-NN Classification
find the k nearest neighbours of a new instance
label it by majority voting
needs an efficient indexing structure for large datasets
## k-NN classification
k <- 20
newTS <- sc[501, ] + runif(100) * 15
distances <- dist(newTS, sc, method = "DTW")
s <- sort(as.vector(distances), index.return = TRUE)
# class IDs of k nearest neighbours
table(classId[s$ix[1:k]])
##
## 4 6
## 3 17
34 / 40
k-NN Classification
find the k nearest neighbours of a new instance
label it by majority voting
needs an efficient indexing structure for large datasets
## k-NN classification
k <- 20
newTS <- sc[501, ] + runif(100) * 15
distances <- dist(newTS, sc, method = "DTW")
s <- sort(as.vector(distances), index.return = TRUE)
# class IDs of k nearest neighbours
table(classId[s$ix[1:k]])
##
## 4 6
## 3 17
Results of majority voting: class 6
34 / 40
The TSclust Package
TSclust: a package for time seriesclustering †
measures of dissimilarity between time series to perform time
series clustering.
metrics based on raw data, on generating models and on the
forecast behavior
time series clustering algorithms and cluster evaluation metrics
†
http://cran.r-project.org/web/packages/TSclust/
35 / 40
Contents
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
36 / 40
Online Resources
Book titled R and Data Mining: Examples and Case
Studies [Zhao, 2012]
http://www.rdatamining.com/docs/RDataMining-book.pdf
R Reference Card for Data Mining
http://www.rdatamining.com/docs/RDataMining-reference-card.pdf
Free online courses and documents
http://www.rdatamining.com/resources/
RDataMining Group on LinkedIn (27,000+ members)
http://group.rdatamining.com
Twitter (3,300+ followers)
@RDataMining
37 / 40
The End
Thanks!
Email: yanchang(at)RDataMining.com
Twitter: @RDataMining
38 / 40
How to Cite This Work
Citation
Yanchang Zhao. R and Data Mining: Examples and Case Studies. ISBN
978-0-12-396963-7, December 2012. Academic Press, Elsevier. 256
pages. URL: http://www.rdatamining.com/docs/RDataMining-book.pdf.
BibTex
@BOOK{Zhao2012R,
title = {R and Data Mining: Examples and Case Studies},
publisher = {Academic Press, Elsevier},
year = {2012},
author = {Yanchang Zhao},
pages = {256},
month = {December},
isbn = {978-0-123-96963-7},
keywords = {R, data mining},
url = {http://www.rdatamining.com/docs/RDataMining-book.pdf}
}
39 / 40
References I
Brockwell, P. J. and Davis, R. A. (2016).
Introduction to Time Series and Forecasting, ISBN 9783319298528.
Springer.
Burrus, C. S., Gopinath, R. A., and Guo, H. (1998).
Introduction to Wavelets and Wavelet Transforms: A Primer.
Prentice-Hall, Inc.
Keogh, E. J. and Pazzani, M. J. (2001).
Derivative dynamic time warping.
In the 1st SIAM Int. Conf. on Data Mining (SDM-2001), Chicago, IL, USA.
Zhao, Y. (2012).
R and Data Mining: Examples and Case Studies, ISBN 978-0-12-396963-7.
Academic Press, Elsevier.
40 / 40

More Related Content

PDF
RDataMining slides-regression-classification
PDF
RDataMining slides-clustering-with-r
PDF
Time Series Analysis and Mining with R
PDF
Data Exploration and Visualization with R
PDF
Regression and Classification with R
PDF
An Introduction to Data Mining with R
PPTX
R programming language
PDF
R Workshop for Beginners
RDataMining slides-regression-classification
RDataMining slides-clustering-with-r
Time Series Analysis and Mining with R
Data Exploration and Visualization with R
Regression and Classification with R
An Introduction to Data Mining with R
R programming language
R Workshop for Beginners

What's hot (20)

PDF
RDataMining slides-network-analysis-with-r
PPTX
R Language Introduction
PDF
R learning by examples
PDF
R programming intro with examples
PDF
Data Clustering with R
PPTX
Language R
KEY
Presentation R basic teaching module
PDF
Introduction to R Programming
PPTX
Datamining with R
PDF
Next Generation Programming in R
PPTX
R language
PDF
[系列活動] Data exploration with modern R
PDF
RDataMining slides-data-exploration-visualisation
PDF
Data Analysis with R (combined slides)
PPTX
Programming in R
PDF
Python for R developers and data scientists
PPTX
An Interactive Introduction To R (Programming Language For Statistics)
PPTX
R language introduction
PDF
Grouping & Summarizing Data in R
PDF
4 R Tutorial DPLYR Apply Function
RDataMining slides-network-analysis-with-r
R Language Introduction
R learning by examples
R programming intro with examples
Data Clustering with R
Language R
Presentation R basic teaching module
Introduction to R Programming
Datamining with R
Next Generation Programming in R
R language
[系列活動] Data exploration with modern R
RDataMining slides-data-exploration-visualisation
Data Analysis with R (combined slides)
Programming in R
Python for R developers and data scientists
An Interactive Introduction To R (Programming Language For Statistics)
R language introduction
Grouping & Summarizing Data in R
4 R Tutorial DPLYR Apply Function
Ad

Similar to RDataMining slides-time-series-analysis (20)

PDF
R data mining-Time Series Analysis with R
PDF
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
PPTX
wasim 1
PDF
Forecasting time series powerful and simple
PPTX
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
PPTX
Long Memory presentation to SURF
PDF
Lecture9_Time_Series_2024_and_data_analysis (1).pdf
PDF
TIME SERIES ANALYSIS USING ARIMA MODEL FOR FORECASTING IN R (PRACTICAL)
PPTX
Seasonal Decomposition of Time Series Data
PDF
IRJET-Forecasting of Time Series Data using Hybrid ARIMA Model with the Wavel...
PDF
MFx_Module_3_Properties_of_Time_Series.pdf
PPTX
time_series and the forecastring age of RNNS.pptx
PDF
Module 5.pptx (Data science in engineering)
PPTX
Forecasting_CO2_Emissions.pptx
PDF
Accurate time series classification using shapelets
PDF
lecture3.pdf
PDF
Forecasting%20Economic%20Series%20using%20ARMA
PDF
Combination of Similarity Measures for Time Series Classification using Genet...
PPTX
Presentation On Time Series Analysis in Mechine Learning
PPTX
Time series analysis
R data mining-Time Series Analysis with R
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
wasim 1
Forecasting time series powerful and simple
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Long Memory presentation to SURF
Lecture9_Time_Series_2024_and_data_analysis (1).pdf
TIME SERIES ANALYSIS USING ARIMA MODEL FOR FORECASTING IN R (PRACTICAL)
Seasonal Decomposition of Time Series Data
IRJET-Forecasting of Time Series Data using Hybrid ARIMA Model with the Wavel...
MFx_Module_3_Properties_of_Time_Series.pdf
time_series and the forecastring age of RNNS.pptx
Module 5.pptx (Data science in engineering)
Forecasting_CO2_Emissions.pptx
Accurate time series classification using shapelets
lecture3.pdf
Forecasting%20Economic%20Series%20using%20ARMA
Combination of Similarity Measures for Time Series Classification using Genet...
Presentation On Time Series Analysis in Mechine Learning
Time series analysis
Ad

More from Yanchang Zhao (9)

PDF
RDataMining slides-text-mining-with-r
PDF
RDataMining slides-r-programming
PDF
RDataMining slides-association-rule-mining-with-r
PDF
RDataMining-reference-card
PDF
Text Mining with R -- an Analysis of Twitter Data
PDF
Association Rule Mining with R
PDF
Introduction to Data Mining with R and Data Import/Export in R
PDF
Time series-mining-slides
PDF
R Reference Card for Data Mining
RDataMining slides-text-mining-with-r
RDataMining slides-r-programming
RDataMining slides-association-rule-mining-with-r
RDataMining-reference-card
Text Mining with R -- an Analysis of Twitter Data
Association Rule Mining with R
Introduction to Data Mining with R and Data Import/Export in R
Time series-mining-slides
R Reference Card for Data Mining

Recently uploaded (20)

PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Mushroom cultivation and it's methods.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
A Presentation on Touch Screen Technology
PDF
Approach and Philosophy of On baking technology
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
Tartificialntelligence_presentation.pptx
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
Heart disease approach using modified random forest and particle swarm optimi...
A novel scalable deep ensemble learning framework for big data classification...
MIND Revenue Release Quarter 2 2025 Press Release
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Mushroom cultivation and it's methods.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Accuracy of neural networks in brain wave diagnosis of schizophrenia
A Presentation on Touch Screen Technology
Approach and Philosophy of On baking technology
DP Operators-handbook-extract for the Mautical Institute
WOOl fibre morphology and structure.pdf for textiles
Tartificialntelligence_presentation.pptx
TLE Review Electricity (Electricity).pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Encapsulation_ Review paper, used for researhc scholars
Enhancing emotion recognition model for a student engagement use case through...
Assigned Numbers - 2025 - Bluetooth® Document

RDataMining slides-time-series-analysis

  • 1. Time Series Analysis with R Yanchang Zhao http://www.RDataMining.com R and Data Mining Course Beijing University of Posts and Telecommunications, Beijing, China July 2019 1 / 40
  • 2. Contents Introduction Time Series Decomposition Time Series Forecasting Time Series Clustering Time Series Classification Online Resources 2 / 40
  • 3. Time Series Analysis with R ∗ time series data in R time series decomposition, forecasting, clustering and classification autoregressive integrated moving average (ARIMA) model Dynamic Time Warping (DTW) Discrete Wavelet Transform (DWT) k-NN classification ∗ Chapter 8: Time Series Analysis and Mining, in book R and Data Mining: Examples and Case Studies. http://www.rdatamining.com/docs/RDataMining-book.pdf 3 / 40
  • 4. Time Series Data in R class ts represents data which has been sampled at equispaced points in time frequency=7: a weekly series frequency=12: a monthly series frequency=4: a quarterly series 4 / 40
  • 5. Time Series Data in R ## an example of time series data a <- ts(1:20, frequency = 12, start = c(2011, 3)) print(a) ## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ## 2011 1 2 3 4 5 6 7 8 9 10 ## 2012 11 12 13 14 15 16 17 18 19 20 str(a) ## Time-Series [1:20] from 2011 to 2013: 1 2 3 4 5 6 7 8 9 10... attributes(a) ## $tsp ## [1] 2011.167 2012.750 12.000 ## ## $class ## [1] "ts" 5 / 40
  • 6. Contents Introduction Time Series Decomposition Time Series Forecasting Time Series Clustering Time Series Classification Online Resources 6 / 40
  • 7. What is Time Series Decomposition To decompose a time series into components [Brockwell and Davis, 2016]: Trend component: long term trend Seasonal component: seasonal variation Cyclical component: repeated but non-periodic fluctuations Irregular component: the residuals 7 / 40
  • 8. Data AirPassengers Data AirPassengers: monthly totals of Box Jenkins international airline passengers, 1949 to 1960. It has 144(=12×12) values. ## load time series data plot(AirPassengers) AirPassengers 1950 1952 1954 1956 1958 1960 100200300400500600 8 / 40
  • 9. Decomposition ## time series decomposation apts <- ts(AirPassengers, frequency = 12) f <- decompose(apts) plot(f$figure, type = "b") # seasonal figures 2 4 6 8 10 12 −40−200204060 f$figure 9 / 40
  • 10. Decomposition plot(f) 100300500 observed 150250350450 trend −400204060 seasonal −400204060 2 4 6 8 10 12 random Time Decomposition of additive time series 10 / 40
  • 11. Contents Introduction Time Series Decomposition Time Series Forecasting Time Series Clustering Time Series Classification Online Resources 11 / 40
  • 12. Time Series Forecasting To forecast future events based on known past data For example, to predict the price of a stock based on its past performance Popular models Autoregressive moving average (ARMA) Autoregressive integrated moving average (ARIMA) 12 / 40
  • 13. Forecasting ## build an ARIMA model fit <- arima(AirPassengers, order=c(1,0,0), list(order=c(2,1,0), period=12)) ## make forecast fore <- predict(fit, n.ahead=24) ## error bounds at 95% confidence level upper.bound <- fore$pred + 2*fore$se lower.bound <- fore$pred - 2*fore$se ## plot forecast results ts.plot(AirPassengers, fore$pred, upper.bound, lower.bound, col = c(1, 2, 4, 4), lty = c(1, 1, 2, 2)) legend("topleft", col = c(1, 2, 4), lty = c(1, 1, 2), c("Actual", "Forecast", "Error Bounds (95% Confidence)")) 13 / 40
  • 14. Forecasting Time 1950 1952 1954 1956 1958 1960 1962 100200300400500600700 Actual Forecast Error Bounds (95% Confidence) 14 / 40
  • 15. Contents Introduction Time Series Decomposition Time Series Forecasting Time Series Clustering Time Series Classification Online Resources 15 / 40
  • 16. Time Series Clustering To partition time series data into groups based on similarity or distance, so that time series in the same cluster are similar Measure of distance/dissimilarity Euclidean distance Manhattan distance Maximum norm Hamming distance The angle between two vectors (inner product) Dynamic Time Warping (DTW) distance ... 16 / 40
  • 17. Dynamic Time Warping (DTW) DTW finds optimal alignment between two time series [Keogh and Pazzani, 2001]. ## Dynamic Time Warping (DTW) library(dtw) idx <- seq(0, 2 * pi, len = 100) a <- sin(idx) + runif(100)/10 b <- cos(idx) align <- dtw(a, b, step = asymmetricP1, keep = T) dtwPlotTwoWay(align) Queryvalue 0 20 40 60 80 100 −1.0−0.50.00.51.0 17 / 40
  • 18. Synthetic Control Chart Time Series The dataset contains 600 examples of control charts synthetically generated by the process in Alcock and Manolopoulos (1999). Each control chart is a time series with 60 values. Six classes: 1-100 Normal 101-200 Cyclic 201-300 Increasing trend 301-400 Decreasing trend 401-500 Upward shift 501-600 Downward shift http://kdd.ics.uci.edu/databases/synthetic_control/synthetic_ control.html 18 / 40
  • 19. Synthetic Control Chart Time Series # read data into R # sep="": the separator is white space, i.e., one # or more spaces, tabs, newlines or carriage returns sc <- read.table("../data/synthetic_control.data", header=F, sep="") # show one sample from each class idx <- c(1, 101, 201, 301, 401, 501) sample1 <- t(sc[idx,]) plot.ts(sample1, main="") 19 / 40
  • 20. Six Classes 24262830323436 1 15202530354045 101 2530354045 0 10 20 30 40 50 60 201 Time 0102030 301 2530354045 401 101520253035 0 10 20 30 40 50 60 501 Time 20 / 40
  • 21. Hierarchical Clustering with Euclidean distance # sample n cases from every class n <- 10 s <- sample(1:100, n) idx <- c(s, 100 + s, 200 + s, 300 + s, 400 + s, 500 + s) sample2 <- sc[idx, ] observedLabels <- rep(1:6, each = n) ## hierarchical clustering with Euclidean distance hc <- hclust(dist(sample2), method = "ave") plot(hc, labels = observedLabels, main = "") 21 / 40
  • 22. Hierarchical Clustering with Euclidean distance 6 4 6 6 6 6 6 6 6 6 6 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 5 5 5 5 5 5 3 3 3 3 5 5 5 5 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 20406080100120140 hclust (*, "average") dist(sample2) Height 22 / 40
  • 23. Hierarchical Clustering with Euclidean distance # cut tree to get 8 clusters memb <- cutree(hc, k = 8) table(observedLabels, memb) ## memb ## observedLabels 1 2 3 4 5 6 7 8 ## 1 10 0 0 0 0 0 0 0 ## 2 0 3 1 1 3 2 0 0 ## 3 0 0 0 0 0 0 10 0 ## 4 0 0 0 0 0 0 0 10 ## 5 0 0 0 0 0 0 10 0 ## 6 0 0 0 0 0 0 0 10 23 / 40
  • 24. Hierarchical Clustering with DTW Distance # hierarchical clustering with DTW distance myDist <- dist(sample2, method = "DTW") hc <- hclust(myDist, method = "average") plot(hc, labels = observedLabels, main = "") # cut tree to get 8 clusters memb <- cutree(hc, k = 8) table(observedLabels, memb) ## memb ## observedLabels 1 2 3 4 5 6 7 8 ## 1 10 0 0 0 0 0 0 0 ## 2 0 4 3 2 1 0 0 0 ## 3 0 0 0 0 0 6 4 0 ## 4 0 0 0 0 0 0 0 10 ## 5 0 0 0 0 0 0 10 0 ## 6 0 0 0 0 0 0 0 10 24 / 40
  • 25. Hierarchical Clustering with DTW Distance 3 3 3 3 3 3 5 5 3 3 3 3 5 5 5 5 5 5 5 5 6 6 6 4 6 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 02004006008001000 hclust (*, "average") myDist Height 25 / 40
  • 26. Contents Introduction Time Series Decomposition Time Series Forecasting Time Series Clustering Time Series Classification Online Resources 26 / 40
  • 27. Time Series Classification Time Series Classification To build a classification model based on labelled time series and then use the model to predict the lable of unlabelled time series Feature Extraction Singular Value Decomposition (SVD) Discrete Fourier Transform (DFT) Discrete Wavelet Transform (DWT) Piecewise Aggregate Approximation (PAA) Perpetually Important Points (PIP) Piecewise Linear Representation Symbolic Representation 27 / 40
  • 28. Decision Tree (ctree) ctree from package party ## build a decision tree classId <- rep(as.character(1:6), each = 100) newSc <- data.frame(cbind(classId, sc)) library(party) ct <- ctree(classId ~ ., data = newSc, controls = ctree_control(minsplit = 20, minbucket = 5, maxdepth = 5)) 28 / 40
  • 29. Decision Tree pClassId <- predict(ct) table(classId, pClassId) ## pClassId ## classId 1 2 3 4 5 6 ## 1 100 0 0 0 0 0 ## 2 1 97 2 0 0 0 ## 3 0 0 99 0 1 0 ## 4 0 0 0 100 0 0 ## 5 4 0 8 0 88 0 ## 6 0 3 0 90 0 7 # accuracy (sum(classId == pClassId))/nrow(sc) ## [1] 0.8183333 29 / 40
  • 30. DWT (Discrete Wavelet Transform) Wavelet transform provides a multi-resolution representation using wavelets [Burrus et al., 1998]. Haar Wavelet Transform – the simplest DWT http://dmr.ath.cx/gfx/haar/ DFT (Discrete Fourier Transform): another popular feature extraction technique 30 / 40
  • 31. DWT (Discrete Wavelet Transform) # extract DWT (with Haar filter) coefficients library(wavelets) wtData <- NULL for (i in 1:nrow(sc)) { a <- t(sc[i, ]) wt <- dwt(a, filter = "haar", boundary = "periodic") wtData <- rbind(wtData, unlist(c(wt@W, wt@V[[wt@level]]))) } wtData <- as.data.frame(wtData) wtSc <- data.frame(cbind(classId, wtData)) 31 / 40
  • 32. Decision Tree with DWT ## build a decision tree ct <- ctree(classId ~ ., data = wtSc, controls = ctree_control(minsplit=20, minbucket=5, maxdepth=5)) pClassId <- predict(ct) table(classId, pClassId) ## pClassId ## classId 1 2 3 4 5 6 ## 1 98 2 0 0 0 0 ## 2 1 99 0 0 0 0 ## 3 0 0 81 0 19 0 ## 4 0 0 0 74 0 26 ## 5 0 0 16 0 84 0 ## 6 0 0 0 3 0 97 (sum(classId==pClassId)) / nrow(wtSc) ## [1] 0.8883333 32 / 40
  • 33. plot(ct, ip_args = list(pval = F), ep_args = list(digits = 0)) V57 1 ≤ 117 > 117 W43 2 ≤ −4 > −4 W5 3 ≤ −8 > −8 Node 4 (n = 68) 123456 0 0.2 0.4 0.6 0.8 1 Node 5 (n = 6) 123456 0 0.2 0.4 0.6 0.8 1 W31 6 ≤ −6 > −6 Node 7 (n = 9) 123456 0 0.2 0.4 0.6 0.8 1 Node 8 (n = 86) 123456 0 0.2 0.4 0.6 0.8 1 V57 9 ≤ 140 > 140 Node 10 (n = 31) 123456 0 0.2 0.4 0.6 0.8 1 V57 11 ≤ 178 > 178 W22 12 ≤ −6 > −6 Node 13 (n = 80) 123456 0 0.2 0.4 0.6 0.8 1 W31 14 ≤ −13 > −13 Node 15 (n = 9) 123456 0 0.2 0.4 0.6 0.8 1 Node 16 (n = 99) 123456 0 0.2 0.4 0.6 0.8 1 W31 17 ≤ −15 > −15 Node 18 (n = 12) 123456 0 0.2 0.4 0.6 0.8 1 W43 19 ≤ 3 > 3 Node 20 (n = 103) 123456 0 0.2 0.4 0.6 0.8 1 Node 21 (n = 97) 123456 0 0.2 0.4 0.6 0.8 1 33 / 40
  • 34. k-NN Classification find the k nearest neighbours of a new instance label it by majority voting needs an efficient indexing structure for large datasets ## k-NN classification k <- 20 newTS <- sc[501, ] + runif(100) * 15 distances <- dist(newTS, sc, method = "DTW") s <- sort(as.vector(distances), index.return = TRUE) # class IDs of k nearest neighbours table(classId[s$ix[1:k]]) ## ## 4 6 ## 3 17 34 / 40
  • 35. k-NN Classification find the k nearest neighbours of a new instance label it by majority voting needs an efficient indexing structure for large datasets ## k-NN classification k <- 20 newTS <- sc[501, ] + runif(100) * 15 distances <- dist(newTS, sc, method = "DTW") s <- sort(as.vector(distances), index.return = TRUE) # class IDs of k nearest neighbours table(classId[s$ix[1:k]]) ## ## 4 6 ## 3 17 Results of majority voting: class 6 34 / 40
  • 36. The TSclust Package TSclust: a package for time seriesclustering † measures of dissimilarity between time series to perform time series clustering. metrics based on raw data, on generating models and on the forecast behavior time series clustering algorithms and cluster evaluation metrics † http://cran.r-project.org/web/packages/TSclust/ 35 / 40
  • 37. Contents Introduction Time Series Decomposition Time Series Forecasting Time Series Clustering Time Series Classification Online Resources 36 / 40
  • 38. Online Resources Book titled R and Data Mining: Examples and Case Studies [Zhao, 2012] http://www.rdatamining.com/docs/RDataMining-book.pdf R Reference Card for Data Mining http://www.rdatamining.com/docs/RDataMining-reference-card.pdf Free online courses and documents http://www.rdatamining.com/resources/ RDataMining Group on LinkedIn (27,000+ members) http://group.rdatamining.com Twitter (3,300+ followers) @RDataMining 37 / 40
  • 40. How to Cite This Work Citation Yanchang Zhao. R and Data Mining: Examples and Case Studies. ISBN 978-0-12-396963-7, December 2012. Academic Press, Elsevier. 256 pages. URL: http://www.rdatamining.com/docs/RDataMining-book.pdf. BibTex @BOOK{Zhao2012R, title = {R and Data Mining: Examples and Case Studies}, publisher = {Academic Press, Elsevier}, year = {2012}, author = {Yanchang Zhao}, pages = {256}, month = {December}, isbn = {978-0-123-96963-7}, keywords = {R, data mining}, url = {http://www.rdatamining.com/docs/RDataMining-book.pdf} } 39 / 40
  • 41. References I Brockwell, P. J. and Davis, R. A. (2016). Introduction to Time Series and Forecasting, ISBN 9783319298528. Springer. Burrus, C. S., Gopinath, R. A., and Guo, H. (1998). Introduction to Wavelets and Wavelet Transforms: A Primer. Prentice-Hall, Inc. Keogh, E. J. and Pazzani, M. J. (2001). Derivative dynamic time warping. In the 1st SIAM Int. Conf. on Data Mining (SDM-2001), Chicago, IL, USA. Zhao, Y. (2012). R and Data Mining: Examples and Case Studies, ISBN 978-0-12-396963-7. Academic Press, Elsevier. 40 / 40