SlideShare a Scribd company logo
ARIMA - Statistical Analysis for Data Science
A time series is a data series consisting of several values over a time interval. e.g. daily Stock Exchange closing
point, weekly sales and monthly profit of a company etc.
Typically, in a time series it is assumed that value at any given point of time is a result of its historical values. This
assumption is the basis of performing a time series analysis. ARIMA technique exploits the auto-correlation
(Correlation of observation with its lags) for forecasting.
So talking mathematically, Vt = p(Vt-n) + e
It means value (V) at time "t" is a function of value at time "n" instance ago with an error (e). Value at time "t" can depend
on one or various lags of various order.
Example :
Suppose Mr. X starts his job in year 2010 and his starting salary was $5,000 per month. Every years he is appraised and
salary reached to a level of $20,000 per month in year 2014. His annual salary can be considered a time series and it is clear
that every year's salary is function of previous year's salary (here function is appraisal rating).
Time Series
ARIMA (Box-Jenkins Approach)
ARIMA stands for Auto-Regressive Integrated Moving Average. It is also
known as Box-Jenkins approach. It is one of the most popular techniques
used for time series analysis and forecasting purpose.
ARIMA, as its full form indicates that it involves two components :
1. Auto-regressive component
2. Moving average component
1. Auto-regressive Component
It implies relationship of a value of a series at a point of time with its own previous values. Such relationship can exist with
any order of lag.
Lag -
Lag is basically value at a previous point of time. It can have various orders as shown in the table below. It hints toward a
pointed relationship.
2. Moving average components:
It implies the current deviation from mean depends on previous deviations. Such relationship can exist with any
number of lags which decides the order of moving average.
Moving Average -
Moving Average is average of consecutive values at various time periods. It can have various orders as shown in
the table below. It hints toward a distributed relationship as moving itself is derivative of various lags.
Moving average is itself considered as one of the most rudimentary methods of forecasting. So if you drag the
average formula in excel further (beyond Dec-15), it would give you forecast for next month.
1.Plot the time series data
2.Check volatility - Run Box-Cox transformation to stabilize the variance
3.Check whether data contains seasonality. If yes, two options - either take seasonal differencing or fit
seasonal arima model.
4.If the data are non-stationary: take first differences of the data until the data are stationary
5.Identify orders of p,d and q by examining the ACF/PACF
6.Try your chosen models, and use the AICC/BIC to search for a better model.
7.Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a portmanteau
test of the residuals. If they do not look like white noise, try a modified model.
8.Check whether residuals are normally distributed with mean zero and constant variance
9.Once step 7 and 8 are completed, calculate forecasts
Note : The auto.arima function() automates step 3 to 6.
ARIMA Modeling Steps
1.Many of the simple time series models are special cases of ARIMA Model
Simple Exponential Smoothing ARIMA(0,1,1)
2.Holt's Exponential Smoothing ARIMA(0,2,2)
3.White noise ARIMA(0,0,0)
4.Random walk ARIMA(0,1,0) with no constant
5.Random walk with drift ARIMA(0,1,0) with a constant
6.Autoregression ARIMA(p,0,0)
7.Moving average ARIMA(0,0,q)

More Related Content

PDF
Different Models Used In Time Series - InsideAIML
PDF
Time series modelling arima-arch
PPTX
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
PPTX
Time series analysis
PDF
Stationarity and Seasonality in Univariate Time Series.pdf
PPTX
time_series and the forecastring age of RNNS.pptx
PPTX
PPTX
ARIMA MODEL USED FOR TIME SERIES FORECASTING
Different Models Used In Time Series - InsideAIML
Time series modelling arima-arch
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time series analysis
Stationarity and Seasonality in Univariate Time Series.pdf
time_series and the forecastring age of RNNS.pptx
ARIMA MODEL USED FOR TIME SERIES FORECASTING

Similar to ARIMA - Statistical Analysis for Data Science (20)

PDF
Non-Temporal ARIMA Models in Statistical Research
PPTX
HR Cost Forecasting using ARIMA modelling
PDF
Byungchul Yea (Project)
PDF
arimamodel-170204090012.pdf
PPTX
Arima model
PDF
Machine Learning - Time Series Part 2
PPTX
Project time series ppt
PDF
Large Scale Automatic Forecasting for Millions of Forecasts
PPTX
UNIT 3 SLIDES-1 Daniel Kojo Frederickwalter777@gmail.com
PDF
Module 5.pptx (Data science in engineering)
PDF
Time Series Analysis in R Studio using AirPassengers dataset.pdf
PPTX
Time series
PDF
The 7 basic Quality Tools.pdf for helathcare
PPTX
Air Passenger Prediction Using ARIMA Model
PPTX
Module 3 - Time Series.pptx
PPTX
Introduction to Eviews.pptx
PDF
working with python
PPTX
Time Series Decomposition
PPTX
Seasonal Decomposition of Time Series Data
PDF
Time Series Analysis with R
Non-Temporal ARIMA Models in Statistical Research
HR Cost Forecasting using ARIMA modelling
Byungchul Yea (Project)
arimamodel-170204090012.pdf
Arima model
Machine Learning - Time Series Part 2
Project time series ppt
Large Scale Automatic Forecasting for Millions of Forecasts
UNIT 3 SLIDES-1 Daniel Kojo Frederickwalter777@gmail.com
Module 5.pptx (Data science in engineering)
Time Series Analysis in R Studio using AirPassengers dataset.pdf
Time series
The 7 basic Quality Tools.pdf for helathcare
Air Passenger Prediction Using ARIMA Model
Module 3 - Time Series.pptx
Introduction to Eviews.pptx
working with python
Time Series Decomposition
Seasonal Decomposition of Time Series Data
Time Series Analysis with R
Ad

Recently uploaded (20)

PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
annual-report-2024-2025 original latest.
PDF
Foundation of Data Science unit number two notes
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction to machine learning and Linear Models
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Computer network topology notes for revision
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Lecture1 pattern recognition............
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction-to-Cloud-ComputingFinal.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
annual-report-2024-2025 original latest.
Foundation of Data Science unit number two notes
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
.pdf is not working space design for the following data for the following dat...
Introduction to Knowledge Engineering Part 1
Introduction to machine learning and Linear Models
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Computer network topology notes for revision
Supervised vs unsupervised machine learning algorithms
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Ppt On Nestle.pptx huunnnhhgfvu
Lecture1 pattern recognition............
Ad

ARIMA - Statistical Analysis for Data Science

  • 2. A time series is a data series consisting of several values over a time interval. e.g. daily Stock Exchange closing point, weekly sales and monthly profit of a company etc. Typically, in a time series it is assumed that value at any given point of time is a result of its historical values. This assumption is the basis of performing a time series analysis. ARIMA technique exploits the auto-correlation (Correlation of observation with its lags) for forecasting. So talking mathematically, Vt = p(Vt-n) + e It means value (V) at time "t" is a function of value at time "n" instance ago with an error (e). Value at time "t" can depend on one or various lags of various order. Example : Suppose Mr. X starts his job in year 2010 and his starting salary was $5,000 per month. Every years he is appraised and salary reached to a level of $20,000 per month in year 2014. His annual salary can be considered a time series and it is clear that every year's salary is function of previous year's salary (here function is appraisal rating). Time Series
  • 3. ARIMA (Box-Jenkins Approach) ARIMA stands for Auto-Regressive Integrated Moving Average. It is also known as Box-Jenkins approach. It is one of the most popular techniques used for time series analysis and forecasting purpose. ARIMA, as its full form indicates that it involves two components : 1. Auto-regressive component 2. Moving average component
  • 4. 1. Auto-regressive Component It implies relationship of a value of a series at a point of time with its own previous values. Such relationship can exist with any order of lag. Lag - Lag is basically value at a previous point of time. It can have various orders as shown in the table below. It hints toward a pointed relationship.
  • 5. 2. Moving average components: It implies the current deviation from mean depends on previous deviations. Such relationship can exist with any number of lags which decides the order of moving average. Moving Average - Moving Average is average of consecutive values at various time periods. It can have various orders as shown in the table below. It hints toward a distributed relationship as moving itself is derivative of various lags. Moving average is itself considered as one of the most rudimentary methods of forecasting. So if you drag the average formula in excel further (beyond Dec-15), it would give you forecast for next month.
  • 6. 1.Plot the time series data 2.Check volatility - Run Box-Cox transformation to stabilize the variance 3.Check whether data contains seasonality. If yes, two options - either take seasonal differencing or fit seasonal arima model. 4.If the data are non-stationary: take first differences of the data until the data are stationary 5.Identify orders of p,d and q by examining the ACF/PACF 6.Try your chosen models, and use the AICC/BIC to search for a better model. 7.Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a portmanteau test of the residuals. If they do not look like white noise, try a modified model. 8.Check whether residuals are normally distributed with mean zero and constant variance 9.Once step 7 and 8 are completed, calculate forecasts Note : The auto.arima function() automates step 3 to 6. ARIMA Modeling Steps
  • 7. 1.Many of the simple time series models are special cases of ARIMA Model Simple Exponential Smoothing ARIMA(0,1,1) 2.Holt's Exponential Smoothing ARIMA(0,2,2) 3.White noise ARIMA(0,0,0) 4.Random walk ARIMA(0,1,0) with no constant 5.Random walk with drift ARIMA(0,1,0) with a constant 6.Autoregression ARIMA(p,0,0) 7.Moving average ARIMA(0,0,q)