ECONOMETRICS PROJECT PRESENTATION
Time Series Analysis in SAS
AYAPPARAJ SKS | ADITYA NATHIREDDY | VIBEESH CS |
NEHA NEHRA
AGENDA
 TIME SERIES FORECASTING – INTRODUCTION
 THINGS TO BE CHECKED BEFORE APPLYING TIME
SERIES MODELLING
 BUSINESS OBJECTIVE
 ABOUT THE DATASET
 DATA PREPARATION
 MODEL IDENTIFICATION AND ESTIMATION
 GENERATING FORECAST
 FINAL FORECAST
TIME SERIES FORECASTING - INTRODUCTION
 Time Series relates to values taken by a variable over time (such as daily
sales revenue, weekly orders, monthly overheads, yearly income) and
tabulated or plotted as chronologically ordered numbers or data points to
yield valid statistical inferences.
 VOLATILITY :
 Data should not be very volatile for time series method.
 For stable changes, it is a good method.
 HOW IT CAN BE CHECKED :
 Scatter plot of the data with time in the horizontal axis and the time series
in the vertical axis gives indications for this.
 Fan or Inverted Fan distribution of the scatter plot shows the data is highly
volatile.
 WHAT CAN BE DONE TO OVERCOME THE PROBLEM :
 Transformations has to be done for the data based on the distribution of
the plot
 For Fan shape distribution - decrease the scale of the data – log / Sqrt
 While for Inverted Fan distribution - increase the scale of the data –
exponential / square .
THINGS TO BE CHECKED BEFORE
APPLYING TIME SERIES MODELLING
 PATTERN :
 If the past has a pattern, then time series will yield good result.
 Absence of pattern will have no effect of this method on the data.
 For stable changes, it is a good method.
 STATIONARITY OF THE DATA :
 If the data is stationary, there will not be any problem in using the
technique.
 if a data is a complete random memory less process with no fixed pattern
it is called non stationary data and cannot be used for future forecasting
this is checked using “Augmented Dickey Fuller unit root (ADF) test“.
 HOW DO WE PERFORM ADF TEST :
 We perform Hypothesis to determine the whether the data is stationary or
not.
 Ho : Non stationary /// Ha : Stationary
 if p < alpha we reject the Ho to claim that the data is stationary and
hence it cant be used for forecasting.
 if p > alpha we accept the Ho to claim that the data is non stationary
which can be made stationary by differencing.
BUSINESS OBJECTIVE
[To project the airline travel for the next 12 months]
 Sashelp.air — Airline Data (Monthly: Jan49-Dec60)
• The dataset used here is SASHELP.AIR which is
Airline data and contains two variables – DATE and
AIR (labelled as International Airline Travel). It
contains the data from JAN 1949 to DEC 1960.
ABOUT THE DATASET
 CHECK FOR VOLATILITY :
• Plot between the two variables yielded a
distribution as shown below.
• So we are going for variable transformation.
DATA PREPARATION
 CHECK FOR VOLATILITY :
• We are doing both log and sqrt transformations.
• From the below plots it is visible that log
transformation yields a good plot.
 CHECK FOR STATIONARY CONDITION OF DATA (PROC ARIMA OUTPUT)
• Now the result shows there is no stationarity
based on p values(all p values should be less than
alpha 0.01% or 0.0001) so we have to do
differencing.
• Now all the p values are less than alpha.
 CHECK FOR SEASONALITY :
• Auto correlation (ACF) captures correlation btw Yt
and Yt-s where S is the period of lag if the ACF
exhibits high a value at fixed interval then that
interval is considered as the period of
seasonality.
• Differencing of the same order will de seasonalize
the data.
• The output of ACF shows the period of seasonality
is 12 years.
 DESESONALIZATION :
• We are desesonalizing the data by 12th order
differencing as it gives high correlation values.
 CREATION OF DEVELOPMENT AND VALIDATION OF DATA :
• Depending upon number of observations, some of the
most recent time point data are put aside as the
validation sample.
• The rest of the data, development sample, is used
to generate forecast for multiple models which are
compared with the actuals stored in the
development sample.
MODEL IDENTIFICATION AND
ESTIMATION
 SELECTION OF P(AUTO REGRESSIVE)AND Q (MOVING AVERAGE) :
• The model selection criteria namely AIC BIC SBC
(Lower the values it is better)are used to select
the values of P and Q.
• AIC - Akaike's information criteria
• BIC - Bayesian information criteria
• SBC - Schwartz bayesian criteria
• BIC :: p 0-5 q 0-5
• The minic (minimum information criteria) under
proc arima generates the minimum BIC model after
considering all combinations of P and Q from 0 to
5.
• This selects all models in the neighbourhood of
the minimum BIC models, generate AIC SBC and
calculate average of AIC and SBC.
• Then select 6 to 7 models based on relative lower
value of average and generate forecasts for them.
 Contd..,
• By observation, we can see that the minimum of the
matrix is -6.3503 corresponding to AR3 and MA0
location (P,Q)=>(3,0).
• This selects all models in the neighbourhood of
the minimum BIC models, generate AIC SBC and
calculate average of AIC and SBC.
• Then select 6 to 7 models based on relative lower
value of average and generate forecasts for them.
• For each combinations of p and q selected from AIC
and SBC, generate forecast using the forecast
under proc arima where,
 lead = Number of future time points to forecast
 ID = Name of the time variable
 Interval = Unit of the time variable
 Out = Output file which saves the forecast.
• The forecasts obtained for each combination for p
and q is compared with the actuals of the same
time points stored in the validation file using
MAPE(Mean Absolute Percentage Error).
• The combination with the minimum MAPE is selected.
GENERATING FORECAST
GENERATING FORECAST..contd..,
• The combination with the minimum MAPE is selected
and the same is applied to the entire data to
generate the final forecast.
• Ran forecast on the full data for the best p and q
combination.
FINAL FORECAST
>> Thank you <<

More Related Content

PPTX
Sales Data Forecasting for Airline
PPTX
Time_Series_Assignment
PPTX
Time series Forecasting
PDF
Time series forecasting
PPTX
2017 ams
PPTX
Portland oregon riders monthly data Using R
DOCX
Time series project report report
PDF
Optimize distribution center
Sales Data Forecasting for Airline
Time_Series_Assignment
Time series Forecasting
Time series forecasting
2017 ams
Portland oregon riders monthly data Using R
Time series project report report
Optimize distribution center

What's hot (14)

PPTX
Splunk Search
PPTX
spatial interoplation in GIS
PDF
Time series project
PPTX
Flight Delay Prediction Model (2)
DOCX
Projects Portfolio
PDF
Performance Of A Combined Shewhart-Cusum Control Chart With Binomial Data For...
PDF
Intro to Forecasting in R - Part 4
PDF
P6 Analytics history hierarchies and maps - Oracle Primavera P6 Collaborate 14
PPTX
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
PDF
X Bar And S Charts Mini Tutorial
PDF
Probabilistic data structures
PPTX
Lesson13
PPT
CanSat 2008: ITESM Mexico City Final Presentation
PPTX
PowerLyra@EuroSys2015
Splunk Search
spatial interoplation in GIS
Time series project
Flight Delay Prediction Model (2)
Projects Portfolio
Performance Of A Combined Shewhart-Cusum Control Chart With Binomial Data For...
Intro to Forecasting in R - Part 4
P6 Analytics history hierarchies and maps - Oracle Primavera P6 Collaborate 14
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
X Bar And S Charts Mini Tutorial
Probabilistic data structures
Lesson13
CanSat 2008: ITESM Mexico City Final Presentation
PowerLyra@EuroSys2015
Ad

Similar to Ecm time series forecast (20)

PPT
Time Series Analysis - Modeling and Forecasting
PPT
Time Series Analysis - Modeling and Forecasting
PDF
Sales forecasting of an airline company using time series analysis (1) (1)
PDF
Forecasting%20Economic%20Series%20using%20ARMA
PPTX
HR Cost Forecasting using ARIMA modelling
PPTX
Air Passenger Prediction Using ARIMA Model
PDF
arimamodel-170204090012.pdf
PPTX
Arima model
PDF
Forecasting time series powerful and simple
PDF
timeseries cheat sheet with example code for R
PDF
Different Models Used In Time Series - InsideAIML
PDF
a brief introduction to Arima
PDF
Writing Sample
PDF
Lecture_18 hypothesis testing and probability
PDF
Forecasting Retail Sales with ARIMA_ A Case Study in Time Series Analysis
PPTX
ARIMA model predicts futture values based on past values
PPTX
Time series analysis
PPTX
ARIMA.pptx
PDF
Forecasting Techniques - Data Science SG
PPTX
Project time series ppt
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
Sales forecasting of an airline company using time series analysis (1) (1)
Forecasting%20Economic%20Series%20using%20ARMA
HR Cost Forecasting using ARIMA modelling
Air Passenger Prediction Using ARIMA Model
arimamodel-170204090012.pdf
Arima model
Forecasting time series powerful and simple
timeseries cheat sheet with example code for R
Different Models Used In Time Series - InsideAIML
a brief introduction to Arima
Writing Sample
Lecture_18 hypothesis testing and probability
Forecasting Retail Sales with ARIMA_ A Case Study in Time Series Analysis
ARIMA model predicts futture values based on past values
Time series analysis
ARIMA.pptx
Forecasting Techniques - Data Science SG
Project time series ppt
Ad

More from Ayapparaj SKS (7)

PPSX
ECM Regression Analysis
DOCX
Apache hive
DOCX
My First Hadoop Program !!!
PDF
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
PDF
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
PPTX
Credit Card Fraud Detection Client Presentation
ECM Regression Analysis
Apache hive
My First Hadoop Program !!!
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
Credit Card Fraud Detection Client Presentation

Recently uploaded (20)

PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Leprosy and NLEP programme community medicine
PPTX
SET 1 Compulsory MNH machine learning intro
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
chrmotography.pptx food anaylysis techni
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPT
statistic analysis for study - data collection
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
New ISO 27001_2022 standard and the changes
PPTX
Business_Capability_Map_Collection__pptx
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
Introduction to the R Programming Language
DOCX
Factor Analysis Word Document Presentation
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
STERILIZATION AND DISINFECTION-1.ppthhhbx
Leprosy and NLEP programme community medicine
SET 1 Compulsory MNH machine learning intro
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
chrmotography.pptx food anaylysis techni
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
IMPACT OF LANDSLIDE.....................
A Complete Guide to Streamlining Business Processes
Topic 5 Presentation 5 Lesson 5 Corporate Fin
statistic analysis for study - data collection
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
New ISO 27001_2022 standard and the changes
Business_Capability_Map_Collection__pptx
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
Introduction to the R Programming Language
Factor Analysis Word Document Presentation
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf

Ecm time series forecast

  • 1. ECONOMETRICS PROJECT PRESENTATION Time Series Analysis in SAS AYAPPARAJ SKS | ADITYA NATHIREDDY | VIBEESH CS | NEHA NEHRA
  • 2. AGENDA  TIME SERIES FORECASTING – INTRODUCTION  THINGS TO BE CHECKED BEFORE APPLYING TIME SERIES MODELLING  BUSINESS OBJECTIVE  ABOUT THE DATASET  DATA PREPARATION  MODEL IDENTIFICATION AND ESTIMATION  GENERATING FORECAST  FINAL FORECAST
  • 3. TIME SERIES FORECASTING - INTRODUCTION  Time Series relates to values taken by a variable over time (such as daily sales revenue, weekly orders, monthly overheads, yearly income) and tabulated or plotted as chronologically ordered numbers or data points to yield valid statistical inferences.
  • 4.  VOLATILITY :  Data should not be very volatile for time series method.  For stable changes, it is a good method.  HOW IT CAN BE CHECKED :  Scatter plot of the data with time in the horizontal axis and the time series in the vertical axis gives indications for this.  Fan or Inverted Fan distribution of the scatter plot shows the data is highly volatile.  WHAT CAN BE DONE TO OVERCOME THE PROBLEM :  Transformations has to be done for the data based on the distribution of the plot  For Fan shape distribution - decrease the scale of the data – log / Sqrt  While for Inverted Fan distribution - increase the scale of the data – exponential / square . THINGS TO BE CHECKED BEFORE APPLYING TIME SERIES MODELLING
  • 5.  PATTERN :  If the past has a pattern, then time series will yield good result.  Absence of pattern will have no effect of this method on the data.  For stable changes, it is a good method.  STATIONARITY OF THE DATA :  If the data is stationary, there will not be any problem in using the technique.  if a data is a complete random memory less process with no fixed pattern it is called non stationary data and cannot be used for future forecasting this is checked using “Augmented Dickey Fuller unit root (ADF) test“.  HOW DO WE PERFORM ADF TEST :  We perform Hypothesis to determine the whether the data is stationary or not.  Ho : Non stationary /// Ha : Stationary  if p < alpha we reject the Ho to claim that the data is stationary and hence it cant be used for forecasting.  if p > alpha we accept the Ho to claim that the data is non stationary which can be made stationary by differencing.
  • 6. BUSINESS OBJECTIVE [To project the airline travel for the next 12 months]
  • 7.  Sashelp.air — Airline Data (Monthly: Jan49-Dec60) • The dataset used here is SASHELP.AIR which is Airline data and contains two variables – DATE and AIR (labelled as International Airline Travel). It contains the data from JAN 1949 to DEC 1960. ABOUT THE DATASET
  • 8.  CHECK FOR VOLATILITY : • Plot between the two variables yielded a distribution as shown below. • So we are going for variable transformation. DATA PREPARATION
  • 9.  CHECK FOR VOLATILITY : • We are doing both log and sqrt transformations. • From the below plots it is visible that log transformation yields a good plot.
  • 10.  CHECK FOR STATIONARY CONDITION OF DATA (PROC ARIMA OUTPUT) • Now the result shows there is no stationarity based on p values(all p values should be less than alpha 0.01% or 0.0001) so we have to do differencing. • Now all the p values are less than alpha.
  • 11.  CHECK FOR SEASONALITY : • Auto correlation (ACF) captures correlation btw Yt and Yt-s where S is the period of lag if the ACF exhibits high a value at fixed interval then that interval is considered as the period of seasonality. • Differencing of the same order will de seasonalize the data. • The output of ACF shows the period of seasonality is 12 years.  DESESONALIZATION : • We are desesonalizing the data by 12th order differencing as it gives high correlation values.
  • 12.  CREATION OF DEVELOPMENT AND VALIDATION OF DATA : • Depending upon number of observations, some of the most recent time point data are put aside as the validation sample. • The rest of the data, development sample, is used to generate forecast for multiple models which are compared with the actuals stored in the development sample. MODEL IDENTIFICATION AND ESTIMATION
  • 13.  SELECTION OF P(AUTO REGRESSIVE)AND Q (MOVING AVERAGE) : • The model selection criteria namely AIC BIC SBC (Lower the values it is better)are used to select the values of P and Q. • AIC - Akaike's information criteria • BIC - Bayesian information criteria • SBC - Schwartz bayesian criteria • BIC :: p 0-5 q 0-5 • The minic (minimum information criteria) under proc arima generates the minimum BIC model after considering all combinations of P and Q from 0 to 5. • This selects all models in the neighbourhood of the minimum BIC models, generate AIC SBC and calculate average of AIC and SBC. • Then select 6 to 7 models based on relative lower value of average and generate forecasts for them.
  • 14.  Contd.., • By observation, we can see that the minimum of the matrix is -6.3503 corresponding to AR3 and MA0 location (P,Q)=>(3,0). • This selects all models in the neighbourhood of the minimum BIC models, generate AIC SBC and calculate average of AIC and SBC. • Then select 6 to 7 models based on relative lower value of average and generate forecasts for them.
  • 15. • For each combinations of p and q selected from AIC and SBC, generate forecast using the forecast under proc arima where,  lead = Number of future time points to forecast  ID = Name of the time variable  Interval = Unit of the time variable  Out = Output file which saves the forecast. • The forecasts obtained for each combination for p and q is compared with the actuals of the same time points stored in the validation file using MAPE(Mean Absolute Percentage Error). • The combination with the minimum MAPE is selected. GENERATING FORECAST
  • 17. • The combination with the minimum MAPE is selected and the same is applied to the entire data to generate the final forecast. • Ran forecast on the full data for the best p and q combination. FINAL FORECAST