SlideShare a Scribd company logo
Session 6: Administrative and Alternative Data Sources
Big Data and Macroeconomic Nowcasting:
From Data Access to Modelling
15th Conference of International Association for Official Statistics
Abu Dhabi, 6–8 December 2016
Dario Buono*, European Commission, Eurostat, dario.buono@ec.europa.eu
Stephan Krische*, GOPA Consultants, stephan.krische@gopa.de
Massimiliano Marcellino, Bocconi University, massimiliano.marcellino@unibocconi.it
George Kapetanios, King’s College, george.kapetanios@kcl.ac.uk
Gian Luigi Mazzi, European Commission, Eurostat, gianluigi.mazzi@ec.europa.eu
Fotis Papailias, Queen's University Management School, f.papailias@qub.ac.uk
*The views expressed are the author’s alone and do not necessarily correspond to those of the corresponding organisations of affiliation
Eurostat, the Statistical Office of EU
• About 700 people with 28 different nationalities
• Statistical Office of European Union, part of EC
• Core business:
• Euro-zone (19) & EU (28) aggregates
• harmonization, best practices, guidelines, trainings &
international cooperation
• Methodology team: Time Series, Econometrics, SDC,
Research & EA
Why interested in Big Data for nowcasting?
• Big Data are complementary information to standard
data, being based on different information sets
• More granular perspective on the indicator of interest,
both in the temporal and cross-sectional dimensions
• It is timely available, generally not subject to revisions
European research project: Apr 15 to Jul 16
Research questions and findings
Can Big Data help for Macroeconomic Nowcasting?
What are the potential Big Data sources?
1. Literature review
2. Models/methods to be used for Big data
3. Recommendations on how to handle Big Data
4. Case study: IPI, Inflation, unemployment of some EU
countries
Big Data types & dimensionality
• When the dimensionality increases, the volume of the space
increases so fast that the available data become sparse.
• For statistically significant result, the amount of data needed
often grows exponentially with the dimensionality.
• Use of a typology based on Doornik and Hendry (2015):
• Tall data: many observation, few variables
• Fat data: many variables, few observations
• Huge data: many variables, many observations
Eurostat
Models race
• Dynamic Factor Analysis
• Partial Least Squares
• Bayesian Regression
• LASSO regression
• U-Midas models
• Model averaging
255 models tested, macro-financial & google trend data
Eurostat
Statistical Methods: findings
• Sparse regression (LASSO) works for fat, huge data
• Data reduction techniques (PLS) helpful for large variables
• (U)-MIDAS or bridge modelling for mixed frequency
• Dimensionality reduction improves nowcasting
• Forecast combination: Data-driven automated strategy with
model rotation based on forecasting performance in the past
works well
From Data Access to Modelling
Step-by-step approach, accompanied by specific
recommendations for the use of big data for
macroeconomic nowcasting, guiding to
• the identification and the choice of Big Data
• pre-treatment and econometric modelling
• the comparative evaluation of results to obtain a very useful
tool for decision about the use or not of Big Data
Step 1: Big Data usefulness within
a nowcasting exercise
Recommendations
1. Evaluate the quality of the existing nowcasts
and identify issue (bias or inefficiency or large
errors in specific periods), that can be fixed by
adding information in Big Data based indicators
2. Use of Big Data only when expecting to improve
the timeliness and/or the quality of nowcastings
3. Do not consider Big Data sources with spurious
correlations with the target variable
Step 2: Big Data search
Recommendations
1. Starting point for an assessment of the potential
benefits/costs of the use of Big Data for macroeconomic
nowcasting: identification of their source
• Social Networks (human-sourced information)
• Traditional Business Systems (process-mediated data)
• Internet of Things (machine-generated data)
2. Choice is heavily dependent on the target indicator of
the nowcasting exercise
Step 3: Assessment of big-data
accessibility and quality
Recommendations
1. Privilege data providers with guarantee of continuity and of the
availability of a good metadata associated to the Big Data
2. Privilege Big Data sources ensuring sufficient time and cross-
sectional coverage
3. If a bias is observed a bias correction can be included in the
nowcasting strategy.
4. To deal with possible instabilities of the relationships between the
Big Data and the target variables, nowcasting models should be re-
specified on a regular basis (e.g. yearly) and occasionally in the
presence of unexpected events.
Step 4: Big data preparation
Recommendations
1. Big data often unstructured: proper mapping
2. Pre-treatment to remove deterministic patterns
• Outliers, calendar effects, missing observations
• Seasonal and non-seasonal short-term movements should
be dealt accordingly to the characteristic of the target
variable
3. Create a specific IT environment where the original
data are collected and stored with associated routines
4. Ensure the availability of an exhaustive
documentation of the Big Data conversion process
Step 5: Big Data modelling strategy
Recommendations
1. Identification of appropriate econometric techniques
2. First dimension: choice between the use of methods suited for
large but not huge datasets, therefore applied to summaries of the
Big Data (Google Trends)
• nowcasting with large datasets can be based on factor models,
large BVARs, or shrinkage regressions
3. Huge datasets can be handled by sparse principal components,
linear models combined with heuristic optimization, or a variety of
machine learning methods such as LASSO & LARS regression
4. In case of mixed frequency data, methods such as UMIDAS and,
as a second best, Bridge, should be privileged.
Step 6: Results evaluation of Big Data
based nowcasting
Recommendations
1. Run a critical and comprehensive assessment of the
contribution of Big Data for nowcasting the indicator of interest
based, e.g., on standard criteria such as MSE or MAE.
2. In order to reduce the extent of data and model snooping, a cross-
validation approach should be followed:
• various models and indicators, with and without Big Data,
estimated over a first sample and selected and/or pooled
according to their performance
• then the performance of the preferred approaches re-evaluated
over a second sample
Case study
- Implementation of all these steps for nowcasting IP growth, inflation
and unemployment in several EU countries in a pseudo out of
sample context, using Google trends for specific and carefully
selected keywords for each country and variable
- Big Data specific features: transform unstructured into structured data,
time series decompositions, handling mixed frequency data
- Overall, the results are mixed but there are several cases where
Google trends, when combined with rather sophisticated econometric
techniques, yield forecasting gains, though generally small.
- Gains in term of timeliness or revisions have not been considered
Eurostat
Literature contribution
Eurostat Statistical Working Paper
"Big Data and Macroeconomic Nowcasting:
From data access to modelling"
 Methodological finding will be included in 2 chapter of the
Eurostat/UNECE Handbook on Rapid Estimates currently
under 2nd peer review, (forthcoming in 2017)
What's next? Big Data Econometrics
• 2017, a new project focusing on:
• Econometrics, Filtering issues, advanced Bayesian
estimation and forecasting methods
• Real time empirical evaluations (including a direct
comparison with Eurostat flash estimates),
• New ways and new metrics to present nowcasts
• Possible data timeliness/accuracy gains
• Big data handling tool developed as R package
• Scientific summary for Big Data Econometric strategy
Thank you for your attention!!
Some References:
- Eurostat, Big data and macroeconomic nowcasting, preliminary results presented at the
ESS methodological working group (7 April 2016, Luxembourg)
http://ec.europa.eu/eurostat/cros/content/item21bigdataandmacroeconomicnowcastingsl
ides_en
- Big data CROS portal, http://ec.europa.eu/eurostat/cros/content/big-data_en
- Marcellino, M. (2016), “Nowcasting with Big Data”, Keynote Speech at the 33rd CIRET
conference.
- Harford, T. (2014, April). Big data: Are we making a big mistake? Financial Times.
Available at http://www.ft.com/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabdc0.html
#ixzz2xcdlP1zZ
- Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). "The Parable of Google Flu:
Traps in Big Data Analysis", Science, 143, 1203-1205.
- Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of
the Royal Statistical Society B, 58, 267-288.

More Related Content

PPTX
Big Data and Nowcasting
PPTX
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
PPTX
IAOS 2018 - Enhanced recommendations on step-by-step procedure and approach t...
PPTX
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
PDF
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
PDF
Quality Approaches to Big Data
PDF
Big data Big impact?
PPTX
Big Data Day LA 2016 Keynote - Tom Horan/ Claremont Graduate University
Big Data and Nowcasting
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
IAOS 2018 - Enhanced recommendations on step-by-step procedure and approach t...
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
Selection of Articles using Data Analytics for Behavioral Dissertation Resear...
Quality Approaches to Big Data
Big data Big impact?
Big Data Day LA 2016 Keynote - Tom Horan/ Claremont Graduate University

What's hot (20)

PDF
Apply (Big) Data Analytics & Predictive Analytics to Business Application
PDF
hariri2019.pdf
PDF
International Year of Statistics | 2013
PPT
Data mining intro-2009-v2
PDF
Opportunities and methodological challenges of Big Data for official statist...
PDF
A forecasting of stock trading price using time series information based on b...
PDF
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
PPTX
Predictive analytics
PDF
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
PPTX
What is statistics
PDF
Arloesiadur: An analytics experiment in innovation policy
PDF
NIH BD2K DataMed model, DATS
PDF
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
PPTX
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
PPTX
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
PDF
Query reverse engineering in the context of the semantic web
PDF
Development of a Collaborator Recommender System Based on Directed Graph Model
PDF
Randy Goebel for the KIEF 2018. FROM DATA TO ECONOMIC VALUE
PPTX
Methods for making the best use of admin data
PPTX
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
Apply (Big) Data Analytics & Predictive Analytics to Business Application
hariri2019.pdf
International Year of Statistics | 2013
Data mining intro-2009-v2
Opportunities and methodological challenges of Big Data for official statist...
A forecasting of stock trading price using time series information based on b...
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
Predictive analytics
BIG DATA IN SMART CITIES: A SYSTEMATIC MAPPING REVIEW
What is statistics
Arloesiadur: An analytics experiment in innovation policy
NIH BD2K DataMed model, DATS
A statistical approach to big data, Gustav Haraldsen and Arild Langseth, Stat...
University of Missouri-Columbia, Frequent Hierarchical Pattern (FHP) Tree
Ta4.04 mikkela.20170111 fin-data_advocacy_un2017_jakoon
Query reverse engineering in the context of the semantic web
Development of a Collaborator Recommender System Based on Directed Graph Model
Randy Goebel for the KIEF 2018. FROM DATA TO ECONOMIC VALUE
Methods for making the best use of admin data
Measuring the promise of Open Data: Development of the Impact Monitoring Fram...
Ad

Viewers also liked (20)

PPTX
Nowcasting German GDP growth and the real time newsflow
PDF
Le Big data à Bruxelles aujourd'hui. Et demain ?
PPTX
Pecha Kucha Presentation
PDF
KARAS-eBrochure-FINAL
DOCX
Assignment akshat
PDF
Uid15 accessibility raike
PPS
Cafe Desire Presentation Latest
PDF
Michael Wilkinson & Richard Gunson, Learning Everywhere
PPT
Infrastructure strategy
PDF
My City Crawl
DOCX
PDF
Cei_Willis_PortfolioILLUSTRATION
PDF
Inspection activities linked
PPT
проказы старухи зимы
PPT
мальчик
PPTX
Teller training
PPTX
Bagian 2 slide - pengantar teori pajak atas penghasilan (pph)-oke
PDF
Le big data à l'épreuve des projets d'entreprise
PPTX
The Technological Singularity
PPTX
Big Data & Real Time #JSS2014
Nowcasting German GDP growth and the real time newsflow
Le Big data à Bruxelles aujourd'hui. Et demain ?
Pecha Kucha Presentation
KARAS-eBrochure-FINAL
Assignment akshat
Uid15 accessibility raike
Cafe Desire Presentation Latest
Michael Wilkinson & Richard Gunson, Learning Everywhere
Infrastructure strategy
My City Crawl
Cei_Willis_PortfolioILLUSTRATION
Inspection activities linked
проказы старухи зимы
мальчик
Teller training
Bagian 2 slide - pengantar teori pajak atas penghasilan (pph)-oke
Le big data à l'épreuve des projets d'entreprise
The Technological Singularity
Big Data & Real Time #JSS2014
Ad

Similar to Big data and macroeconomic nowcasting from data access to modelling (20)

PDF
Lesson1.2.pptx.pdf
PPTX
Improving activity data for Tier 2 estimates of livestock emissions: End of W...
PPTX
Predictive Analytics: Context and Use Cases
PPTX
data science, prior knowledge ,modeling, scatter plot
PDF
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
PPTX
PPT 1.1.4.pptx_PPT 1.1.4.pptx_PPT 1.1.4.pptx
PPTX
PPT 1.1.4.pptx_PPT 1.1.4.pptx_PPT 1.1.4.pptx
PDF
Construction of composite index: process & methods
PDF
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
PPTX
Garcia - New data for innovation policy
PPTX
New Data for Innovation Policy
PDF
Hobbit project overview presented at EBDVF 2017
PDF
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
PPTX
Methodological network and strategy
PPTX
Fealing - Improving indicators to inform policy
PPTX
data analytics vs data analysis understanding the differencespptx
PPTX
KU_Big_Data_3_25_2015a
PDF
Week_2_Lecture.pdf
PPTX
Data Science course at MIT SCHOOL OF DISTANCE EDUCATION
PPTX
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...
Lesson1.2.pptx.pdf
Improving activity data for Tier 2 estimates of livestock emissions: End of W...
Predictive Analytics: Context and Use Cases
data science, prior knowledge ,modeling, scatter plot
BDVe Webinar Series - Big Data for Public Policy, the state of play - Roadmap...
PPT 1.1.4.pptx_PPT 1.1.4.pptx_PPT 1.1.4.pptx
PPT 1.1.4.pptx_PPT 1.1.4.pptx_PPT 1.1.4.pptx
Construction of composite index: process & methods
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
Garcia - New data for innovation policy
New Data for Innovation Policy
Hobbit project overview presented at EBDVF 2017
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
Methodological network and strategy
Fealing - Improving indicators to inform policy
data analytics vs data analysis understanding the differencespptx
KU_Big_Data_3_25_2015a
Week_2_Lecture.pdf
Data Science course at MIT SCHOOL OF DISTANCE EDUCATION
IAOS 2018 - Evaluation of nowcasting/flash-estimation based on a big set of ...

More from Dario Buono (14)

PPTX
Introduction to LLMs and their relevance for Official Statistics
PDF
Reporting uncertainties - too much information?
PPTX
Skills for the new generation of statisticians
PPTX
JDemetra+ Java Tool for Seasonal Adjustment
PDF
Big Data Analysis: The curse of dimensionality in official statistics
PPTX
Physics4Stats & BMI vs. QoL
PPTX
Safebook quality grading
PPT
MIP: Analysis of metadata and data revisions
PPT
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
PPT
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
PPTX
Detecting outliers at the end of the series using forecast intervals
PPT
1 out of 20 scenarios
PPT
Eurostat methodological skills staff survey lesson learned final
PPT
Reliability of estimates in socio-demographic groups with small samples
Introduction to LLMs and their relevance for Official Statistics
Reporting uncertainties - too much information?
Skills for the new generation of statisticians
JDemetra+ Java Tool for Seasonal Adjustment
Big Data Analysis: The curse of dimensionality in official statistics
Physics4Stats & BMI vs. QoL
Safebook quality grading
MIP: Analysis of metadata and data revisions
New innovative 3 way anova a-priori test for direct vs. indirect approach in ...
Eurostat tools for benchmarking and seasonal adjustment j_demetra+ and jecotr...
Detecting outliers at the end of the series using forecast intervals
1 out of 20 scenarios
Eurostat methodological skills staff survey lesson learned final
Reliability of estimates in socio-demographic groups with small samples

Recently uploaded (20)

PDF
Mega Projects Data Mega Projects Data
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
annual-report-2024-2025 original latest.
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Computer network topology notes for revision
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Mega Projects Data Mega Projects Data
Qualitative Qantitative and Mixed Methods.pptx
Supervised vs unsupervised machine learning algorithms
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Galatica Smart Energy Infrastructure Startup Pitch Deck
Miokarditis (Inflamasi pada Otot Jantung)
annual-report-2024-2025 original latest.
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Computer network topology notes for revision
Introduction-to-Cloud-ComputingFinal.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Fluorescence-microscope_Botany_detailed content
Introduction to Knowledge Engineering Part 1
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx

Big data and macroeconomic nowcasting from data access to modelling

  • 1. Session 6: Administrative and Alternative Data Sources Big Data and Macroeconomic Nowcasting: From Data Access to Modelling 15th Conference of International Association for Official Statistics Abu Dhabi, 6–8 December 2016 Dario Buono*, European Commission, Eurostat, dario.buono@ec.europa.eu Stephan Krische*, GOPA Consultants, stephan.krische@gopa.de Massimiliano Marcellino, Bocconi University, massimiliano.marcellino@unibocconi.it George Kapetanios, King’s College, george.kapetanios@kcl.ac.uk Gian Luigi Mazzi, European Commission, Eurostat, gianluigi.mazzi@ec.europa.eu Fotis Papailias, Queen's University Management School, f.papailias@qub.ac.uk *The views expressed are the author’s alone and do not necessarily correspond to those of the corresponding organisations of affiliation
  • 2. Eurostat, the Statistical Office of EU • About 700 people with 28 different nationalities • Statistical Office of European Union, part of EC • Core business: • Euro-zone (19) & EU (28) aggregates • harmonization, best practices, guidelines, trainings & international cooperation • Methodology team: Time Series, Econometrics, SDC, Research & EA
  • 3. Why interested in Big Data for nowcasting? • Big Data are complementary information to standard data, being based on different information sets • More granular perspective on the indicator of interest, both in the temporal and cross-sectional dimensions • It is timely available, generally not subject to revisions
  • 4. European research project: Apr 15 to Jul 16
  • 5. Research questions and findings Can Big Data help for Macroeconomic Nowcasting? What are the potential Big Data sources? 1. Literature review 2. Models/methods to be used for Big data 3. Recommendations on how to handle Big Data 4. Case study: IPI, Inflation, unemployment of some EU countries
  • 6. Big Data types & dimensionality • When the dimensionality increases, the volume of the space increases so fast that the available data become sparse. • For statistically significant result, the amount of data needed often grows exponentially with the dimensionality. • Use of a typology based on Doornik and Hendry (2015): • Tall data: many observation, few variables • Fat data: many variables, few observations • Huge data: many variables, many observations
  • 7. Eurostat Models race • Dynamic Factor Analysis • Partial Least Squares • Bayesian Regression • LASSO regression • U-Midas models • Model averaging 255 models tested, macro-financial & google trend data
  • 8. Eurostat Statistical Methods: findings • Sparse regression (LASSO) works for fat, huge data • Data reduction techniques (PLS) helpful for large variables • (U)-MIDAS or bridge modelling for mixed frequency • Dimensionality reduction improves nowcasting • Forecast combination: Data-driven automated strategy with model rotation based on forecasting performance in the past works well
  • 9. From Data Access to Modelling Step-by-step approach, accompanied by specific recommendations for the use of big data for macroeconomic nowcasting, guiding to • the identification and the choice of Big Data • pre-treatment and econometric modelling • the comparative evaluation of results to obtain a very useful tool for decision about the use or not of Big Data
  • 10. Step 1: Big Data usefulness within a nowcasting exercise Recommendations 1. Evaluate the quality of the existing nowcasts and identify issue (bias or inefficiency or large errors in specific periods), that can be fixed by adding information in Big Data based indicators 2. Use of Big Data only when expecting to improve the timeliness and/or the quality of nowcastings 3. Do not consider Big Data sources with spurious correlations with the target variable
  • 11. Step 2: Big Data search Recommendations 1. Starting point for an assessment of the potential benefits/costs of the use of Big Data for macroeconomic nowcasting: identification of their source • Social Networks (human-sourced information) • Traditional Business Systems (process-mediated data) • Internet of Things (machine-generated data) 2. Choice is heavily dependent on the target indicator of the nowcasting exercise
  • 12. Step 3: Assessment of big-data accessibility and quality Recommendations 1. Privilege data providers with guarantee of continuity and of the availability of a good metadata associated to the Big Data 2. Privilege Big Data sources ensuring sufficient time and cross- sectional coverage 3. If a bias is observed a bias correction can be included in the nowcasting strategy. 4. To deal with possible instabilities of the relationships between the Big Data and the target variables, nowcasting models should be re- specified on a regular basis (e.g. yearly) and occasionally in the presence of unexpected events.
  • 13. Step 4: Big data preparation Recommendations 1. Big data often unstructured: proper mapping 2. Pre-treatment to remove deterministic patterns • Outliers, calendar effects, missing observations • Seasonal and non-seasonal short-term movements should be dealt accordingly to the characteristic of the target variable 3. Create a specific IT environment where the original data are collected and stored with associated routines 4. Ensure the availability of an exhaustive documentation of the Big Data conversion process
  • 14. Step 5: Big Data modelling strategy Recommendations 1. Identification of appropriate econometric techniques 2. First dimension: choice between the use of methods suited for large but not huge datasets, therefore applied to summaries of the Big Data (Google Trends) • nowcasting with large datasets can be based on factor models, large BVARs, or shrinkage regressions 3. Huge datasets can be handled by sparse principal components, linear models combined with heuristic optimization, or a variety of machine learning methods such as LASSO & LARS regression 4. In case of mixed frequency data, methods such as UMIDAS and, as a second best, Bridge, should be privileged.
  • 15. Step 6: Results evaluation of Big Data based nowcasting Recommendations 1. Run a critical and comprehensive assessment of the contribution of Big Data for nowcasting the indicator of interest based, e.g., on standard criteria such as MSE or MAE. 2. In order to reduce the extent of data and model snooping, a cross- validation approach should be followed: • various models and indicators, with and without Big Data, estimated over a first sample and selected and/or pooled according to their performance • then the performance of the preferred approaches re-evaluated over a second sample
  • 16. Case study - Implementation of all these steps for nowcasting IP growth, inflation and unemployment in several EU countries in a pseudo out of sample context, using Google trends for specific and carefully selected keywords for each country and variable - Big Data specific features: transform unstructured into structured data, time series decompositions, handling mixed frequency data - Overall, the results are mixed but there are several cases where Google trends, when combined with rather sophisticated econometric techniques, yield forecasting gains, though generally small. - Gains in term of timeliness or revisions have not been considered
  • 17. Eurostat Literature contribution Eurostat Statistical Working Paper "Big Data and Macroeconomic Nowcasting: From data access to modelling"  Methodological finding will be included in 2 chapter of the Eurostat/UNECE Handbook on Rapid Estimates currently under 2nd peer review, (forthcoming in 2017)
  • 18. What's next? Big Data Econometrics • 2017, a new project focusing on: • Econometrics, Filtering issues, advanced Bayesian estimation and forecasting methods • Real time empirical evaluations (including a direct comparison with Eurostat flash estimates), • New ways and new metrics to present nowcasts • Possible data timeliness/accuracy gains • Big data handling tool developed as R package • Scientific summary for Big Data Econometric strategy
  • 19. Thank you for your attention!! Some References: - Eurostat, Big data and macroeconomic nowcasting, preliminary results presented at the ESS methodological working group (7 April 2016, Luxembourg) http://ec.europa.eu/eurostat/cros/content/item21bigdataandmacroeconomicnowcastingsl ides_en - Big data CROS portal, http://ec.europa.eu/eurostat/cros/content/big-data_en - Marcellino, M. (2016), “Nowcasting with Big Data”, Keynote Speech at the 33rd CIRET conference. - Harford, T. (2014, April). Big data: Are we making a big mistake? Financial Times. Available at http://www.ft.com/cms/s/2/21a6e7d8-b479-11e3-a09a-00144feabdc0.html #ixzz2xcdlP1zZ - Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014). "The Parable of Google Flu: Traps in Big Data Analysis", Science, 143, 1203-1205. - Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of the Royal Statistical Society B, 58, 267-288.