SlideShare a Scribd company logo
Dawn of the Data Age Lecture Series
Interpreting Data Like a Pro
Hi. I’m Luciano Pesci…
Co-Founder & CEO, EMPERITAS
● A Services as a Subscription team of economists and data scientists delivering bi-weekly Customer
Lifetime Value intelligence so our clients can beat their competitors for the most profitable customers.
Founder & Director, Utah Community Research Group, Univ. of Utah
● Teach microeconomics, statistics, applied research & data analytics, & American economic history.
● Teach data science for Westminster and developed their 3-class MBA emphasis in data science.
2
Today’s Lecture Outline
● Teach you how to identify data types & context.
● Explain the right way to select analysis methods.
● Show you the core data interpretation skills.
3
4DATA TYPES & CONTEXT
Defining Data Differently
● There are many ways to define data, each
requires a different approach when utilizing it:
○ Origin - How it was created.
○ Totality - If it’s a sample or a census.
○ Scope - Whether it’s been captured over time.
○ Measurement - How it was quantified.
5
What’s The Origin Story?
● Understanding the origin of your data is key
to grasping its context:
○ Experiments produce data with strong causal
patterns but it’s costly to collect & analyze.
○ Survey data is easy to get, but it shows intent or
attitude, not necessarily actual outcomes.
○ Observational data is mostly captured by machines
and shows actual outcomes, but it’s very rigid.
6
What’s the Totality?
● If you have data on every possible unit in a
population of interest, then it’s Census data.
● In most cases you’ll only have a Sample which
can be used to infer patterns about the larger
(unknowable) population.
7
Scoping Time?
● If your data contains different variables, all
measured at the same time, it’s Cross-Sectional.
○ Most data that you encounter will be cross-sectional.
● If your data contains multiple measurements of
the same variable over time, it’s Time Series.
8
Data Measurement?
9
● All data fits into 4 basic types
based on how it was measured:
○ Nominal & Ordinal = CATEGORICAL.
○ Interval & Ratio = CONTINUOUS.
● Identifying data types is a
critical skill to develop.
○ Analysis selection &
interpretation depend on it.
10SELECTING THE RIGHT ANALYSIS
Categorical or Continuous?
● The biggest difference when selecting analysis is
based on whether the data is categorical
(nominal, ordinal) or continuous (interval, ratio).
○ So much of what you can or can’t do is determined by
the data’s measurement type.
○ Time Series vs Cross-Sectional is another important
distinction that radically changes your approach.
11
Looking for Differences
● Tests of difference, like comparing medians
or means, are a good way to find unique
subgroups within the data.
○ This should only be done with continuous data,
though you can use categorical variables for
subgrouping when testing for differences.
12
Looking for Similarities
● Measures of association (like correlation)
are a good way to find patterns that move
together in the data.
● While correlation doesn’t equal causation,
theory can help you understand when
correlations are likely to be real or not.*
13
*Source: www.tylervigen.com/spurious-correlations
14CORE DATA INTERPRETATION SKILLS
Looking At It Both Ways
● You should look at tables & visualizations.
○ Each tells a unique part of the data’s story.
● 3 very specific things to find (when possible
based on your data type):
○ Shape of the distribution
○ Center of the distribution
○ Spread of the distribution
15
Understanding Shape
● The shape of any ordinal, interval or ratio
data is important to its interpretation.
○ Can show multimodality and/or outliers.
● This is much easier to see through a
visualization than from a table of numbers.
16
Understanding Center
● The central value of interval or ratio data
tells you what to predictively expect (it's
potentially the most frequent value).
○ You should calculate both the median & mean.
■ If they differ this is a sign you have skew
in the data, possibly from outliers.
17
Understanding Spread
● The spread of interval and ratio data tells an
important story about precision of your predictability.
○ Calculate the Interquartile Range (IQR).
■ This is found by subtracting the 1st quartile from the 3rd
quartile, and shows 50% of the data.
○ You can also calculate the variance and standard deviation.
18
5-Number Summary
● Between a visualization and the 5-Number
Summary you get most of the information
you need to interpret what’s going on with
your variable.
○ This will show you the min value, quartiles,
median/mean, and max value.
○ The only thing that’s missing is the number of
observations (n-count).
19
20WORKED EXAMPLE
The Example Data’s Origin
● The example data comes from a survey of festival goers (aka customers)
and was linked to observational data about their
lifetime ticket sales.
● It’s a cross-sectional sample (n=3,834) since we
don’t have every festival customer’s feedback and
the data was captured at a single moment in time.
21
Inspecting Your Data File
● Before you start summarizing and visualizing
your data, open the raw file and look around.
● Make sure you can identify what the rows
are, and what each column measures.
○ When in doubt, ask for a data map or data dictionary.
22
Ordinal Data: Years Attended
23
68% of festival customers have been attending for less than 10 years.
1 in 10 have been attending for more than 20 years.
Interval Data: Likelihood to Recommend
24
Min 0
1st Quartile 9
Median 10
Mean 9.2
2nd Quartile 10
Max 10
5-Number Summary
~80% of festival customers are likely
to recommend (9’s & 10’s).
Making It Ordinal: Net Promoter Groups*
● It’s always possible to transform
data from continuous to categorical,
but not the other way around.
○ Likelihood to recommend can be
transformed into categorical
groups to create a simpler metric:
■ Net Promoter Score (NPS).
25
*Source: https://hbr.org/2003/12/the-one-number-you-need-to-grow
From Interval to Ordinal Data: NPS
26
● The Festival’s NPS is: 75%
● We could use these groups for
testing differences, like in their
Customer Lifetime Value.
○ This is often why you want to create
categorical data from continuous data.
Ratio Data: # Tickets Purchased Per Visit
27
Min 0
1st Quartile 2
Median 4
Mean 5.8
2nd Quartile 6
Max 400
5-Number Summary
The presence of outliers hides an important pattern
in this data. To see it, we will drop outliers who
purchase more than 13 tickets per visit.*
*You should ALWAYS note when you drop outliers from analysis.
Ratio Data: # Tickets Purchased Per Visit
28
Min 0
1st Quartile 2
Median 4
Mean 4.4
2nd Quartile 6
Max 12
5-Number Summary
With outliers removed the mean falls to ~4 tickets, and we
can see multimodality for even-numbered purchases.
People don’t like to go to the festival alone.
Ratio Data: Customer Lifetime Value
29
Min 0
1st Quartile 124
Median 336
Mean 1510
2nd Quartile 1125
Max 479878
5-Number SummaryAs with tickets purchased, the presence of outliers is
obscuring any detail in the visualization.
The maximum value of $479,878 is suspiciously high
(though it turns out to be an accurate value, despite
being 55 standard deviations above the mean*).
*Values more than 3 Standard Deviations from the mean are considered outliers.
Ratio Data: Customer Lifetime Value
30
Min 0
1st Quartile 112
Median 249
Mean 486
2nd Quartile 642
Max 2624
5-Number Summary
Dropping outliers above $2,624 causes the mean
to fall significantly from its previous level.
This shows EXTREME leverage in the data.
Ratio Data: Customer Lifetime Value
31
Like most data, the festival’s Customer Lifetime Value
exhibits the Pareto Principle (aka the 80/20 rule).
This means 80% of all CLV comes from 20% of customers.
What We Learned About CLV?
● Most festival customer have been attending for less than 10 years, but
there’s a small group that’s been coming for more than 20.
● Festival customers are unlikely to come alone, they’ll buy 4 tickets,
and virtually all are likely to recommend the festival.
● The average CLV is $486 and 80% of all CLV
comes from just 20% of festival customers.
32
Next Step: Analytics & Predictive Modeling
● The next step for this data would be
multivariate analytics.
○ Tests of difference & measures of association.
○ Present discounted value of future ticket sales.
● After that, we could use all of the data
to build a predictive model for CLV.
33
JOIN US FOR THE NEXT LECTURE
Turning Analytics into Actionable Insights, Thursday October 19th 2017
emperitas.com/lecture

More Related Content

PPTX
A high level overview of all that is Analytics
PPTX
Risk Based Loan Approval Framework
PPTX
Branches of statistics
PDF
Data science 101 statistics overview
PPT
060 techniques of_data_analysis
PDF
Data analysis01 singlevariable
PPT
Data Analysis
DOCX
Statics for the management
A high level overview of all that is Analytics
Risk Based Loan Approval Framework
Branches of statistics
Data science 101 statistics overview
060 techniques of_data_analysis
Data analysis01 singlevariable
Data Analysis
Statics for the management

What's hot (18)

DOCX
Scope and objective of the assignment
DOCX
Statistics for management
PDF
Statistics for data scientists
DOC
Statistics Assignments 090427
PDF
Panel slides
PPTX
Step by Step guide to executing an analytics project
PPTX
A power point presentation on statistics
PDF
All About Big Data
PPT
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
PDF
Exploratory data analysis
DOCX
Business statistics
PPTX
Data Analysis and Statistics
PPTX
Basics of data_interpretation
PPTX
Introduction of statistics and probability
PDF
Introduction to Statistics - Basic Statistical Terms
PPTX
Basic Statistics & Data Analysis
PPTX
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
PPTX
Business statistics
Scope and objective of the assignment
Statistics for management
Statistics for data scientists
Statistics Assignments 090427
Panel slides
Step by Step guide to executing an analytics project
A power point presentation on statistics
All About Big Data
Basic statistics by Neeraj Bhandari ( Surkhet.Nepal )
Exploratory data analysis
Business statistics
Data Analysis and Statistics
Basics of data_interpretation
Introduction of statistics and probability
Introduction to Statistics - Basic Statistical Terms
Basic Statistics & Data Analysis
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Business statistics
Ad

Similar to Interpreting Data Like a Pro - Dawn of the Data Age Lecture Series (20)

PDF
PPTX
Preprocessing_exploring_and_Visualization.pptx
PPTX
Unit 2_ Descriptive Analytics for MBA .pptx
PDF
Pelatihan Data Analitik
PPTX
SUVIDHA CHAPLOT Business Statistics_ Data-Driven Decision Making.pptx
PPT
hanjia chapter_2.ppt data mining chapter 2
PPTX
Data analysis aug-11
PPT
02Data mining 243657786756868766758(1).ppt
PDF
Presentation.pdf is very helpful for engineers
PDF
Presentation.pdf very helpful for engineers
PDF
Presentation.pdf for describing data for engineers
PDF
Presentation.pdf describing data with foundation of data science
PDF
02Data-osu-0829.pdf
PPT
Data Mining: Concepts and Techniques — Chapter 2 —
PPT
Data mining :Concepts and Techniques Chapter 2, data
PPTX
Statistics with R
PPT
Getting to Know Your Data Some sources from where you can access datasets for...
PDF
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
 
PPTX
Exploratory Data Analysis week 4
Preprocessing_exploring_and_Visualization.pptx
Unit 2_ Descriptive Analytics for MBA .pptx
Pelatihan Data Analitik
SUVIDHA CHAPLOT Business Statistics_ Data-Driven Decision Making.pptx
hanjia chapter_2.ppt data mining chapter 2
Data analysis aug-11
02Data mining 243657786756868766758(1).ppt
Presentation.pdf is very helpful for engineers
Presentation.pdf very helpful for engineers
Presentation.pdf for describing data for engineers
Presentation.pdf describing data with foundation of data science
02Data-osu-0829.pdf
Data Mining: Concepts and Techniques — Chapter 2 —
Data mining :Concepts and Techniques Chapter 2, data
Statistics with R
Getting to Know Your Data Some sources from where you can access datasets for...
資料視覺化_透過Orange3進行_無須寫程式直接使用_碩士學程_202403.pdf
 
Exploratory Data Analysis week 4
Ad

More from Luciano Pesci, PhD (20)

PDF
Lifetime Value - The Only Metric That Matters (DMC September 2018)
PDF
Crypto Economics Crash Course
PDF
Welcome To The Data Age - Dawn of the Data Age Lecture Series
PDF
Think Like An Economist - Dawn Of The Data Age Lecture Series
PDF
Identifying Personas With Agile Research - Dawn of the Data Age Lecture Series
PDF
Data Mapping Customer Touchpoints - Dawn of the Data Age Lecture Series
PDF
Creating Data Driven Customer Profiles - Dawn of the Data Age Lecture Series
PDF
Sales Hacks with Market Research - Dawn of the Data Age Lecture Series
PDF
Data Drive Better Sales Conversions - Dawn of the Data Age Lecture Series
PDF
Customer Research For Product Managers - Dawn of The Data Age Lecture Series
PDF
Data Driven Product Vision - Dawn of the Data Age Lecture Series
PDF
Data Drive Your Content Creation - Dawn of the Data Age Lecture Series
PDF
Step Up Your Survey Research - Dawn of the Data Age Lecture Series
PDF
Calculating Your Customer Lifetime Value - Dawn of the Data Age Lecture Series
PDF
From Analytics Into Actionable Insights - Dawn of the Data Age Lecture Series
PDF
Getting to Quick Wins with Data - Dawn of the Data Age Lecture Series
PDF
Building a Data Culture at Your Organization - Dawn of the Data Age Lecture S...
PDF
Storytelling with data think broad, mine deep, explain simply
PDF
Grow Your Own - How to Create a Data Culture at Your Organization
PDF
Stop Burning Your AdWords Budget - Simple Optimization Tactics to Make Your S...
Lifetime Value - The Only Metric That Matters (DMC September 2018)
Crypto Economics Crash Course
Welcome To The Data Age - Dawn of the Data Age Lecture Series
Think Like An Economist - Dawn Of The Data Age Lecture Series
Identifying Personas With Agile Research - Dawn of the Data Age Lecture Series
Data Mapping Customer Touchpoints - Dawn of the Data Age Lecture Series
Creating Data Driven Customer Profiles - Dawn of the Data Age Lecture Series
Sales Hacks with Market Research - Dawn of the Data Age Lecture Series
Data Drive Better Sales Conversions - Dawn of the Data Age Lecture Series
Customer Research For Product Managers - Dawn of The Data Age Lecture Series
Data Driven Product Vision - Dawn of the Data Age Lecture Series
Data Drive Your Content Creation - Dawn of the Data Age Lecture Series
Step Up Your Survey Research - Dawn of the Data Age Lecture Series
Calculating Your Customer Lifetime Value - Dawn of the Data Age Lecture Series
From Analytics Into Actionable Insights - Dawn of the Data Age Lecture Series
Getting to Quick Wins with Data - Dawn of the Data Age Lecture Series
Building a Data Culture at Your Organization - Dawn of the Data Age Lecture S...
Storytelling with data think broad, mine deep, explain simply
Grow Your Own - How to Create a Data Culture at Your Organization
Stop Burning Your AdWords Budget - Simple Optimization Tactics to Make Your S...

Recently uploaded (20)

PDF
annual-report-2024-2025 original latest.
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Business Analytics and business intelligence.pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPT
Quality review (1)_presentation of this 21
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Foundation of Data Science unit number two notes
PDF
Lecture1 pattern recognition............
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
annual-report-2024-2025 original latest.
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Business Analytics and business intelligence.pdf
Supervised vs unsupervised machine learning algorithms
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Quality review (1)_presentation of this 21
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Ppt On Nestle.pptx huunnnhhgfvu
Clinical guidelines as a resource for EBP(1).pdf
Foundation of Data Science unit number two notes
Lecture1 pattern recognition............
Data_Analytics_and_PowerBI_Presentation.pptx
ISS -ESG Data flows What is ESG and HowHow
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx

Interpreting Data Like a Pro - Dawn of the Data Age Lecture Series

  • 1. Dawn of the Data Age Lecture Series Interpreting Data Like a Pro
  • 2. Hi. I’m Luciano Pesci… Co-Founder & CEO, EMPERITAS ● A Services as a Subscription team of economists and data scientists delivering bi-weekly Customer Lifetime Value intelligence so our clients can beat their competitors for the most profitable customers. Founder & Director, Utah Community Research Group, Univ. of Utah ● Teach microeconomics, statistics, applied research & data analytics, & American economic history. ● Teach data science for Westminster and developed their 3-class MBA emphasis in data science. 2
  • 3. Today’s Lecture Outline ● Teach you how to identify data types & context. ● Explain the right way to select analysis methods. ● Show you the core data interpretation skills. 3
  • 4. 4DATA TYPES & CONTEXT
  • 5. Defining Data Differently ● There are many ways to define data, each requires a different approach when utilizing it: ○ Origin - How it was created. ○ Totality - If it’s a sample or a census. ○ Scope - Whether it’s been captured over time. ○ Measurement - How it was quantified. 5
  • 6. What’s The Origin Story? ● Understanding the origin of your data is key to grasping its context: ○ Experiments produce data with strong causal patterns but it’s costly to collect & analyze. ○ Survey data is easy to get, but it shows intent or attitude, not necessarily actual outcomes. ○ Observational data is mostly captured by machines and shows actual outcomes, but it’s very rigid. 6
  • 7. What’s the Totality? ● If you have data on every possible unit in a population of interest, then it’s Census data. ● In most cases you’ll only have a Sample which can be used to infer patterns about the larger (unknowable) population. 7
  • 8. Scoping Time? ● If your data contains different variables, all measured at the same time, it’s Cross-Sectional. ○ Most data that you encounter will be cross-sectional. ● If your data contains multiple measurements of the same variable over time, it’s Time Series. 8
  • 9. Data Measurement? 9 ● All data fits into 4 basic types based on how it was measured: ○ Nominal & Ordinal = CATEGORICAL. ○ Interval & Ratio = CONTINUOUS. ● Identifying data types is a critical skill to develop. ○ Analysis selection & interpretation depend on it.
  • 11. Categorical or Continuous? ● The biggest difference when selecting analysis is based on whether the data is categorical (nominal, ordinal) or continuous (interval, ratio). ○ So much of what you can or can’t do is determined by the data’s measurement type. ○ Time Series vs Cross-Sectional is another important distinction that radically changes your approach. 11
  • 12. Looking for Differences ● Tests of difference, like comparing medians or means, are a good way to find unique subgroups within the data. ○ This should only be done with continuous data, though you can use categorical variables for subgrouping when testing for differences. 12
  • 13. Looking for Similarities ● Measures of association (like correlation) are a good way to find patterns that move together in the data. ● While correlation doesn’t equal causation, theory can help you understand when correlations are likely to be real or not.* 13 *Source: www.tylervigen.com/spurious-correlations
  • 15. Looking At It Both Ways ● You should look at tables & visualizations. ○ Each tells a unique part of the data’s story. ● 3 very specific things to find (when possible based on your data type): ○ Shape of the distribution ○ Center of the distribution ○ Spread of the distribution 15
  • 16. Understanding Shape ● The shape of any ordinal, interval or ratio data is important to its interpretation. ○ Can show multimodality and/or outliers. ● This is much easier to see through a visualization than from a table of numbers. 16
  • 17. Understanding Center ● The central value of interval or ratio data tells you what to predictively expect (it's potentially the most frequent value). ○ You should calculate both the median & mean. ■ If they differ this is a sign you have skew in the data, possibly from outliers. 17
  • 18. Understanding Spread ● The spread of interval and ratio data tells an important story about precision of your predictability. ○ Calculate the Interquartile Range (IQR). ■ This is found by subtracting the 1st quartile from the 3rd quartile, and shows 50% of the data. ○ You can also calculate the variance and standard deviation. 18
  • 19. 5-Number Summary ● Between a visualization and the 5-Number Summary you get most of the information you need to interpret what’s going on with your variable. ○ This will show you the min value, quartiles, median/mean, and max value. ○ The only thing that’s missing is the number of observations (n-count). 19
  • 21. The Example Data’s Origin ● The example data comes from a survey of festival goers (aka customers) and was linked to observational data about their lifetime ticket sales. ● It’s a cross-sectional sample (n=3,834) since we don’t have every festival customer’s feedback and the data was captured at a single moment in time. 21
  • 22. Inspecting Your Data File ● Before you start summarizing and visualizing your data, open the raw file and look around. ● Make sure you can identify what the rows are, and what each column measures. ○ When in doubt, ask for a data map or data dictionary. 22
  • 23. Ordinal Data: Years Attended 23 68% of festival customers have been attending for less than 10 years. 1 in 10 have been attending for more than 20 years.
  • 24. Interval Data: Likelihood to Recommend 24 Min 0 1st Quartile 9 Median 10 Mean 9.2 2nd Quartile 10 Max 10 5-Number Summary ~80% of festival customers are likely to recommend (9’s & 10’s).
  • 25. Making It Ordinal: Net Promoter Groups* ● It’s always possible to transform data from continuous to categorical, but not the other way around. ○ Likelihood to recommend can be transformed into categorical groups to create a simpler metric: ■ Net Promoter Score (NPS). 25 *Source: https://hbr.org/2003/12/the-one-number-you-need-to-grow
  • 26. From Interval to Ordinal Data: NPS 26 ● The Festival’s NPS is: 75% ● We could use these groups for testing differences, like in their Customer Lifetime Value. ○ This is often why you want to create categorical data from continuous data.
  • 27. Ratio Data: # Tickets Purchased Per Visit 27 Min 0 1st Quartile 2 Median 4 Mean 5.8 2nd Quartile 6 Max 400 5-Number Summary The presence of outliers hides an important pattern in this data. To see it, we will drop outliers who purchase more than 13 tickets per visit.* *You should ALWAYS note when you drop outliers from analysis.
  • 28. Ratio Data: # Tickets Purchased Per Visit 28 Min 0 1st Quartile 2 Median 4 Mean 4.4 2nd Quartile 6 Max 12 5-Number Summary With outliers removed the mean falls to ~4 tickets, and we can see multimodality for even-numbered purchases. People don’t like to go to the festival alone.
  • 29. Ratio Data: Customer Lifetime Value 29 Min 0 1st Quartile 124 Median 336 Mean 1510 2nd Quartile 1125 Max 479878 5-Number SummaryAs with tickets purchased, the presence of outliers is obscuring any detail in the visualization. The maximum value of $479,878 is suspiciously high (though it turns out to be an accurate value, despite being 55 standard deviations above the mean*). *Values more than 3 Standard Deviations from the mean are considered outliers.
  • 30. Ratio Data: Customer Lifetime Value 30 Min 0 1st Quartile 112 Median 249 Mean 486 2nd Quartile 642 Max 2624 5-Number Summary Dropping outliers above $2,624 causes the mean to fall significantly from its previous level. This shows EXTREME leverage in the data.
  • 31. Ratio Data: Customer Lifetime Value 31 Like most data, the festival’s Customer Lifetime Value exhibits the Pareto Principle (aka the 80/20 rule). This means 80% of all CLV comes from 20% of customers.
  • 32. What We Learned About CLV? ● Most festival customer have been attending for less than 10 years, but there’s a small group that’s been coming for more than 20. ● Festival customers are unlikely to come alone, they’ll buy 4 tickets, and virtually all are likely to recommend the festival. ● The average CLV is $486 and 80% of all CLV comes from just 20% of festival customers. 32
  • 33. Next Step: Analytics & Predictive Modeling ● The next step for this data would be multivariate analytics. ○ Tests of difference & measures of association. ○ Present discounted value of future ticket sales. ● After that, we could use all of the data to build a predictive model for CLV. 33
  • 34. JOIN US FOR THE NEXT LECTURE Turning Analytics into Actionable Insights, Thursday October 19th 2017 emperitas.com/lecture