SlideShare a Scribd company logo
EVOLVING YOUR ANALYTICS
STACK WITH YOUR BUSINESS
SNOWPLOW - LONDON MEETUP #4
SNOWPLOW - LONDON MEETUP #4
BUSINESSES ARE CONSTANTLY EVOLVING…
▸ Your products (apps & platforms) change
▸ Your questions should change too
▸ It’s critical that the analytics stack can evolve with your
business
SNOWPLOW - LONDON MEETUP #4
SELF-DESCRIBING DATA EVENT DATA MODELING+
EVOLVING EVENT DATA PIPELINE
HOW?
SELF-DESCRIBING DATA
PART 1
SNOWPLOW - LONDON MEETUP #4
NO TWO COMPANIES ARE ALIKE
SNOWPLOW - LONDON MEETUP #4
DEFINE YOUR OWN EVENTS AND ENTITIES
Events
Entities
‣ Build castle
‣ Form alliance
‣ Declare war
‣ Player
‣ Game
‣ Level
‣ Castle
‣ View product
‣ Buy product
‣ Deliver product
‣ Product
‣ Customer
‣ Basket
‣ Vehicle
"description": "Schema for a fighter context",
"vendor": "com.ufc",
"name": “fighter",
"version": “1-0-2“,
"properties": {
"FirstName": {"type": "string"},
"LastName": {"type": "string"},
"Nickname": {"type": "string"},
"FacebookProfile": {"type": "string"},
"WeightLbs": {"type": ["integer", "null"]},
"Record": {"type": “string", "pattern": "^[0-9]+-[0-9]+-[0-9]+$"}
}
}
SNOWPLOW - LONDON MEETUP #4
YOU THEN DEFINE A SCHEMA FOR EACH EVENT AND ENTITY
I DON’T DO EVENTS
THAT AREN’T SCHEMA’ED
SNOWPLOW - LONDON MEETUP #4
YOU THEN DEFINE A SCHEMA FOR EACH EVENT AND ENTITY
"schema": "iglu:ufc/fighter/jsonschema/1-0-2",
"data": {
"FirstName": “Daniel”
"LastName": “Cormier”,
"Nickname": “DC”,
"FacebookProfile": “Daniel-Cormier”,
"TwitterName": “dc_mma”,
"WeightLbs": 205
}
}
SNOWPLOW - LONDON MEETUP #4
THE SCHEMAS CAN THEN BE USED IN A NUMBER OF WAYS
▸ Validate the data (important for data quality)
▸ Load the data into tidy tables in your data warehouse
▸ Make it easy / safe to write downstream data processing
application (e.g. for real-time users)
EVENT DATA MODELING
PART 2
SNOWPLOW - LONDON MEETUP #4
WHAT IS EVENT DATA MODELING?
▸ Event data modeling is the process of using business logic
to aggregate over event-level data to produce 'modeled'
data that is simpler for querying.
SNOWPLOW - LONDON MEETUP #4
MODELED VS UNMODELED DATA
event 1
event n
…
Users
Sessions
…
Funnels
IMMUTABLE.
UNOPINIATED. HARD TO CONSUME. NOT
MUTABLE
AND OPINIONATED. EASY TO CONSUME.
SNOWPLOW - LONDON MEETUP #4
IN GENERAL, EVENT DATA MODELING IS PERFORMED ON THE COMPLETE EVENT STREAM
▸ Late arriving events can change the way you understand
earlier arriving events
▸ If we change our data models: this gives us the flexibility
to recompute historical data based on the new model
EVOLVING THE DATA PIPELINE
PART 3
SNOWPLOW - LONDON MEETUP #4
HOW DO WE HANDLE PIPELINE EVOLUTION?
▸ Businesses change over time
▸ The events that occur are going to change
▸ Use of the data will change
▸ Insight -> more questions -> more insight -> more
questions
▸ Two types of evolution: push and pull
BUSINESSES ARE NOT STATIC, SO EVENT PIPELINES SHOULD NOT BE EITHER
SNOWPLOW - LONDON MEETUP #4
PUSH EXAMPLE:
▸ If data is self-describing it is easy to add an additional
sources
▸ Self-describing data is good for managing bad data and
pipeline evolution
I’M
AN EMAIL SEND
EVENT AND I HAVE
INFORMATION ABOUT THE
RECIPIENT (EMAIL
SNOWPLOW - LONDON MEETUP #4
ANSWERING THE QUESTION:
1. EXISTING DATA MODEL
SUPPORTS ANSWER
2. NEED TO UPDATE DATA
MODEL
3. NEED TO UPDATE DATA
MODEL AND DATA COLLECTION
SNOWPLOW - LONDON MEETUP #4
SELF-DESCRIBING DATA AND THE ABILITY TO RECOMPUTE DATA MODELS ARE ESSENTIAL TO ENABLE PIPELINE EVOLUTION
SELF-DESCRIBING DATA RECOMPUTE DATA MODELS ON ENTIRE DATA SET
‣ Updating existing events and entities in a
backward compatible way e.g. add
optional new fields
‣ Update existing events and entities in a
backwards incompatible way e.g. change
field types, remove fields, add
compulsory fields
‣ Add new event and entity types
‣ Add new columns to existing derived
tables e.g. add new audience
segmentation
‣ Change the way existing derived tables
are generated e.g. change
sessionization logic
‣ Create new derived tables
QUESTIONS?
SNOWPLOW - LONDON MEETUP #4

More Related Content

PDF
Snowplow: evolve your analytics stack with your business
PPTX
A taste of Snowplow Analytics data
PDF
Big data meetup budapest adding data schemas to snowplow
PDF
How Gousto is moving to just-in-time personalization with Snowplow
PPTX
Understanding event data
PPTX
Modelling event data in look ml
PDF
Snowplow - Evolve your analytics stack with your business
PDF
2016 09 measurecamp - event data modeling
Snowplow: evolve your analytics stack with your business
A taste of Snowplow Analytics data
Big data meetup budapest adding data schemas to snowplow
How Gousto is moving to just-in-time personalization with Snowplow
Understanding event data
Modelling event data in look ml
Snowplow - Evolve your analytics stack with your business
2016 09 measurecamp - event data modeling

What's hot (20)

PPTX
Snowplow Analytics and Looker at Oyster.com
PPTX
Why use big data tools to do web analytics? And how to do it using Snowplow a...
PDF
Snowplow: open source game analytics powered by AWS
PPTX
How we use Hive at SnowPlow, and how the role of HIve is changing
PPTX
Snowplow Analytics: from NoSQL to SQL and back again
PDF
Snowplow presentation for Amsterdam Meetup #3
PPTX
Snowplow: where we came from and where we are going - March 2016
PPTX
Snowplow, Metail and Cascalog
PPTX
Snowplow the evolving data pipeline
PPTX
Big Data Beers - Introducing Snowplow
PDF
Using Snowplow for A/B testing and user journey analysis at CustomMade
PDF
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
PDF
Simply Business - Near Real Time Event Processing
PPTX
Simply Business and Snowplow - Multichannel Attribution Analysis
PPTX
Implementing improved and consistent arbitrary event tracking company-wide us...
PDF
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
PPTX
Real-Time, Geospatial, Maps by Neil Dahlke
PDF
Snowplow at DA Hub emerging technology showcase
PPTX
Snowplow is at the core of everything we do
PPTX
In-Memory Computing Webcast. Market Predictions 2017
Snowplow Analytics and Looker at Oyster.com
Why use big data tools to do web analytics? And how to do it using Snowplow a...
Snowplow: open source game analytics powered by AWS
How we use Hive at SnowPlow, and how the role of HIve is changing
Snowplow Analytics: from NoSQL to SQL and back again
Snowplow presentation for Amsterdam Meetup #3
Snowplow: where we came from and where we are going - March 2016
Snowplow, Metail and Cascalog
Snowplow the evolving data pipeline
Big Data Beers - Introducing Snowplow
Using Snowplow for A/B testing and user journey analysis at CustomMade
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Simply Business - Near Real Time Event Processing
Simply Business and Snowplow - Multichannel Attribution Analysis
Implementing improved and consistent arbitrary event tracking company-wide us...
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Real-Time, Geospatial, Maps by Neil Dahlke
Snowplow at DA Hub emerging technology showcase
Snowplow is at the core of everything we do
In-Memory Computing Webcast. Market Predictions 2017
Ad

Similar to How to evolve your analytics stack with your business using Snowplow (20)

PDF
What makes an effective data team?
PDF
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
PDF
Understanding “Event” in Event Data
PPTX
Yali presentation for snowplow amsterdam meetup number 2
PPTX
Crack the Domain with Event Storming By Vivek
PPT
Richer Data History in Groovy with Event Sourcing
PPTX
INTERNET OF THINGS On data acquisition m2m systems
PPTX
How to collect Google Analytics events to your own data warehouse and do it o...
PPTX
Lesson 3 - The Kimbal Lifecycle.pptx
PPTX
CSC612 THIRD LECTURE ON DATA WAREHOUSE.pptx
PPTX
Who changed my data? Need for data governance and provenance in a streaming w...
PDF
Module 2 Data Collection and Management.pdf
PPTX
Data is The Future of Events
PDF
Metail at Cambridge AWS User Group Main Meetup #3
PDF
Capturing online customer data to create better insights and targeted actions...
PDF
Reliable and Scalable Data Ingestion at Airbnb
PPTX
Dataware house Introduction By Quontra Solutions
PPTX
Modern data warehouse presentation
PPTX
Web Analytics: Challenges in Data Modeling
PPTX
Debbie Wilson: Deliver More Efficient, Joined-Up Services through Improved Ma...
What makes an effective data team?
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
Understanding “Event” in Event Data
Yali presentation for snowplow amsterdam meetup number 2
Crack the Domain with Event Storming By Vivek
Richer Data History in Groovy with Event Sourcing
INTERNET OF THINGS On data acquisition m2m systems
How to collect Google Analytics events to your own data warehouse and do it o...
Lesson 3 - The Kimbal Lifecycle.pptx
CSC612 THIRD LECTURE ON DATA WAREHOUSE.pptx
Who changed my data? Need for data governance and provenance in a streaming w...
Module 2 Data Collection and Management.pdf
Data is The Future of Events
Metail at Cambridge AWS User Group Main Meetup #3
Capturing online customer data to create better insights and targeted actions...
Reliable and Scalable Data Ingestion at Airbnb
Dataware house Introduction By Quontra Solutions
Modern data warehouse presentation
Web Analytics: Challenges in Data Modeling
Debbie Wilson: Deliver More Efficient, Joined-Up Services through Improved Ma...
Ad

Recently uploaded (20)

PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Business Analytics and business intelligence.pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Introduction to Data Science and Data Analysis
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Fluorescence-microscope_Botany_detailed content
SAP 2 completion done . PRESENTATION.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
IB Computer Science - Internal Assessment.pptx
.pdf is not working space design for the following data for the following dat...
IBA_Chapter_11_Slides_Final_Accessible.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Knowledge Engineering Part 1
Business Ppt On Nestle.pptx huunnnhhgfvu
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Business Analytics and business intelligence.pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Qualitative Qantitative and Mixed Methods.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction-to-Cloud-ComputingFinal.pptx
Introduction to Data Science and Data Analysis
Miokarditis (Inflamasi pada Otot Jantung)
Fluorescence-microscope_Botany_detailed content

How to evolve your analytics stack with your business using Snowplow

  • 1. EVOLVING YOUR ANALYTICS STACK WITH YOUR BUSINESS SNOWPLOW - LONDON MEETUP #4
  • 2. SNOWPLOW - LONDON MEETUP #4 BUSINESSES ARE CONSTANTLY EVOLVING… ▸ Your products (apps & platforms) change ▸ Your questions should change too ▸ It’s critical that the analytics stack can evolve with your business
  • 3. SNOWPLOW - LONDON MEETUP #4 SELF-DESCRIBING DATA EVENT DATA MODELING+ EVOLVING EVENT DATA PIPELINE HOW?
  • 5. SNOWPLOW - LONDON MEETUP #4 NO TWO COMPANIES ARE ALIKE
  • 6. SNOWPLOW - LONDON MEETUP #4 DEFINE YOUR OWN EVENTS AND ENTITIES Events Entities ‣ Build castle ‣ Form alliance ‣ Declare war ‣ Player ‣ Game ‣ Level ‣ Castle ‣ View product ‣ Buy product ‣ Deliver product ‣ Product ‣ Customer ‣ Basket ‣ Vehicle
  • 7. "description": "Schema for a fighter context", "vendor": "com.ufc", "name": “fighter", "version": “1-0-2“, "properties": { "FirstName": {"type": "string"}, "LastName": {"type": "string"}, "Nickname": {"type": "string"}, "FacebookProfile": {"type": "string"}, "WeightLbs": {"type": ["integer", "null"]}, "Record": {"type": “string", "pattern": "^[0-9]+-[0-9]+-[0-9]+$"} } } SNOWPLOW - LONDON MEETUP #4 YOU THEN DEFINE A SCHEMA FOR EACH EVENT AND ENTITY I DON’T DO EVENTS THAT AREN’T SCHEMA’ED
  • 8. SNOWPLOW - LONDON MEETUP #4 YOU THEN DEFINE A SCHEMA FOR EACH EVENT AND ENTITY "schema": "iglu:ufc/fighter/jsonschema/1-0-2", "data": { "FirstName": “Daniel” "LastName": “Cormier”, "Nickname": “DC”, "FacebookProfile": “Daniel-Cormier”, "TwitterName": “dc_mma”, "WeightLbs": 205 } }
  • 9. SNOWPLOW - LONDON MEETUP #4 THE SCHEMAS CAN THEN BE USED IN A NUMBER OF WAYS ▸ Validate the data (important for data quality) ▸ Load the data into tidy tables in your data warehouse ▸ Make it easy / safe to write downstream data processing application (e.g. for real-time users)
  • 11. SNOWPLOW - LONDON MEETUP #4 WHAT IS EVENT DATA MODELING? ▸ Event data modeling is the process of using business logic to aggregate over event-level data to produce 'modeled' data that is simpler for querying.
  • 12. SNOWPLOW - LONDON MEETUP #4 MODELED VS UNMODELED DATA event 1 event n … Users Sessions … Funnels IMMUTABLE. UNOPINIATED. HARD TO CONSUME. NOT MUTABLE AND OPINIONATED. EASY TO CONSUME.
  • 13. SNOWPLOW - LONDON MEETUP #4 IN GENERAL, EVENT DATA MODELING IS PERFORMED ON THE COMPLETE EVENT STREAM ▸ Late arriving events can change the way you understand earlier arriving events ▸ If we change our data models: this gives us the flexibility to recompute historical data based on the new model
  • 14. EVOLVING THE DATA PIPELINE PART 3
  • 15. SNOWPLOW - LONDON MEETUP #4 HOW DO WE HANDLE PIPELINE EVOLUTION? ▸ Businesses change over time ▸ The events that occur are going to change ▸ Use of the data will change ▸ Insight -> more questions -> more insight -> more questions ▸ Two types of evolution: push and pull BUSINESSES ARE NOT STATIC, SO EVENT PIPELINES SHOULD NOT BE EITHER
  • 16. SNOWPLOW - LONDON MEETUP #4 PUSH EXAMPLE: ▸ If data is self-describing it is easy to add an additional sources ▸ Self-describing data is good for managing bad data and pipeline evolution I’M AN EMAIL SEND EVENT AND I HAVE INFORMATION ABOUT THE RECIPIENT (EMAIL
  • 17. SNOWPLOW - LONDON MEETUP #4 ANSWERING THE QUESTION: 1. EXISTING DATA MODEL SUPPORTS ANSWER 2. NEED TO UPDATE DATA MODEL 3. NEED TO UPDATE DATA MODEL AND DATA COLLECTION
  • 18. SNOWPLOW - LONDON MEETUP #4 SELF-DESCRIBING DATA AND THE ABILITY TO RECOMPUTE DATA MODELS ARE ESSENTIAL TO ENABLE PIPELINE EVOLUTION SELF-DESCRIBING DATA RECOMPUTE DATA MODELS ON ENTIRE DATA SET ‣ Updating existing events and entities in a backward compatible way e.g. add optional new fields ‣ Update existing events and entities in a backwards incompatible way e.g. change field types, remove fields, add compulsory fields ‣ Add new event and entity types ‣ Add new columns to existing derived tables e.g. add new audience segmentation ‣ Change the way existing derived tables are generated e.g. change sessionization logic ‣ Create new derived tables