SlideShare a Scribd company logo
Real-Time Applications At
Terabyte Scale
Isaac Mosquera
VP Engineering, Data & Insights
You’ve probably seen our sharing
tools...
But that’s not all we do...
WE MAKE SOCIAL DATA ACTIONABLE
Over 1B social signals
are processed monthly by
the ShareThis Social
Intelligence Platform™ to
generate insights about
your brand, industry and
events.
ENGAGEMENT
Users consume and
share content across
web and mobile
TARGETING
Desktop and mobile
targeting at scale
INSIGHTS
Actionable cross-device
insights
DATA
1B+ first party
Social Actions
Monthly
ENGAGEMENT
TARGETING INSIGHTS
DATA
• Lookalike Audiences
• Audience Segments
“Wow small SUVs are fuel efficient!”
User #12345
• Automotive Study
• Car Buying Infographic
Why Is Real-Time Important?
Time
Sharing Interest Decays With Time
The Previous
Architecture
Previous Architecture Problems
Duplicated Data
Query
Engine
Share Data
Insights
Query
Engine
Ad Tech
Query
Engine
Consumer
Engagement
Query
Engine
Data Science
Fragmented & Siloed Data Sources
Query
Engine
Share Data
Insights
Query
Engine
Ad Tech
Query
Engine
Consumer
Engagement
Query
Engine
Data Science
Campaign RTB Conversion
Summarization
3rd Party
Trends
Studies
Generating Reports From Old Platform
Raw Data
Pre
Aggregation
Staged Data
Results
Consumers
Query
Rest API
New Report Type
Why Focus On These Problems?
Faster Iterations Data Science New Applications
Business Value
Targeting
The Birth of a New Team
Data Team’s Mission
Making our data easily accessible
Our Data
Vision
Centralize Data sources Data Quality &
Trust
Reliable
Infrastructure
Real Time All The Things
Raw Social
Data
DLX Geo Device
Mappings
SentimentSocial
Keywords
Downstream
Applications
Kafka Architecture
Data ScienceApplication
Data ScienceLogs
Data ScienceProducers
Data ScienceApplication
Data ScienceLogs
Data ScienceProducers
Brokers
Data ScienceConsumers
Data Loaders
Data ScienceAnnotations Data ScienceFilters
Destinations
Big QuerySocial Ad Tech
Integrate Campaign
Social Data
DLX Geo Device
Mappings
SentimentSocial
Keywords
RTB Bid Data
Campaign Data
Downstream
Applications
Build An Active Warehouse
3 Trillion Row Interactive
Query Engine
Share Data
Data
Science
Ad Tech
Consumer
Engagement
Sales
Strategy
Insights
RTB
ImpressionS &
Clicks + RT
External
Data
Science
Data
Science
ATDs
Data
Scienc
e
DMPs DSPs
Internal
Google Big Query
Add in redundancy and robustness into our
data pipeline that protects us against data
loss.
Reliability
Unified Monitoring
Centralizing monitoring allows us to have a
singular definition of “data quality”
Monitoring Infrastructure
Consumer App
Metrics Library
Producer App
Metrics Library
Graphite
Slack
Dev Team
Seyren
Dashboards
Defining Data Quality
Expected Field Distribution Data Loss Business KPIs
What’s Next?
Dynamic Stream Filter
You Want This But You Get This
Stream Sources
Filter Application
Data Filter UI
Filter
Definitions
Data Stream Filter Prototype
Real Time Pipeline
shares from top
100 domains
user actions in
north east region
users who
recently bought
car
user likely to buy a
car soon
actions from user
ids in (1234,
5432, 9999)
Data Science
External
Customers
Data ScienceInternal Teams
Predictive Algorithms
Dynamically create filters based
on customer’s needs. These can
be created instantly on-demand.
Questions?
Isaac Mosquera
twitter: imosquera
e-mail: isaac@sharethis.com

More Related Content

PPTX
Data Science @ ShareThis
PDF
PPT
PeopleBrowsr Keynote Slides - About Us
PPTX
Credibility and Influence - AdTech London 2011 - Jodee Rich
PPT
Research.ly by PeopleBrowsr - Next Generation Social Search
PPTX
Tools to Grow, Measure and Optimize Social Traffic
PPT
Geo Location Mobile Social Networking
PPTX
The Connected User - iCrossing Client Event May 2010
Data Science @ ShareThis
PeopleBrowsr Keynote Slides - About Us
Credibility and Influence - AdTech London 2011 - Jodee Rich
Research.ly by PeopleBrowsr - Next Generation Social Search
Tools to Grow, Measure and Optimize Social Traffic
Geo Location Mobile Social Networking
The Connected User - iCrossing Client Event May 2010

What's hot (20)

PPT
Real Results from Social Marketing
PPTX
Film315 presentation
PPT
Paid Brand watch tools
PPTX
FILM315
PPT
PeopleBrowsr Super Bowl Deck
PDF
Influencer discovery solution
PPT
The Twitter Metadata Revolution And Collective Consciousness by PeopleBrowsr
PPTX
BOLO2010 Coburn
PPTX
Social Media and Community Management
PDF
Delving deeper into viewer experiences - How combined date collection technol...
PPTX
Positioning the User in Mobile Locative Apps
PPTX
Positioning the User in Mobile Locative Apps
PDF
Blockchain: 2018 Media & Influencer Analysis
PPT
socStardom2: Social Media Marketing with Dave Evans
PPTX
Making social media monitoring and analytics work for your brand
PDF
Technology lanscape
PPTX
Crowdsnapfinal
PPT
Press Release Headline Optimization Marketwire Widmann Mrs2009
PDF
WEPOLITICS_20161206_General
PDF
Sma for national_security
Real Results from Social Marketing
Film315 presentation
Paid Brand watch tools
FILM315
PeopleBrowsr Super Bowl Deck
Influencer discovery solution
The Twitter Metadata Revolution And Collective Consciousness by PeopleBrowsr
BOLO2010 Coburn
Social Media and Community Management
Delving deeper into viewer experiences - How combined date collection technol...
Positioning the User in Mobile Locative Apps
Positioning the User in Mobile Locative Apps
Blockchain: 2018 Media & Influencer Analysis
socStardom2: Social Media Marketing with Dave Evans
Making social media monitoring and analytics work for your brand
Technology lanscape
Crowdsnapfinal
Press Release Headline Optimization Marketwire Widmann Mrs2009
WEPOLITICS_20161206_General
Sma for national_security
Ad

Viewers also liked (9)

PPTX
Distributed shred memory architecture
PDF
Inter process communication using Linux System Calls
PPTX
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
PPTX
Architectural patterns for real-time systems
PPT
message passing
PPT
Chapter 4 a interprocess communication
PPT
remote procedure calls
PDF
Inter-Process Communication in distributed systems
PPT
distributed shared memory
Distributed shred memory architecture
Inter process communication using Linux System Calls
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
Architectural patterns for real-time systems
message passing
Chapter 4 a interprocess communication
remote procedure calls
Inter-Process Communication in distributed systems
distributed shared memory
Ad

Similar to Real time pipeline at terabyte sacle (20)

PDF
Hadoop 2.0: YARN to Further Optimize Data Processing
PDF
Dr. Stefan Schwarz - Data is the New Oil
PDF
Big data beyond the hype may 2014
PDF
Data Analytics PowerPoint Presentation Slides
PPTX
The DataSift platform
PPT
ai based computer basic learning Lecture about Bigdata.ppt
PPTX
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
PPTX
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
PDF
Engineering Data Pipeline for Data-Driven Analytics
PPTX
How Data Science Plays the Crucial Role in Social Media
PPTX
Bigdatacooltools
PDF
Big Data Ecosystem @ LinkedIn
PDF
Bigdata the technological renaissance
PDF
Seminaire bigdata23102014
PPTX
bigdataintro.pptx
PDF
From IoT to IoTA
PDF
Introduction to Streaming Analytics
PDF
Enterprise Data Sources PowerPoint Presentation Slides
PPTX
Social media analytics powered by data science
PDF
Bigdata (1) converted
Hadoop 2.0: YARN to Further Optimize Data Processing
Dr. Stefan Schwarz - Data is the New Oil
Big data beyond the hype may 2014
Data Analytics PowerPoint Presentation Slides
The DataSift platform
ai based computer basic learning Lecture about Bigdata.ppt
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
Engineering Data Pipeline for Data-Driven Analytics
How Data Science Plays the Crucial Role in Social Media
Bigdatacooltools
Big Data Ecosystem @ LinkedIn
Bigdata the technological renaissance
Seminaire bigdata23102014
bigdataintro.pptx
From IoT to IoTA
Introduction to Streaming Analytics
Enterprise Data Sources PowerPoint Presentation Slides
Social media analytics powered by data science
Bigdata (1) converted

More from ShareThis (20)

PDF
ShareThis Canadian Millennials Study_2015
PDF
ShareThis TV Study
PPTX
Q1/2015 ShareThis Consumer Sharing Trends Report
PDF
ShareThis Finance Study
PPTX
DataScienceInnovation_ShareThis
PPTX
Share this influentialdemocrats_jan2015
PDF
ShareThis TravelStudy-2014
PPTX
ShareThis Midterm Elections_2014
PPTX
H2O platform workshop
PPTX
Q3 2014 Consumer Sharing Trends Report
PDF
ShareThis_Return on a Share Study
PPTX
Share this millennial study_2014
PPT
Data Pipeline Management Framework on Oozie
PDF
ShareThis_CSTR_July2014
PDF
Sharing Steals the Cup
PPTX
Data analysis with R
PPTX
ShareThis Auto Study
PDF
ShareThis Return on a Share Study
PDF
Social TV
PPTX
ShareThis RoS
ShareThis Canadian Millennials Study_2015
ShareThis TV Study
Q1/2015 ShareThis Consumer Sharing Trends Report
ShareThis Finance Study
DataScienceInnovation_ShareThis
Share this influentialdemocrats_jan2015
ShareThis TravelStudy-2014
ShareThis Midterm Elections_2014
H2O platform workshop
Q3 2014 Consumer Sharing Trends Report
ShareThis_Return on a Share Study
Share this millennial study_2014
Data Pipeline Management Framework on Oozie
ShareThis_CSTR_July2014
Sharing Steals the Cup
Data analysis with R
ShareThis Auto Study
ShareThis Return on a Share Study
Social TV
ShareThis RoS

Recently uploaded (20)

PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Database Infoormation System (DBIS).pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Business Analytics and business intelligence.pdf
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Foundation of Data Science unit number two notes
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Reliability_Chapter_ presentation 1221.5784
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Acumen Training GuidePresentation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Fluorescence-microscope_Botany_detailed content
Database Infoormation System (DBIS).pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
oil_refinery_comprehensive_20250804084928 (1).pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
ISS -ESG Data flows What is ESG and HowHow
Business Analytics and business intelligence.pdf
.pdf is not working space design for the following data for the following dat...
Foundation of Data Science unit number two notes
Supervised vs unsupervised machine learning algorithms
MODULE 8 - DISASTER risk PREPAREDNESS.pptx

Real time pipeline at terabyte sacle