SlideShare a Scribd company logo
Building Real-Time Search
at Mailchimp
Kevin Xu
Software Engineer
Building Real-Time Search at MailChimp
2018
JAN ‘17 OCT ‘17
2018
MC Technical Overview
US1 - US19
2018
SEARCH APP
(downstream)
search queries
indexing requests
(every 5 min)
SQLite Search
2018
SEARCH APP
Problem: Querying
search queries
2018
SEARCH APP
Problem: Indexing
indexing requests
(every ???)
2018
SEARCH APP
search queries
indexing requests
(every ???)
SQLite Search
2018
Considering a Replacement
- Use cases: full-text search,
logging/log analysis, events
and metrics
- Fast queries
- Scale horizontally
2018
Elasticsearch Docs
2018
Querying ES
2018
Capturing Events from MC
US1
US19
2018
Possible Solution: Direct Indexing?
2018
Adding a Message Queue
2018
Real-Time Streaming
Proprietary & Confidential 2018 16
Kafka: Topics
2018
Kafka: Partitions
2018
Connecting the Dots
2018
Tracking
Events
All Changes
Searchable
Changes
2018
App-Layer Filtering
config.searchable_model: “Contact”;
config.searchable_model: “Campaign”;
...
$contact = new Contact(“Ben”);
$contact->save(); // onSave()
2018
Write to File, Ship to Kafka
2018
Indexing to ES
PHP Consumers
?
2018
Generating Documents
2018
No Ordering Guarantees!
2018
No Order, No Problem
2018
PHP Consumers
2018
PHP Consumers
2018
Queries > 1s
9/26/17 10/10/17
NumberofQueries
release!
2018
release!
Queries > 2sNumberofQueries
9/26/17 10/10/17
2018 30
250msMedian (p50) Query Response Time
2018 31
400msp95 Query Response Time
2018 32
19Number of Elasticsearch clusters
2018 33
373 billion
Total docs across all ES clusters
2018 34
93,000
Total Changelog events per second (peak)
2018 35
3.7 minutes
Average Time to Index
2018 36
0Support tickets post-launch
Proprietary & Confidential 2018 37
What Now?
- Explore other applications of this
infrastructure
- Ongoing technical challenges
- Data Drift, Consumer Lag
Proprietary & Confidential 2018
Thanks!
38
kevin.xu@mailchimp.com

More Related Content

PDF
GraphQL Advanced
PDF
GraphQL Search
PDF
Version Control in AI/Machine Learning by Datmo
PDF
Enterprise graph applications
PPTX
Introduction to graphQL
PDF
REST vs GraphQL
PDF
How to GraphQL
PDF
GraphQL Fundamentals
GraphQL Advanced
GraphQL Search
Version Control in AI/Machine Learning by Datmo
Enterprise graph applications
Introduction to graphQL
REST vs GraphQL
How to GraphQL
GraphQL Fundamentals

What's hot (20)

PPTX
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
PDF
GraphConnect 2014 SF: How eBay and Shutl Deliver Even Faster Using Neo4j
PDF
GraphQL London January 2018: Graphql tooling
PDF
Boost your APIs with GraphQL
PDF
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
PPTX
Fifth elephant 2017 Data Pipeline workshop
PDF
Real Time Serverless Polling App
PDF
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
PDF
Building Fullstack Graph Applications With Neo4j
PDF
Standing out as a new grad candidate
PDF
LeanIX GraphQL Lessons Learned - CodeTalks 2017
PDF
From Data Analytics to Fast Data Intelligence
PPTX
Melb nov17 Virtual Entity and auto number
PDF
Continuous Delivery of ML-Enabled Pipelines on Databricks using MLflow
PPT
How to build high frequency trading with our matlab secrets with c++ and mysql
PPTX
GraphQL Misconfiguration
PDF
About The Event-Driven Data Layer & Adobe Analytics
PDF
Getting started with GraphQL
PPTX
Attacking GraphQL
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
GraphConnect 2014 SF: How eBay and Shutl Deliver Even Faster Using Neo4j
GraphQL London January 2018: Graphql tooling
Boost your APIs with GraphQL
AllThingsOpen 2018 - Deployment Design Patterns (Dan Zaratsian)
Fifth elephant 2017 Data Pipeline workshop
Real Time Serverless Polling App
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Building Fullstack Graph Applications With Neo4j
Standing out as a new grad candidate
LeanIX GraphQL Lessons Learned - CodeTalks 2017
From Data Analytics to Fast Data Intelligence
Melb nov17 Virtual Entity and auto number
Continuous Delivery of ML-Enabled Pipelines on Databricks using MLflow
How to build high frequency trading with our matlab secrets with c++ and mysql
GraphQL Misconfiguration
About The Event-Driven Data Layer & Adobe Analytics
Getting started with GraphQL
Attacking GraphQL
Ad

Similar to Building Real-Time Search at MailChimp (20)

PPTX
Elasticsearch an overview
PDF
Into The Box 2018 cbelasticsearch
PDF
Session #2, tech session: Build realtime search by Sylvain Utard from Algolia
KEY
Sphinx at Craigslist in 2012
PDF
Real-time search in Drupal with Elasticsearch @Moldcamp
ODP
Finding Anything: Real-time Search with IndexTank
ODP
Finding Anything: Real-time Search with IndexTank
PDF
Introduction to Elasticsearch
PDF
ITB 2023 - cbElasticSearch Modern Searching for Modern CFML - Jon Clausen.pdf
PDF
Building Software Systems at Google and Lessons Learned
PDF
Lessons Learned from Building SW at Google
PPTX
Building the search engine: from thorns to stars
PPTX
Search and analyze your data with elasticsearch
PDF
ITB2019 Easy ElasticSearch with cbElasticSearch - Jon Clausen
PDF
Elasticsearch for Logs & Metrics - a deep dive
PPTX
ElasticSearch as (only) datastore
PPTX
Elastic & Azure & Episever, Case Evira
PPTX
Elasticsearch - DevNexus 2015
PDF
Real-time search in Drupal. Meet Elasticsearch
PDF
Taking Elasticsearch From 0 to 88mph
Elasticsearch an overview
Into The Box 2018 cbelasticsearch
Session #2, tech session: Build realtime search by Sylvain Utard from Algolia
Sphinx at Craigslist in 2012
Real-time search in Drupal with Elasticsearch @Moldcamp
Finding Anything: Real-time Search with IndexTank
Finding Anything: Real-time Search with IndexTank
Introduction to Elasticsearch
ITB 2023 - cbElasticSearch Modern Searching for Modern CFML - Jon Clausen.pdf
Building Software Systems at Google and Lessons Learned
Lessons Learned from Building SW at Google
Building the search engine: from thorns to stars
Search and analyze your data with elasticsearch
ITB2019 Easy ElasticSearch with cbElasticSearch - Jon Clausen
Elasticsearch for Logs & Metrics - a deep dive
ElasticSearch as (only) datastore
Elastic & Azure & Episever, Case Evira
Elasticsearch - DevNexus 2015
Real-time search in Drupal. Meet Elasticsearch
Taking Elasticsearch From 0 to 88mph
Ad

More from All Things Open (20)

PDF
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
PPTX
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
PDF
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
PDF
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
PDF
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
PDF
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
PDF
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
PPTX
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
PDF
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
PDF
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
PPTX
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
PDF
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
PPTX
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
PDF
The Death of the Browser - Rachel-Lee Nabors, AgentQL
PDF
Making Operating System updates fast, easy, and safe
PDF
Reshaping the landscape of belonging to transform community
PDF
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
PDF
Integrating Diversity, Equity, and Inclusion into Product Design
PDF
The Open Source Ecosystem for eBPF in Kubernetes
PDF
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
The Death of the Browser - Rachel-Lee Nabors, AgentQL
Making Operating System updates fast, easy, and safe
Reshaping the landscape of belonging to transform community
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
Integrating Diversity, Equity, and Inclusion into Product Design
The Open Source Ecosystem for eBPF in Kubernetes
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Big Data Technologies - Introduction.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Electronic commerce courselecture one. Pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Cloud computing and distributed systems.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Unlocking AI with Model Context Protocol (MCP)
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
MYSQL Presentation for SQL database connectivity
Encapsulation_ Review paper, used for researhc scholars
Big Data Technologies - Introduction.pptx
20250228 LYD VKU AI Blended-Learning.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Electronic commerce courselecture one. Pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Understanding_Digital_Forensics_Presentation.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Spectral efficient network and resource selection model in 5G networks
Cloud computing and distributed systems.
Digital-Transformation-Roadmap-for-Companies.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

Building Real-Time Search at MailChimp