SlideShare a Scribd company logo
Mesos at OpenTable
Pablo Delgado
Senior Data Engineer
OpenTable
@pablete
MesosCon 2015, Seattle, WA
• Over 32,000 restaurants worldwide
• more than 760 million diners seated since 1998, representing
more than $30 billion spent at partner restaurants
• Over 16 million diners seated every month
• OpenTable has seated over 190 million diners via a mobile
device. Almost 50% of our reservations are made via a
mobile device
• OpenTable currently has presence in US, Canada, Mexico,
UK, Germany and Japan
• OpenTable has nearly 600 partners including Facebook,
Google, TripAdvisor, Urbanspoon, Yahoo and Zagat.
2
OpenTable
the world’s leading provider of online restaurant
reservations
At OpenTable
we aim to power
the best dining
experiences!
Service Oriented Architecture
5
From monolith to microservices
6
• Mesos: A Platform for Fine-Grained Resource Sharing in
the Data Center

PAPER: http://mesos.berkeley.edu/mesos_tech_report.pdf
• Omega: flexible, scalable schedulers for large compute
clusters

PAPER: http://research.google.com/pubs/pub41684.html
Apache Mesos
7
Apache Mesos
• Mesos slaves connect to
masters and offer resources
like CPU, disk, and memory.
• Masters take those offers
and make decisions about
resource allocation using
frameworks like Singularity.
• Frameworks in turn choose
to use resource offers, and
run tasks on slaves.
8
Zookeeper
Netflix’s Exhibitor
Mesos Master
Zookeeper
Netflix’s Exhibitor
Standby Master
Zookeeper
Netflix’s Exhibitor
Standby Master
Docker
Mesos Slave
Docker
Mesos Slave
Docker
Mesos Slave
Docker
Mesos Slave
Docker
Mesos Slave
Docker
Mesos Slave
availability zone 2bavailability zone 2a availability zone 2c
Apache Mesos
Hubspot’s Singularity
Scheduler
10
• Native Docker Support
• JSON REST API and Java Client
• Fully featured web application (replaces and improves Mesos Master UI)
• Deployments, automatic rollbacks, and healthchecks
• Configurable email alerts to service owners
Singularity Features
11
Hubspot’s Singularity
Process types:

Web Services 

Workers

Scheduled (CRON-type) Jobs

On-Demand Processes
Slave placement:

GREEDY

SEPARATE_BY_DEPLOY

SEPARATE_BY_REQUEST

OPTIMISTIC
Executors:

Mesos executor

Singularity executor

Docker executor
Linux Containers
13
Docker
• Immutability
• Portability
• Isolation
Service Discovery
15
Services no longer live in a well known address/port, so we needed a registry
or dynamic way to find them. Also it had to be MESOS agnostic.
• Service announce their presence to the Discovery Server
• Service subscribe to changes in dependencies announcement
• Service un-announce on termination or timeout on crash
Service Discovery
16
Zookeeper Zookeeper Zookeeper
availability zone 2bavailability zone 2a availability zone 2c
Service Discovery
Discovery Server Discovery Server Discovery Server
A
A
A
BB
Announce
Discover
Subscribe
17
Service Discovery API
FrontDoor
19
FrontDoor
• Route external traffic to
internal services
• Simple Discovery-aware
proxy
• Dynamic configuration
• Developer friendly
configuration via Git repo
REQUEST_URI=/api/timezone* passthru timezone
Monitoring
21
Monitoring
https://github.com/opentable/mesos_stats
• Finds your service
name by parsing
the task names.
• Includes grafana
dashboard
• Runs inside mesos
All together
23
Github
Continuous
Integration
Singularity
Discovery
Master
Zookeeper
Master
Zookeeper
Master
Zookeeper
Slave
Docker
Slave
Docker
Slave
Docker
Slave
Docker
Slave
Docker
Slave
Docker
FrontDoor
Docker
Registry
Discovery
Discovery
Overview
24
Github
Continuous
Integration
Singularity
Docker
Registry
Developer’s Concerns
• Initialize projects with Continuous
integration template
• Enable monitoring/logging of
application level errors
• Build project as an immutable docker
image
• Deploy to Mesos through singularity
using a rest API
25
Singularity
Discovery
Master
Zookeeper
Master
Zookeeper
Master
Zookeeper
Slave
Docker
Slave
Docker
Slave
Docker
Slave
Docker
Slave
Docker
Slave
Docker
FrontDoor
Docker
Registry
Discovery
Discovery
Operational Concerns
• Provide Mesos with
resources
• Monitor and
maintain external
traffic routing
• Monitor and replace
failing resources
26
Stateless Mesos Cluster
Datastores
Caches
Stateless Simplicity
Other
Mysql, PostgreSQL,
MongoDB
Redis, Memcached
Zookeeper, Amazon S3
27
US Data Center EU Data Center
AWS us-west-2 AWS eu-west-1 AWS us-west-2
PROD PROD
PROD PROD
QA
DATA
PROCESSING
28
US Data Center EU Data Center
AWS us-west-2 AWS eu-west-1 AWS us-west-2
PROD PROD
PROD PROD
QA
DATA
PROCESSING
Kafka Kafka
Kafka Kafka Kafka
Data Processing
30
Distributed Multitenant Data Processing
31
Spark’s Approach
• Generalize MapReduce in order to support new apps in the same engine
• General DAGs and Data Sharing
• Unification benefits the engine, which is more efficient, and simple for user
• Handles batch, interactive and online processing
• API available for Java, Scala, Python, SQL, R
32
Spark RDDs
Resilient Distributed Datasets (or RDD) are fault-tolerant distributed collections
They exists in the form of:
• Parallelized Collections
• External datasets, distributed datasets from any storage source supported by
Hadoop, including your local file system, HDFS, Cassandra, HBase, Amazon S3, etc.
33
HadoopRDD(
path(=(hdfs://...(
FilteredRDD(
func(=(_.contains(…)(
shouldCache(=(true(
file:%
errors:%
Partition.level%view:%Dataset.level%view:%
Task%1%Task%2% ...%
RDD Graph
Dataset-level view Partition-level view
file RDD
errors RDD
Task 1 Task 2 Task 3 Task n
34
Scheduling Process
rdd1.join(rdd2)
.groupBy(…)
.filter(…)
RDD#Objects#
build#operator#DAG!
agnos&c(to(
operators!(
doesn’t(know(
about(stages(
DAGScheduler#
split#graph#into#
stages#of#tasks!
submit#each#
stage#as#ready#
DAG#
TaskScheduler#
TaskSet#
launch#tasks#via#
cluster#manager!
retry#failed#or#
straggling#tasks!
Cluster#
manager#
Worker#
execute#tasks!
store#and#serve#
blocks!
Block(
manager(
Threads(
Task#
stage#
failed#
Lifetime of a job. Scheduling Process
35
Scheduling Process
rdd1.join(rdd2)
.groupBy(…)
.filter(…)
RDD#Objects#
build#operator#DAG!
agnos&c(to(
operators!(
doesn’t(know(
about(stages(
DAGScheduler#
split#graph#into#
stages#of#tasks!
submit#each#
stage#as#ready#
DAG#
TaskScheduler#
TaskSet#
launch#tasks#via#
cluster#manager!
retry#failed#or#
straggling#tasks!
Cluster#
manager#
Worker#
execute#tasks!
store#and#serve#
blocks!
Block(
manager(
Threads(
Task#
stage#
failed#
Lifetime of a job. Scheduling Process
36
Scheduling Process
rdd1.join(rdd2)
.groupBy(…)
.filter(…)
RDD#Objects#
build#operator#DAG!
agnos&c(to(
operators!(
doesn’t(know(
about(stages(
DAGScheduler#
split#graph#into#
stages#of#tasks!
submit#each#
stage#as#ready#
DAG#
TaskScheduler#
TaskSet#
launch#tasks#via#
cluster#manager!
retry#failed#or#
straggling#tasks!
Cluster#
manager#
Worker#
execute#tasks!
store#and#serve#
blocks!
Block(
manager(
Threads(
Task#
stage#
failed#
Lifetime of a job. Scheduling Process
37
Scheduling Process
rdd1.join(rdd2)
.groupBy(…)
.filter(…)
RDD#Objects#
build#operator#DAG!
agnos&c(to(
operators!(
doesn’t(know(
about(stages(
DAGScheduler#
split#graph#into#
stages#of#tasks!
submit#each#
stage#as#ready#
DAG#
TaskScheduler#
TaskSet#
launch#tasks#via#
cluster#manager!
retry#failed#or#
straggling#tasks!
Cluster#
manager#
Worker#
execute#tasks!
store#and#serve#
blocks!
Block(
manager(
Threads(
Task#
stage#
failed#
Lifetime of a job. Scheduling Process
38
Alternating Least Squares (ALS) in MLlib
39
Driver Program
SparkContext
Cluster Manager
Worker Node
Executor
Task Task
Cache
Worker Node
Executor
Task Task
Cache
Running Spark
40
Driver Program
SparkContext
Cluster Manager
Worker Node


Executor
Task Task
Cache
Mesos Master
Mesos Executor
Worker Node
Task Task
Cache
Mesos Executor
Framework
Mesos Coarse Grained


Executor
41
Driver Program
SparkContext
Cluster Manager
Worker Node
Task
Mesos Master
Mesos Executor
Worker Node
Mesos Executor
Task
Task Task


Executor


Executor


Executor


Executor
Mesos Fine Grained
Framework
Pull Requests (maybe merged)
[SPARK-7373] Add docker support for launching drivers 

in mesos cluster mode.
[SPARK-5338] Add cluster mode support for Mesos
[SPARK-5095] Support capping cores and launch mulitple executors in coarse
mode
[SPARK-6707] Mesos Scheduler should allow the user to specify constraints
based on slave attributes



[SPARK-6287] Add dynamic allocation to the coarse-grained Mesos scheduler
43
Memory-centric distributed
storage system (cache)
Distributed file system
General engine for large-scale data
processing
Kernel for the datacenter
Ideal data processing stack
44
Other frameworks
• KAFKA on mesos https://github.com/mesos/kafka
• SAMZA on mesos https://github.com/banno/samza-mesos
• PHOENIX (secor on mesos) https://github.com/stealthly/phoenix
• CASSANDRA on mesos https://github.com/mesosphere/cassandra-mesos
We are also using:
We are considering:
• CHRONOS https://github.com/mesos/chronos
• MARATHON https://github.com/mesosphere/marathon
45
Kafka
User
Activity
backups
Query/Processing Layer
Spark SQL
JSON
Data Products
ETL
Spark MLlib
Spark Streaming
46
{“userId”:"xxxxxxxx","event":"personalizer_search","query_longitude":-77.16816,"latitude":38.918159,"req_attribute_tag_ids":
["pizza"],"req_geo_query":"Current Location”,"sort_by":"best","longitude":-77.168156,"query_latitude":38.91816,"req_forward_minutes":
30,"req_party_size":2,"req_backward_minutes":30,"req_datetime":"2015-06-02T12:00","req_time":"12:00","res_num_results":
784,"calculated_radius":5.466253405962307,"req_date":"2015-06-02"},"type":"track","messageId":"b4f2fafc-
dd4a-45e3-99ed-4b83d1e42dcd","timestamp":"2015-06-02T10:02:34.323Z"}
ETL with Spark/ SparkSQL
47
Matrix Factorization. Spark MLlib
• Collaborative Filtering
• Topic Modeling
• Restaurant Demand Analysis
48
nigiri
sashimi
gari
maki
roku
rolls
roll
godzilla
chirashi
robata
zushi
omakase
yellowtail
unagi
samba
toro
gyoza
aburi
spider
starburst
nakazawa
shabu
sasa
katana
sake
hapa
maguro
tsunami
raku
kappo
yasuda
otoro
seki
tamari
ra
teppanyaki
caterpillar
japan
shashimi
hamasaku
Early explorations with Word2vec:
Find synonyms for “Sushi”
We use Apache Spark’s Implementation of Word2Vec (skip-gram model)
49
Sushi of Gari,
Gari Columbus, NYC
Masaki Sushi
Chicago
Sansei Seafood Restaurant &
Sushi Bar, Maui
A restaurant like your favorite one but in a
different city.
Find the “synonyms” of the restaurant in question, then filter by location!
Akiko’s, SF
San Francisco Maui Chicago New York
'
Downtown upscale sushi experience with sushi bar
keep in touch
@pablete

More Related Content

PPTX
Service Discovery and Registration in a Microservices Architecture
PDF
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
PDF
Datacenter Computing with Apache Mesos - BigData DC
PDF
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
PDF
Introduction To Apache Mesos
PDF
Introduction to Apache Mesos and DC/OS
PPTX
MANTL Data Platform, Microservices and BigData Services
PDF
Mesos: A State-of-the-art Container Orchestrator
Service Discovery and Registration in a Microservices Architecture
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - シリコンバレー日本人駐在員Meetup
Introduction To Apache Mesos
Introduction to Apache Mesos and DC/OS
MANTL Data Platform, Microservices and BigData Services
Mesos: A State-of-the-art Container Orchestrator

Similar to Mesos at OpenTable (20)

PDF
Introduction to DC/OS
PDF
OSDC 2016 - Mesos and the Architecture of the New Datacenter by Jörg Schad
PDF
Introduction to DC/OS
PDF
Apache Mesos Overview and Integration
PDF
The Rise of Cloud Computing Systems
PDF
Easy Docker Deployments with Mesosphere DCOS on Azure
PDF
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
PPT
What can-be-done-around-mesos
PPTX
Apache Kafka, HDFS, Accumulo and more on Mesos
PDF
DCOS Presentation
PDF
Mesos and the Architecture of the New Datacenter
PDF
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
PDF
Musings on Mesos: Docker, Kubernetes, and Beyond.
PDF
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
PDF
Fully fault tolerant real time data pipeline with docker and mesos
DOCX
Big Data - Hadoop Ecosystem
PDF
Data Lake and the rise of the microservices
PDF
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
PDF
Introducing Apache Mesos
PPTX
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
Introduction to DC/OS
OSDC 2016 - Mesos and the Architecture of the New Datacenter by Jörg Schad
Introduction to DC/OS
Apache Mesos Overview and Integration
The Rise of Cloud Computing Systems
Easy Docker Deployments with Mesosphere DCOS on Azure
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
What can-be-done-around-mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
DCOS Presentation
Mesos and the Architecture of the New Datacenter
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
Musings on Mesos: Docker, Kubernetes, and Beyond.
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
Fully fault tolerant real time data pipeline with docker and mesos
Big Data - Hadoop Ecosystem
Data Lake and the rise of the microservices
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Introducing Apache Mesos
EMC World 2016 - Deep Dive with Mesos and Persistent Storage for Applications
Ad

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Global journeys: estimating international migration
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Database Infoormation System (DBIS).pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Lecture1 pattern recognition............
Introduction to Knowledge Engineering Part 1
Fluorescence-microscope_Botany_detailed content
Global journeys: estimating international migration
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Mega Projects Data Mega Projects Data
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction-to-Cloud-ComputingFinal.pptx
IB Computer Science - Internal Assessment.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Foundation of Data Science unit number two notes
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Supervised vs unsupervised machine learning algorithms
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Database Infoormation System (DBIS).pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Clinical guidelines as a resource for EBP(1).pdf
Lecture1 pattern recognition............
Ad

Mesos at OpenTable