MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as Possible

Don't Break the Camel's Back
MongoDB World, June, 2019

Presenting Today
Jon Hyman 
CTO and Co-Founder, Braze
@jon_hyman

Time is money.
The value of data to your business starts
deteriorating as soon as it's generated.

The new digital economy is on-demand and  
the connected consumer is always-on.
1990 2000 2020 2030
Corporate  
Computing
Personal  
Computing
Ambient  
Computing
SOCIETALIMPACT
SOURCE: JAVASCRIPT: THE MACHINE LANGUAGE OF THE AMBIENT COMPUTING ERA, ALLEN WIRFS-BROCK
Computers empower
and enhance
enterprise tasks
Computers empower
and enhance
individuals’ tasks
GOOGLE  
ANALYTICS
SILVERPOP
ELOQUA
EXACT TARGET
NEOLANE
RESPONSYS
OMNITURE
iOS
APP STORE
CHROMECAST
ALEXA
TENSORFLOW
tvOS
FB MESSENGER
WALT MOSSBERG, AMERICAN JOURNALIST & FORMER RECODE EDITOR AT LARGE
I expect that one end result of all this work will be that the technology,
the computer inside all these things, will fade into the background. In
some cases, it may entirely disappear... This is ambient computing, the
transformation of the environment all around us with intelligence and
capabilities that don’t seem to be there at all.”
“
1980 2010
Computers empower
and enhance our
environment

Braze empowers you to humanize your
brand-customer relationships at scale.
Tens of  
Billions of
Messages  
Sent  
Monthly
Global  
Customer  
Presence
More than  
1 Billion  
MAU
ON SIX CONTINENTS

TOC
MongoDB at Braze
Message Sending Pipeline
Monitoring Application Impact
 
Summary and Q&A
Today

MongoDB at Braze
•Main database at Braze, used for most
application models
•Most documents at Braze are user profiles
• End users of mobile apps, websites, or mailing lists
• Nearly 11 billion user profiles
•Over 1,200 shards across more than 65
different clusters
• Scaling is entirely for read and write throughput,
storage size is only tens of terabytes
•Across clusters, performing over 350,000
MongoDB operations per second

Challenges at Braze
Braze
APIs
Messaging
Sending

{
_id: 123,
first_name: "Jon",
email: "jon@braze.com",
custom: {
favorite_color: "blue",
shoe_size: 11,
linked_credit_card: true
},
campaign_summaries: {
CampaignA: {
last_received: Date('2019-06-01T12:00:03Z'),
last_opened_email: Date('2019-06-01T12:03:19Z')
}
}
}
Users Collection Example
Demographic data
Custom data
Aggregated  
interaction data

UserCampaignInteractionData Collection Example
{
_id: 123,
emails_received: [
{
date: Date(‘2019-06-01T12:00:03Z’),
campaign: “CampaignA”,
dispatch_id: “identifier-for-send”
},
…
],
android_push_received: [ … ],
emails_opened: [
{
date: Date(‘2019-06-01T12:03:19Z’),
campaign: “CampaignA”,
dispatch_id: “identifier-for-send”
},
…
]
}
Longer history
of interaction data
Message
receipt data

Event processing and message sending
Analytics
•Braze provides real-time analytics on
interactions from campaigns
•Conversion processing to attribute
revenue and actions to campaign receipt
•Determining influenced interaction rates

Event processing and message sending
Analytics
•Braze provides real-time analytics on
interactions from campaigns
•Conversion processing to attribute
revenue and actions to campaign receipt
•Determining influenced interaction rates
Message Sending
•Frequency capping: Limit customers to
only receive certain types of campaigns
a fixed number of times
•Intelligent send time optimizations: use
interaction data to feed into model for
best time of day to send to someone
This is possible with accurate summarized event documents per user.

We summarize event data in other collections as well
{
_id: 123,
sessions: [
Date(‘2019-06-01T11:53:03Z’),
Date(‘2019-05-31T08:14:44Z’),
Date(‘2019-05-31T08:02:11Z’),
],
custom_events: {
watched_video: [
Date(‘2019-06-01T11:53:03Z’),
Date(‘2019-05-31T08:14:44Z’),
Date(‘2019-05-31T08:02:11Z’),
]
},
purchases: [
…
}
}
Recent history
of behavioral data
Recent history
of usage data

Message Sending Pipeline at Braze
Audience
Segmentation
Business Logic
Application
Integrity Check
& Frequency Cap
Render Payloads MongoDB Write Deliver Messages

How can Braze send messages as fast
as possible without breaking the bank?
Money is money.

MongoDB Deployment Model
Three tiers of databases:
Small Clients
Shared Databases
Medium Clients
Dedicated Databases,
Shared Cluster
Large Clients
Dedicated Databases,
Dedicated Cluster
A
B
C
A B A

MongoDB Deployment Model
Benefits:
• Isolation for security and compliance
• Scalability, read and write throughput
• Maintenance improvements: issues or
maintenances affecting one database
may not affect other customers
A
B
C
A B A
Worker servers are associated with a given database(s) and allow us to take down individual databases and pause processing.

Improving Read Throughput
•Limit concurrency of unindexed queries with large
expected totalDocsExamined
•Take advantage of multi-cluster deployment model
•Statistical analysis on customer campaigns to create
new indexes with partialFilterExpression
•Find slow queries in system.profile focused around
campaign sending
•Do frequency analysis of those queries
•Use aggregations to determine selectivity of the fields

Improving Write Throughput
•Add a lot of small shards to increase write scopes
• Many instances are 30-50 shards with small
amounts of disk
•Spread writes out to multiple collections with
summary documents, prune old data from existing
documents
•Use the cluster distribution model, if a customer’s
write throughput is affecting other clients, move
them to a new cluster
•Limit concurrency as necessary

Monitoring Application Throughput
•We care about application impact of load to
MongoDB: some queueing may be okay, it
depends on if the application is affected
•We want to prevent application from being
blocked indefinitely by queued operations
•We want to understand if certain MongoDB
clusters or databases are causing issues
•We patched the MongoDB driver to do both
these things

Driver Patches
Useful for returning
control to the application
Patch Mongo::Socket#read_from_socket to allow custom, per operation timeouts

deadline = (Time.now + timeout) if timeout
begin
while retrieved < length
retrieve = length - retrieved
if retrieve > buf_size
retrieve = buf_size
end
chunk = @socket.read_nonblock(retrieve, buf)
# If we read the entire wanted length in one operation,
# return the data as is which saves one memory allocation and
# one copy per read
if retrieved == 0 && chunk.length == length
return chunk
end
# If we are here, we are reading the wanted length in
# multiple operations. Allocate the total buffer here rather
# than up front so that the special case above won't be
# allocating twice
if data.nil?
data = allocate_string(length)
end
# ... and we need to copy the chunks at this point
data[retrieved, chunk.length] = chunk
retrieved += chunk.length
end
rescue IO::WaitReadable
select_timeout = (deadline - Time.now) if deadline
if (select_timeout && select_timeout <= 0) || !Kernel::select([@socket], nil, [@socket], select_timeout)
raise Timeout::Error.new("Took more than #{timeout} seconds to receive data.")
end
retry
end
Per-server deadline
Selects on this deadline

First, support a custom timeout

deadline = (Time.now + timeout) if timeout
begin
retrieve = buf_size
end
# one copy per read
return chunk
end
# If we are here, we are reading the wanted length in
# multiple operations. Allocate the total buffer here rather
# than up front so that the special case above won't be
# allocating twice
if data.nil?
data = allocate_string(length)
end
# ... and we need to copy the chunks at this point
data[retrieved, chunk.length] = chunk
retrieved += chunk.length
end
select_timeout = (deadline - Time.now) if deadline
if (select_timeout && select_timeout <= 0) || !Kernel::select([@socket], nil, [@socket], select_timeout)
raise Timeout::Error.new("Took more than #{timeout} seconds to receive data.")
end
retry
end
Set deadline per
operation

First, support a custom timeout
Mongo::Socket.with_read_timeout(5) do
# any read or write in here can only take 5 seconds
end

class ReadMaxTimeoutError < ::Timeout::Error
end
# Allows you to create a block in which you can set a maximum time for a
query to wait for data on the socket
# @param timeout Integer timeout for the socket read
# @param exception_class_to_raise The exception class that will be raised if the
timeout is hit, the default will be Mongo::Socket::ReadMaxTimeoutError
def self.with_read_timeout(timeout, exception_class_to_raise = nil)
existing_value = Thread.current[:mongo_query_read_timeout]
existing_exception_value = Thread.current[:mongo_query_read_timeout_exception]
begin
Thread.current[:mongo_query_read_timeout] = timeout
Thread.current[:mongo_query_read_timeout_exception] = exception_class_to_raise
yield
ensure
Thread.current[:mongo_query_read_timeout] = existing_value
Thread.current[:mongo_query_read_timeout_exception] = existing_exception_value
end
end

modified_timeout = Thread.current[:mongo_query_read_timeout] || timeout
deadline = (Time.now + modified_timeout) if modified_timeout
begin
retrieve = buf_size
end
# one copy per read
return chunk
end
...

Driver Patches
Get metrics on application issues
Patch Mongo::Socket#read_from_socket to log to StatsD as it is waiting

LOG_TO_STATS_D_AFTER_EACH_OF_THESE_SECONDS = [
2, 5, 10, 30, 60, 120, 180, 240, 300, 360, 420, 480
].freeze
...
# Log duration to StatsD here
seconds_since_start_time = Time.now - start_time
LOG_TO_STATS_D_AFTER_EACH_OF_THESE_SECONDS.each do |seconds|
begin
# We want to log once after 5 seconds, then after 10 seconds, then after 30 seconds, etc.
# so keep track of if we have already logged for a given value
if !already_logged_to_stats_d_at_seconds[seconds] && seconds_since_start_time > seconds
already_logged_to_stats_d_at_seconds[seconds] = true
mongo_port = Thread.current[:current_mongo_port]
company_id = Thread.current[:current_company_id]
StatsDAdapter.increment(
"platform.all.mongo_slow_query".freeze,
1,
{:port => mongo_port, :company_id => company_id, :seconds => seconds}
)
end
rescue => e
Mongo::Logger.logger.info {"Caught error logging to StatsD: #{e.inspect()}"}
end
end
...
end

Long running queries by MongoDB cluster & company
During a hardware incident that affected a handful of MongoDB clusters

How does Braze use
this information?

Braze response to long running operations
•Graphs can inform whether or not to add more
shards to a cluster or move a customer off
•Graphs can point toward certain campaigns to
evaluate at times of day for new
partialFilterExpression indexes
•Used as input to alerts and incident handling tools
+

Braze incident handling tools
• Built throttling system for campaign sending (i.e., database writes)
• Can control rate at which messages are picked up to be sent
• Alerts set up for high load on MongoDB, slow application
queries
• Ideal long term solution is to build a Governor
• Use machine learning to throttle database writes based on
application response time

Summary
• MongoDB’s flexible schemas with non-scalar types enables
Braze to quickly fetch information about end users
• Braze’s MongoDB deployment model allows Braze to match
customer resources with the appropriate amount of hardware
and shards
• Metrics from the driver for how long reads and writes are taking
gives Braze visibility into how its application is performing

Thank you! We are hiring!
braze.com/careers

MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as Possible

MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as Possible

More Related Content

Similar to MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as Possible (20)

More from MongoDB (20)

Recently uploaded (20)

MongoDB World 2019: Don't Break the Camel's Back: Running MongoDB as Hard as Possible