SlideShare a Scribd company logo
How Capital Markets Firms Use
 MongoDB as a Tick Database

    Matt Kalan, Sr. Solution Architect
         Email: Matt.kalan@10gen.com
           Twitter: @matthewkalan
Agenda

• MongoDB Introduction
• FS Use Cases
• Writing/Capturing Market Data
• Reading/Analyzing Market Data
• Performance, Scalability, & High Availability
• Q&A




                          2
Introduction

10gen is the company behind MongoDB –
the leading next generation database




 Document-         General               Open-
  Oriented         Purpose              Source



                      3
10gen Overview




       200+ employees                              500+ customers




                                    Offices in New York, Palo Alto, Washington
  Over $81 million in funding       DC, London, Dublin, Barcelona and Sydney



                                4
Database Landscape
                     • No Automatic Joins
                     • Document Transactions
                     • Fast, Scalable Read/Writes




                5
MongoDB Business Benefits




Increased Developer Productivity       Better Customer Experience




     Faster Time to Market                    Lower TCO


                                   6
MongoDB Technical Benefits

                  Application
                                              Agile &
                                              Flexible
                                            { author: “roger”,
    High                    Highly            date: new Date(),
                                              text: “Spirited Away”,
    Performance             Available         tags: [“Tezuka”, “Manga”]}
    -Indexes                -Replica Sets
    -RAM




  Horizontally Scalable
  -Sharding


                                7
Most Common FS Use Cases

1. Tick Data Capture & Analysis
2. Reference Data Management
3. Risk Analysis & Reporting
4. Trade Repository
5. Portfolio Reporting




                         8
Tick Data Capture & Analysis -
Requirements
• Capture real-time market data (multi-asset, top of
  book, depth of book, even news)
• Load historical data
• Aggregate data into bars, daily, monthly intervals
• Enable queries & analysis on raw ticks or
  aggregates
• Drive backtesting or automated signals



                          9
Tick Data Capture & Analysis –
Why MongoDB?
• High throughput => can capture real-time feeds for all
  products/asset classes needed
• High scalability => all data and depth for all historical time periods
  can be captured
• Flexible & Range-based indexing => fast querying on time ranges
  and any fields
• Aggregation Framework => can shape raw data into aggregates
  (e.g. ticks to bars)
• Map-reduce capability (Native MR or Hadoop Connector) => batch
  analysis looking for patterns and opportunities
• Easy to use => native language drivers and JSON expressions that
  you can apply for most operational database needs as well
• Low TCO => Low software license cost and commodity hardware
                                    10
Writing/Capturing Tick Data
High Level Trading Architecture

        Market Data                                              Capturing
                            Feed Handler
                                                                Application



                                                                                        News & social
                                                                                         networking
                                                                                          sources
                                            Cached Static &
                 Orders                     Aggregated Data
Exchanges/Mark              Low Latency
  ets/Brokers               Applications


                                           Trades/metrics




                          Orders                   Higher Latency         Backtesting and
                                                      Trading                Analysis
                                                    Applications           Applications




                                              12
High Level Trading Architecture

        Market Data                                            Capturing
                            Feed Handler
                                                              Application



                                                                                      News & social
                                                                                       networking
                           Data Types                                                   sources
                           • Top of book Cached Static &
                 Orders    • Depth of book Aggregated Data
Exchanges/Mark             • Low Latency
                             Multi-asset
  ets/Brokers                Applications
                           • Derivatives (e.g. strips)
                           • News (text, video)
                                           Trades/metrics
                           • Social Networking




                          Orders                 Higher Latency         Backtesting and
                                                    Trading                Analysis
                                                  Applications           Applications




                                            13
Top of book [e.g. equities]
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    symbol : "DIS",
    timestamp: ISODate("2013-02-15 10:00"),
    bidPrice: 55.37,
    offerPrice: 55.58,
    bidQuantity: 500,
    offerQuantity: 700
}


> db.ticks.find( {symbol: "DIS",
                   bidPrice: {$gt: 55.36} } )




                            14
Depth of book
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    symbol : "DIS",
    timestamp: ISODate("2013-02-15 10:00"),
    bidPrices: [55.37, 55.36, 55.35],
    offerPrices: [55.58, 55.59, 55.60],
    bidQuantities: [500, 1000, 2000],
    offerQuantities: [1000, 2000, 3000]
}


> db.ticks.find( {bidPrices: {$gt: 55.36} } )



                           15
or any way your app uses it
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    symbol : "DIS",
    timestamp: ISODate("2013-02-15 10:00"),
    bids: [
          {price: 55.37, amount: 500},
          {price: 55.37, amount: 1000},
          {price: 55.37, amount: 2000} ],
    offers: [
          {price: 55.58, amount: 1000},
          {price: 55.58, amount: 2000},
          {price: 55.59, amount: 3000} ]
}
> db.ticks.find( {"bids.price": {$gt: 55.36} } )

                            16
Synthetic spreads
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    symbol : "DIS",
    timestamp: ISODate("2013-02-15 10:00"),
    spreadPrice: 0.58
    leg1: {symbol: “CLM13, price: 97.34}
    leg2: {symbol: “CLK13, price: 96.92}
}
db.ticks.find( { “leg1” : “CLM13” },
                { “leg2” : “CLK13” },
                { “spreadPrice” : {$gt: 0.50 } } )



                           17
News
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    symbol : "DIS",
    timestamp: ISODate("2013-02-15 10:00"),
    title: “Disney Earnings…”
    body: “Walt Disney Company reported…”,
    tags: [“earnings”, “media”, “walt disney”]
}




                           18
Social networking
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    timestamp: ISODate("2013-02-15 10:00"),
    twitterHandle: “jdoe”,
    tweet: “Heard @DisneyPictures is releasing…”,
    usernamesIncluded: [“DisneyPictures”],
    hashTags: [“movierumors”, “disney”]
}




                           19
Aggregates (bars, daily, etc.)
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    symbol : "DIS”,
    openTS: Date("2013-02-15 10:00"),
    closeTS: Date("2013-02-15 10:05"),
    open: 55.36,
    high: 55.80,
    low: 55.20,
    close: 55.70
}




                           20
Querying/Analyzing Tick Data
Architecture for Querying Data

                        Research &
                         Analysis
    • Ticks             Applications

    • Bars
    • Other analysis

                        Backtesting
                        Applications




                       Higher Latency
                          Trading
                        Applications




                       22
Index any fields: arrays, nested, etc

 // Compound indexes
 > db.ticks.ensureIndex({symbol: 1, timestamp:1})

 // Index on arrays
 >db.ticks.ensureIndex( {bidPrices: -1})

 // Index on any depth
 > db.ticks.ensureIndex( {“bids.price”: 1} )

 // Full text search
 > db.ticks.ensureIndex ( {tweet: “text”} )

                             23
Query for ticks by time; price
threshold
 // Ticks for last month for media companies
 > db.ticks.find({
            symbol: {$in: ["DIS", “VIA“, “CBS"]},
            timestamp: {$gt: new ISODate("2013-01-01")},
            timestamp: {$lte: new ISODate("2013-01-31")}})

 // Ticks when Disney’s bid breached 55.50 this month
 > db.ticks.find({
            symbol: "DIS",
            bidPrice: {$gt: 55.50},
            timestamp: {$gt: new ISODate("2013-02-01")}})


                           24
Analyzing/Aggregating Options

• Custom application code
  – Run your queries, compute your results
• Aggregation framework
  – Declarative, pipeline-based approach
• Native Map/Reduce in MongoDB
  – Javascript functions distributed across cluster
• Hadoop Connector
  – Offline batch processing/computation



                            25
Aggregate into min bars
//Aggregate minute bars for Disney for this month

db.ticks.aggregate(
 { $match: {symbol: "DIS”, timestamp: {$gt: new ISODate("2013-02-01")}}},
 { $project: {
         year:        {$year: "$timestamp"},
         month:       {$month: "$timestamp"},
         day:         {$dayOfMonth: "$timestamp"},
         hour:        {$hour: "$timestamp"},
         minute:       {$minute: "$timestamp"},
         second:       {$second: "$timestamp"},
         timestamp: 1,
         price: 1}},
 { $sort: { timestamp: 1}},
 { $group :
      { _id : {year: "$year", month: "$month", day: "$day", hour: "$hour", minute: "$minute"},
        open: {$first: "$price"},
        high: {$max: "$price"},
        low: {$min: "$price"},
        close: {$last: "$price"} }} )

                                              26
Add analysis on the bars

…
//then count the number of down bars
{ $project: {
       downBar: {$lt: [“$close”, “$open”] },
       timestamp: 1,
       open: 1, high: 1, low: 1, close: 1}},
{ $group: {
       _id: “$downBar”,
       sum: {$sum: 1}}} })

                          27
Map-Reduce Example: Sum

var mapFunction = function () {
    emit(this.symbol, this.bidPrice);
}
var reduceFunction = function (symbol, priceList) {
    return Array.sum(priceList);
}
> db.ticks.mapReduce(
       map, reduceFunction, {out: ”tickSums"})



                                28
Process Data on Hadoop

• MongoDB’s Hadoop Connector
• Supports Map/Reduce, Streaming, Pig
• MongoDB as input/output storage for Hadoop
  jobs
  – No need to go through HDFS
• Leverage power of Hadoop ecosystem against
  operational data in MongoDB




                         29
Performance, Scalability, and
            High Availability
Why MongoDB is fast and scalable

Better data locality    In-Memory   Auto-Sharding
                         Caching




                                    Read/write scaling

 Relational   MongoDB




                             31
Auto-Sharding for Horizontal Scale


              Key Range
              Symbol: A…Z




     mongod




   Read/Write Scalability



                            32
Auto-Sharding for Horizontal Scale


     Key Range      Key Range
     Symbol: A…J    Symbol: K…Z




     mongod         mongod




   Read/Write Scalability



                              33
Sharding

    Key Range     Key Range      Key Range     Key Range
    Symbol: A…F   Symbol: G…J    Symbol: K…O   Symbol: P…Z




   mongod          mongod
                                 mongod        mongod




  Read/Write Scalability



                            34
Application



          MongoS       MongoS           MongoS




Key Range          Key Range           Key Range      Key Range
Symbol: A…F,       Symbol: G…J,        Symbol: K…O,   Symbol: P…Z,
Time               Time                Time           Time

Primary            Primary             Primary        Primary


Secondary          Secondary           Secondary      Secondary


Secondary          Secondary           Secondary      Secondary


                                  35
10gen Products and Services

     Subscriptions
     Professional Support, Enterprise Edition and Commercial License



     Consulting
     Expert Resources for All Phases of MongoDB Implementations



     Training
     Online and In-Person, for Developers and Administrators




                             36
Summary

• MongoDB is high performance for tick data
• Scales horizontally automatically by auto-
  sharding
• Fast, flexible querying, analysis, & aggregation
• Dynamic schema can handle any data types
• MongoDB has all these features with low TCO
• 10gen can support you with anything discussed



                         37
For More Information

 Resource                     User Data Management
                              Location

 MongoDB Downloads            www.mongodb.org/download

 Free Online Training         education.10gen.com

 Webinars and Events          www.10gen.com/events

 White Papers                 www.10gen.com/white-papers

 Customer Case Studies        www.10gen.com/customers

 Presentations                www.10gen.com/presentations

 Documentation                docs.mongodb.org

 Additional Info              info@10gen.com



                         38
How Capital Markets Firms Use
 MongoDB as a Tick Database

    Matt Kalan, Sr. Solution Architect
         Email: Matt.kalan@10gen.com
           Twitter: @matthewkalan
Webinar: How Banks Use MongoDB as a Tick Database

More Related Content

PDF
How Financial Services Organizations Use MongoDB
PPTX
Webinar: Position and Trade Management with MongoDB
PPT
How Retail Banks Use MongoDB
PPT
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
PPTX
Webinar: How Financial Services Organizations Use MongoDB
PPTX
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
PPTX
Webinar: How Banks Use MongoDB as a Tick Database
PPTX
MongoDB on Financial Services Sector
How Financial Services Organizations Use MongoDB
Webinar: Position and Trade Management with MongoDB
How Retail Banks Use MongoDB
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
Webinar: How Financial Services Organizations Use MongoDB
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
Webinar: How Banks Use MongoDB as a Tick Database
MongoDB on Financial Services Sector

What's hot (20)

PPTX
Data Treatment MongoDB
PPTX
Best Practices for MongoDB in Today's Telecommunications Market
PDF
MongoDB in FS
PPTX
Python and MongoDB as a Market Data Platform by James Blackburn
PPTX
Event-Based Subscription with MongoDB
PPTX
How Insurance Companies Use MongoDB
PPT
Webinar: Real-time Risk Management and Regulatory Reporting with MongoDB
PPTX
Using NoSQL and Enterprise Shared Services (ESS) to Achieve a More Efficient ...
PDF
MongoDB Evenings Houston: Implementing EDW Using MongoDB by Purvesh Patel, Ch...
PPTX
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
PPTX
MongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
PPTX
IOOF IT System Modernisation
PPTX
Creating a Single View: Overview and Analysis
PPTX
Operationalizing the Value of MongoDB: The MetLife Experience
PPTX
Jumpstart: Introduction to MongoDB
PPTX
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
PDF
MongoDB .local Toronto 2019: MongoDB – Powering the new age data demands
PPTX
L’architettura di Classe Enterprise di Nuova Generazione
PPTX
Beyond the Basics 3: Introduction to the MongoDB BI Connector
PPTX
The Right (and Wrong) Use Cases for MongoDB
Data Treatment MongoDB
Best Practices for MongoDB in Today's Telecommunications Market
MongoDB in FS
Python and MongoDB as a Market Data Platform by James Blackburn
Event-Based Subscription with MongoDB
How Insurance Companies Use MongoDB
Webinar: Real-time Risk Management and Regulatory Reporting with MongoDB
Using NoSQL and Enterprise Shared Services (ESS) to Achieve a More Efficient ...
MongoDB Evenings Houston: Implementing EDW Using MongoDB by Purvesh Patel, Ch...
Webinar: MongoDB and Analytics: Building Solutions with the MongoDB BI Connector
MongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
IOOF IT System Modernisation
Creating a Single View: Overview and Analysis
Operationalizing the Value of MongoDB: The MetLife Experience
Jumpstart: Introduction to MongoDB
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
MongoDB .local Toronto 2019: MongoDB – Powering the new age data demands
L’architettura di Classe Enterprise di Nuova Generazione
Beyond the Basics 3: Introduction to the MongoDB BI Connector
The Right (and Wrong) Use Cases for MongoDB
Ad

Viewers also liked (7)

PPTX
Using MongoDB As a Tick Database
PPTX
Agg framework selectgroup feb2015 v2
PPTX
Indexing with MongoDB
PPTX
MongoDB for Time Series Data: Schema Design
PPTX
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
PPTX
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
PPTX
The Aggregation Framework
Using MongoDB As a Tick Database
Agg framework selectgroup feb2015 v2
Indexing with MongoDB
MongoDB for Time Series Data: Schema Design
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
The Aggregation Framework
Ad

Similar to Webinar: How Banks Use MongoDB as a Tick Database (20)

PDF
Analytics&IoT
PDF
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
PDF
Data Virtualization. An Introduction (ASEAN)
PDF
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
PDF
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
PDF
When and How Data Lakes Fit into a Modern Data Architecture
PPT
MongoDB Tick Data Presentation
PPTX
L’architettura di classe enterprise di nuova generazione
PPTX
Microsoft Azure Big Data Analytics
PDF
A Key to Real-time Insights in a Post-COVID World (ASEAN)
PDF
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
PDF
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
PDF
Introduction to Modern Data Virtualization 2021 (APAC)
PDF
Virtualisation de données : Enjeux, Usages & Bénéfices
PPTX
Fast Data Strategy Houston Roadshow Presentation
PPTX
Was steckt drinnen, im Data Market Austria?
PDF
Data Virtualization: From Zero to Hero
PDF
Webinar Data Mesh - Part 3
PDF
Data Virtualization: Introduction and Business Value (UK)
PDF
Why Data Virtualization? An Introduction
Analytics&IoT
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Data Virtualization. An Introduction (ASEAN)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
When and How Data Lakes Fit into a Modern Data Architecture
MongoDB Tick Data Presentation
L’architettura di classe enterprise di nuova generazione
Microsoft Azure Big Data Analytics
A Key to Real-time Insights in a Post-COVID World (ASEAN)
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Introduction to Modern Data Virtualization 2021 (APAC)
Virtualisation de données : Enjeux, Usages & Bénéfices
Fast Data Strategy Houston Roadshow Presentation
Was steckt drinnen, im Data Market Austria?
Data Virtualization: From Zero to Hero
Webinar Data Mesh - Part 3
Data Virtualization: Introduction and Business Value (UK)
Why Data Virtualization? An Introduction

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Webinar: How Banks Use MongoDB as a Tick Database

  • 1. How Capital Markets Firms Use MongoDB as a Tick Database Matt Kalan, Sr. Solution Architect Email: Matt.kalan@10gen.com Twitter: @matthewkalan
  • 2. Agenda • MongoDB Introduction • FS Use Cases • Writing/Capturing Market Data • Reading/Analyzing Market Data • Performance, Scalability, & High Availability • Q&A 2
  • 3. Introduction 10gen is the company behind MongoDB – the leading next generation database Document- General Open- Oriented Purpose Source 3
  • 4. 10gen Overview 200+ employees 500+ customers Offices in New York, Palo Alto, Washington Over $81 million in funding DC, London, Dublin, Barcelona and Sydney 4
  • 5. Database Landscape • No Automatic Joins • Document Transactions • Fast, Scalable Read/Writes 5
  • 6. MongoDB Business Benefits Increased Developer Productivity Better Customer Experience Faster Time to Market Lower TCO 6
  • 7. MongoDB Technical Benefits Application Agile & Flexible { author: “roger”, High Highly date: new Date(), text: “Spirited Away”, Performance Available tags: [“Tezuka”, “Manga”]} -Indexes -Replica Sets -RAM Horizontally Scalable -Sharding 7
  • 8. Most Common FS Use Cases 1. Tick Data Capture & Analysis 2. Reference Data Management 3. Risk Analysis & Reporting 4. Trade Repository 5. Portfolio Reporting 8
  • 9. Tick Data Capture & Analysis - Requirements • Capture real-time market data (multi-asset, top of book, depth of book, even news) • Load historical data • Aggregate data into bars, daily, monthly intervals • Enable queries & analysis on raw ticks or aggregates • Drive backtesting or automated signals 9
  • 10. Tick Data Capture & Analysis – Why MongoDB? • High throughput => can capture real-time feeds for all products/asset classes needed • High scalability => all data and depth for all historical time periods can be captured • Flexible & Range-based indexing => fast querying on time ranges and any fields • Aggregation Framework => can shape raw data into aggregates (e.g. ticks to bars) • Map-reduce capability (Native MR or Hadoop Connector) => batch analysis looking for patterns and opportunities • Easy to use => native language drivers and JSON expressions that you can apply for most operational database needs as well • Low TCO => Low software license cost and commodity hardware 10
  • 12. High Level Trading Architecture Market Data Capturing Feed Handler Application News & social networking sources Cached Static & Orders Aggregated Data Exchanges/Mark Low Latency ets/Brokers Applications Trades/metrics Orders Higher Latency Backtesting and Trading Analysis Applications Applications 12
  • 13. High Level Trading Architecture Market Data Capturing Feed Handler Application News & social networking Data Types sources • Top of book Cached Static & Orders • Depth of book Aggregated Data Exchanges/Mark • Low Latency Multi-asset ets/Brokers Applications • Derivatives (e.g. strips) • News (text, video) Trades/metrics • Social Networking Orders Higher Latency Backtesting and Trading Analysis Applications Applications 13
  • 14. Top of book [e.g. equities] { _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), bidPrice: 55.37, offerPrice: 55.58, bidQuantity: 500, offerQuantity: 700 } > db.ticks.find( {symbol: "DIS", bidPrice: {$gt: 55.36} } ) 14
  • 15. Depth of book { _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), bidPrices: [55.37, 55.36, 55.35], offerPrices: [55.58, 55.59, 55.60], bidQuantities: [500, 1000, 2000], offerQuantities: [1000, 2000, 3000] } > db.ticks.find( {bidPrices: {$gt: 55.36} } ) 15
  • 16. or any way your app uses it { _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), bids: [ {price: 55.37, amount: 500}, {price: 55.37, amount: 1000}, {price: 55.37, amount: 2000} ], offers: [ {price: 55.58, amount: 1000}, {price: 55.58, amount: 2000}, {price: 55.59, amount: 3000} ] } > db.ticks.find( {"bids.price": {$gt: 55.36} } ) 16
  • 17. Synthetic spreads { _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), spreadPrice: 0.58 leg1: {symbol: “CLM13, price: 97.34} leg2: {symbol: “CLK13, price: 96.92} } db.ticks.find( { “leg1” : “CLM13” }, { “leg2” : “CLK13” }, { “spreadPrice” : {$gt: 0.50 } } ) 17
  • 18. News { _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), title: “Disney Earnings…” body: “Walt Disney Company reported…”, tags: [“earnings”, “media”, “walt disney”] } 18
  • 19. Social networking { _id : ObjectId("4e2e3f92268cdda473b628f6"), timestamp: ISODate("2013-02-15 10:00"), twitterHandle: “jdoe”, tweet: “Heard @DisneyPictures is releasing…”, usernamesIncluded: [“DisneyPictures”], hashTags: [“movierumors”, “disney”] } 19
  • 20. Aggregates (bars, daily, etc.) { _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS”, openTS: Date("2013-02-15 10:00"), closeTS: Date("2013-02-15 10:05"), open: 55.36, high: 55.80, low: 55.20, close: 55.70 } 20
  • 22. Architecture for Querying Data Research & Analysis • Ticks Applications • Bars • Other analysis Backtesting Applications Higher Latency Trading Applications 22
  • 23. Index any fields: arrays, nested, etc // Compound indexes > db.ticks.ensureIndex({symbol: 1, timestamp:1}) // Index on arrays >db.ticks.ensureIndex( {bidPrices: -1}) // Index on any depth > db.ticks.ensureIndex( {“bids.price”: 1} ) // Full text search > db.ticks.ensureIndex ( {tweet: “text”} ) 23
  • 24. Query for ticks by time; price threshold // Ticks for last month for media companies > db.ticks.find({ symbol: {$in: ["DIS", “VIA“, “CBS"]}, timestamp: {$gt: new ISODate("2013-01-01")}, timestamp: {$lte: new ISODate("2013-01-31")}}) // Ticks when Disney’s bid breached 55.50 this month > db.ticks.find({ symbol: "DIS", bidPrice: {$gt: 55.50}, timestamp: {$gt: new ISODate("2013-02-01")}}) 24
  • 25. Analyzing/Aggregating Options • Custom application code – Run your queries, compute your results • Aggregation framework – Declarative, pipeline-based approach • Native Map/Reduce in MongoDB – Javascript functions distributed across cluster • Hadoop Connector – Offline batch processing/computation 25
  • 26. Aggregate into min bars //Aggregate minute bars for Disney for this month db.ticks.aggregate( { $match: {symbol: "DIS”, timestamp: {$gt: new ISODate("2013-02-01")}}}, { $project: { year: {$year: "$timestamp"}, month: {$month: "$timestamp"}, day: {$dayOfMonth: "$timestamp"}, hour: {$hour: "$timestamp"}, minute: {$minute: "$timestamp"}, second: {$second: "$timestamp"}, timestamp: 1, price: 1}}, { $sort: { timestamp: 1}}, { $group : { _id : {year: "$year", month: "$month", day: "$day", hour: "$hour", minute: "$minute"}, open: {$first: "$price"}, high: {$max: "$price"}, low: {$min: "$price"}, close: {$last: "$price"} }} ) 26
  • 27. Add analysis on the bars … //then count the number of down bars { $project: { downBar: {$lt: [“$close”, “$open”] }, timestamp: 1, open: 1, high: 1, low: 1, close: 1}}, { $group: { _id: “$downBar”, sum: {$sum: 1}}} }) 27
  • 28. Map-Reduce Example: Sum var mapFunction = function () { emit(this.symbol, this.bidPrice); } var reduceFunction = function (symbol, priceList) { return Array.sum(priceList); } > db.ticks.mapReduce( map, reduceFunction, {out: ”tickSums"}) 28
  • 29. Process Data on Hadoop • MongoDB’s Hadoop Connector • Supports Map/Reduce, Streaming, Pig • MongoDB as input/output storage for Hadoop jobs – No need to go through HDFS • Leverage power of Hadoop ecosystem against operational data in MongoDB 29
  • 30. Performance, Scalability, and High Availability
  • 31. Why MongoDB is fast and scalable Better data locality In-Memory Auto-Sharding Caching Read/write scaling Relational MongoDB 31
  • 32. Auto-Sharding for Horizontal Scale Key Range Symbol: A…Z mongod Read/Write Scalability 32
  • 33. Auto-Sharding for Horizontal Scale Key Range Key Range Symbol: A…J Symbol: K…Z mongod mongod Read/Write Scalability 33
  • 34. Sharding Key Range Key Range Key Range Key Range Symbol: A…F Symbol: G…J Symbol: K…O Symbol: P…Z mongod mongod mongod mongod Read/Write Scalability 34
  • 35. Application MongoS MongoS MongoS Key Range Key Range Key Range Key Range Symbol: A…F, Symbol: G…J, Symbol: K…O, Symbol: P…Z, Time Time Time Time Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary 35
  • 36. 10gen Products and Services Subscriptions Professional Support, Enterprise Edition and Commercial License Consulting Expert Resources for All Phases of MongoDB Implementations Training Online and In-Person, for Developers and Administrators 36
  • 37. Summary • MongoDB is high performance for tick data • Scales horizontally automatically by auto- sharding • Fast, flexible querying, analysis, & aggregation • Dynamic schema can handle any data types • MongoDB has all these features with low TCO • 10gen can support you with anything discussed 37
  • 38. For More Information Resource User Data Management Location MongoDB Downloads www.mongodb.org/download Free Online Training education.10gen.com Webinars and Events www.10gen.com/events White Papers www.10gen.com/white-papers Customer Case Studies www.10gen.com/customers Presentations www.10gen.com/presentations Documentation docs.mongodb.org Additional Info info@10gen.com 38
  • 39. How Capital Markets Firms Use MongoDB as a Tick Database Matt Kalan, Sr. Solution Architect Email: Matt.kalan@10gen.com Twitter: @matthewkalan

Editor's Notes

  • #6: Mention tick databases
  • #15: JSON document – contains key value pairs, different types, values can also be arrays and other documents
  • #16: because of the way MongoDB lets you update documents atomically we can be sure totals and list of voters will stay in sync
  • #17: because of the way MongoDB lets you update documents atomically we can be sure totals and list of voters will stay in sync
  • #18: because of the way MongoDB lets you update documents atomically we can be sure totals and list of voters will stay in sync
  • #19: comments is an array of JSON documentswe can query by fields inside embedded documents as well as array members.
  • #20: secondary indexes, compound indexes, multikey indexes.why is it important to have all of document together? data locality
  • #21: secondary indexes, compound indexes, multikey indexes.why is it important to have all of document together? data locality
  • #32: Fewer reads, data is together, memory mapped files, caching handled by OS, naturally leaves most frequently accessed data in RAM (have enough RAM to fit indexes and working data set into RAM for best performance), horizontal scaling is "built-in" to the product by design from the start.
  • #36: Full deployment. As many mongoS processes as you have app servers (for example); Config DBs are small but hold the critical information about where ranges of data are located on disk/shards.