SlideShare a Scribd company logo
Roger	
  Bodamer
roger@10gen.com
    @rogerb
Thoughts on Transaction
          and
  Consistency Models
RDBMS
(Oracle,	
  MySQL)
RDBMS
                          (Oracle,	
  MySQL)




New Gen.
 OLAP
(vertica,	
  aster,	
  
 greenplum)
RDBMS
                          (Oracle,	
  MySQL)




New Gen.                                       Non-relational
 OLAP                                           Operational
(vertica,	
  aster,	
                             Stores
 greenplum)                                       (“NoSQL”)
The database world is changing
Document Datastores, Key Value, Graph databases
The database world is changing
         Transactional model
The database world is changing
            Full Acid
The database world is changing
• memcached
scalability	
  &	
  performance


                                      • key/value


                                                                            •   RDBMS




                                             depth	
  of	
  functionality
CAP
It is impossible in the asynchronous network model to
implement a read/write data object that guarantees the
following properties:
• Availability
• Atomic consistency in all fair executions (including those
in which messages are lost).
CAP
  It is impossible in the asynchronous network model to
  implement a read/write data object that guarantees the
  following properties:
  • Availability
  • Atomic consistency in all fair executions (including those
  in which messages are lost).

Or: If the network is broken, your database won’t work

But we get to define “won’t work”
Consistency Models - CAP

• Choices are Available-P (AP) or Consistent-P (CP)
• Write Availability, not Read Availability, is the Main Question

• It’s not all about CAP
   Scale, Reduced latency:
       •Multi data center
       •Speed
       •Even load distribution
Examples of Eventually
          Consistent Systems

• Eventual Consistency:
   •“The storage system guarantees that if no new
   updates are made to the object, eventually all
   accesses will return the last updated value”

•Examples:
   • DNS
   • Async replication (RDBMS, MongoDB)
   • Memcached (TTL cache)
Eventual Consistency
Eventual Consistency




Read(x)	
  :	
  1,	
  2,	
  2,	
  4,	
  4,	
  4,	
  4	
  …
Could we get this?




Read(x)	
  :	
  1,	
  2,	
  1,	
  4,	
  2,	
  4,	
  4,	
  4	
  …
Monotonic read consistency
Prevent	
  seeing	
  writes	
  out	
  of	
  order

                                 • Appserver and slave on same
                                 box
                                 • Appserver only reads from Slave

                                 • Eventually consistent
                                 • Guaranteed to see reads in
                                 order
Monotonic read consistency
Prevent	
  seeing	
  writes	
  out	
  of	
  order

                                 • Appserver and slave on same
                                 box
                                 • Appserver only reads from Slave

                                 • Eventually consistent
                                 • Guaranteed to see reads in
                                 order

                                 • Failover ?
RYOW Consistency
                                                             • Read Your Own
                                                 Primary     Writes
Read	
  /	
  	
  Has	
  Written
                                   Replication               • Eventually consistent
                                                             for readers
                                                 Secondary   • Immediately
                                                             consistent for writers
   Read
Amazon Dynamo

•R: # of servers to read from
•W: # servers to get response from
•N: Replication factor
   • R+W>N has nice properties
Example
Example	
  1

R	
  +	
  W	
  <=	
  N

R	
  =	
  1
W	
  =	
  1
N	
  =	
  5

Possibly	
  Stale	
  data
Higher	
  availability
Example
Example	
  1                   Example	
  2

R	
  +	
  W	
  <=	
  N         R	
  +	
  W	
  >	
  N

R	
  =	
  1                    R	
  =	
  2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  R	
  =	
  1
W	
  =	
  1                    W	
  =	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  W	
  =	
  2
N	
  =	
  5                    N	
  =	
  2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  N	
  =	
  2

Possibly	
  Stale	
  data      Consistent	
  data
Higher	
  availability
R+W>N
    If R+W > N, we can’t
       have both fast local
       reads and writes at
       the same time if all
       the data centers are
       equal peers
Consistency levels
• Strong: W + R > N
• Weak / Eventual : W + R <= N
• Optimized Read: R=1, W=N
• Optimized Write: W=1, R=N
Multi Datacenter Strategies
• DR
   • Failover
• Single Region, Multi Datacenter
   • Useful for Strong or Eventual Consistency
• Local Reads, Remote Writes
   • All writes to master on WAN, EC on Slaves
• Intelligent Homing
   • Master copy close to user location
Network Partitions
Network Write Possibilities
•Deny all writes
   • Still Read fully consistent data
   • Give up write Availability
Network Write Possibilities
•Deny all writes
   • Still Read fully consistent data
   • Give up write Availability

•Allow writes on one side
   • Failover
   • Possible to allow reads on other side
Network Write Possibilities
•Deny all writes
   • Still Read fully consistent data
   • Give up write Availability

•Allow writes on one side
   • Failover
   • Possible to allow reads on other side

•Allow writes on both sides
   • Available
   • Give up consistency
Multiple Writer Strategies
• Last one wins
   • Use Vector clocks to decide latest

• Insert
     Inserts often really means
       if ( !exists(x)) then set(x)
    exists is hard to implement in eventually
    consistent systems
Multiple Writer Strategies
• Delete
    op1: set( { _id : 'joe', age : 40 } }
    op2: delete( { _id : 'joe' } )
    op3: set( { _id : 'joe', age : 33 } )
• Consider switching 2 and 3
• Tombstone: Remember delete, and apply last-operation-wins

• Update
 update users set age=40 where _id=’joe’
However:
 op1: update users set age=40 where _id='joe'
 op2: update users set state='ca' where _id='joe'
Multiple Writer Strategies
• Programatic Merge
   • Store operations, instead of state
   • Replay operations
   • Did you get the last operation ?
   • Not immediate

•Communtative operations
   • Conflict free?
   • Fold-able
   • Example: add, increment, decrement
Trivial Network Partitions
MongoDB Options
MongoDB Transaction Support
• Single Master / Sharded

• MongoDB Supports Atomic Operations on Single Documents
   • But not Rollback
• $ operators
   • $set, $unset, $inc, $push, $pushall, $pull, $pullall
• Update if current
   • find followed by update
• Find and modify
MongoDB


                               Primary
Read	
  /	
  Write
MongoDB


                                       Primary
Read	
  /	
  Write
                         Replication


                                       Secondary


                         Replication


                                       Secondary	
  for	
  Backup
MongoDB


                                       Primary
Read	
  /	
  Write
                         Replication


                                       Secondary


     Read                Replication


                                       Secondary	
  for	
  Backup
MongoDB
                         Replicaset

                                       Primary
Read	
  /	
  Write
                         Replication


                                       Secondary


     Read                Replication


                                       Secondary	
  for	
  Backup
Write Scalability: Sharding
read      key	
  range	
      key	
  range	
      key	
  range	
  
            0	
  ..	
  30      31	
  ..	
  60      61	
  ..	
  100

         ReplicaSet	
  1     ReplicaSet	
  2     ReplicaSet	
  3



          Primary              Primary            Primary


         Secondary           Secondary           Secondary


         Secondary           Secondary           Secondary



                                                                     write
Thoughts on Transaction and Consistency Models
Sometimes we need global state /
      more consistency
•Unique key constraints
   •User registration
•ACL changes
Could it be the case that…


uptime( CP + average developer )
 >=
uptime( AP + average developer )

where uptime:= system is up and non-buggy?
Thank You :-)
  @rogerb
download at mongodb.org

                conferences,	
  appearances,	
  and	
  meetups
                                         http://www.10gen.com/events




    Facebook	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  |	
  	
  	
  	
  	
  	
  	
  	
  	
  Twitter	
  	
  	
  	
  	
  	
  	
  	
  	
  |	
  	
  	
  	
  	
  	
  	
  	
  	
  LinkedIn
http://bit.ly/mongoU	
                                                        @mongodb                                          http://linkd.in/joinmongo

More Related Content

PDF
Consistency Models in New Generation Databases
PDF
Q con london2011-matthewwall-whyichosemongodbforguardiancouk
KEY
Message:Passing - lpw 2012
PDF
Kafka Technical Overview
PDF
Coherence Implementation Patterns - Sig Nov 2011
PDF
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
PPTX
Kafka: Internals
PDF
Building high traffic http front-ends. theo schlossnagle. зал 1
Consistency Models in New Generation Databases
Q con london2011-matthewwall-whyichosemongodbforguardiancouk
Message:Passing - lpw 2012
Kafka Technical Overview
Coherence Implementation Patterns - Sig Nov 2011
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Kafka: Internals
Building high traffic http front-ends. theo schlossnagle. зал 1

What's hot (16)

PDF
Thousands of Threads and Blocking I/O
PDF
Kafka Overview
PDF
Kafka as a message queue
PDF
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
PPT
Zarafa SummerCamp 2012 - Keynote Steve Hardy - 3 Cool innovations
PDF
Methods of NoSQL database systems benchmarking
PPTX
Diveinto AWS
PPTX
Apache Kafka - Messaging System Overview
PPTX
Apache kafka
PPTX
Kafka tutorial
ODP
Apache Kafka Demo
PPTX
Kafka overview v0.1
PDF
Tuning Linux Windows and Firebird for Heavy Workload
PDF
NoSQL afternoon in Japan Kumofs & MessagePack
PDF
Ruby Microservices with RabbitMQ
Thousands of Threads and Blocking I/O
Kafka Overview
Kafka as a message queue
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Zarafa SummerCamp 2012 - Keynote Steve Hardy - 3 Cool innovations
Methods of NoSQL database systems benchmarking
Diveinto AWS
Apache Kafka - Messaging System Overview
Apache kafka
Kafka tutorial
Apache Kafka Demo
Kafka overview v0.1
Tuning Linux Windows and Firebird for Heavy Workload
NoSQL afternoon in Japan Kumofs & MessagePack
Ruby Microservices with RabbitMQ
Ad

Viewers also liked (20)

PPT
Architecting Big Data Ingest & Manipulation
PPTX
Basic data ingestion in r
PDF
Barga IC2E & IoTDI'16 Keynote
PPTX
Big Data Ingestion @ Flipkart Data Platform
PDF
Couchdb and me
PDF
Ooredis
PDF
Mysql HandleSocket技术在SNS Feed存储中的应用
PPTX
Ocean base海量结构化数据存储系统 hadoop in china
ODP
Consistency in Distributed Systems
PPT
8 minute MongoDB tutorial slide
PDF
SDEC2011 NoSQL Data modelling
PDF
Big Challenges in Data Modeling: NoSQL and Data Modeling
PPT
skip list
PDF
Real time data ingestion and Hybrid Cloud
PDF
Jitney, Kafka at Airbnb
PDF
Cache coherence
PDF
Boosting Machine Learning with Redis Modules and Spark
KEY
Schema Design with MongoDB
PDF
Coherence and consistency models in multiprocessor architecture
PDF
Consistency in Distributed Systems
Architecting Big Data Ingest & Manipulation
Basic data ingestion in r
Barga IC2E & IoTDI'16 Keynote
Big Data Ingestion @ Flipkart Data Platform
Couchdb and me
Ooredis
Mysql HandleSocket技术在SNS Feed存储中的应用
Ocean base海量结构化数据存储系统 hadoop in china
Consistency in Distributed Systems
8 minute MongoDB tutorial slide
SDEC2011 NoSQL Data modelling
Big Challenges in Data Modeling: NoSQL and Data Modeling
skip list
Real time data ingestion and Hybrid Cloud
Jitney, Kafka at Airbnb
Cache coherence
Boosting Machine Learning with Redis Modules and Spark
Schema Design with MongoDB
Coherence and consistency models in multiprocessor architecture
Consistency in Distributed Systems
Ad

Similar to Thoughts on Transaction and Consistency Models (20)

PDF
Consistency-New-Generation-Databases
PDF
Thoughts on consistency models
PPTX
MongoDB
PPT
MongoDB Basic Concepts
PPT
Handling Data in Mega Scale Web Systems
PDF
Design Patterns For Distributed NO-reational databases
PDF
What every developer should know about database scalability, PyCon 2010
PDF
Design Patterns for Distributed Non-Relational Databases
PDF
Scalable Data Storage Getting You Down? To The Cloud!
PDF
Scalable Data Storage Getting you Down? To the Cloud!
PDF
NoSQL databases
PPTX
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
PPT
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
PPTX
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
ODP
Distributed systems and consistency
PDF
Intro to Cassandra
PDF
MongoDB: What, why, when
PDF
Cassandra for Ruby/Rails Devs
PPT
Big Data & NoSQL - EFS'11 (Pavlo Baron)
KEY
Webinar: Building Web Applications with MongoDB and Spring
Consistency-New-Generation-Databases
Thoughts on consistency models
MongoDB
MongoDB Basic Concepts
Handling Data in Mega Scale Web Systems
Design Patterns For Distributed NO-reational databases
What every developer should know about database scalability, PyCon 2010
Design Patterns for Distributed Non-Relational Databases
Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting you Down? To the Cloud!
NoSQL databases
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Distributed systems and consistency
Intro to Cassandra
MongoDB: What, why, when
Cassandra for Ruby/Rails Devs
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Webinar: Building Web Applications with MongoDB and Spring

More from iammutex (20)

PDF
Scaling Instagram
PPT
Redis深入浅出
PDF
深入了解Redis
PDF
NoSQL误用和常见陷阱分析
PDF
MongoDB 在盛大大数据量下的应用
PPTX
Rethink db&tokudb调研测试报告
PDF
redis 适用场景与实现
PDF
Introduction to couchdb
PPTX
What every data programmer needs to know about disks
PDF
Ooredis
PDF
redis运维之道
PDF
Realtime hadoopsigmod2011
PDF
[译]No sql生态系统
PDF
Couchdb + Membase = Couchbase
PDF
Redis cluster
PDF
Redis cluster
PDF
Hadoop introduction berlin buzzwords 2011
PDF
No sql but even less security
PDF
10 Key MongoDB Performance Indicators
PDF
MongoDB开发应用实践
Scaling Instagram
Redis深入浅出
深入了解Redis
NoSQL误用和常见陷阱分析
MongoDB 在盛大大数据量下的应用
Rethink db&tokudb调研测试报告
redis 适用场景与实现
Introduction to couchdb
What every data programmer needs to know about disks
Ooredis
redis运维之道
Realtime hadoopsigmod2011
[译]No sql生态系统
Couchdb + Membase = Couchbase
Redis cluster
Redis cluster
Hadoop introduction berlin buzzwords 2011
No sql but even less security
10 Key MongoDB Performance Indicators
MongoDB开发应用实践

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPTX
A Presentation on Artificial Intelligence
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation theory and applications.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Cloud computing and distributed systems.
Encapsulation_ Review paper, used for researhc scholars
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Review of recent advances in non-invasive hemoglobin estimation
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25 Week I
Building Integrated photovoltaic BIPV_UPV.pdf
Machine learning based COVID-19 study performance prediction
Mobile App Security Testing_ A Comprehensive Guide.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Digital-Transformation-Roadmap-for-Companies.pptx
Empathic Computing: Creating Shared Understanding
A Presentation on Artificial Intelligence
“AI and Expert System Decision Support & Business Intelligence Systems”
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Cloud computing and distributed systems.

Thoughts on Transaction and Consistency Models

  • 2. Thoughts on Transaction and Consistency Models
  • 4. RDBMS (Oracle,  MySQL) New Gen. OLAP (vertica,  aster,   greenplum)
  • 5. RDBMS (Oracle,  MySQL) New Gen. Non-relational OLAP Operational (vertica,  aster,   Stores greenplum) (“NoSQL”)
  • 6. The database world is changing Document Datastores, Key Value, Graph databases
  • 7. The database world is changing Transactional model
  • 8. The database world is changing Full Acid
  • 9. The database world is changing
  • 10. • memcached scalability  &  performance • key/value • RDBMS depth  of  functionality
  • 11. CAP It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • Availability • Atomic consistency in all fair executions (including those in which messages are lost).
  • 12. CAP It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • Availability • Atomic consistency in all fair executions (including those in which messages are lost). Or: If the network is broken, your database won’t work But we get to define “won’t work”
  • 13. Consistency Models - CAP • Choices are Available-P (AP) or Consistent-P (CP) • Write Availability, not Read Availability, is the Main Question • It’s not all about CAP Scale, Reduced latency: •Multi data center •Speed •Even load distribution
  • 14. Examples of Eventually Consistent Systems • Eventual Consistency: •“The storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value” •Examples: • DNS • Async replication (RDBMS, MongoDB) • Memcached (TTL cache)
  • 16. Eventual Consistency Read(x)  :  1,  2,  2,  4,  4,  4,  4  …
  • 17. Could we get this? Read(x)  :  1,  2,  1,  4,  2,  4,  4,  4  …
  • 18. Monotonic read consistency Prevent  seeing  writes  out  of  order • Appserver and slave on same box • Appserver only reads from Slave • Eventually consistent • Guaranteed to see reads in order
  • 19. Monotonic read consistency Prevent  seeing  writes  out  of  order • Appserver and slave on same box • Appserver only reads from Slave • Eventually consistent • Guaranteed to see reads in order • Failover ?
  • 20. RYOW Consistency • Read Your Own Primary Writes Read  /    Has  Written Replication • Eventually consistent for readers Secondary • Immediately consistent for writers Read
  • 21. Amazon Dynamo •R: # of servers to read from •W: # servers to get response from •N: Replication factor • R+W>N has nice properties
  • 22. Example Example  1 R  +  W  <=  N R  =  1 W  =  1 N  =  5 Possibly  Stale  data Higher  availability
  • 23. Example Example  1 Example  2 R  +  W  <=  N R  +  W  >  N R  =  1 R  =  2                                          R  =  1 W  =  1 W  =  1                                        W  =  2 N  =  5 N  =  2                                        N  =  2 Possibly  Stale  data Consistent  data Higher  availability
  • 24. R+W>N If R+W > N, we can’t have both fast local reads and writes at the same time if all the data centers are equal peers
  • 25. Consistency levels • Strong: W + R > N • Weak / Eventual : W + R <= N • Optimized Read: R=1, W=N • Optimized Write: W=1, R=N
  • 26. Multi Datacenter Strategies • DR • Failover • Single Region, Multi Datacenter • Useful for Strong or Eventual Consistency • Local Reads, Remote Writes • All writes to master on WAN, EC on Slaves • Intelligent Homing • Master copy close to user location
  • 28. Network Write Possibilities •Deny all writes • Still Read fully consistent data • Give up write Availability
  • 29. Network Write Possibilities •Deny all writes • Still Read fully consistent data • Give up write Availability •Allow writes on one side • Failover • Possible to allow reads on other side
  • 30. Network Write Possibilities •Deny all writes • Still Read fully consistent data • Give up write Availability •Allow writes on one side • Failover • Possible to allow reads on other side •Allow writes on both sides • Available • Give up consistency
  • 31. Multiple Writer Strategies • Last one wins • Use Vector clocks to decide latest • Insert Inserts often really means if ( !exists(x)) then set(x) exists is hard to implement in eventually consistent systems
  • 32. Multiple Writer Strategies • Delete op1: set( { _id : 'joe', age : 40 } } op2: delete( { _id : 'joe' } ) op3: set( { _id : 'joe', age : 33 } ) • Consider switching 2 and 3 • Tombstone: Remember delete, and apply last-operation-wins • Update update users set age=40 where _id=’joe’ However: op1: update users set age=40 where _id='joe' op2: update users set state='ca' where _id='joe'
  • 33. Multiple Writer Strategies • Programatic Merge • Store operations, instead of state • Replay operations • Did you get the last operation ? • Not immediate •Communtative operations • Conflict free? • Fold-able • Example: add, increment, decrement
  • 36. MongoDB Transaction Support • Single Master / Sharded • MongoDB Supports Atomic Operations on Single Documents • But not Rollback • $ operators • $set, $unset, $inc, $push, $pushall, $pull, $pullall • Update if current • find followed by update • Find and modify
  • 37. MongoDB Primary Read  /  Write
  • 38. MongoDB Primary Read  /  Write Replication Secondary Replication Secondary  for  Backup
  • 39. MongoDB Primary Read  /  Write Replication Secondary Read Replication Secondary  for  Backup
  • 40. MongoDB Replicaset Primary Read  /  Write Replication Secondary Read Replication Secondary  for  Backup
  • 41. Write Scalability: Sharding read key  range   key  range   key  range   0  ..  30 31  ..  60 61  ..  100 ReplicaSet  1 ReplicaSet  2 ReplicaSet  3 Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary write
  • 43. Sometimes we need global state / more consistency •Unique key constraints •User registration •ACL changes
  • 44. Could it be the case that… uptime( CP + average developer ) >= uptime( AP + average developer ) where uptime:= system is up and non-buggy?
  • 45. Thank You :-) @rogerb
  • 46. download at mongodb.org conferences,  appearances,  and  meetups http://www.10gen.com/events Facebook                    |                  Twitter                  |                  LinkedIn http://bit.ly/mongoU   @mongodb http://linkd.in/joinmongo