SlideShare a Scribd company logo
@ljacomet#DevoxxPL
Data Consistency
Analyze, understand and decide
Louis Jacomet
@ljacomet
Principal Software Engineer
Software AG / Terracotta
Platinum Sponsors:
@ljacomet#DevoxxPL
• Louis Jacomet / @ljacomet
• Principal Software Engineer at Software AG / Terracotta since 2013
• A developer closer to his forties that did not fully manage to
dodge all things management
• Interests range from concurrency to API design, with learning new
things as a driving factor
• Part of the Devoxx family as program committee for Belgium
Who is that guy?
@ljacomet#DevoxxPL
• Been presenting on caching for a while now
• Focus usually on
• performance gains,
• ease of use,
• integration
• Mostly silent on consistency issues
• Distributed systems with or without micro services are really trendy
Why this talk?
@ljacomet#DevoxxPL
• Some tools sound like magic
• Makes for hard wake up calls when production disaster happen
• Building on the shoulder of giants does not mean you should
not look at the giant!
Why this talk?
@ljacomet#DevoxxPL
• Consistency as in ACID
• Consistency as in CAP
• And what about your application?
Agenda
@ljacomet#DevoxxPL
• Defines a model with a set of rules, and being consistent
means the rules are respected
• Examples:
• serial execution in a program thread
• or the Java memory model
Consistency?
@ljacomet#DevoxxPL
@ljacomet#DevoxxPL
• From the database world:
“Consistency in database systems refers to the requirement
that any given database transaction must change affected
data only in allowed ways.Any data written to the database
must be valid according to all defined rules, including
constraints, cascades, triggers, and any combination thereof.”
https://en.wikipedia.org/wiki/Consistency_(database_systems)
Data Consistency and ACID
@ljacomet#DevoxxPL
• Isolation
• Concurrent transactions results in system state that would
be obtained if they were executed serially
• 4 levels of isolation, 3 read phenomena in ANSI SQL
• Consistency and Isolation are related properties
• Usually configurable
C and I
@ljacomet#DevoxxPL
Isolation levels vs Read phenomena
Isolation level Dirty reads
Non-repeatable
reads
Phantom reads
Read uncommitted X X X
Read committed X X
Repeatable read X
Serialisable
@ljacomet#DevoxxPL
• Configurable in your data source
• Frameworks may offer configuration
• When pooling connections, most often the option is one
isolation level for all
• Use multiple pools for multiple levels
• See Spring support for example
Isolation levels in Java
@ljacomet#DevoxxPL
• 4 different strategies
• Read only
• Non strict read write
• Read write
• Transactional
Hibernate Caching Strategies
@ljacomet#DevoxxPL
• Opens a window of inconsistency by using invalidation
• Cache entries are invalidated before and after transaction
completion
• Means that a concurrent transaction could end up loading an
outdated value during that time in the cache
Non strict read write
@ljacomet#DevoxxPL
• Resolves inconsistencies by using soft locks
• Cached items can only be read by transactions started after
the item’s creation
• Invalidated entries can only be replaced by a transaction with
a timestamp after the transaction that invalidated the mapping
Read write
@ljacomet#DevoxxPL
• Researchers have since then identified more phenomena and
thus defined more isolation levels
• Examples:
• read skew or write skew phenomena
• Snapshot or cursor stability isolation levels
Not the whole story …
@ljacomet#DevoxxPL
Isolation anomalies: Read skew
T1
T 2
T 1
T 2
x x
y y
read
50 25
write
50 75
write
commit
75
read
y = 75x = 50
@ljacomet#DevoxxPL
Isolation anomalies: Write skew
T1
T 2
T 1
T 2
x x
y y
read
30
10
write
commit
10
read
y = 10
x = 30
50
60
write
commit
@ljacomet#DevoxxPL
http://learnyousomeerlang.com/distribunomicon
@ljacomet#DevoxxPL
• Availability
“Availability means that every request to a non-failing
node must complete successfully. Since network
partitions are allowed to last arbitrarily long, this means that
nodes cannot simply defer responding until after the partition
heals.”
https://aphyr.com/posts/313-strong-consistency-models
CAP definitions
@ljacomet#DevoxxPL
• Partition (tolerance)
“Partition tolerance means that partitions can happen.
Providing consistency and availability when the network is reliable
is easy. Providing both when the network is not reliable is provably
impossible. If your network is not perfectly reliable–and it isn’t–you
cannot choose CA.This means that all practical distributed systems
on commodity hardware can guarantee, at maximum, either AP or
CP.”
https://aphyr.com/posts/313-strong-consistency-models
CAP definitions
@ljacomet#DevoxxPL
• (Atomic) Consistency
“Consistency means linearizability, and in particular, a
linearizable register. Registers are equivalent to other systems,
including sets, lists, maps, relational databases, and so on, so the
theorem can be extended to cover all kinds of linearizable
systems.”
https://aphyr.com/posts/313-strong-consistency-models
CAP definitions
@ljacomet#DevoxxPL
• Back to consistency - the term, not the definition
• Defines a model with a set of rules, and being consistent
means the rules are respected
Defining Linearizability
@ljacomet#DevoxxPL
• Operations span time
• Luckily, this time is finite
• From the beginning to the end of the operation
• Effect could be visible at any time during that span
• Let’s call that the linearisation point
Defining Linearizability
@ljacomet#DevoxxPL
• If there is a valid sequential history of operations using the
linearisation point, then linearizability is achieved
• Knowing that a response preceding an invocation must still
precede it in the reordering.
So what is Linearizability?
@ljacomet#DevoxxPL
A invokes lock B invokes lock A “fails” to lock B “gets” lock
@ljacomet#DevoxxPL
A invokes lock B invokes lockA “fails” to lock B “gets” lock
@ljacomet#DevoxxPL
A invokes lockB invokes lock A “fails” to lockB “gets” lock
@ljacomet#DevoxxPL
• Powerful consequences:
• Completed operations must be visible
• Stale and non monotonic reads are prohibited
• Stackable model
• You can build higher level linearizability on top of
linearizability
So in practice?
@ljacomet#DevoxxPL
• Attracted (public) attention to weaker consistency models
• By relaxing constraints, you can be C’A’P
Consequences of CAP
@ljacomet#DevoxxPL
https://aphyr.com/posts/313-strong-consistency-models
@ljacomet#DevoxxPL
Cache
Terracotta
client
Terracotta
server
Terracotta
client
Terracotta
client
@ljacomet#DevoxxPL
Cache
Terracotta
client
Terracotta
server
Terracotta
client
Terracotta
client
@ljacomet#DevoxxPL
http://hackingdistributed.com/2013/03/23/consistency-alphabet-soup/
@ljacomet#DevoxxPL
• Your application may never trigger these issues
• Not enough concurrency
• Higher consistency provided by the application logic
• Repair of inconsistencies are part of the business process
But why does it work then?
@ljacomet#DevoxxPL
• It probably cares about neither
• Instead it defines its own set of rules and must be consistent
with regards to those
What about your application?
@ljacomet#DevoxxPL
• An application is built of multiple pieces
• Storage, eventing, messaging
• Services, distributed or not
• UIs on different platforms with different partition
characteristics
Composing systems
@ljacomet#DevoxxPL
• Proposition:
“A cache should never be the cause of an application error”
Ehcache resilience strategy
@ljacomet#DevoxxPL
• For in-memory, the cache should always be consistent
• With write through, a failure to write means the entry is not
in the cache
• With write-behind, a failure to write will invalidate the cache
entry
Ehcache resilience strategy
@ljacomet#DevoxxPL
• What about distributed caches?
• Idea is to require users to provide their conflict resolution
strategy
Ehcache resilience strategy
@ljacomet#DevoxxPL
• Analyse the properties of the system
• Your application
• The tools it is built upon
• Understand where things can go wrong and what are the
consequences
• Then decide what to do and how to minimise impacts!
Conclusion
@ljacomet#DevoxxPL
• Aphyr and all things Jespen
• https://aphyr.com/posts
• Work from Peter Bailis
• http://www.bailis.org/blog/
• Adrian Colyer’s morning paper
• https://blog.acolyer.org/
• And more … shoulder of giants, remember?
References
@ljacomet#DevoxxPL
Q & A
Platinum Sponsors:

More Related Content

PPTX
Pandas csv
PPT
Python Pandas
PPTX
Lex & yacc
PPTX
Data mining tools overall
PDF
COMPILER DESIGN- Introduction & Lexical Analysis:
PPTX
Vb decision making statements
PPTX
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Pandas csv
Python Pandas
Lex & yacc
Data mining tools overall
COMPILER DESIGN- Introduction & Lexical Analysis:
Vb decision making statements
Introduction to Jupyter notebook and MS Azure Machine Learning Studio

What's hot (20)

PDF
Python Pandas.pdf
PPTX
Basics of Object Oriented Programming in Python
PPTX
Relationship Between Big Data & AI
PPT
SRS for banking system requirement s.ppt
PPTX
Unit 1-Data Science Process Overview.pptx
PDF
file handling c++
PPTX
Big data Analytics Hadoop
PPTX
Input Output Management In C Programming
PDF
Design and analysis of algorithm
PPTX
Object Oriented Programming in Python
PPT
brief introduction on Oracle
PPTX
Preprocessor directives in c language
PPTX
Relational Database Management System
PPTX
The columnar roadmap: Apache Parquet and Apache Arrow
PDF
Introduction to algorithms
PPTX
Data Wrangling
PPTX
PPT on Data Science Using Python
PPTX
introduction to data science
PPTX
Data science.chapter-1,2,3
PPTX
Introduction to pandas
Python Pandas.pdf
Basics of Object Oriented Programming in Python
Relationship Between Big Data & AI
SRS for banking system requirement s.ppt
Unit 1-Data Science Process Overview.pptx
file handling c++
Big data Analytics Hadoop
Input Output Management In C Programming
Design and analysis of algorithm
Object Oriented Programming in Python
brief introduction on Oracle
Preprocessor directives in c language
Relational Database Management System
The columnar roadmap: Apache Parquet and Apache Arrow
Introduction to algorithms
Data Wrangling
PPT on Data Science Using Python
introduction to data science
Data science.chapter-1,2,3
Introduction to pandas
Ad

Similar to Data consistency: Analyse, understand and decide (20)

PDF
Big data 101 for beginners devoxxpl
PPTX
Data Engineering for Data Scientists
PDF
Caching 101: Caching on the JVM (and beyond)
PDF
Data Consitency Patterns in Cloud Native Applications
ODP
Distributed Systems
PPT
CAP, PACELC, and Determinism
PDF
Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan...
PPTX
Lost with data consistency
PDF
Lightning talk: highly scalable databases and the PACELC theorem
PDF
CM2-Data model for Big Data chapter2.pdf
PDF
Database Consistency Models
PPTX
NoSQL Introduction, Theory, Implementations
PPTX
HbaseHivePigbyRohitDubey
PDF
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
PPTX
Hbase hive pig
PDF
Big data 101 for beginners riga dev days
PDF
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
PPTX
cse40822-CAP.pptx
PDF
Real-world consistency explained
PPTX
CS 542 Parallel DBs, NoSQL, MapReduce
Big data 101 for beginners devoxxpl
Data Engineering for Data Scientists
Caching 101: Caching on the JVM (and beyond)
Data Consitency Patterns in Cloud Native Applications
Distributed Systems
CAP, PACELC, and Determinism
Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan...
Lost with data consistency
Lightning talk: highly scalable databases and the PACELC theorem
CM2-Data model for Big Data chapter2.pdf
Database Consistency Models
NoSQL Introduction, Theory, Implementations
HbaseHivePigbyRohitDubey
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Hbase hive pig
Big data 101 for beginners riga dev days
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
cse40822-CAP.pptx
Real-world consistency explained
CS 542 Parallel DBs, NoSQL, MapReduce
Ad

More from Louis Jacomet (7)

PDF
Protecting your organization against attacks via the build system
PDF
Caching 101: Caching on the JVM (and beyond)
PDF
Caching 101: sur la JVM et au delà
PDF
Ehcache 3 @ BruJUG
PDF
Caching reboot: javax.cache & Ehcache 3
PDF
Caching reboot: javax.cache & Ehcache 3
PDF
Ehcache 3: JSR-107 on steroids at Devoxx Morocco
Protecting your organization against attacks via the build system
Caching 101: Caching on the JVM (and beyond)
Caching 101: sur la JVM et au delà
Ehcache 3 @ BruJUG
Caching reboot: javax.cache & Ehcache 3
Caching reboot: javax.cache & Ehcache 3
Ehcache 3: JSR-107 on steroids at Devoxx Morocco

Recently uploaded (20)

PPTX
ai tools demonstartion for schools and inter college
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Essential Infomation Tech presentation.pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
System and Network Administration Chapter 2
PDF
System and Network Administraation Chapter 3
PDF
Nekopoi APK 2025 free lastest update
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Introduction to Artificial Intelligence
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Transform Your Business with a Software ERP System
PPTX
L1 - Introduction to python Backend.pptx
ai tools demonstartion for schools and inter college
Upgrade and Innovation Strategies for SAP ERP Customers
Internet Downloader Manager (IDM) Crack 6.42 Build 41
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Understanding Forklifts - TECH EHS Solution
Essential Infomation Tech presentation.pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
CHAPTER 2 - PM Management and IT Context
How to Migrate SBCGlobal Email to Yahoo Easily
System and Network Administration Chapter 2
System and Network Administraation Chapter 3
Nekopoi APK 2025 free lastest update
Design an Analysis of Algorithms II-SECS-1021-03
Introduction to Artificial Intelligence
VVF-Customer-Presentation2025-Ver1.9.pptx
Odoo POS Development Services by CandidRoot Solutions
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Transform Your Business with a Software ERP System
L1 - Introduction to python Backend.pptx

Data consistency: Analyse, understand and decide