SlideShare a Scribd company logo
3. Concurrency Control
for Transactions
Part One
CSEP 545 Transaction Processing
Philip A. Bernstein
Copyright ©2003 Philip A. Bernstein
Outline
1. A Simple System Model
2. Serializability Theory
3. Synchronization Requirements
for Recoverability
4. Two-Phase Locking
5. Preserving Transaction Handshakes
6. Implementing Two-Phase Locking
7. Deadlocks
3.1 A Simple System Model
• Goal - Ensure serializable (SR) executions
• Implementation technique - Delay operations
that would lead to non-SR results (e.g. set locks
on shared data)
• For good performance minimize overhead and
delay from synchronization operations
• First, we’ll study how to get correct (SR) results
• Then, we’ll study performance implications
(mostly in Part Two)
Assumption - Atomic Operations
• We will synchronize Reads and Writes.
• We must therefore assume they’re atomic
– else we’d have to synchronize the finer-grained
operations that implement Read and Write
• Read(x) - returns the current value of x in the DB
• Write(x, val) overwrites all of x (the whole page)
• This assumption of atomic operations is what
allows us to abstract executions as sequences of
reads and writes (without loss of information).
– Otherwise, what would wk[x] ri[x] mean?
• Also, commit (ci) and abort (ai) are atomic
System Model
Transaction 1 Transaction N
Start, Commit, Abort
Read(x), Write(x)
Data
Manager
Database
Transaction 2
3.2 Serializability Theory
• The theory is based on modeling executions as
histories, such as
H1 = r1[x] r2[x] w1[x] c1 w2[y] c2
• First, characterize a concurrency control
algorithm by the properties of histories it
allows.
• Then prove that any history having these
properties is SR
• Why bother? It helps you understand why
concurrency control algorithms work.
Equivalence of Histories
• Two operations conflict if their execution order
affects their return values or the DB state.
– a read and write on the same data item conflict
– two writes on the same data item conflict
– two reads (on the same data item) do not conflict
• Two histories are equivalent if they have the
same operations and conflicting operations are
in the same order in both histories
– because only the relative order of conflicting
operations can affect the result of the histories
Examples of Equivalence
• The following histories are equivalent
H1 = r1[x] r2[x] w1[x] c1 w2[y] c2
H2 = r2[x] r1[x] w1[x] c1 w2[y] c2
H3 = r2[x] r1[x] w2[y] c2 w1[x] c1
H4 = r2[x] w2[y] c2 r1[x] w1[x] c1
• But none of them are equivalent to
H5 = r1[x] w1[x] r2[x] c1 w2[y] c2
because r2[x] and w1[x] conflict and
r2[x] precedes w1[x] in H1 - H4, but
w1[x] precedes r2[x] in H5.
Serializable Histories
• A history is serializable if it is equivalent to a serial
history
• For example,
H1 = r1[x] r2[x] w1[x] c1 w2[y] c2
is equivalent to
H4 = r2[x] w2[y] c2 r1[x] w1[x] c1
(r2[x] and w1[x] are in the same order in H1 and H4.)
• Therefore, H1 is serializable.
Another Example
• H6= r1[x] r2[x] w1[x] r3[x] w2[y] w3[x] c3 w1[y] c1 c2
is equivalent to a serial execution of T2 T1 T3,
H7 = r2[x] w2[y] c2 r1[x] w1[x] w1[y] c1 r3[x] w3[x] c3
• Each conflict implies a constraint on any equivalent
serial history:
H6= r1[x] r2[x] w1[x] r3[x] w2[y] w3[x] c3 w1[y] c1 c2
T2→T1 T1→T3 T2→T1
T2→T3
Serialization Graphs
• A serialization graph, SG(H), for history H tells the
effective execution order of transactions in H.
• Given history H, SG(H) is a directed graph whose
nodes are the committed transactions and whose
edges are all Ti→ Tk such that at least one of Ti’s
operations precedes and conflicts with at least one
of Tk’s operations
H6 = r1[x] r2[x] w1[x] r3[x] w2[y] w3[x] c3 w1[y] c1 c2
SG(H6) = T2 →T1 →T3
The Serializability Theorem
A history is SR if and only if SG(H) is acyclic.
Proof: (if) SG(H) is acyclic. So let Hs be a serial
history consistent with SG(H). Each pair of
conflicting ops in H induces an edge in SG(H).
Since conflicting ops in Hs and H are in the same
order, Hs≡H, so H is SR.
(only if) H is SR. Let Hs be a serial history equivalent
to H. Claim that if Ti→ Tk in SG(H), then Ti
precedes Tk in Hs (else Hs≡H). If SG(H) had a cycle,
T1→T2→…→Tn→T1, then T1 precedes T1 in Hs,
How to Use
the Serializability Theorem
• Characterize the set of histories that a
concurrency control algorithm allows
• Prove that any such history must have an
acyclic serialization graph.
• Therefore, the algorithm guarantees SR
executions.
• We’ll use this soon to prove that locking
produces serializable executions.
3.3 Synchronization Requirements
for Recoverability
• In addition to guaranteeing serializability,
synchronization is needed to implement abort easily.
• When a transaction T aborts, the data manager
wipes out all of T’s effects, including
– undoing T’s writes that were applied to the DB, and
– aborting transactions that read values written by T
(these are called cascading aborts)
• Example - w1[x] r2[x] w2[y]
– to abort T1, we must undo w1[x] and abort T2
(a cascading abort)
Recoverability
• If Tk reads from Ti and Ti aborts, then Tk must abort
– Example - w1[x] r2[x] a1implies T2 must abort
• But what if Tk already committed? We’d be stuck.
– Example - w1[x] r2[x] c2 a1
– T2 can’t abort after it commits
• Executions must be recoverable:
A transaction T’s commit operation must follow the
commit of every transaction from which T read.
– Recoverable - w1[x] r2[x] c1 c2
– Not recoverable - w1[x] r2[x] c2 a1
• Recoverability requires synchronizing operations.
Avoiding Cascading Aborts
• Cascading aborts are worth avoiding to
– avoid complex bookkeeping, and
– avoid an uncontrolled number of forced aborts
• To avoid cascading aborts, a data manager should
ensure transactions only read committed data
• Example
– avoids cascading aborts: w1[x] c1 r2[x]
– allows cascading aborts: w1[x] r2[x] a1
• A system that avoids cascading aborts also
guarantees recoverability.
Strictness
• It’s convenient to undo a write, w[x], by restoring
its before image (=the value of x before w[x]
executed)
• Example - w1[x,1] writes the value “1” into x.
– w1[x,1] w1[y,3] c1 w2[y,1] r2[x] a2
– abort T2 by restoring the before image of w2[y,1], = 3
• But this isn’t always possible.
– For example, consider w1[x,2] w2[x,3] a1 a2
– a1 & a2 can’t be implemented by restoring before images
– notice that w1[x,2] w2[x,3] a2 a1would be OK
Strictness (cont’d)
• More precisely, a system is strict if it only executes
ri[x] or wi[x] if all previous transactions that wrote x
committed or aborted.
• Examples (“…” marks a non-strict prefix)
– strict: w1[x] c1 w2[x] a2
– not strict: w1[x] w2[x] … a1 a2
– strict: w1[x] w1[y] c1 w2[y] r2[x] a2
– not strict: w1[x] w1[y] w2[y] a1 r2[x] a2
• “Strict” implies “avoids cascading aborts.”
3.4 Two-Phase Locking
• Basic locking - Each transaction sets a lock on each
data item before accessing the data
– the lock is a reservation
– there are read locks and write locks
– if one transaction has a write lock on x, then no other
transaction can have any lock on x
• Example
– rli[x], rui[x], wli[x], wui[x] denote lock/unlock operations
– wl1[x] w1[x] rl2[x] r2[x] is impossible
– wl1[x] w1[x] wu1[x] rl2[x] r2[x] is OK
Basic Locking Isn’t Enough
• Basic locking doesn’t guarantee serializability
• rl1[x] r1[x] ru1[x] wl1[y] w1[y] wu1[y] c1
rl2[y] r2[y] wl2[x] w2[x] ru2[y] wu2[x] c2
• Eliminating the lock operations, we have
r1[x] r2[y] w2[x] c2w1[y] c1which isn’t SR
• The problem is that locks aren’t being released
properly.
Two-Phase Locking (2PL) Protocol
• A transaction is two-phase locked if:
– before reading x, it sets a read lock on x
– before writing x, it sets a write lock on x
– it holds each lock until after it executes the
corresponding operation
– after its first unlock operation, it requests no new locks
• Each transaction sets locks during a growing phase
and releases them during a shrinking phase.
• Example - on the previous page T2 is two-phase
locked, but not T1 since ru1[x] < wl1[y]
– use “<” for “precedes”
2PL Theorem: If all transactions in an execution are
two-phase locked, then the execution is SR.
Proof: Define Ti ⇒ Tkif either
– Ti read x and Tklater wrote x, or
– Ti wrote x and Tklater read or wrote x
• If Ti ⇒ Tk, then Ti released a lock before Tk
obtained some lock.
• If Ti ⇒ Tk⇒ Tm, then Ti released a lock before Tm
obtained some lock (because Tk is two-phase).
• If Ti ⇒...⇒ Ti, then Ti released a lock before Ti
obtained some lock, breaking the 2-phase rule.
• So there cannot be a cycle. By the Serializability
2PL and Recoverability
• 2PL does not guarantee recoverability
• This non-recoverable execution is 2-phase locked
wl1[x] w1[x] wu1[x] rl2[x] r2[x] c2 … c1
– hence, it is not strict and allows cascading aborts
• However, holding write locks until after commit or
abort guarantees strictness
– and hence avoids cascading aborts and is recoverable
– In the above example, T1 must commit before its first
unlock-write (wu1): wl1[x] w1[x] c1 wu1[x] rl2[x] r2[x] c2
Automating Locking
• 2PL can be hidden from the application
• When a data manager gets a Read or Write
operation from a transaction, it sets a read or write
lock.
• How does the data manager know it’s safe to
release locks (and be two-phase)?
• Ordinarily, the data manager holds a transaction’s
locks until it commits or aborts. A data manager
– can release read locks after it receives commit
– releases write locks only after processing commit,
to ensure strictness
3.5 Preserving Transaction Handshakes
• Read and Write are the only operations the
system will control to attain serializability.
• So, if transactions communicate via messages,
then implement SendMsg as Write, and
ReceiveMsg as Read.
• Else, you could have the following:
w1[x] r2[x] send2[M] receive1[M]
– data manager didn’t know about send/receive and
thought the execution was SR.
• Also watch out for brain transport
Transactions Can Communicate via Brain
Transport
T1: Start
. . .
Display output
Commit
T2: Start
Get input from display
. . .
Commit
User reads output
…
User enters input
Brain
transport
Brain Transport (cont’d)
• For practical purposes, if user waits for T1 to
commit before starting T2, then the data manager
can ignore brain transport.
• This is called a transaction handshake
(T1 commits before T2 starts)
• Reason - Locking preserves the order imposed by
transaction handshakes
– e.g., it serializes T1 before T2.
2PL Preserves Transaction Handshakes
• Recall the definition: Ti commits before Tk starts
• 2PL serializes txns consistent with all transaction
handshakes. I.e. there’s an equivalent serial
execution that preserves the transaction order of
transaction handshakes
• This isn’t true for arbitrary SR executions. E.g.
– r1[x] w2[x] c2 r3[y] c3 w1[y] c1
– T2 commits before T3 starts, but the only equivalent
serial execution is T3 T1 T2
– rl1[x] r1[x] wl1[y] ru1[x] wl2[x] w2[x] wu2[x] c2
2PL Preserves Transaction
Handshakes (cont’d)
• Stating this more formally …
• Theorem:
For any 2PL execution H,
there is an equivalent serial execution Hs,
such that for all Ti, Tk,
if Ti committed before Tk started in H,
then Ti precedes Tk in Hs.
Brain Transport  One Last Time
• If a user reads committed displayed output of Ti
and uses that displayed output as input to
transaction Tk, then he/she should wait for
Ti to commit before starting Tk.
• The user can then rely on transaction handshake
preservation to ensure Ti is serialized before Tk.
3.6 Implementing Two-Phase Locking
• Even if you never implement a DB system, it’s
valuable to understand locking implementation,
because it can have a big effect on performance.
• A data manager implements locking by
– implementing a lock manager
– setting a lock for each Read and Write
– handling deadlocks
System Model
Transaction 1 Transaction N
Database
System
Start,
SQL Ops
Commit, Abort
Query Optimizer
Query Executor
Access Method
(record-oriented files)
Page-oriented Files
Database
How to Implement SQL
• Query Optimizer - translates SQL into an ordered
expression of relational DB operators (Select,
Project, Join)
• Query Executor - executes the ordered expression
by running a program for each operator, which in
turn accesses records of files
• Access methods - provides indexed record-at-a-
time access to files (OpenScan, GetNext, …)
• Page-oriented files - Read or Write (page address)
Which Operations Get Synchronized?
Record-oriented operations
Page-oriented operations
SQL operations
Query Optimizer
Query Executor
Access Method
(record-oriented files)
Page-oriented Files
• It’s a tradeoff between
– amount of concurrency and
– overhead and complexity of synchronization
Lock Manager
• A lock manager services the operations
– Lock(trans-id, data-item-id, mode)
– Unlock(trans-id, data-item-id)
– Unlock(trans-id)
• It stores locks in a lock table. Lock op inserts
[trans-id, mode] in the table. Unlock deletes it.
Data Item List of Locks Wait List
x [T1,r] [T2,r] [T3,w]
y [T4,w] [T5,w] [T6, r]
Lock Manager (cont’d)
• Caller generates data-item-id, e.g. by hashing data
item name
• The lock table is hashed on data-item-id
• Lock and Unlock must be atomic, so access to the
lock table must be “locked”
• Lock and Unlock are called frequently. They must
be very fast. Average < 100 instructions.
– This is hard, in part due to slow compare-and-swap
operations needed for atomic access to lock table
Lock Manager (cont’d)
• In MS SQL Server
– Locks are approx 32 bytes each.
– Each lock contains a Database-ID, Object-Id, and other
resource-specific lock information such as record id
(RID) or key.
– Each lock is attached to lock resource block (64 bytes)
and lock owner block (32 bytes)
Locking Granularity
• Granularity - size of data items to lock
– e.g., files, pages, records, fields
• Coarse granularity implies
– very few locks, so little locking overhead
– must lock large chunks of data, so high chance of
conflict, so concurrency may be low
• Fine granularity implies
– many locks, so high locking overhead
– locking conflict occurs only when two transactions try to
access the exact same data concurrently
• High performance TP requires record locking
Multigranularity Locking (MGL)
• Allow different txns to lock at different granularity
– big queries should lock coarse-grained data (e.g. tables)
– short transactions lock fine-grained data (e.g. rows)
• Lock manager can’t detect these conflicts
– each data item (e.g., table or row) has a different id
• Multigranularity locking “trick”
– exploit the natural hierarchy of data containment
– before locking fine-grained data, set intention locks on coarse
grained data that contains it
– e.g., before setting a read-lock on a row, get an
intention-read-lock on the table that contains the row
– Intention-read-locks conflicts with awrite lock
3.7 Deadlocks
• A set of transactions is deadlocked if every
transaction in the set is blocked and will remain
blocked unless the system intervenes.
– Example rl1[x] granted
rl2[y] granted
wl2[x] blocked
wl1[y] blocked and deadlocked
• Deadlock is 2PL’s way to avoid non-SR executions
– rl1[x] r1[x] rl2[y] r2[y] … can’t run w2[x] w1[y] and be SR
• To repair a deadlock, you must abort a transaction
– if you released a transaction’s lock without aborting it,
Deadlock Prevention
• Never grant a lock that can lead to deadlock
• Often advocated in operating systems
• Useless for TP, because it would require running
transactions serially.
– Example to prevent the previous deadlock,
rl1[x] rl2[y] wl2[x] wl1[y], the system can’t grant rl2[y]
• Avoiding deadlock by resource ordering is unusable
in general, since it overly constrains applications.
– But may help for certain high frequency deadlocks
• Setting all locks when txn begins requires too much
advance knowledge and reduces concurrency.
Deadlock Detection
• Detection approach: Detect deadlocks automatically,
and abort a deadlocked transactions (the victim).
• It’s the preferred approach, because it
– allows higher resource utilization and
– uses cheaper algorithms
• Timeout-based deadlock detection - If a transaction
is blocked for too long, then abort it.
– Simple and easy to implement
– But aborts unnecessarily and
– some deadlocks persist for too long
Detection Using Waits-For
Graph
• Explicit deadlock detection - Use a Waits-For Graph
– Nodes = {transactions}
– Edges = {Ti → Tk | Ti is waiting for Tk to release a lock}
– Example (previous deadlock) T1 T2
• Theorem: If there’s a deadlock, then the waits-for
graph has a cycle.
Detection Using Waits-For
Graph (cont’d)
• So, to find deadlocks
– when a transaction blocks, add an edge to the graph
– periodically check for cycles in the waits-for graph
• Don’t test for deadlocks too often. (A cycle won’t
disappear until you detect it and break it.)
• When a deadlock is detected, select a victim from
the cycle and abort it.
• Select a victim that hasn’t done much work
(e.g., has set the fewest locks).
Cyclic Restart
• Transactions can cause each other to abort forever.
– T1 starts running. Then T2 starts running.
– They deadlock and T1(the oldest) is aborted.
– T1 restarts, bumps into T2 and again deadlocks
– T2(the oldest) is aborted ...
• Choosing the youngest in a cycle as victim avoids
cyclic restart, since the oldest transaction is never
the victim.
• Can combine with other heuristics, e.g. fewest-locks
MS SQL Server
• Aborts the transaction that is “cheapest” to roll
back.
– “Cheapest” is determined by the amount of log
generated.
– Allows transactions that you’ve invested a lot in to
complete.
• SET DEADLOCK_PRIORITY LOW
(vs. NORMAL) causes a transaction to sacrifice
itself as a victim.
Distributed Locking
• Suppose a transaction can access data at many
data managers
• Each data manager sets locks in the usual way
• When a transaction commits or aborts, it runs
two-phase commit to notify all data managers it
accessed
• The only remaining issue is distributed deadlock
Distributed Deadlock
• The deadlock spans two nodes.
Neither node alone can see it.
• Timeout-based detection is popular. Its weaknesses
are less important in the distributed case:
– aborts unnecessarily and some deadlocks persist too long
– possibly abort younger unblocked transaction to avoid
cyclic restart
rl1[x]
wl2[x] (blocked)
Node 1
rl2[y]
wl1[y] (blocked)
Node 2
Oracle Deadlock Handling
• Uses a waits-for graph for single-server
deadlock detection.
• The transaction that detects the deadlock is
the victim.
• Uses timeouts to detect distributed
deadlocks.
Fancier Dist’d Deadlock Detection
• Use waits-for graph cycle detection with a central
deadlock detection server
– more work than timeout-based detection, and no
evidence it does better, performance-wise
– phantom deadlocks? - No, because each waits-for edge
is an SG edge. So, WFG cycle => SG cycle
(modulo spontaneous aborts)
• Path pushing - Send paths Ti→ … → Tk to each
node where Tk might be blocked.
– Detects short cycles quickly
– Hard to know where to send paths.
Possibly too many messages
What’s Coming in Part Two?
• Locking Performance
• A more detailed look at multigranularity
locking
• Hot spot techniques
• Query-Update Techniques
• Phantoms
• B-Trees and Tree locking
Locking Performance
• The following is oversimplified.We’ll revisit it.
• Deadlocks are rare.
– Typically 1-2% of transactions deadlock.
• Locking performance problems are not rare.
• The problem is too much blocking.
• The solution is to reduce the “locking load”
• Good heuristic – If more than 30% of transactions
are blocked, then reduce the number of concurrent
transactions
First section of Concurrency
Control Part Two if there’s
time
11.6 Locking Performance
• Deadlocks are rare
– up to 1% - 2% of transactions deadlock
• The one exception to this is lock conversions
– r-lock a record and later upgrade to w-lock
– e.g., Ti = read(x) … write(x)
– if two txns do this concurrently, they’ll deadlock
(both get an r-lock on x before either gets a w-lock)
– To avoid lock conversion deadlocks, get a w-lock first
and down-grade to an r-lock if you don’t need to write.
– Use SQL Update statement or explicit program hints
Conversions in MS SQL Server
• Update-lock prevents lock conversion deadlock.
– Conflicts with other update and write locks, but not
with read locks.
– Only on pages and rows (not tables)
• You get an update lock by using the UPDLOCK
hint in the FROM clause
Select Foo.A
From Foo (UPDLOCK)
Where Foo.B = 7
Blocking and Lock Thrashing
Throughput
Low
High
# of Active Txns
Low High
• The locking performance problem is too much delay due to
blocking
– little delay until locks are saturated
– then major delay, due to the locking bottleneck
– thrashing - the point where throughput decreases with increasing
load
thrashing
More on Thrashing
• It’s purely a blocking problem
– It happens even when the abort rate is low
• As number of transactions increase
– each additional transaction is more likely to block
– but first, it gathers some locks, increasing the
probability others will block (negative feedback)
Avoiding Thrashing
• If over 30% of active transactions are blocked,
then the system is (nearly) thrashing
so reduce the number of active transactions
• Timeout-based deadlock detection mistakes
– They happen due to long lock delays
– So the system is probably close to thrashing
– So if deadlock detection rate is too high (over 2%)
reduce the number of active transactions
Interesting Sidelights
• By getting all locks before transaction Start, you
can increase throughput at the thrashing point
because blocked transactions hold no locks
– But it assumes you get exactly the locks you need
and retries of get-all-locks are cheap
• Pure restart policy - abort when there’s a conflict
and restart when the conflict disappears
– If aborts are cheap and there’s low contention for
other resources, then this policy produces higher
throughput before thrashing than a blocking policy
– But response time is greater than a blocking policy
How to Reduce Lock Contention
• If each transaction holds a lock L for t seconds,
then the maximum throughput is 1/t txns/second
Start CommitLock L
t
• To increase throughput, reduce t (lock holding time)
– Set the lock later in the transaction’s execution
(e.g., defer updates till commit time)
– Reduce transaction execution time (reduce path length, read from disk before setting locks)
– Split a transaction into smaller transactions
Reducing Lock Contention (cont’d)
• Reduce number of conflicts
– Use finer grained locks, e.g., by partitioning tables
vertically
Part# Price OnHand PartName CatalogPage
Part# Price OnHand Part# PartName CatalogPage
– Use record-level locking (i.e., select a database
system that supports it)

More Related Content

PDF
3 concurrency controlone_v3
PPT
Quick Sort
PPT
Chapter18
PPT
Quick sort Algorithm Discussion And Analysis
PPTX
Algorithm - Mergesort & Quicksort
PPT
3.8 quick sort
PPTX
Analysis of Algorithm (Bubblesort and Quicksort)
PPT
Algorithm: Quick-Sort
3 concurrency controlone_v3
Quick Sort
Chapter18
Quick sort Algorithm Discussion And Analysis
Algorithm - Mergesort & Quicksort
3.8 quick sort
Analysis of Algorithm (Bubblesort and Quicksort)
Algorithm: Quick-Sort

What's hot (20)

PPTX
Merge sort and quick sort
PPT
Quicksort
PPT
3.8 quicksort
PDF
Algorithms
PDF
Quick sort algorithn
PDF
SVM (Support Vector Machine & Kernel)
PDF
Concur15slides
PPT
Presentation on binary search, quick sort, merge sort and problems
PPTX
Quick sort
PPT
Chapter 5
PDF
QX Simulator and quantum programming - 2020-04-28
PDF
PPTX
7. transaction mang
DOCX
Discrete control2
DOCX
Discrete control
PDF
HiPEAC'19 Tutorial on Quantum algorithms using QX - 2019-01-23
PPTX
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
PPTX
PPTX
Spanner osdi2012
Merge sort and quick sort
Quicksort
3.8 quicksort
Algorithms
Quick sort algorithn
SVM (Support Vector Machine & Kernel)
Concur15slides
Presentation on binary search, quick sort, merge sort and problems
Quick sort
Chapter 5
QX Simulator and quantum programming - 2020-04-28
7. transaction mang
Discrete control2
Discrete control
HiPEAC'19 Tutorial on Quantum algorithms using QX - 2019-01-23
Discretized Stream - Fault-Tolerant Streaming Computation at Scale - SOSP
Spanner osdi2012
Ad

Viewers also liked (20)

PPT
Gerbang Logika
PPTX
Artefact powerpoint
PPT
Physical Properties Lab
PPT
Let them eat fish.ver2
PPTX
Tarea de wish , if y would
PDF
Government - recommendations from AIGLIA2014
PPTX
Asheville Horizon Art
PDF
Exact MKB Cloud Barometer presentatie tijdens EuroCloud Awards
PDF
LBC: Kudavi
PDF
Modul e book wenni mts-muh 1 bjm
PPTX
1041 Sophomores LR Lecture 1
PDF
Presentacionblog
PDF
PPTX
Χριστούγεννα, η γιορτή της ενανθρώπησης του θεού
PPT
Power Notes Atomic Structure Day 3
PPTX
Materi CSS lanjut
DOCX
Tugas 2 Share data windows dengan kabel utp
DOCX
Hout plaatmateriaal
PPT
Программа Женское здоровье
Gerbang Logika
Artefact powerpoint
Physical Properties Lab
Let them eat fish.ver2
Tarea de wish , if y would
Government - recommendations from AIGLIA2014
Asheville Horizon Art
Exact MKB Cloud Barometer presentatie tijdens EuroCloud Awards
LBC: Kudavi
Modul e book wenni mts-muh 1 bjm
1041 Sophomores LR Lecture 1
Presentacionblog
Χριστούγεννα, η γιορτή της ενανθρώπησης του θεού
Power Notes Atomic Structure Day 3
Materi CSS lanjut
Tugas 2 Share data windows dengan kabel utp
Hout plaatmateriaal
Программа Женское здоровье
Ad

Similar to 3 concurrencycontrolone (20)

PPT
Transaction_processingkind of best ppt .ppt
PPTX
Chapter_NINE_Concurrency_Control_DBMS.pptx
PPTX
7. CSEN3101-MIV-TransactionProcessing-SolvedProblems-06.11.2023.pptx
PPTX
CHapter four database managementm04.pptx
PPTX
Characteristics Schedule based on Recover-ability & Serial-ability
PPT
5-Chapter Five - Concurrenc.ppt
PPTX
recoverability and serializability dbms
PPT
concurrencycontrol_databasemanagement_system
PPTX
db unit 4 dbms protocols in transaction
PDF
Concurrency control iN Advanced Database
PPTX
Concurrency control
PPTX
Transaction.pptx
PPTX
Chapter 4-Concrruncy controling techniques.pptx
PDF
concurrencycontrol from power pint pdf a
PPTX
Transaction of program execution updates
PPT
Chapter Three _Concurrency Control Techniques_ETU.ppt
PDF
Transaction & Concurrency Control
PDF
Advance_DBMS-Lecture_notesssssssssssssssss.pdf
PPT
215-Database-Recovery presentation document
PDF
UNIT 2- TRANSACTION CONCEPTS AND CONCURRENCY CONCEPTS (1).pdf
Transaction_processingkind of best ppt .ppt
Chapter_NINE_Concurrency_Control_DBMS.pptx
7. CSEN3101-MIV-TransactionProcessing-SolvedProblems-06.11.2023.pptx
CHapter four database managementm04.pptx
Characteristics Schedule based on Recover-ability & Serial-ability
5-Chapter Five - Concurrenc.ppt
recoverability and serializability dbms
concurrencycontrol_databasemanagement_system
db unit 4 dbms protocols in transaction
Concurrency control iN Advanced Database
Concurrency control
Transaction.pptx
Chapter 4-Concrruncy controling techniques.pptx
concurrencycontrol from power pint pdf a
Transaction of program execution updates
Chapter Three _Concurrency Control Techniques_ETU.ppt
Transaction & Concurrency Control
Advance_DBMS-Lecture_notesssssssssssssssss.pdf
215-Database-Recovery presentation document
UNIT 2- TRANSACTION CONCEPTS AND CONCURRENCY CONCEPTS (1).pdf

Recently uploaded (20)

PPTX
20th Century Theater, Methods, History.pptx
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
Complications of Minimal Access-Surgery.pdf
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PPTX
Introduction to pro and eukaryotes and differences.pptx
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
Trump Administration's workforce development strategy
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
20th Century Theater, Methods, History.pptx
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Complications of Minimal Access-Surgery.pdf
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Introduction to pro and eukaryotes and differences.pptx
Cambridge-Practice-Tests-for-IELTS-12.docx
Weekly quiz Compilation Jan -July 25.pdf
Trump Administration's workforce development strategy
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
B.Sc. DS Unit 2 Software Engineering.pptx
Paper A Mock Exam 9_ Attempt review.pdf.
Chinmaya Tiranga quiz Grand Finale.pdf
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα

3 concurrencycontrolone

  • 1. 3. Concurrency Control for Transactions Part One CSEP 545 Transaction Processing Philip A. Bernstein Copyright ©2003 Philip A. Bernstein
  • 2. Outline 1. A Simple System Model 2. Serializability Theory 3. Synchronization Requirements for Recoverability 4. Two-Phase Locking 5. Preserving Transaction Handshakes 6. Implementing Two-Phase Locking 7. Deadlocks
  • 3. 3.1 A Simple System Model • Goal - Ensure serializable (SR) executions • Implementation technique - Delay operations that would lead to non-SR results (e.g. set locks on shared data) • For good performance minimize overhead and delay from synchronization operations • First, we’ll study how to get correct (SR) results • Then, we’ll study performance implications (mostly in Part Two)
  • 4. Assumption - Atomic Operations • We will synchronize Reads and Writes. • We must therefore assume they’re atomic – else we’d have to synchronize the finer-grained operations that implement Read and Write • Read(x) - returns the current value of x in the DB • Write(x, val) overwrites all of x (the whole page) • This assumption of atomic operations is what allows us to abstract executions as sequences of reads and writes (without loss of information). – Otherwise, what would wk[x] ri[x] mean? • Also, commit (ci) and abort (ai) are atomic
  • 5. System Model Transaction 1 Transaction N Start, Commit, Abort Read(x), Write(x) Data Manager Database Transaction 2
  • 6. 3.2 Serializability Theory • The theory is based on modeling executions as histories, such as H1 = r1[x] r2[x] w1[x] c1 w2[y] c2 • First, characterize a concurrency control algorithm by the properties of histories it allows. • Then prove that any history having these properties is SR • Why bother? It helps you understand why concurrency control algorithms work.
  • 7. Equivalence of Histories • Two operations conflict if their execution order affects their return values or the DB state. – a read and write on the same data item conflict – two writes on the same data item conflict – two reads (on the same data item) do not conflict • Two histories are equivalent if they have the same operations and conflicting operations are in the same order in both histories – because only the relative order of conflicting operations can affect the result of the histories
  • 8. Examples of Equivalence • The following histories are equivalent H1 = r1[x] r2[x] w1[x] c1 w2[y] c2 H2 = r2[x] r1[x] w1[x] c1 w2[y] c2 H3 = r2[x] r1[x] w2[y] c2 w1[x] c1 H4 = r2[x] w2[y] c2 r1[x] w1[x] c1 • But none of them are equivalent to H5 = r1[x] w1[x] r2[x] c1 w2[y] c2 because r2[x] and w1[x] conflict and r2[x] precedes w1[x] in H1 - H4, but w1[x] precedes r2[x] in H5.
  • 9. Serializable Histories • A history is serializable if it is equivalent to a serial history • For example, H1 = r1[x] r2[x] w1[x] c1 w2[y] c2 is equivalent to H4 = r2[x] w2[y] c2 r1[x] w1[x] c1 (r2[x] and w1[x] are in the same order in H1 and H4.) • Therefore, H1 is serializable.
  • 10. Another Example • H6= r1[x] r2[x] w1[x] r3[x] w2[y] w3[x] c3 w1[y] c1 c2 is equivalent to a serial execution of T2 T1 T3, H7 = r2[x] w2[y] c2 r1[x] w1[x] w1[y] c1 r3[x] w3[x] c3 • Each conflict implies a constraint on any equivalent serial history: H6= r1[x] r2[x] w1[x] r3[x] w2[y] w3[x] c3 w1[y] c1 c2 T2→T1 T1→T3 T2→T1 T2→T3
  • 11. Serialization Graphs • A serialization graph, SG(H), for history H tells the effective execution order of transactions in H. • Given history H, SG(H) is a directed graph whose nodes are the committed transactions and whose edges are all Ti→ Tk such that at least one of Ti’s operations precedes and conflicts with at least one of Tk’s operations H6 = r1[x] r2[x] w1[x] r3[x] w2[y] w3[x] c3 w1[y] c1 c2 SG(H6) = T2 →T1 →T3
  • 12. The Serializability Theorem A history is SR if and only if SG(H) is acyclic. Proof: (if) SG(H) is acyclic. So let Hs be a serial history consistent with SG(H). Each pair of conflicting ops in H induces an edge in SG(H). Since conflicting ops in Hs and H are in the same order, Hs≡H, so H is SR. (only if) H is SR. Let Hs be a serial history equivalent to H. Claim that if Ti→ Tk in SG(H), then Ti precedes Tk in Hs (else Hs≡H). If SG(H) had a cycle, T1→T2→…→Tn→T1, then T1 precedes T1 in Hs,
  • 13. How to Use the Serializability Theorem • Characterize the set of histories that a concurrency control algorithm allows • Prove that any such history must have an acyclic serialization graph. • Therefore, the algorithm guarantees SR executions. • We’ll use this soon to prove that locking produces serializable executions.
  • 14. 3.3 Synchronization Requirements for Recoverability • In addition to guaranteeing serializability, synchronization is needed to implement abort easily. • When a transaction T aborts, the data manager wipes out all of T’s effects, including – undoing T’s writes that were applied to the DB, and – aborting transactions that read values written by T (these are called cascading aborts) • Example - w1[x] r2[x] w2[y] – to abort T1, we must undo w1[x] and abort T2 (a cascading abort)
  • 15. Recoverability • If Tk reads from Ti and Ti aborts, then Tk must abort – Example - w1[x] r2[x] a1implies T2 must abort • But what if Tk already committed? We’d be stuck. – Example - w1[x] r2[x] c2 a1 – T2 can’t abort after it commits • Executions must be recoverable: A transaction T’s commit operation must follow the commit of every transaction from which T read. – Recoverable - w1[x] r2[x] c1 c2 – Not recoverable - w1[x] r2[x] c2 a1 • Recoverability requires synchronizing operations.
  • 16. Avoiding Cascading Aborts • Cascading aborts are worth avoiding to – avoid complex bookkeeping, and – avoid an uncontrolled number of forced aborts • To avoid cascading aborts, a data manager should ensure transactions only read committed data • Example – avoids cascading aborts: w1[x] c1 r2[x] – allows cascading aborts: w1[x] r2[x] a1 • A system that avoids cascading aborts also guarantees recoverability.
  • 17. Strictness • It’s convenient to undo a write, w[x], by restoring its before image (=the value of x before w[x] executed) • Example - w1[x,1] writes the value “1” into x. – w1[x,1] w1[y,3] c1 w2[y,1] r2[x] a2 – abort T2 by restoring the before image of w2[y,1], = 3 • But this isn’t always possible. – For example, consider w1[x,2] w2[x,3] a1 a2 – a1 & a2 can’t be implemented by restoring before images – notice that w1[x,2] w2[x,3] a2 a1would be OK
  • 18. Strictness (cont’d) • More precisely, a system is strict if it only executes ri[x] or wi[x] if all previous transactions that wrote x committed or aborted. • Examples (“…” marks a non-strict prefix) – strict: w1[x] c1 w2[x] a2 – not strict: w1[x] w2[x] … a1 a2 – strict: w1[x] w1[y] c1 w2[y] r2[x] a2 – not strict: w1[x] w1[y] w2[y] a1 r2[x] a2 • “Strict” implies “avoids cascading aborts.”
  • 19. 3.4 Two-Phase Locking • Basic locking - Each transaction sets a lock on each data item before accessing the data – the lock is a reservation – there are read locks and write locks – if one transaction has a write lock on x, then no other transaction can have any lock on x • Example – rli[x], rui[x], wli[x], wui[x] denote lock/unlock operations – wl1[x] w1[x] rl2[x] r2[x] is impossible – wl1[x] w1[x] wu1[x] rl2[x] r2[x] is OK
  • 20. Basic Locking Isn’t Enough • Basic locking doesn’t guarantee serializability • rl1[x] r1[x] ru1[x] wl1[y] w1[y] wu1[y] c1 rl2[y] r2[y] wl2[x] w2[x] ru2[y] wu2[x] c2 • Eliminating the lock operations, we have r1[x] r2[y] w2[x] c2w1[y] c1which isn’t SR • The problem is that locks aren’t being released properly.
  • 21. Two-Phase Locking (2PL) Protocol • A transaction is two-phase locked if: – before reading x, it sets a read lock on x – before writing x, it sets a write lock on x – it holds each lock until after it executes the corresponding operation – after its first unlock operation, it requests no new locks • Each transaction sets locks during a growing phase and releases them during a shrinking phase. • Example - on the previous page T2 is two-phase locked, but not T1 since ru1[x] < wl1[y] – use “<” for “precedes”
  • 22. 2PL Theorem: If all transactions in an execution are two-phase locked, then the execution is SR. Proof: Define Ti ⇒ Tkif either – Ti read x and Tklater wrote x, or – Ti wrote x and Tklater read or wrote x • If Ti ⇒ Tk, then Ti released a lock before Tk obtained some lock. • If Ti ⇒ Tk⇒ Tm, then Ti released a lock before Tm obtained some lock (because Tk is two-phase). • If Ti ⇒...⇒ Ti, then Ti released a lock before Ti obtained some lock, breaking the 2-phase rule. • So there cannot be a cycle. By the Serializability
  • 23. 2PL and Recoverability • 2PL does not guarantee recoverability • This non-recoverable execution is 2-phase locked wl1[x] w1[x] wu1[x] rl2[x] r2[x] c2 … c1 – hence, it is not strict and allows cascading aborts • However, holding write locks until after commit or abort guarantees strictness – and hence avoids cascading aborts and is recoverable – In the above example, T1 must commit before its first unlock-write (wu1): wl1[x] w1[x] c1 wu1[x] rl2[x] r2[x] c2
  • 24. Automating Locking • 2PL can be hidden from the application • When a data manager gets a Read or Write operation from a transaction, it sets a read or write lock. • How does the data manager know it’s safe to release locks (and be two-phase)? • Ordinarily, the data manager holds a transaction’s locks until it commits or aborts. A data manager – can release read locks after it receives commit – releases write locks only after processing commit, to ensure strictness
  • 25. 3.5 Preserving Transaction Handshakes • Read and Write are the only operations the system will control to attain serializability. • So, if transactions communicate via messages, then implement SendMsg as Write, and ReceiveMsg as Read. • Else, you could have the following: w1[x] r2[x] send2[M] receive1[M] – data manager didn’t know about send/receive and thought the execution was SR. • Also watch out for brain transport
  • 26. Transactions Can Communicate via Brain Transport T1: Start . . . Display output Commit T2: Start Get input from display . . . Commit User reads output … User enters input Brain transport
  • 27. Brain Transport (cont’d) • For practical purposes, if user waits for T1 to commit before starting T2, then the data manager can ignore brain transport. • This is called a transaction handshake (T1 commits before T2 starts) • Reason - Locking preserves the order imposed by transaction handshakes – e.g., it serializes T1 before T2.
  • 28. 2PL Preserves Transaction Handshakes • Recall the definition: Ti commits before Tk starts • 2PL serializes txns consistent with all transaction handshakes. I.e. there’s an equivalent serial execution that preserves the transaction order of transaction handshakes • This isn’t true for arbitrary SR executions. E.g. – r1[x] w2[x] c2 r3[y] c3 w1[y] c1 – T2 commits before T3 starts, but the only equivalent serial execution is T3 T1 T2 – rl1[x] r1[x] wl1[y] ru1[x] wl2[x] w2[x] wu2[x] c2
  • 29. 2PL Preserves Transaction Handshakes (cont’d) • Stating this more formally … • Theorem: For any 2PL execution H, there is an equivalent serial execution Hs, such that for all Ti, Tk, if Ti committed before Tk started in H, then Ti precedes Tk in Hs.
  • 30. Brain Transport  One Last Time • If a user reads committed displayed output of Ti and uses that displayed output as input to transaction Tk, then he/she should wait for Ti to commit before starting Tk. • The user can then rely on transaction handshake preservation to ensure Ti is serialized before Tk.
  • 31. 3.6 Implementing Two-Phase Locking • Even if you never implement a DB system, it’s valuable to understand locking implementation, because it can have a big effect on performance. • A data manager implements locking by – implementing a lock manager – setting a lock for each Read and Write – handling deadlocks
  • 32. System Model Transaction 1 Transaction N Database System Start, SQL Ops Commit, Abort Query Optimizer Query Executor Access Method (record-oriented files) Page-oriented Files Database
  • 33. How to Implement SQL • Query Optimizer - translates SQL into an ordered expression of relational DB operators (Select, Project, Join) • Query Executor - executes the ordered expression by running a program for each operator, which in turn accesses records of files • Access methods - provides indexed record-at-a- time access to files (OpenScan, GetNext, …) • Page-oriented files - Read or Write (page address)
  • 34. Which Operations Get Synchronized? Record-oriented operations Page-oriented operations SQL operations Query Optimizer Query Executor Access Method (record-oriented files) Page-oriented Files • It’s a tradeoff between – amount of concurrency and – overhead and complexity of synchronization
  • 35. Lock Manager • A lock manager services the operations – Lock(trans-id, data-item-id, mode) – Unlock(trans-id, data-item-id) – Unlock(trans-id) • It stores locks in a lock table. Lock op inserts [trans-id, mode] in the table. Unlock deletes it. Data Item List of Locks Wait List x [T1,r] [T2,r] [T3,w] y [T4,w] [T5,w] [T6, r]
  • 36. Lock Manager (cont’d) • Caller generates data-item-id, e.g. by hashing data item name • The lock table is hashed on data-item-id • Lock and Unlock must be atomic, so access to the lock table must be “locked” • Lock and Unlock are called frequently. They must be very fast. Average < 100 instructions. – This is hard, in part due to slow compare-and-swap operations needed for atomic access to lock table
  • 37. Lock Manager (cont’d) • In MS SQL Server – Locks are approx 32 bytes each. – Each lock contains a Database-ID, Object-Id, and other resource-specific lock information such as record id (RID) or key. – Each lock is attached to lock resource block (64 bytes) and lock owner block (32 bytes)
  • 38. Locking Granularity • Granularity - size of data items to lock – e.g., files, pages, records, fields • Coarse granularity implies – very few locks, so little locking overhead – must lock large chunks of data, so high chance of conflict, so concurrency may be low • Fine granularity implies – many locks, so high locking overhead – locking conflict occurs only when two transactions try to access the exact same data concurrently • High performance TP requires record locking
  • 39. Multigranularity Locking (MGL) • Allow different txns to lock at different granularity – big queries should lock coarse-grained data (e.g. tables) – short transactions lock fine-grained data (e.g. rows) • Lock manager can’t detect these conflicts – each data item (e.g., table or row) has a different id • Multigranularity locking “trick” – exploit the natural hierarchy of data containment – before locking fine-grained data, set intention locks on coarse grained data that contains it – e.g., before setting a read-lock on a row, get an intention-read-lock on the table that contains the row – Intention-read-locks conflicts with awrite lock
  • 40. 3.7 Deadlocks • A set of transactions is deadlocked if every transaction in the set is blocked and will remain blocked unless the system intervenes. – Example rl1[x] granted rl2[y] granted wl2[x] blocked wl1[y] blocked and deadlocked • Deadlock is 2PL’s way to avoid non-SR executions – rl1[x] r1[x] rl2[y] r2[y] … can’t run w2[x] w1[y] and be SR • To repair a deadlock, you must abort a transaction – if you released a transaction’s lock without aborting it,
  • 41. Deadlock Prevention • Never grant a lock that can lead to deadlock • Often advocated in operating systems • Useless for TP, because it would require running transactions serially. – Example to prevent the previous deadlock, rl1[x] rl2[y] wl2[x] wl1[y], the system can’t grant rl2[y] • Avoiding deadlock by resource ordering is unusable in general, since it overly constrains applications. – But may help for certain high frequency deadlocks • Setting all locks when txn begins requires too much advance knowledge and reduces concurrency.
  • 42. Deadlock Detection • Detection approach: Detect deadlocks automatically, and abort a deadlocked transactions (the victim). • It’s the preferred approach, because it – allows higher resource utilization and – uses cheaper algorithms • Timeout-based deadlock detection - If a transaction is blocked for too long, then abort it. – Simple and easy to implement – But aborts unnecessarily and – some deadlocks persist for too long
  • 43. Detection Using Waits-For Graph • Explicit deadlock detection - Use a Waits-For Graph – Nodes = {transactions} – Edges = {Ti → Tk | Ti is waiting for Tk to release a lock} – Example (previous deadlock) T1 T2 • Theorem: If there’s a deadlock, then the waits-for graph has a cycle.
  • 44. Detection Using Waits-For Graph (cont’d) • So, to find deadlocks – when a transaction blocks, add an edge to the graph – periodically check for cycles in the waits-for graph • Don’t test for deadlocks too often. (A cycle won’t disappear until you detect it and break it.) • When a deadlock is detected, select a victim from the cycle and abort it. • Select a victim that hasn’t done much work (e.g., has set the fewest locks).
  • 45. Cyclic Restart • Transactions can cause each other to abort forever. – T1 starts running. Then T2 starts running. – They deadlock and T1(the oldest) is aborted. – T1 restarts, bumps into T2 and again deadlocks – T2(the oldest) is aborted ... • Choosing the youngest in a cycle as victim avoids cyclic restart, since the oldest transaction is never the victim. • Can combine with other heuristics, e.g. fewest-locks
  • 46. MS SQL Server • Aborts the transaction that is “cheapest” to roll back. – “Cheapest” is determined by the amount of log generated. – Allows transactions that you’ve invested a lot in to complete. • SET DEADLOCK_PRIORITY LOW (vs. NORMAL) causes a transaction to sacrifice itself as a victim.
  • 47. Distributed Locking • Suppose a transaction can access data at many data managers • Each data manager sets locks in the usual way • When a transaction commits or aborts, it runs two-phase commit to notify all data managers it accessed • The only remaining issue is distributed deadlock
  • 48. Distributed Deadlock • The deadlock spans two nodes. Neither node alone can see it. • Timeout-based detection is popular. Its weaknesses are less important in the distributed case: – aborts unnecessarily and some deadlocks persist too long – possibly abort younger unblocked transaction to avoid cyclic restart rl1[x] wl2[x] (blocked) Node 1 rl2[y] wl1[y] (blocked) Node 2
  • 49. Oracle Deadlock Handling • Uses a waits-for graph for single-server deadlock detection. • The transaction that detects the deadlock is the victim. • Uses timeouts to detect distributed deadlocks.
  • 50. Fancier Dist’d Deadlock Detection • Use waits-for graph cycle detection with a central deadlock detection server – more work than timeout-based detection, and no evidence it does better, performance-wise – phantom deadlocks? - No, because each waits-for edge is an SG edge. So, WFG cycle => SG cycle (modulo spontaneous aborts) • Path pushing - Send paths Ti→ … → Tk to each node where Tk might be blocked. – Detects short cycles quickly – Hard to know where to send paths. Possibly too many messages
  • 51. What’s Coming in Part Two? • Locking Performance • A more detailed look at multigranularity locking • Hot spot techniques • Query-Update Techniques • Phantoms • B-Trees and Tree locking
  • 52. Locking Performance • The following is oversimplified.We’ll revisit it. • Deadlocks are rare. – Typically 1-2% of transactions deadlock. • Locking performance problems are not rare. • The problem is too much blocking. • The solution is to reduce the “locking load” • Good heuristic – If more than 30% of transactions are blocked, then reduce the number of concurrent transactions
  • 53. First section of Concurrency Control Part Two if there’s time
  • 54. 11.6 Locking Performance • Deadlocks are rare – up to 1% - 2% of transactions deadlock • The one exception to this is lock conversions – r-lock a record and later upgrade to w-lock – e.g., Ti = read(x) … write(x) – if two txns do this concurrently, they’ll deadlock (both get an r-lock on x before either gets a w-lock) – To avoid lock conversion deadlocks, get a w-lock first and down-grade to an r-lock if you don’t need to write. – Use SQL Update statement or explicit program hints
  • 55. Conversions in MS SQL Server • Update-lock prevents lock conversion deadlock. – Conflicts with other update and write locks, but not with read locks. – Only on pages and rows (not tables) • You get an update lock by using the UPDLOCK hint in the FROM clause Select Foo.A From Foo (UPDLOCK) Where Foo.B = 7
  • 56. Blocking and Lock Thrashing Throughput Low High # of Active Txns Low High • The locking performance problem is too much delay due to blocking – little delay until locks are saturated – then major delay, due to the locking bottleneck – thrashing - the point where throughput decreases with increasing load thrashing
  • 57. More on Thrashing • It’s purely a blocking problem – It happens even when the abort rate is low • As number of transactions increase – each additional transaction is more likely to block – but first, it gathers some locks, increasing the probability others will block (negative feedback)
  • 58. Avoiding Thrashing • If over 30% of active transactions are blocked, then the system is (nearly) thrashing so reduce the number of active transactions • Timeout-based deadlock detection mistakes – They happen due to long lock delays – So the system is probably close to thrashing – So if deadlock detection rate is too high (over 2%) reduce the number of active transactions
  • 59. Interesting Sidelights • By getting all locks before transaction Start, you can increase throughput at the thrashing point because blocked transactions hold no locks – But it assumes you get exactly the locks you need and retries of get-all-locks are cheap • Pure restart policy - abort when there’s a conflict and restart when the conflict disappears – If aborts are cheap and there’s low contention for other resources, then this policy produces higher throughput before thrashing than a blocking policy – But response time is greater than a blocking policy
  • 60. How to Reduce Lock Contention • If each transaction holds a lock L for t seconds, then the maximum throughput is 1/t txns/second Start CommitLock L t • To increase throughput, reduce t (lock holding time) – Set the lock later in the transaction’s execution (e.g., defer updates till commit time) – Reduce transaction execution time (reduce path length, read from disk before setting locks) – Split a transaction into smaller transactions
  • 61. Reducing Lock Contention (cont’d) • Reduce number of conflicts – Use finer grained locks, e.g., by partitioning tables vertically Part# Price OnHand PartName CatalogPage Part# Price OnHand Part# PartName CatalogPage – Use record-level locking (i.e., select a database system that supports it)