TRANSACTION MANAGEMENT AND TIME STAMP PROTOCOLS AND BACKUP RECOVERY

UNIT-4
Transactions refer to a set of operations
that are used for performing a set of
logical work. Usually, a transaction means
the data present in the DB has changed.
Protecting the user data from system
failures is one of the primary uses of
DBMS.
We can define a transaction as a group of
tasks in DBMS. Here a single task refers to
a minimum processing unit, and we cannot
divide it further. Now let us take the
example of a certain simple transaction.
Suppose any worker transfers Rs 1000 from
X’s account to Y’s account.

X’s Account
Open_Account(X)
Old_Bank_Balance = X.balance
New_Bank_Balance = Old_Bank_Balance –
1000
A.balance = New_Bank_Balance
Close_Bank_Account(X)
Y’s Account
Open_Account(Y)
Old_Bank_Balance = Y.balance
New_Bank_Balance = Old_Bank_Balance +
1000
B.balance = New_Bank_Balance
Close_Bank_Account(Y)

ACID Properties
The transaction refers to a small unit of
any given program that consists of various
low-level tasks. Every transaction in DBMS
must maintain ACID – A (Atomicity), C
(Consistency), I (Isolation), D
(Durability). One must maintain ACID so as
to ensure completeness, accuracy, and
integrity of data.

Atomicity: Atomicity means that an entire
transaction either takes place all at once
or it doesn’t occur at all. It means that
there’s no midway. The transactions can
never occur partially. Every transaction
can be considered as a single unit, and
they either run to completion or do not get
executed at all. We have the following two
operations here:
—Commit: In case a transaction commits, the
changes made are visible to us. Thus,
atomicity is also called the ‘All or
nothing rule’.
—Abort: In case a transaction aborts, the
changes made to the database are not
visible to us.
Consider this transaction T that
consists of T1 and T2:
Transferring 100 from account A
to account B.
In case the transaction fails when the T1 is completed
but the T2 is not completed (say, after write(A) but
before write(B)), then the amount has been deducted
from A but not added to B.
This would result in a database state that is
inconsistent. Thus, the transaction has to be executed
in its entirety in order to ensure the correctness of
the database state.

Consistency
Consistency means that we have to maintain
the integrity constraints so that any given
database stays consistent both before and
after a transaction. If we refer to the
example discussed above, then we have to
maintain the total amount, both before and
after the transaction.
Total after T occurs = 400 + 300 = 700.
Total before T occurs = 500 + 200 = 700.
Thus, the given database is consistent.
Here, an inconsistency would occur when T1
completes, but then the T2 fails. As a
result, the T would remain incomplete

I – Isolation
Isolation ensures the occurrence of multiple
transactions concurrently without a database
state leading to a state of inconsistency. A
transaction occurs independently, i.e. without
any interference. Any changes that occur in any
particular transaction would NOT be ever visible
to the other transactions unless and until this
particular change in this transaction has been
committed or written to the memory.
The property of isolation ensures that when we
execute the transactions concurrently, it will
result in such a state that’s equivalent to the
achieved state that was serially executed in a
particular order.
Let A = 500, B = 500
Let us consider two transactions here- T and T”

Suppose that T has been executed here till
Read(B) and then T’’ starts. As a result,
the interleaving of operations would take
place. And due to this, T’’ reads the
correct value of A but incorrect value of
B.
T’’: (X+B = 50, 000+500=50, 500)
Thus, the sum computed here is not
consistent with the sum that is obtained at
the end of the transaction:
T: (A+B = 50, 000 + 450 = 50, 450).
It results in the inconsistency of a
database due to the loss of a total of 50
units.

D – Durability
The durability property states that once
the execution of a transaction is
completed, the modifications and updates on
the database gets written on and stored in
the disk. These persist even after the
occurrence of a system failure. Such
updates become permanent and get stored in
non-volatile memory. Thus, the effects of
this transaction are never lost
Uses of ACID Properties
In totality, the ACID properties of
transactions provide a mechanism in DBMS to
ensure the consistency and correctness of
any database. It ensures consistency in a
way that every transaction acts as a group
of operations acting as single units,
produces consistent results, operates in an
isolated manner from all the other
operations, and makes durably stored
updates. These ensure the integrity of data
in any given database.

Serializability
Whenever the operating system executes multiple transactions in a
multiprogramming environment, then there is always a possibility
that instructions of one transaction interleave with some other
transaction.
Schedule
A schedule refers to a chronological execution sequence of a given
transaction. Any schedule can have multiple transactions in it, and
each comprises a number of tasks/instructions.
•Serial Schedule − It is a schedule in which transactions are aligned
in such a way that one transaction is executed first.
•When the first transaction completes its cycle, then the next
transaction is executed. Transactions are ordered one after the other.
This type of schedule is called a serial schedule, as transactions are
executed in a serial manner.

Equivalence Schedules
An equivalence schedule can be of the following types −
Result Equivalence
If two schedules produce the same result after execution, they are said to
be result equivalent. They may yield the same result for some value and
different results for another set of values. That's why this equivalence is
not generally considered significant.

For example −
•If T reads the initial data in S1, then it also reads
the initial data in S2.
•If T reads the value written by J in S1, then it also
reads the value written by J in S2.
•If T performs the final write on the data value in
S1, then it also performs the final write on the
data value in S2.
View Equivalence
Two schedules would be view equivalence if the
transactions in both the schedules perform similar
actions in a similar manner.

Conflict Equivalence
Two schedules would be conflicting if they have the following
properties −
•Both belong to separate transactions.
•Both accesses the same data item.
•At least one of them is "write" operation.
Two schedules having multiple transactions with conflicting
operations are said to be conflict equivalent if and only if −
•Both the schedules contain the same set of Transactions.
•The order of conflicting pairs of operation is maintained in both the
schedules.
Note − View equivalent schedules are view serializable and conflict
equivalent schedules are conflict serializable. All conflict serializable
schedules are view serializable too.

TRANSACTION MANAGEMENT AND TIME STAMP PROTOCOLS AND BACKUP RECOVERY

•Active − In this state, the transaction is being executed. This is the initial state of
every transaction.
•Partially Committed − When a transaction executes its final operation, it is said to
be in a partially committed state.
•Failed − A transaction is said to be in a failed state if any of the checks made by
the database recovery system fails. A failed transaction can no longer proceed
further.
•Aborted − If any of the checks fails and the transaction has reached a failed
state, then the recovery manager rolls back all its write operations on the
database to bring the database back to its original state where it was prior to the
execution of the transaction. Transactions in this state are called aborted. The
database recovery module can select one of the two operations after a transaction
aborts −
• Re-start the transaction
• Kill the transaction
•Committed − If a transaction executes all its operations successfully, it is said to
be committed. All its effects are now permanently established on the database
system.

Recoverability is a property of database systems that
ensures that, in the event of a failure or error, the system
can recover the database to a consistent state.
The characteristics of non-serializable schedules
are as follows −
•The transactions may or may not be consistent.
•The transactions may or may not be recoverable.
So, now let’s talk about recoverability schedules.
We all know that recoverable and irrecoverable are
non-serializable techniques,

Irrecoverable schedules
If a transaction does a dirty
read operation from an
uncommitted transaction and
commits before the
transaction from where it has
read the value, then such a
schedule is called an
irrecoverable schedule.

Recoverable Schedules
If any transaction that
performs a dirty read
operation from an
uncommitted transaction and
also its committed operation
becomes delayed till the
uncommitted transaction is
either committed or rollback
such type of schedules is
called as Recoverable
Schedules.

Isolation is a database-level
characteristic that governs how
and when modifications are
made, as well as whether they
are visible to other users,
systems, and other databases.
One of the purposes of isolation
is to allow many transactions to
run concurrently without
interfering with their execution.
Phenomena Defining Isolation Level:
•A transaction that reads data that hasn't yet been
committed is said to have performed a "Dirty Read".
Imagine that when Transaction 2 receives the modified
row, Transaction 1 modifies the row and leaves it
uncommitted. Transaction 2 will have read data that was
never intended to exist if transaction 1 reverses the
change.
•Non Repeatable Read occurs when a transaction reads
the same row twice and receives a different value each
time. Assume that transaction T1 reads data. Because of
concurrency, another transaction, T2, modifies and
commits the same data. Transaction T1 will get a different
value if it reads the same data a second time.
•When two identical queries are run, but the rows returned
by the two are different, this phenomenon is known as
a "Phantom Read."Assume transaction T1 receives a
collection of records that meet some search criteria.
Transaction T2 now creates some new data that fit the
transaction T1 search criteria. Transaction T1 will acquire
a different set of rows if it re-executes the statement that
reads the rows.

The SQL standard defines four isolation levels based on these phenomena:
Levels of Isolation:
Isolation is divided into four stages. The ability of users to access the same data
concurrently is constrained by higher isolation. The greater the isolation degree,
the more system resources are required, and the greater the likelihood that
database transactions would block one another.
•"Serializable," the highest level, denotes that one transaction must be completed
before another can start.
•Repeatable Reads allow transactions to be accessed after they have begun,
even if they have not completed. This level enables phantom reads or the
awareness of inserted or deleted rows even when changes to existing rows are
not readable.
•Read Committed allows you access to information only after it has been
committed to the database.
•Read Uncommitted is the lowest level of isolation, allowing access to data
before modifications are performed.

Testing of Serializability
Serialization Graph is used to test the Serializability of a
schedule.
Assume a schedule S. For S, we construct a graph known
as precedence graph. This graph has a pair G = (V, E),
where V consists a set of vertices, and E consists a set of
edges. The set of vertices is used to contain all the
transactions participating in the schedule. The set of
edges is used to contain all edges Ti ->Tj for which one of
the three conditions holds:
1.Create a node Ti → Tj if Ti executes write (Q) before Tj
executes read (Q).
2.Create a node Ti → Tj if Ti executes read (Q) before Tj
executes write (Q).
3.Create a node Ti → Tj if Ti executes write (Q) before Tj
executes write (Q).

Explanation:
Read(A): In T1, no subsequent writes to A, so no new
edges
Read(B): In T2, no subsequent writes to B, so no new
edges
Read(C): In T3, no subsequent writes to C, so no new
edges
Write(B): B is subsequently read by T3, so add edge
T2 → T3
Write(C): C is subsequently read by T1, so add edge
T3 → T1
Write(A): A is subsequently read by T2, so add edge
T1 → T2
Write(A): In T2, no subsequent reads to A, so no new
edges
Write(C): In T1, no subsequent reads to C, so no new
edges
Write(B): In T3, no subsequent reads to B, so no new
edges
The precedence graph
for schedule S1
contains a cycle that's
why Schedule S1 is
non-serializable.

Explanation:
Read(A): In T4,no subsequent writes to A, so no new
edges
Read(C): In T4, no subsequent writes to C, so no new
edges
Write(A): A is subsequently read by T5, so add edge T4
→ T5
Read(B): In T5,no subsequent writes to B, so no new
edges
Write(C): C is subsequently read by T6, so add edge T4
→ T6
Write(B): A is subsequently read by T6, so add edge T5
→ T6
Write(C): In T6, no subsequent reads to C, so no new
edges
Write(A): In T5, no subsequent reads to A, so no new
edges
Write(B): In T6, no subsequent reads to B, so no new
edges
The precedence graph
for schedule S2 contains
no cycle that's why
ScheduleS2 is
serializable

Lock-Based Protocol
In this type of protocol, any transaction cannot read or write data until it
acquires an appropriate lock on it. There are two types of lock:
1. Shared lock:
•It is also known as a Read-only lock. In a shared lock, the data item can
only read by the transaction.
•It can be shared between the transactions because when the transaction
holds a lock, then it can't update the data on the data item.
2. Exclusive lock:
•In the exclusive lock, the data item can be both reads as well as written
by the transaction.
•This lock is exclusive, and in this lock, multiple transactions do not modify
the same data simultaneously.

There are four types of lock protocols available:
1. Simplistic lock protocol
It is the simplest way of locking the data while transaction. Simplistic lock-
based protocols allow all the transactions to get the lock on the data before
insert or delete or update on it. It will unlock the data item after completing
the transaction.
2. Pre-claiming Lock Protocol
•Pre-claiming Lock Protocols evaluate the transaction to list all the data
items on which they need locks.
•Before initiating an execution of the transaction, it requests DBMS for all
the lock on all those data items.
•If all the locks are granted then this protocol allows the transaction to
begin. When the transaction is completed then it releases all the lock.
•If all the locks are not granted then this protocol allows the transaction to
rolls back and waits until all the locks are granted.

3. Two-phase locking (2PL)
•The two-phase locking protocol divides the execution
phase of the transaction into three parts.
•In the first part, when the execution of the transaction
starts, it seeks permission for the lock it requires.
•In the second part, the transaction acquires all the locks.
The third phase is started as soon as the transaction
releases its first lock.
•In the third phase, the transaction cannot demand any
new locks. It only releases the acquired locks.

There are two phases of 2PL:
Growing phase: In the growing phase, a new lock on the
data item may be acquired by the transaction, but none
can be released.
Shrinking phase: In the shrinking phase, existing lock
held by the transaction may be released, but no new locks
can be acquired.
In the below example, if lock conversion is allowed then
the following phase can happen:
1.Upgrading of lock (from S(a) to X (a)) is allowed in
growing phase.
2.Downgrading of lock (from X(a) to S(a)) must be done in
shrinking phase.

The following way shows how unlocking
and locking work with 2-PL.
Transaction T1:
•Growing phase: from step 1-3
•Shrinking phase: from step 5-7
•Lock point: at 3
Transaction T2:
•Growing phase: from step 2-6
•Shrinking phase: from step 8-9
•Lock point: at 6

4. Strict Two-phase locking (Strict-2PL)
•The first phase of Strict-2PL is similar to
2PL. In the first phase, after acquiring all the
locks, the transaction continues to execute
normally.
•The only difference between 2PL and strict
2PL is that Strict-2PL does not release a lock
after using it.
•Strict-2PL waits until the whole transaction to
commit, and then it releases all the locks at a
time.
•Strict-2PL protocol does not have shrinking
phase of lock release. It does not have cascading abort as 2PL does.

Timestamp Ordering Protocol
•The Timestamp Ordering Protocol is used to order the transactions based on their
Timestamps. The order of transaction is nothing but the ascending order of the
transaction creation.
•The priority of the older transaction is higher that's why it executes first. To determine
the timestamp of the transaction, this protocol uses system time or logical counter.
•The lock-based protocol is used to manage the order between conflicting pairs
among transactions at the execution time. But Timestamp based protocols start
working as soon as a transaction is created.
•Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has
entered the system at 007 times and transaction T2 has entered the system at 009
times. T1 has the higher priority, so it executes first as it is entered the system first.
•The timestamp ordering protocol also maintains the timestamp of last 'read' and
'write' operation on a data.

Basic Timestamp ordering protocol works as follows:
1. Check the following condition whenever a transaction Ti
issues a Read (X) operation:
•If W_TS(X) >TS(Ti) then the operation is rejected.
•If W_TS(X) <= TS(Ti) then the operation is executed.
•Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti
issues a Write(X) operation:
•If TS(Ti) < R_TS(X) then the operation is rejected.
•If TS(Ti) < W_TS(X) then the operation is rejected and Ti
is rolled back otherwise the operation is executed.
Where,
TS(TI) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item X.
W_TS(X) denotes the Write time-stamp of data-item X.

Validation Based Protocol
Validation phase is also known as optimistic concurrency control
technique. In the validation based protocol, the transaction is executed
in the following three phases:
1.Read phase: In this phase, the transaction T is read and executed.
It is used to read the value of various data items and stores them in
temporary local variables. It can perform all the write operations on
temporary variables without an update to the actual database.
2.Validation phase: In this phase, the temporary variable value will be
validated against the actual data to see if it violates the serializability.
3.Write phase: If the validation of the transaction is validated, then the
temporary results are written to the database or system otherwise the
transaction is rolled back.

Thomas write Rule
Thomas Write Rule provides the guarantee of
serializability order for the protocol. It improves the Basic
Timestamp Ordering Algorithm.
The basic Thomas write rules are as follows:
•If TS(T) < R_TS(X) then transaction T is aborted and
rolled back, and operation is rejected.
•If TS(T) < W_TS(X) then don't execute the W_item(X)
operation of the transaction and continue processing.
•If neither condition 1 nor condition 2 occurs, then allowed
to execute the WRITE operation by transaction Ti and set
W_TS(X) to TS(T).

If we use the Thomas write rule then some serializable
schedule can be permitted that does not conflict
serializable as illustrate by the schedule in a given figure:

Thomas write rule checks that T2's write is never seen by
any transaction. If we delete the write operation in
transaction T2, then conflict serializable schedule can be
obtained which is shown in below figure.

Multiple Granularity
Let's start by understanding the meaning of
granularity.
Granularity: It is the size of data item allowed
to lock.
Multiple Granularity:
•It can be defined as hierarchically breaking up
the database into blocks which can be locked.
•The Multiple Granularity protocol enhances
concurrency and reduces lock overhead.
•It maintains the track of what to lock and how
to lock.
•It makes easy to decide either to lock a data
item or to unlock a data item. This type of
hierarchy can be graphically represented as a
tree.
For example: Consider a tree which has four
levels of nodes.
•The first level or higher level shows the entire
database.
•The second level represents a node of type area.
The higher level database consists of exactly
these areas.
•The area consists of children nodes which are
known as files. No file can be present in more
than one area.
•Finally, each file contains child nodes known as
records. The file has exactly those records that
are its child nodes. No records represent in more
than one file.
•Hence, the levels of the tree starting from the top
level are as follows:
1. Database
2. Area
3. File
4. Record

In this example, the highest level shows the entire database. The levels below are
file, record, and fields.
There are three additional lock modes with multiple granularity:
Intention Mode Lock
Intention-shared (IS): It contains explicit locking at a lower level of the tree but
only with shared locks.
Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive
or shared locks.
Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared
mode, and some node is locked in exclusive mode by the same transaction.

Recovery with Concurrent Transaction
•Whenever more than one transaction is being executed,
then the interleaved of logs occur. During recovery, it
would become difficult for the recovery system to
backtrack all logs and then start recovering.

TRANSACTION MANAGEMENT AND TIME STAMP PROTOCOLS AND BACKUP RECOVERY

More Related Content

Similar to TRANSACTION MANAGEMENT AND TIME STAMP PROTOCOLS AND BACKUP RECOVERY (20)

Recently uploaded (20)

TRANSACTION MANAGEMENT AND TIME STAMP PROTOCOLS AND BACKUP RECOVERY