2. When multiple transactions execute concurrently in an
uncontrolled or unrestricted manner, then it might lead to
several problems.
Such problems are called as concurrency problems.
For example, if we take ATM machines and do not use
concurrency, multiple persons cannot draw money at a time
in different places. This is where we need concurrency.
The main objective of concurrency control is to allow many
users perform different operations at the same time.
-Manisha Kapila
3. The concurrency control protocols ensure the atomicity,
consistency, isolation, durability and serializability of the
concurrent execution of the database transactions.
Concurrency control can be divided into two protocols:
Lock –based protocol
Timestamp Based protocol
-Manisha Kapila
4. Lock-based protocol
Lock is a mechanism which is important in a concurrency
control.
It controls concurrent access to a data item.
It assures that one process should not retrieve or update a
record which another process is updating.
-Manisha Kapila
5. For e.g. In traffic, there are signals which indicate stop or go .
When one signal is allowed to pass at a time, then other
signals are locked. Similarly, in database transaction only one
transaction is performed at a time and other transaction are
locked.
-Manisha Kapila
6. Types of Locks
Types of Locks
The types of locks are as follows −
Shared Lock [Transaction can read only the data item
values]
Exclusive Lock [Used for both read and write data item
values]
-Manisha Kapila
7. Shared Lock(S)
A shared lock is also called a Read-only lock. With the shared lock, the
data item can be shared between transactions. This is because you will
never have permission to update data on the data item.
For example, consider a case where two transactions are reading the
account balance of a person. The database will let them read by placing a
shared lock. However, if another transaction wants to update that
account’s balance, shared lock prevent it until the reading process is over.
-Manisha Kapila
8. Exclusive Lock(X)
In the exclusive lock, the data item can be both reads as well as
written by the transaction.
This lock is exclusive, and in this lock, multiple transactions do
not modify the same data simultaneously.
For example, when a transaction needs to update the account
balance of a person. You can allows this transaction by placing X
lock on it. Therefore, when the second transaction wants to
read or write, exclusive lock prevent this operation.
-Manisha Kapila
9. One-phase Locking Protocol
One-phase Locking Protocol
In this method, each transaction locks an item before use and
releases the lock as soon as it has finished using it.
This locking method provides for maximum concurrency but
does not always enforce serializability.
-Manisha Kapila
10. Two-phase Locking Protocol(2PL)
Two-phase Locking Protocol(2PL)
The two-phase locking protocol divides the execution phase of the
transaction into three parts.
In the first part, when the execution of the transaction starts, it seeks
permission for the lock it requires.
In the second part, the transaction acquires all the locks. The third
phase is started as soon as the transaction releases its first lock.
In the third phase, the transaction cannot demand any new locks.It only
releases the acquired locks.
-Manisha Kapila
11. Two-phase Locking Protocol(2PL)
Two-phase Locking Protocol(2PL)
Growing Phase: New Locks on data item may be acquired but
none can be released.
Shrinking Phase: Existing lock may be released but no new
locks can be acquired.
-Manisha Kapila
12. Timestamp-Based Protocol
Timestamp-Based Protocol
The most commonly used concurrency protocol is the
timestamp based protocol.
The timestamp ordering protocol is used to order the
transaction based on their timestamps. The order of transaction
means in ascending order.
The priority of the older transaction is higher that’s why it
executes first. To determine the timestamp of the transaction,
this protocol use system time .
-Manisha Kapila
13. Let's assume there are two transactions T1 and T2. Suppose
the transaction T1 has entered the system at 007 times and
transaction T2 has entered the system at 009 times. T1 has
the higher priority, so it executes first as it is entered the
system first.
The timestamp ordering protocol also maintains the
timestamp of last 'read' and 'write' operation on a data.
-Manisha Kapila
14. Timestamp based ordering follow three rules to enforce
Timestamp based ordering follow three rules to enforce
serializability −
serializability −
Access Rule − When two transactions try to access the same data
item simultaneously, for conflicting operations, priority is given to
the older transaction. This causes the younger transaction to wait
for the older transaction to commit first.
Late Transaction Rule − If a younger transaction has written a
data item, then an older transaction is not allowed to read or
write that data item. This rule prevents the older transaction from
committing after the younger transaction has already committed.
-Manisha Kapila
15. Younger Transaction Rule − A younger transaction can read
or write a data item that has already been written by an older
transaction.
-Manisha Kapila
16. Distributed Databases
Distributed Databases
A distributed database is basically a database that is not limited to one
system, it is spread over different sites, i.e, on multiple computers or
over a network of computers.
A distributed database system is located on various sites that don’t
share physical components. This may be required when a particular
database needs to be accessed by various users globally.
It needs to be managed such that for the users it looks like one single
database.
-Manisha Kapila
18. Features of Distributed Databases
Features of Distributed Databases
Location independency
Distributed query processing
Distributed transaction management
Seamless integration
Transaction processing
-Manisha Kapila
19. Types of Distributed Databases
Types of Distributed Databases
There are two types of distributed databases:
Homogenous
Heterogeneous
-Manisha Kapila
20. Homogenous
Homogenous
A homogenous distributed database is a
network of identical databases stored
on multiple sites. The sites have the same
operating system, DDBMS, and data
structure, making them easily manageable.
Homogenous databases allow users to
access data from each of the databases
seamlessly.
-Manisha Kapila
21. Heterogeneous
Heterogeneous
Heterogeneous distributed database
system is a network of two or more
databases with different types of DBMS
software, which can be stored on one or
more machines.
In this system data can be accessible to
several databases in the network with the
help of generic connectivity (ODBC and
JDBC).
-Manisha Kapila
23. Replication
Replication
In database replication, the systems
store copies of data on different
sites. If an entire database is available on
multiple sites, it is a fully redundant
database.
The advantage of database replication is
that it increases data availability on
different sites and allows for parallel
query requests to be processed.
-Manisha Kapila
24. Advantages of Data Replication
Reliability − In case of failure of any site, the database system continues to work
since a copy is available at another site(s).
Reduction in Network Load − Since local copies of data are available, query
processing can be done with reduced network usage.
Quicker Response − Availability of local copies of data ensures quick query
processing and consequently quick response time.
Simpler Transactions − Transactions require less number of joins of tables
located at different sites and minimal coordination across the network. Thus, they
become simpler in nature.
-Manisha Kapila
25. Disadvantages of Data Replication
Increased Storage Requirements Maintaining multiple
−
copies of data is associated with increased storage costs. The
storage space required is in multiples of the storage required
for a centralized system.
Increased Cost and Complexity of Data Updating −
Each time a data item is updated, the update needs to be
reflected in all the copies of the data at the different sites. This
requires complex synchronization techniques and protocols.
-Manisha Kapila
26. Fragmentation
Fragmentation
When it comes to fragmentation of distributed database
storage, the relations are fragmented, which means they
are split into smaller parts. Each of the fragments is stored
on a different site, where it is required.
The advantage of fragmentation is that there are no data
copies, which prevents data inconsistency.
-Manisha Kapila
27. Two Types of Data Fragmentation
Two Types of Data Fragmentation
Vertical fragmentation
Vertical fragmentation is a subset of attributes.
Basically, vertical fragmentation splits tables by columns
Horizontal Fragmentation
Horizontal Fragmentation is a subset of tuples (rows).
Horizontal Fragmentation splits tables by rows.
-Manisha Kapila
29. Advantages of Fragmentation
Since data is stored close to the site of usage, efficiency of
the database system is increased.
Local query optimization techniques are sufficient for most
queries since data is locally available.
Since irrelevant data is not available at the sites, security and
privacy of the database system can be maintained.
-Manisha Kapila
30. Disadvantages of Fragmentation
When data from different fragments are required, the access
speeds may be very high.
In case of recursive fragmentations, the job of reconstruction
will need expensive techniques.
Lack of back-up copies of data in different sites may render
the database ineffective in case of failure of a site.
-Manisha Kapila
31. Distributed DBMS - Commit Protocols
In a local database system, for committing a transaction, the
transaction manager has to only convey the decision to
commit to the recovery manager.
However, in a distributed system, the transaction manager
should convey the decision to commit to all the servers in the
various sites where the transaction is being executed and
uniformly enforce the decision.
-Manisha Kapila
32. When processing is complete at each site, it reaches the partially committed
transaction state and waits for all other transactions to reach their partially
committed states.
When it receives the message that all the sites are ready to commit, it starts to
commit. In a distributed system, either all sites commit or none of them does.
The different distributed commit protocols are −
One-phase commit
Two-phase commit
Three-phase commit
-Manisha Kapila
33. Distributed One-phase Commit
Distributed one-phase commit is the simplest commit
protocol.
Let us consider that there is a controlling site and a number of
slave sites where the transaction is being executed. The steps
in distributed commit are-
After each slave has locally completed its transaction, it sends
a “DONE” message to the controlling site.
-Manisha Kapila
34. The slaves wait for “Commit” or “Abort” message from the
controlling site. This waiting time is called window of vulnerability.
When the controlling site receives “DONE” message from each
slave, it makes a decision to commit or abort. This is called the
commit point. Then, it sends this message to all the slaves.
On receiving this message, a slave either commits or aborts and
then sends an acknowledgement message to the controlling site.
-Manisha Kapila
35. Distributed Two-phase Commit
Distributed two-phase commit reduces the vulnerability of
one-phase commit protocols. The steps performed in the two
phases are as follows −
Phase 1: Prepare Phase
Phase 2:Commit/Abort Phase
-Manisha Kapila
36. Phase 1: Prepare Phase
Phase 1: Prepare Phase
After each slave has locally completed its transaction, it sends a
“DONE” message to the controlling site. When the controlling site
has received “DONE” message from all slaves, it sends a “Prepare”
message to the slaves.
The slaves vote on whether they still want to commit or not. If a
slave wants to commit, it sends a “Ready” message.
A slave that does not want to commit sends a “Not Ready”
message. This may happen when the slave has conflicting
concurrent transactions or there is a timeout.
-Manisha Kapila
37. Phase 2: Commit/Abort Phase
Phase 2: Commit/Abort Phase
After the controlling site has received “Ready” message from
all the slaves −
◦ The controlling site sends a “Global Commit” message to the slaves.
◦ The slaves apply the transaction and send a “Commit ACK” message
to the controlling site.
◦ When the controlling site receives “Commit ACK” message from all
the slaves, it considers the transaction as committed.
-Manisha Kapila
38. After the controlling site has received the first “Not Ready”
message from any slave −
◦ The controlling site sends a “Global Abort” message to the slaves.
◦ The slaves abort the transaction and send a “Abort ACK” message to
the controlling site.
◦ When the controlling site receives “Abort ACK” message from all
the slaves, it considers the transaction as aborted.
-Manisha Kapila
39. Distributed Three-phase Commit
The steps in distributed three-phase commit are as follows −
Phase 1: Prepare Phase: The steps are same as in distributed
two- phase commit.
Phase 2: Prepare to Commit Phase: The controlling site
issues an “Enter Prepared State” broadcast message.
The slave sites vote “OK” in response.
Phase 3: Commit / Abort Phase: The steps are same as two-
phase commit except that “Commit ACK”/”Abort ACK”
message is not required. -Manisha Kapila