SlideShare a Scribd company logo
2
Most read
12
Most read
Advance Database Management Systems :30
Data Placement
Prof Neeraj Bhargava
Vaibhav Khanna
Department of Computer Science
School of Engineering and Systems Sciences
Maharshi Dayanand Saraswati University Ajmer
Data Placement
• One of the most important decisions a distributed
database designer has to make is data placement.
Proper data placement is a crucial factor in determining
the success of a distributed database system.
• There are four basic alternatives: namely,
– centralized,
– replicated,
– partitioned, and
– hybrid.
• Some of these require additional analysis to fine-tune the
placement of data.
Locality of Data Reference
• In deciding among data placement alternatives, the
following factors need to be considered:
• Locality of Data Reference. The data should be placed
at the site where it is used most often. The designer
studies the applications to identify the sites where they
are performed, and attempts to place the data in such a
way that most accesses are local.
Reliability of the Data
• Reliability of the Data. By storing multiple
copies of the data in geographically remote
sites, the designer maximizes the probability
that the data will be recoverable in case of
physical damage to any site.
• Data Availability. As with reliability, storing
multiple copies assures users that data items
will be available to them, even if the site from
which the items are normally accessed is
unavailable due to failure of the node or its
only link.
Storage Capacities and Costs
• Storage Capacities and Costs. Nodes can have
different storage capacities and storage costs that must
be considered in deciding where data should be kept.
Storage costs are minimized when a single copy of each
data item is kept, but the plunging costs of data storage
make this consideration less important.
Distribution of Processing
Load.
• Distribution of Processing Load. One of
the reasons for choosing a distributed
system is to distribute the workload so that
processing power will be used most
effectively.
• This objective must be balanced against
locality of data reference.
Communications Costs
• Communications Costs. The designer must
consider the cost of using the
communications network to retrieve data.
• Retrieval costs and retrieval time are
minimized when each site has its own copy of
all the data.
• However, when the data is updated, the
changes must then be sent to all sites.
• If the data is very volatile, this results in high
communications costs for update
synchronization.
The Centralized.
• The Centralized. This alternative consists of a single database and DBMS
stored in one location, with users distributed, There is no need for a DDBMS
or global data dictionary, because there is no real distribution of data, only
of processing.
• Retrieval costs are high, because all users, except those at the central site,
use the network for all accesses.
• Storage costs are low, since only one copy of each item is kept.
• There is no need for update synchronization, and the standard concurrency
control mechanism is sufficient.
• Reliability is low and availability is poor, because a failure at the central
node results in the loss of the entire system.
• The workload can be distributed, but remote nodes need to access the
database to perform applications, so locality of data reference is low.
• This alternative is not a true distributed database system
Adbms 30 data placement
Replicated.
• Replicated. With this alternative, a complete copy of the database is kept at
each node.
• Advantages are maximum locality of reference, reliability, data availability,
and processing load distribution.
• Storage costs are highest in this alternative.
• Communications costs for retrievals are low, but the cost of updates is high,
since every site must receive every update.
• If updates are very infrequent, this alternative is a good one.
Partitioned.
• Partitioned. only one copy of each data item, but the data is distributed
across nodes.
• To allow this, the database is split into disjoint fragments or parts. If the
database is a relational one, fragments can be vertical table subsets
(formed by projection)
• or horizontal subsets (formed by selection) of global relations.
• In any horizontal fragmentation scheme, each tuple of every relation
must be assigned to one or more fragments such that taking the union of
the fragments results in the original relation; for the horizontally partitioned
case, a tuple is assigned to exactly one fragment.
• In a vertical fragmentation scheme, the projections must be lossless, so
that the original relations can be reconstructed by taking the join of the
fragments.
Hybrid.
• Hybrid. In this alternative, different portions of the database are distributed
differently.
• For example, those records with high locality of reference are partitioned,
while those commonly used by all nodes are replicated, if updates are
infrequent.
• Those that are needed by all nodes, but updated so frequently that
synchronization would be a problem, might be centralized.
• This alternative is designed to optimize data placement, so that all the
advantages and none of the disadvantages of the other methods are
possible.
• However, very careful analysis of data and processing is required with this
plan.
Assignment
• Compare and contrast the various alternatives for Data
placement

More Related Content

PDF
Transparency and concurrency
PPTX
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
PPTX
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
PPTX
Distributed DBMS - Unit 1 - Introduction
PPTX
Introduction to distributed database
PPT
Page replacement
PPTX
Scheduling in Cloud Computing
Transparency and concurrency
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 1 - Introduction
Introduction to distributed database
Page replacement
Scheduling in Cloud Computing

What's hot (20)

PPTX
Query processing and Query Optimization
PPTX
Centralised and distributed database
PPTX
Distributed DBMS - Unit 6 - Query Processing
PDF
Lecture6 introduction to data streams
PDF
Introduction to Parallel Computing
PPTX
Information retrieval 13 alternative set theoretic models
PPT
Distributed Database Management System
PPTX
Distributed concurrency control
PPTX
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
PPT
Cloud computing
DOCX
Distributed system notes unit I
PPTX
Query processing and optimization (updated)
PPT
Lecture 11 - distributed database
DOCX
FORWARD CHAINING AND BACKWARD CHAINING SYSTEMS IN ARTIFICIAL INTELIGENCE
PDF
Natural Language Processing
PPTX
Distributed dbms architectures
PDF
Data models
PPT
Distributed systems scheduling
PPTX
Parallel Programing Model
PPT
12. Indexing and Hashing in DBMS
Query processing and Query Optimization
Centralised and distributed database
Distributed DBMS - Unit 6 - Query Processing
Lecture6 introduction to data streams
Introduction to Parallel Computing
Information retrieval 13 alternative set theoretic models
Distributed Database Management System
Distributed concurrency control
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Cloud computing
Distributed system notes unit I
Query processing and optimization (updated)
Lecture 11 - distributed database
FORWARD CHAINING AND BACKWARD CHAINING SYSTEMS IN ARTIFICIAL INTELIGENCE
Natural Language Processing
Distributed dbms architectures
Data models
Distributed systems scheduling
Parallel Programing Model
12. Indexing and Hashing in DBMS
Ad

Similar to Adbms 30 data placement (20)

PPTX
Distributed Database Management System
PPTX
Distributed database detailed version by jh
PPTX
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
PPTX
Distributed Database system in Computer Science.pptx
PPTX
Distributed design alternatives
PPTX
lecture-13.pptx
PPTX
Distributed Database Management System.pptx
PPTX
Adbms 28 components of distributed database system
PPTX
Adbms 31 data distribution transparency
PPTX
Distributed dbms (ddbms)
PDF
Csld phan tan va song song
PPTX
Lec 8 (distributed database)
PPTX
DBMS - Distributed Databases
PPTX
AdvanceDatabaseChapter6Advance Dtabases.pptx
PPTX
Transforming centralized into distributed
PPTX
Santosh Kumar Meher(2105040008) DISTRIBUTED DATABASE.pptx
PPTX
Operating system 09 distributed operating system
PPTX
DDBS PPT (1).pptx
PDF
DDMS DBMS Distributed DB Systems.pdf DMS
PPT
distributed database management system.ppt
Distributed Database Management System
Distributed database detailed version by jh
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
Distributed Database system in Computer Science.pptx
Distributed design alternatives
lecture-13.pptx
Distributed Database Management System.pptx
Adbms 28 components of distributed database system
Adbms 31 data distribution transparency
Distributed dbms (ddbms)
Csld phan tan va song song
Lec 8 (distributed database)
DBMS - Distributed Databases
AdvanceDatabaseChapter6Advance Dtabases.pptx
Transforming centralized into distributed
Santosh Kumar Meher(2105040008) DISTRIBUTED DATABASE.pptx
Operating system 09 distributed operating system
DDBS PPT (1).pptx
DDMS DBMS Distributed DB Systems.pdf DMS
distributed database management system.ppt
Ad

More from Vaibhav Khanna (20)

PPTX
Information and network security 47 authentication applications
PPTX
Information and network security 46 digital signature algorithm
PPTX
Information and network security 45 digital signature standard
PPTX
Information and network security 44 direct digital signatures
PPTX
Information and network security 43 digital signatures
PPTX
Information and network security 42 security of message authentication code
PPTX
Information and network security 41 message authentication code
PPTX
Information and network security 40 sha3 secure hash algorithm
PPTX
Information and network security 39 secure hash algorithm
PPTX
Information and network security 38 birthday attacks and security of hash fun...
PPTX
Information and network security 37 hash functions and message authentication
PPTX
Information and network security 35 the chinese remainder theorem
PPTX
Information and network security 34 primality
PPTX
Information and network security 33 rsa algorithm
PPTX
Information and network security 32 principles of public key cryptosystems
PPTX
Information and network security 31 public key cryptography
PPTX
Information and network security 30 random numbers
PPTX
Information and network security 29 international data encryption algorithm
PPTX
Information and network security 28 blowfish
PPTX
Information and network security 27 triple des
Information and network security 47 authentication applications
Information and network security 46 digital signature algorithm
Information and network security 45 digital signature standard
Information and network security 44 direct digital signatures
Information and network security 43 digital signatures
Information and network security 42 security of message authentication code
Information and network security 41 message authentication code
Information and network security 40 sha3 secure hash algorithm
Information and network security 39 secure hash algorithm
Information and network security 38 birthday attacks and security of hash fun...
Information and network security 37 hash functions and message authentication
Information and network security 35 the chinese remainder theorem
Information and network security 34 primality
Information and network security 33 rsa algorithm
Information and network security 32 principles of public key cryptosystems
Information and network security 31 public key cryptography
Information and network security 30 random numbers
Information and network security 29 international data encryption algorithm
Information and network security 28 blowfish
Information and network security 27 triple des

Recently uploaded (20)

PDF
System and Network Administration Chapter 2
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Introduction to Artificial Intelligence
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Digital Strategies for Manufacturing Companies
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
AI in Product Development-omnex systems
PPTX
Transform Your Business with a Software ERP System
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
System and Network Administration Chapter 2
CHAPTER 2 - PM Management and IT Context
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Introduction to Artificial Intelligence
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
VVF-Customer-Presentation2025-Ver1.9.pptx
Operating system designcfffgfgggggggvggggggggg
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
How Creative Agencies Leverage Project Management Software.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Digital Strategies for Manufacturing Companies
Internet Downloader Manager (IDM) Crack 6.42 Build 41
AI in Product Development-omnex systems
Transform Your Business with a Software ERP System
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Which alternative to Crystal Reports is best for small or large businesses.pdf
wealthsignaloriginal-com-DS-text-... (1).pdf
Upgrade and Innovation Strategies for SAP ERP Customers
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)

Adbms 30 data placement

  • 1. Advance Database Management Systems :30 Data Placement Prof Neeraj Bhargava Vaibhav Khanna Department of Computer Science School of Engineering and Systems Sciences Maharshi Dayanand Saraswati University Ajmer
  • 2. Data Placement • One of the most important decisions a distributed database designer has to make is data placement. Proper data placement is a crucial factor in determining the success of a distributed database system. • There are four basic alternatives: namely, – centralized, – replicated, – partitioned, and – hybrid. • Some of these require additional analysis to fine-tune the placement of data.
  • 3. Locality of Data Reference • In deciding among data placement alternatives, the following factors need to be considered: • Locality of Data Reference. The data should be placed at the site where it is used most often. The designer studies the applications to identify the sites where they are performed, and attempts to place the data in such a way that most accesses are local.
  • 4. Reliability of the Data • Reliability of the Data. By storing multiple copies of the data in geographically remote sites, the designer maximizes the probability that the data will be recoverable in case of physical damage to any site. • Data Availability. As with reliability, storing multiple copies assures users that data items will be available to them, even if the site from which the items are normally accessed is unavailable due to failure of the node or its only link.
  • 5. Storage Capacities and Costs • Storage Capacities and Costs. Nodes can have different storage capacities and storage costs that must be considered in deciding where data should be kept. Storage costs are minimized when a single copy of each data item is kept, but the plunging costs of data storage make this consideration less important.
  • 6. Distribution of Processing Load. • Distribution of Processing Load. One of the reasons for choosing a distributed system is to distribute the workload so that processing power will be used most effectively. • This objective must be balanced against locality of data reference.
  • 7. Communications Costs • Communications Costs. The designer must consider the cost of using the communications network to retrieve data. • Retrieval costs and retrieval time are minimized when each site has its own copy of all the data. • However, when the data is updated, the changes must then be sent to all sites. • If the data is very volatile, this results in high communications costs for update synchronization.
  • 8. The Centralized. • The Centralized. This alternative consists of a single database and DBMS stored in one location, with users distributed, There is no need for a DDBMS or global data dictionary, because there is no real distribution of data, only of processing. • Retrieval costs are high, because all users, except those at the central site, use the network for all accesses. • Storage costs are low, since only one copy of each item is kept. • There is no need for update synchronization, and the standard concurrency control mechanism is sufficient. • Reliability is low and availability is poor, because a failure at the central node results in the loss of the entire system. • The workload can be distributed, but remote nodes need to access the database to perform applications, so locality of data reference is low. • This alternative is not a true distributed database system
  • 10. Replicated. • Replicated. With this alternative, a complete copy of the database is kept at each node. • Advantages are maximum locality of reference, reliability, data availability, and processing load distribution. • Storage costs are highest in this alternative. • Communications costs for retrievals are low, but the cost of updates is high, since every site must receive every update. • If updates are very infrequent, this alternative is a good one.
  • 11. Partitioned. • Partitioned. only one copy of each data item, but the data is distributed across nodes. • To allow this, the database is split into disjoint fragments or parts. If the database is a relational one, fragments can be vertical table subsets (formed by projection) • or horizontal subsets (formed by selection) of global relations. • In any horizontal fragmentation scheme, each tuple of every relation must be assigned to one or more fragments such that taking the union of the fragments results in the original relation; for the horizontally partitioned case, a tuple is assigned to exactly one fragment. • In a vertical fragmentation scheme, the projections must be lossless, so that the original relations can be reconstructed by taking the join of the fragments.
  • 12. Hybrid. • Hybrid. In this alternative, different portions of the database are distributed differently. • For example, those records with high locality of reference are partitioned, while those commonly used by all nodes are replicated, if updates are infrequent. • Those that are needed by all nodes, but updated so frequently that synchronization would be a problem, might be centralized. • This alternative is designed to optimize data placement, so that all the advantages and none of the disadvantages of the other methods are possible. • However, very careful analysis of data and processing is required with this plan.
  • 13. Assignment • Compare and contrast the various alternatives for Data placement