SlideShare a Scribd company logo
Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
Scalable and Adaptive Data Replica Placement for Geo-Distributed
Cloud Storages
Abstract:
In geo-distributed cloud storage systems, data replication has been widely used to serve the ever
more users around the world for high data reliability and availability. How to optimize the data
replica placement has become one of the fundamental problems to reduce the inter-node traffic
and the system overhead of accessing associated data items. In the big data era, traditional
solutions may face the challenges of long running time and large overheads to handle the
increasing scale of data items with time-varying user requests. Therefore, novel offline
community discovery and online community adjustment schemes are proposed to solve the
replica placement problem in a scalable and adaptive way. The offline scheme can find a replica
placement solution based on the average read/write rates for a certain period of time. The
scalability can be achieved as 1) the computation complexity is linear to the amount of data items
and 2) the data-node communities can evolve in parallel for a distributed replica placement.
Furthermore, the online scheme is adaptive to handle the bursty data requests, without the need
to completely override the existing replica placement. Driven by realworld data traces, extensive
performance evaluations demonstrate the effectiveness of our design to handle large-scale
datasets.
Index Terms—Geo-distributed storage system, data replica placement, scalability, adaptivity,
community discovery
Existing System:
Apart from the inter-node traffic, the storage locations of data replicas may also affect the system
overhead of accessing associated data items [4], [5]. It is worth noting that users may request
multiple data items in one transaction. For example, in online analytical processing (OLAP)
systems, a query may be executed by accessing multiple data blocks [6]. The system overhead
could be reduced if fewer storage nodes are involved to handle such a request. The reason is that
a certain overhead, e.g., the establishment of TCP connections, will be introduced if the read
request is dispatched to a storage node. In short, data replica placement reduces the system
overhead by placing associated data items together in the same storage location. With the
increasing number of data items, how to choose the proper number and storage locations of data
replicas becomes a critical issue.
Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
Various data replica placement schemes have been proposed to seek optimal data storage
locations, which are typically implemented in a centralized/offline way: At every distributed
storage node handling the user requests, the data access logs are captured. Then, a central
controller is deployed to collect all logs and analyze the request frequency of each data item. The
extracted information is fed into the replica placement algorithms, e.g., mathematical
programming [8] and graph partitioning [5], [7], [9], which finally output the storage locations of
data replicas. These centralized/offline schemes can iteratively approximate the optimal solutions
with high accuracy.
Proposed System
In this paper, based on the overlapping community discovery and adjustment, we design scalable
and adaptive data replica placement schemes in geo-distributed cloud storage systems. A data-
node community is defined as the group of a storage node and all data items placed at it, which
should have more internal data access requests than external ones. Therefore, a more compact
community structure means more data requests are served locally with lower system overhead
and less inter-node traffic. Unlike traditional centralized placement schemes, communities can
evolve to decide whether each data replica should be placed at the node in a parallel and adaptive
way. The scalability of our design can be achieved by this distributed implementation along with
the computation complexity linear to the amount. of data items. Our major contributions in this
paper include:
A novel distributed overlapping community discovery scheme is proposed to solve the data
replica placement problem in a scalable way. This offline scheme can find a replica placement
solution based on the average read/write rates for a certain time period.
Guided by the offline scheme, an online community adjustment scheme is proposed to
adaptively handle the bursty requests.
The worst-case performance guarantees of the proposed schemes are provided via theoretical
analysis.
Extensive evaluation results driven by real-world data traces show the superiority of our design
over the state-of-the-art replica placement methods.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
• PROCESSOR : I3.
Venkat Java Projects
Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com
Email:venkatjavaprojects@gmail.com
• Hard Disk : 40 GB.
• Ram : 2 GB.
SOFTWARE REQUIREMENTS:
• Operating system : Windows.
• Coding Language : JAVA/J2EE
• Data Base : MYSQL
• IDE :Netbeans8.1

More Related Content

DOCX
JPJ1448 Cooperative Caching for Efficient Data Access in Disruption Toleran...
DOCX
Cooperative caching for efficient data access in
PDF
Ijcatr04071003
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
PDF
The International Journal of Engineering and Science (The IJES)
PDF
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
JPJ1448 Cooperative Caching for Efficient Data Access in Disruption Toleran...
Cooperative caching for efficient data access in
Ijcatr04071003
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering Research and Development (IJERD)
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
The International Journal of Engineering and Science (The IJES)
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...

What's hot (18)

DOCX
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
PDF
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
PDF
Mining Of Big Data Using Map-Reduce Theorem
PDF
Data Partitioning in Mongo DB with Cloud
PPTX
Health & Status Monitoring (2010-v8)
PDF
Literature Survey on Buliding Confidential and Efficient Query Processing Usi...
PPTX
Cs6703 grid and cloud computing unit 1
PDF
International Journal of Engineering Research and Development (IJERD)
PPTX
Bionimbus Cambridge Workshop (3-28-11, v7)
PPT
Large Scale On-Demand Image Processing For Disaster Relief
PDF
Experimenting With Big Data
PDF
International Journal of Engineering Research and Development (IJERD)
DOCX
Secure distributed deduplication systems with improved reliability
PPTX
Applications of SOA and Web Services in Grid Computing
PPTX
Open Science Data Cloud (IEEE Cloud 2011)
PDF
NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
PDF
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
PPTX
Open Science Data Cloud - CCA 11
PROVABLE MULTICOPY DYNAMIC DATA POSSESSION IN CLOUD COMPUTING SYSTEMS
MataNui - Building a Grid Data Infrastructure that "doesn't suck!"
Mining Of Big Data Using Map-Reduce Theorem
Data Partitioning in Mongo DB with Cloud
Health & Status Monitoring (2010-v8)
Literature Survey on Buliding Confidential and Efficient Query Processing Usi...
Cs6703 grid and cloud computing unit 1
International Journal of Engineering Research and Development (IJERD)
Bionimbus Cambridge Workshop (3-28-11, v7)
Large Scale On-Demand Image Processing For Disaster Relief
Experimenting With Big Data
International Journal of Engineering Research and Development (IJERD)
Secure distributed deduplication systems with improved reliability
Applications of SOA and Web Services in Grid Computing
Open Science Data Cloud (IEEE Cloud 2011)
NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
Open Science Data Cloud - CCA 11
Ad

Similar to Scalable and adaptive data replica placement for geo distributed cloud storages (20)

PDF
50120130406035
PDF
50120130405014 2-3
PDF
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
PDF
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
PDF
An experimental evaluation of performance
PDF
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
PDF
International Refereed Journal of Engineering and Science (IRJES)
PDF
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
PDF
Ijcet 06 06_002
PDF
Efficient File Sharing Scheme in Mobile Adhoc Network
PDF
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
PDF
Review and Analysis of Self Destruction of Data in Cloud Computing
PDF
A fast-replica-placement-methodology-for-large-scale-distributed-computing-sy...
PDF
IJSETR-VOL-3-ISSUE-12-3358-3363
PDF
Securely Data Forwarding and Maintaining Reliability of Data in Cloud Computing
PDF
Losing Data in a Safe Way – Advanced Replication Strategies in Apache Hadoop ...
PDF
OSDC 2014: Fabrizio Manfredi - Data replication
PDF
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
PDF
Ax34298305
PPTX
54+664444444444444444444444444444444444444444
50120130406035
50120130405014 2-3
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
An Efficient and Fault Tolerant Data Replica Placement Technique for Cloud ba...
An experimental evaluation of performance
The Impact of Data Replication on Job Scheduling Performance in Hierarchical ...
International Refereed Journal of Engineering and Science (IRJES)
REPLICATION STRATEGY BASED ON DATA RELATIONSHIP IN GRID COMPUTING
Ijcet 06 06_002
Efficient File Sharing Scheme in Mobile Adhoc Network
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
Review and Analysis of Self Destruction of Data in Cloud Computing
A fast-replica-placement-methodology-for-large-scale-distributed-computing-sy...
IJSETR-VOL-3-ISSUE-12-3358-3363
Securely Data Forwarding and Maintaining Reliability of Data in Cloud Computing
Losing Data in a Safe Way – Advanced Replication Strategies in Apache Hadoop ...
OSDC 2014: Fabrizio Manfredi - Data replication
Handling Selfishness in Replica Allocation over a Mobile Ad-Hoc Network
Ax34298305
54+664444444444444444444444444444444444444444
Ad

More from Venkat Projects (20)

DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
DOCX
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
DOCX
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
DOCX
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
DOCX
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
DOCX
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
DOCX
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
DOCX
WATERMARKING IMAGES
DOCX
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
DOCX
Application and evaluation of a K-Medoidsbased shape clustering method for an...
DOCX
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
DOCX
2022 PYTHON MAJOR PROJECTS LIST.docx
DOCX
2022 PYTHON PROJECTS LIST.docx
DOCX
2021 PYTHON PROJECTS LIST.docx
DOCX
2021 python projects list
DOCX
10.sentiment analysis of customer product reviews using machine learni
DOCX
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
DOCX
6.iris recognition using machine learning technique
DOCX
5.local community detection algorithm based on minimal cluster
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
WATERMARKING IMAGES
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
Application and evaluation of a K-Medoidsbased shape clustering method for an...
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
2022 PYTHON MAJOR PROJECTS LIST.docx
2022 PYTHON PROJECTS LIST.docx
2021 PYTHON PROJECTS LIST.docx
2021 python projects list
10.sentiment analysis of customer product reviews using machine learni
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
6.iris recognition using machine learning technique
5.local community detection algorithm based on minimal cluster

Recently uploaded (20)

PPTX
Institutional Correction lecture only . . .
PPTX
master seminar digital applications in india
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Cell Structure & Organelles in detailed.
PPTX
Lesson notes of climatology university.
PDF
Computing-Curriculum for Schools in Ghana
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Complications of Minimal Access Surgery at WLH
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Classroom Observation Tools for Teachers
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Institutional Correction lecture only . . .
master seminar digital applications in india
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
Final Presentation General Medicine 03-08-2024.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Cell Types and Its function , kingdom of life
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Anesthesia in Laparoscopic Surgery in India
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Cell Structure & Organelles in detailed.
Lesson notes of climatology university.
Computing-Curriculum for Schools in Ghana
VCE English Exam - Section C Student Revision Booklet
Complications of Minimal Access Surgery at WLH
O5-L3 Freight Transport Ops (International) V1.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Classroom Observation Tools for Teachers
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3

Scalable and adaptive data replica placement for geo distributed cloud storages

  • 1. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com Scalable and Adaptive Data Replica Placement for Geo-Distributed Cloud Storages Abstract: In geo-distributed cloud storage systems, data replication has been widely used to serve the ever more users around the world for high data reliability and availability. How to optimize the data replica placement has become one of the fundamental problems to reduce the inter-node traffic and the system overhead of accessing associated data items. In the big data era, traditional solutions may face the challenges of long running time and large overheads to handle the increasing scale of data items with time-varying user requests. Therefore, novel offline community discovery and online community adjustment schemes are proposed to solve the replica placement problem in a scalable and adaptive way. The offline scheme can find a replica placement solution based on the average read/write rates for a certain period of time. The scalability can be achieved as 1) the computation complexity is linear to the amount of data items and 2) the data-node communities can evolve in parallel for a distributed replica placement. Furthermore, the online scheme is adaptive to handle the bursty data requests, without the need to completely override the existing replica placement. Driven by realworld data traces, extensive performance evaluations demonstrate the effectiveness of our design to handle large-scale datasets. Index Terms—Geo-distributed storage system, data replica placement, scalability, adaptivity, community discovery Existing System: Apart from the inter-node traffic, the storage locations of data replicas may also affect the system overhead of accessing associated data items [4], [5]. It is worth noting that users may request multiple data items in one transaction. For example, in online analytical processing (OLAP) systems, a query may be executed by accessing multiple data blocks [6]. The system overhead could be reduced if fewer storage nodes are involved to handle such a request. The reason is that a certain overhead, e.g., the establishment of TCP connections, will be introduced if the read request is dispatched to a storage node. In short, data replica placement reduces the system overhead by placing associated data items together in the same storage location. With the increasing number of data items, how to choose the proper number and storage locations of data replicas becomes a critical issue.
  • 2. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com Various data replica placement schemes have been proposed to seek optimal data storage locations, which are typically implemented in a centralized/offline way: At every distributed storage node handling the user requests, the data access logs are captured. Then, a central controller is deployed to collect all logs and analyze the request frequency of each data item. The extracted information is fed into the replica placement algorithms, e.g., mathematical programming [8] and graph partitioning [5], [7], [9], which finally output the storage locations of data replicas. These centralized/offline schemes can iteratively approximate the optimal solutions with high accuracy. Proposed System In this paper, based on the overlapping community discovery and adjustment, we design scalable and adaptive data replica placement schemes in geo-distributed cloud storage systems. A data- node community is defined as the group of a storage node and all data items placed at it, which should have more internal data access requests than external ones. Therefore, a more compact community structure means more data requests are served locally with lower system overhead and less inter-node traffic. Unlike traditional centralized placement schemes, communities can evolve to decide whether each data replica should be placed at the node in a parallel and adaptive way. The scalability of our design can be achieved by this distributed implementation along with the computation complexity linear to the amount. of data items. Our major contributions in this paper include: A novel distributed overlapping community discovery scheme is proposed to solve the data replica placement problem in a scalable way. This offline scheme can find a replica placement solution based on the average read/write rates for a certain time period. Guided by the offline scheme, an online community adjustment scheme is proposed to adaptively handle the bursty requests. The worst-case performance guarantees of the proposed schemes are provided via theoretical analysis. Extensive evaluation results driven by real-world data traces show the superiority of our design over the state-of-the-art replica placement methods. SYSTEM REQUIREMENTS: HARDWARE REQUIREMENTS: • PROCESSOR : I3.
  • 3. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com • Hard Disk : 40 GB. • Ram : 2 GB. SOFTWARE REQUIREMENTS: • Operating system : Windows. • Coding Language : JAVA/J2EE • Data Base : MYSQL • IDE :Netbeans8.1