SlideShare a Scribd company logo
Data Replication and 
Synchronization Tool 
Ashish Sharma 
Pradeeban Kathiravelu 
PPoowweerrppooiinntt TTeemmppllaatteess 1
Introduction 
• Data is huge 
• Consumers often share a sub set of 
data with others. 
– Pointers to data, actually. 
• Medical data is structured in 
hierarchies. 
Powerpoint Templates 2
Motivation 
• Creating and sharing pointers to 
interesting sub sets of data. 
• Data Sharing Synchronization 
System 
– Fault-tolerant. 
– In-Memory. 
• Generic, while targeting the 
medical images and meta data. 
– The Cancer Imaging Archive 
Powerpoint Templates 3 
(TCIA)
Solution Architecture 
• Users create, share, and update 
replica sets from a data source. 
• Infinispan In-Memory Data Grid 
(version 6.0.2) to store the replica 
sets. 
Fig 1. Deployment Architecture 
Powerpoint Templates 4
Execution Flow 
• Publisher-Consumer API to consume 
the replica sets and Data Provider API 
to communicate with the data 
source. 
Powerpoint Fig 2. Execution Templates Flow 
5
Design 
Fig 3. Back-end Class Hierarchy 
• DataProSpecs API 
 createReplicaSet 
 getReplicaSet 
 updateReplicaSet 
duplicateReplicaSet 
deleteReplicaSet 
getRawData 
Powerpoint Templates 6
Extensibility 
• Not tightly coupled to the technology. 
– Other data-grids 
• Hazelcast, Terracotta Big 
Memory, Oracle Coherence 
– Persistence 
• Integration to SQL or NoSQL 
solutions such as Mongo DB. 
–Data sources other than TCIA. 
Powerpoint Templates 7
What Infinispan offers? 
• High Performance and Scalability. 
• Fault-tolerance 
– Multiple nodes with TCP-IP or 
Multicast based JGroups clustering 
configurations. 
• Distributed Execution. 
– Optimized for single node as a local 
cache as well as a multiple-node 
execution. 
• MapReduce Framework. 
Powerpoint Templates 8
What Infinispan offers? 
• High Performance and Scalable. 
• Fault-tolerant 
– Multiple nodes with TCP-IP or 
Multicast based JGroups clustering 
configurations. 
Thank you! 
• Distributed Execution. 
– Optimized for single node as a local 
cache as well as a multiple-node 
execution. 
• MapReduce Framework. 
Powerpoint Templates 9
What Infinispan offers? 
• High Performance and Scalable. 
• Fault-tolerant 
– Multiple nodes with TCP-IP or 
Multicast based JGroups clustering 
configurations. 
Thank you! 
• Distributed Execution. 
– Optimized for single node as a local 
cache as well as a multiple-node 
execution. 
• MapReduce Framework. 
Powerpoint Templates 10

More Related Content

PDF
Data replication and synchronization tool
PPTX
Machine Learning on Distributed Systems by Josh Poduska
PDF
Short introduction to ML frameworks on Hadoop
ODP
EDW and Hadoop
PDF
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PPTX
Greenplum- an opensource
PPT
Building High Performance MySQL Query Systems and Analytic Applications
ODP
Challenges in Large Scale Machine Learning
Data replication and synchronization tool
Machine Learning on Distributed Systems by Josh Poduska
Short introduction to ML frameworks on Hadoop
EDW and Hadoop
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
Greenplum- an opensource
Building High Performance MySQL Query Systems and Analytic Applications
Challenges in Large Scale Machine Learning

What's hot (20)

PPT
Centralised and distributed databases
PPT
The thinking persons guide to data warehouse design
PPTX
Hadoop training in bangalore
PPT
MySQL conference 2010 ignite talk on InfiniDB
PDF
Distributed machine learning
ODP
Building next generation data warehouses
PPT
Data Warehouse Logical Design using Mysql
PDF
3 olap storage
PPTX
BUILDING A DATA WAREHOUSE
PPTX
Grid applications
PPTX
bigdawg overview
PDF
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
PDF
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
PPT
Online Analytical Processing
PPT
Hadoop mapreduce and yarn frame work- unit5
PPTX
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
PPT
Coherance in dissemination- Msis 2007
PPTX
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
PPTX
Hadoop tutorial for Freshers,
PPTX
Online analytical processing (olap) tools
Centralised and distributed databases
The thinking persons guide to data warehouse design
Hadoop training in bangalore
MySQL conference 2010 ignite talk on InfiniDB
Distributed machine learning
Building next generation data warehouses
Data Warehouse Logical Design using Mysql
3 olap storage
BUILDING A DATA WAREHOUSE
Grid applications
bigdawg overview
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
Online Analytical Processing
Hadoop mapreduce and yarn frame work- unit5
SciMATE: A Novel MapReduce-Like Framework for Multiple Scientific Data Formats
Coherance in dissemination- Msis 2007
A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database
Hadoop tutorial for Freshers,
Online analytical processing (olap) tools
Ad

Similar to Data replication and synchronization tool (20)

PDF
BAR360 open data platform presentation at DAMA, Sydney
PPT
Dot for-oow-v4
PDF
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
PPT
Teradata vs-exadata
PPTX
The Worst Category Name Ever
DOCX
Big data (word file)
PDF
Where Does Big Data Meet Big Database - QCon 2012
PDF
Top 6 Reasons to Use a Distributed Data Grid
PDF
Embedded Analytics: The Next Mega-Wave of Innovation
PDF
Big data rmoug
PPTX
PPTX
Datasciencetools
PPTX
Microsoft Dryad
PPTX
Big Data and HPC
PDF
Ca e rwin state of the union 09082010
PDF
Oracle strategy for_information_management
PDF
Big Data Modeling Challenges and Machine Learning with No Code
PDF
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
PDF
Big Data: Movement, Warehousing, & Virtualization
BAR360 open data platform presentation at DAMA, Sydney
Dot for-oow-v4
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Teradata vs-exadata
The Worst Category Name Ever
Big data (word file)
Where Does Big Data Meet Big Database - QCon 2012
Top 6 Reasons to Use a Distributed Data Grid
Embedded Analytics: The Next Mega-Wave of Innovation
Big data rmoug
Datasciencetools
Microsoft Dryad
Big Data and HPC
Ca e rwin state of the union 09082010
Oracle strategy for_information_management
Big Data Modeling Challenges and Machine Learning with No Code
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
Big Data: Movement, Warehousing, & Virtualization
Ad

More from Pradeeban Kathiravelu, Ph.D. (20)

PDF
Google Summer of Code_2023.pdf
PDF
Google Summer of Code (GSoC) 2022
PDF
Google Summer of Code (GSoC) 2022
PPTX
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
PDF
Google summer of code (GSoC) 2021
PPTX
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
PDF
Google Summer of Code (GSoC) 2020 for mentors
PDF
Google Summer of Code (GSoC) 2020
PDF
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
PDF
The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree
PDF
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
PDF
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
PDF
UCL Ph.D. Confirmation 2018
PDF
Software-Defined Systems for Network-Aware Service Composition and Workflow P...
PDF
Moving bits with a fleet of shared virtual routers
PDF
Software-Defined Data Services: Interoperable and Network-Aware Big Data Exec...
PDF
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
PDF
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
PDF
Software-Defined Inter-Cloud Composition of Big Services
PDF
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Google Summer of Code_2023.pdf
Google Summer of Code (GSoC) 2022
Google Summer of Code (GSoC) 2022
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
Google summer of code (GSoC) 2021
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
Google Summer of Code (GSoC) 2020 for mentors
Google Summer of Code (GSoC) 2020
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
UCL Ph.D. Confirmation 2018
Software-Defined Systems for Network-Aware Service Composition and Workflow P...
Moving bits with a fleet of shared virtual routers
Software-Defined Data Services: Interoperable and Network-Aware Big Data Exec...
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Software-Defined Inter-Cloud Composition of Big Services
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...

Recently uploaded (20)

PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
famous lake in india and its disturibution and importance
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PPTX
2. Earth - The Living Planet Module 2ELS
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PPTX
Microbiology with diagram medical studies .pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
BIOMOLECULES PPT........................
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
INTRODUCTION TO EVS | Concept of sustainability
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
TOTAL hIP ARTHROPLASTY Presentation.pptx
famous lake in india and its disturibution and importance
The KM-GBF monitoring framework – status & key messages.pptx
AlphaEarth Foundations and the Satellite Embedding dataset
neck nodes and dissection types and lymph nodes levels
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
lecture 2026 of Sjogren's syndrome l .pdf
2. Earth - The Living Planet Module 2ELS
POSITIONING IN OPERATION THEATRE ROOM.ppt
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Microbiology with diagram medical studies .pptx
Introduction to Cardiovascular system_structure and functions-1
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
BIOMOLECULES PPT........................
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5

Data replication and synchronization tool

  • 1. Data Replication and Synchronization Tool Ashish Sharma Pradeeban Kathiravelu PPoowweerrppooiinntt TTeemmppllaatteess 1
  • 2. Introduction • Data is huge • Consumers often share a sub set of data with others. – Pointers to data, actually. • Medical data is structured in hierarchies. Powerpoint Templates 2
  • 3. Motivation • Creating and sharing pointers to interesting sub sets of data. • Data Sharing Synchronization System – Fault-tolerant. – In-Memory. • Generic, while targeting the medical images and meta data. – The Cancer Imaging Archive Powerpoint Templates 3 (TCIA)
  • 4. Solution Architecture • Users create, share, and update replica sets from a data source. • Infinispan In-Memory Data Grid (version 6.0.2) to store the replica sets. Fig 1. Deployment Architecture Powerpoint Templates 4
  • 5. Execution Flow • Publisher-Consumer API to consume the replica sets and Data Provider API to communicate with the data source. Powerpoint Fig 2. Execution Templates Flow 5
  • 6. Design Fig 3. Back-end Class Hierarchy • DataProSpecs API  createReplicaSet  getReplicaSet  updateReplicaSet duplicateReplicaSet deleteReplicaSet getRawData Powerpoint Templates 6
  • 7. Extensibility • Not tightly coupled to the technology. – Other data-grids • Hazelcast, Terracotta Big Memory, Oracle Coherence – Persistence • Integration to SQL or NoSQL solutions such as Mongo DB. –Data sources other than TCIA. Powerpoint Templates 7
  • 8. What Infinispan offers? • High Performance and Scalability. • Fault-tolerance – Multiple nodes with TCP-IP or Multicast based JGroups clustering configurations. • Distributed Execution. – Optimized for single node as a local cache as well as a multiple-node execution. • MapReduce Framework. Powerpoint Templates 8
  • 9. What Infinispan offers? • High Performance and Scalable. • Fault-tolerant – Multiple nodes with TCP-IP or Multicast based JGroups clustering configurations. Thank you! • Distributed Execution. – Optimized for single node as a local cache as well as a multiple-node execution. • MapReduce Framework. Powerpoint Templates 9
  • 10. What Infinispan offers? • High Performance and Scalable. • Fault-tolerant – Multiple nodes with TCP-IP or Multicast based JGroups clustering configurations. Thank you! • Distributed Execution. – Optimized for single node as a local cache as well as a multiple-node execution. • MapReduce Framework. Powerpoint Templates 10