SlideShare a Scribd company logo
Software-Defined Data Services:
Interoperable and Network-Aware Big Data Executions
Pradeeban Kathiravelu, Peter Van Roy, Luís Veiga
5th
IEEE International Conference on Software Defined Systems (SDS 2018).
Barcelona, Spain. 24/04/2018.
Introduction
● Big data with increasing volume and variety.
– Volume requires scalability.
– Variety requires interoperability.
● Data Services
– Services that access and process big data.
– Unified web service interface to data → Interoperability!
● Chaining of data services.
– Composing chains of numerous data services.
– Data Access → Data cleaning → Data Integration.
Problem Statement
● Data services offer interoperability.
● But when related data and services are distributed
far from each other → Bad performance with scale.
– How to scale out efficiently?
● How to minimize communication overheads?
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
4/20
Motivation
● Software-Defined Networking (SDN).
– A unified controller to the data plane devices.
– Brings network awareness to the applications.
● To make big data executions
– Interoperable.
– Network-aware.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
5/20
Our Proposal
● Can we bring SDN to the data services?
● Software-Defined Data Services (SDDS).
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
6/20
Contributions
● SDDS as a generic approach for data services.
– Extending and leveraging SDN in the data centers.
● A software-defined framework for data services.
– Efficient performance and management of data services.
– Interoperability and scalability.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
7/20
Solution Architecture
● A bottom-up approach, extending SDN.
– Data Plane (SDN OpenFlow Switches)
– Storage PlaneStorage Plane (SQL and NoSQL data stores)
– Control Plane (SDN Controller, In-Memory Data Grids (IMDGs), ..)
– Execution Plane (Orchestrator and Web Service Engines)Execution Plane (Orchestrator and Web Service Engines)
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
8/20
Network-Aware Service Executions
with SDN
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
9/20
SDDS Planes and Layered Architecture
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
10/20
SDDS Approach
● Define all the data operations as interoperable services.
● SDN for distributing data and service executions
– Inside a data center (e.g. Software-Defined Data Centers).
– Beyond data centers (extend SDN with Message-Oriented
Middleware).
● Optimal placement of data and service execution.
– Minimize communication overhead and data movements.
● Keep the related data and executions closer.
● Send the execution to data, rather than data to execution.
– Execute data service on the best-fit server, until interrupted.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
11/20
Efficient Data and Execution Placement
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
12/20
Efficient Data and Execution Placement
{i, j} – related data objects
D – datasets of interest
n – execution node
Σ – spread of the related data objects
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
13/20
Prototype Implementation
● Data services implemented with web service
engines.
– Apache Axis2 1.7.0 and Apache CXF 3.2.1.
● IMDG clusters – Hazelcast 3.9.2 and Infinispan 9.1.5.
● Persistent storage – MySQL Server and MongoDB.
● Core SDN Controller – OpenDaylight Beryllium.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
14/20
Evaluation Environment
● A cluster of 6 servers.
– AMD A10-8700P Radeon R6, 10 Compute Cores 4C+6G
× 4.
– 8 GB of memory.
– Ubuntu 16.04 LTS 64 bit operating system.
– 1 TB disk space.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
15/20
Evaluation
● How does SDDS comply as a network-aware big
data execution compared to network-agnostic
execution?
– SDDS vs data services on top of Infinispan IMDG.
– A data storage and update service
● with an increasing volume of persistent data across the cluster
● up to a total of 6 TB data.
● Measured the throughput from the service plane
– by the total amount of data processed through the data
services per unit time.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
16/20
Evaluation
● SDDS outperforms the base.
– Better data locality
● by distributing data adhering to network topology.
– Better resource efficiency.
● by avoiding scaling out prematurely.
– Better throughput with minimal distribution when
there is no need to utilize all the 6 servers.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
17/20
Related Work
● Software-Defined Systems.
– Software-Defined Service Composition.
– Software-Defined Cyber-Physical Systems and SDIoT.
● Industrial SDDS offerings.
– Many of them storage focused.
● PureStorage, PrimaryIO, HPE, RedHat, ..
– Many focus on specific data services.
● Containers and devops – Atlantix and Portworx.
● Data copying and sharing – IBM Spectrum Copy Data Management
and Catalogic ECX.
● We are the first to propose a generic SDDS
framework.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
18/20
Conclusion
Summary
● Software-Defined Data Services (SDDS) offer both
interoperability and scalability to big data executions.
● SDDS leverages SDN in building a software-defined
framework for network-aware executions.
● SDDS caters to data services and compositions of
data services for an efficient execution.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
19/20
Conclusion
Summary
● Software-Defined Data Services (SDDS) offer both
interoperability and scalability to big data executions.
● SDDS leverages SDN in building a software-defined
framework for network-aware executions.
● SDDS caters to data services and compositions of data
services for an efficient execution.
Future Work
● Extend SDDS for edge and IoT/CPS environments.
Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS)
20/20
Conclusion
Summary
● Software-Defined Data Services (SDDS) offer both
interoperability and scalability to big data executions.
● SDDS leverages SDN in building a software-defined
framework for network-aware executions.
● SDDS caters to data services and compositions of data
services for an efficient execution.
Future Work
● Extend SDDS for edge and IoT/CPS environments.
Thank you! Questions?

More Related Content

PDF
Moving bits with a fleet of shared virtual routers
PDF
Software-Defined Systems for Network-Aware Service Composition and Workflow P...
PPTX
PacketCloud: an Open Platform for Elastic In-network Services.
PDF
Software-Defined Inter-Cloud Composition of Big Services
PDF
6. The grid-COMPUTING OGSA and WSRF
PDF
Hitachi datasheet-universal-replicator
PDF
A location based least-cost scheduling for data-intensive applications
PDF
F233842
Moving bits with a fleet of shared virtual routers
Software-Defined Systems for Network-Aware Service Composition and Workflow P...
PacketCloud: an Open Platform for Elastic In-network Services.
Software-Defined Inter-Cloud Composition of Big Services
6. The grid-COMPUTING OGSA and WSRF
Hitachi datasheet-universal-replicator
A location based least-cost scheduling for data-intensive applications
F233842

What's hot (18)

PPTX
An assessment of internet of things protocols for constrain apps
PDF
Content centric networks
PPT
Lambda Data Grid
PPTX
Overlay networks ppt
PPTX
Named data networking
PDF
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
PDF
Dynamic adaptation balman
PPTX
Job sequence scheduling for cloud computing
PDF
Named data networking. Basic Principle
PPTX
QoS-Aware Data Replication for Data-Intensive Applications in Cloud Computing...
PDF
Route Server Peering Improves End User "Quality of Experience"
DOCX
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
PPTX
Faster Content Distribution with Content Addressable NDN Repository
DOC
Distributed, concurrent, and independent access to encrypted cloud databases
PDF
WRNP18 - Software Defined Infrastructures: Multi-Domain Orchestration
PPTX
cloud schedualing
PPT
Distributed, concurrent, and independent access to encrypted cloud databases
PDF
An assessment of internet of things protocols for constrain apps
Content centric networks
Lambda Data Grid
Overlay networks ppt
Named data networking
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
Dynamic adaptation balman
Job sequence scheduling for cloud computing
Named data networking. Basic Principle
QoS-Aware Data Replication for Data-Intensive Applications in Cloud Computing...
Route Server Peering Improves End User "Quality of Experience"
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
Faster Content Distribution with Content Addressable NDN Repository
Distributed, concurrent, and independent access to encrypted cloud databases
WRNP18 - Software Defined Infrastructures: Multi-Domain Orchestration
cloud schedualing
Distributed, concurrent, and independent access to encrypted cloud databases
Ad

Similar to Software-Defined Data Services: Interoperable and Network-Aware Big Data Executions (Best Paper Award: SDS-2018) (20)

PDF
Data Virtualization: An Introduction
PDF
Data Virtualization: An Introduction
PPTX
Speak to Your Data
ODP
BigData Hadoop
PDF
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
PPTX
Sdn in big data
PDF
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
PDF
[OpenStack Day in Korea 2015] Keynote 2 - Leveraging OpenStack to Realize the...
PPTX
DDS Enabling Open Architecture
PDF
Data Virtualization. An Introduction (ASEAN)
PDF
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
PDF
Modern Data Management for Federal Modernization
PDF
Data Virtualization: Introduction and Business Value (UK)
PDF
Virtualisation de données : Enjeux, Usages & Bénéfices
PPTX
Sycamore Quantum Computer 2019 developed.pptx
PPTX
Fast Data Strategy Houston Roadshow Presentation
PPTX
Introduction to Cloud computing and Big Data-Hadoop
PDF
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
PDF
Data virtualization an introduction
PDF
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Data Virtualization: An Introduction
Data Virtualization: An Introduction
Speak to Your Data
BigData Hadoop
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Sdn in big data
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
[OpenStack Day in Korea 2015] Keynote 2 - Leveraging OpenStack to Realize the...
DDS Enabling Open Architecture
Data Virtualization. An Introduction (ASEAN)
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Modern Data Management for Federal Modernization
Data Virtualization: Introduction and Business Value (UK)
Virtualisation de données : Enjeux, Usages & Bénéfices
Sycamore Quantum Computer 2019 developed.pptx
Fast Data Strategy Houston Roadshow Presentation
Introduction to Cloud computing and Big Data-Hadoop
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Data virtualization an introduction
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Ad

More from Pradeeban Kathiravelu, Ph.D. (20)

PDF
Google Summer of Code_2023.pdf
PDF
Google Summer of Code (GSoC) 2022
PDF
Google Summer of Code (GSoC) 2022
PPTX
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
PDF
Google summer of code (GSoC) 2021
PPTX
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
PDF
Google Summer of Code (GSoC) 2020 for mentors
PDF
Google Summer of Code (GSoC) 2020
PDF
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
PDF
The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree
PDF
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
PDF
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
PDF
UCL Ph.D. Confirmation 2018
PDF
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
PDF
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
PDF
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
PDF
Componentizing Big Services in the Internet
PDF
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
PDF
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
PDF
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...
Google Summer of Code_2023.pdf
Google Summer of Code (GSoC) 2022
Google Summer of Code (GSoC) 2022
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
Google summer of code (GSoC) 2021
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
Google Summer of Code (GSoC) 2020 for mentors
Google Summer of Code (GSoC) 2020
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
UCL Ph.D. Confirmation 2018
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Componentizing Big Services in the Internet
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...

Recently uploaded (20)

PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
composite construction of structures.pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Digital Logic Computer Design lecture notes
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Well-logging-methods_new................
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Safety Seminar civil to be ensured for safe working.
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
PPT on Performance Review to get promotions
PPT
Project quality management in manufacturing
PPTX
Current and future trends in Computer Vision.pptx
PPTX
Geodesy 1.pptx...............................................
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
bas. eng. economics group 4 presentation 1.pptx
Sustainable Sites - Green Building Construction
composite construction of structures.pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems
Digital Logic Computer Design lecture notes
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Well-logging-methods_new................
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
UNIT 4 Total Quality Management .pptx
Safety Seminar civil to be ensured for safe working.
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CYBER-CRIMES AND SECURITY A guide to understanding
PPT on Performance Review to get promotions
Project quality management in manufacturing
Current and future trends in Computer Vision.pptx
Geodesy 1.pptx...............................................

Software-Defined Data Services: Interoperable and Network-Aware Big Data Executions (Best Paper Award: SDS-2018)

  • 1. Software-Defined Data Services: Interoperable and Network-Aware Big Data Executions Pradeeban Kathiravelu, Peter Van Roy, Luís Veiga 5th IEEE International Conference on Software Defined Systems (SDS 2018). Barcelona, Spain. 24/04/2018.
  • 2. Introduction ● Big data with increasing volume and variety. – Volume requires scalability. – Variety requires interoperability. ● Data Services – Services that access and process big data. – Unified web service interface to data → Interoperability! ● Chaining of data services. – Composing chains of numerous data services. – Data Access → Data cleaning → Data Integration.
  • 3. Problem Statement ● Data services offer interoperability. ● But when related data and services are distributed far from each other → Bad performance with scale. – How to scale out efficiently? ● How to minimize communication overheads?
  • 4. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 4/20 Motivation ● Software-Defined Networking (SDN). – A unified controller to the data plane devices. – Brings network awareness to the applications. ● To make big data executions – Interoperable. – Network-aware.
  • 5. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 5/20 Our Proposal ● Can we bring SDN to the data services? ● Software-Defined Data Services (SDDS).
  • 6. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 6/20 Contributions ● SDDS as a generic approach for data services. – Extending and leveraging SDN in the data centers. ● A software-defined framework for data services. – Efficient performance and management of data services. – Interoperability and scalability.
  • 7. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 7/20 Solution Architecture ● A bottom-up approach, extending SDN. – Data Plane (SDN OpenFlow Switches) – Storage PlaneStorage Plane (SQL and NoSQL data stores) – Control Plane (SDN Controller, In-Memory Data Grids (IMDGs), ..) – Execution Plane (Orchestrator and Web Service Engines)Execution Plane (Orchestrator and Web Service Engines)
  • 8. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 8/20 Network-Aware Service Executions with SDN
  • 9. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 9/20 SDDS Planes and Layered Architecture
  • 10. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 10/20 SDDS Approach ● Define all the data operations as interoperable services. ● SDN for distributing data and service executions – Inside a data center (e.g. Software-Defined Data Centers). – Beyond data centers (extend SDN with Message-Oriented Middleware). ● Optimal placement of data and service execution. – Minimize communication overhead and data movements. ● Keep the related data and executions closer. ● Send the execution to data, rather than data to execution. – Execute data service on the best-fit server, until interrupted.
  • 11. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 11/20 Efficient Data and Execution Placement
  • 12. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 12/20 Efficient Data and Execution Placement {i, j} – related data objects D – datasets of interest n – execution node Σ – spread of the related data objects
  • 13. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 13/20 Prototype Implementation ● Data services implemented with web service engines. – Apache Axis2 1.7.0 and Apache CXF 3.2.1. ● IMDG clusters – Hazelcast 3.9.2 and Infinispan 9.1.5. ● Persistent storage – MySQL Server and MongoDB. ● Core SDN Controller – OpenDaylight Beryllium.
  • 14. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 14/20 Evaluation Environment ● A cluster of 6 servers. – AMD A10-8700P Radeon R6, 10 Compute Cores 4C+6G × 4. – 8 GB of memory. – Ubuntu 16.04 LTS 64 bit operating system. – 1 TB disk space.
  • 15. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 15/20 Evaluation ● How does SDDS comply as a network-aware big data execution compared to network-agnostic execution? – SDDS vs data services on top of Infinispan IMDG. – A data storage and update service ● with an increasing volume of persistent data across the cluster ● up to a total of 6 TB data. ● Measured the throughput from the service plane – by the total amount of data processed through the data services per unit time.
  • 16. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 16/20 Evaluation ● SDDS outperforms the base. – Better data locality ● by distributing data adhering to network topology. – Better resource efficiency. ● by avoiding scaling out prematurely. – Better throughput with minimal distribution when there is no need to utilize all the 6 servers.
  • 17. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 17/20 Related Work ● Software-Defined Systems. – Software-Defined Service Composition. – Software-Defined Cyber-Physical Systems and SDIoT. ● Industrial SDDS offerings. – Many of them storage focused. ● PureStorage, PrimaryIO, HPE, RedHat, .. – Many focus on specific data services. ● Containers and devops – Atlantix and Portworx. ● Data copying and sharing – IBM Spectrum Copy Data Management and Catalogic ECX. ● We are the first to propose a generic SDDS framework.
  • 18. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 18/20 Conclusion Summary ● Software-Defined Data Services (SDDS) offer both interoperability and scalability to big data executions. ● SDDS leverages SDN in building a software-defined framework for network-aware executions. ● SDDS caters to data services and compositions of data services for an efficient execution.
  • 19. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 19/20 Conclusion Summary ● Software-Defined Data Services (SDDS) offer both interoperability and scalability to big data executions. ● SDDS leverages SDN in building a software-defined framework for network-aware executions. ● SDDS caters to data services and compositions of data services for an efficient execution. Future Work ● Extend SDDS for edge and IoT/CPS environments.
  • 20. Software-Defined Data Services (SDDS)Software-Defined Data Services (SDDS) 20/20 Conclusion Summary ● Software-Defined Data Services (SDDS) offer both interoperability and scalability to big data executions. ● SDDS leverages SDN in building a software-defined framework for network-aware executions. ● SDDS caters to data services and compositions of data services for an efficient execution. Future Work ● Extend SDDS for edge and IoT/CPS environments. Thank you! Questions?