SlideShare a Scribd company logo
D4Science Scientific Data Infrastructure:  promoting interoperability by embracing the value of the differences Pasquale Pagano [email_address] Networking session September 2010 Brussels  (Belgium) www.d4science.eu
Assumptions Consolidated facts: Very rich applications and data collections are currently maintained by a multitude of authoritative providers Different problems require different execution paradigms: batch, map-reduce, synchronous call, message-queue, … Key distributed computation technologies exist: grid (gLite and Globus), distributed resource management (Condor), clusters (Hadoop), … Several standards are adopted in the same domain Societal observations A rich variety of protocols, models, and formats  Create barriers in the usage of resources Delay dramatically new exploitation patterns Technical observations Protocols, models, and formats heterogeneity  increases load,  Load increases failures
D4Science Vision D4Science objectives: hide heterogeneity , i.e. abstract over differences in location, protocol, and model; embrace heterogeneity , i.e. allow for multiple locations, protocols, and models;  Technical goals no bottlenecks : scale no less than the interfaced resources no outages : keep failures partial and temporary autonomicity : system reacts and recovers
Hiding Heterogeneity [1/2]  D4Science is an  ecosystem of e-infrastructures  where: various communities cohabitate by maintaining their peculiarities and policies,  resources sharing and reuse of services from other domains is feasible and affordable
Hiding Heterogeneity [2/2]  D4Science approach: Heterogeneous resources are virtually accessible in a common ecosystem of resources  despite their locations, technologies, and protocol Different communities have access to different views according to the conditions under which the sharing can occur Each community can define its own virtual research environment to satisfy specific needs for a limited timeframe and  at no cost for the providers of the resource Several virtual research environments can coexist without interfering each other even by competing for the same resources
Approaches and solutions to achieve interoperability : Blackboard-based asynchronous communication between components in a system one protocol to R/W and one language to specify messages Wrapper/ Mediator-based translates one interface for a component into a compatible interface Proxy-based exposes the same interface but allows additional operation over received calls Adaptor-based provides a unified interface to a set of other components interfaces and encapsulates how this set of objects interact Broker-based Specialises an Adaptor by coordinating communication Embracing Heterogeneity: Interoperability Approaches
Embracing Heterogeneity: Data Representation, Discovery, and Access D4Science offers Open transformation service framework Extendible with specific source-target mediators To use for metadata and data crosswalk transformations Tailored for statistical, geospatial, temporal, and textual data Rich set of reference data Extendible with domain-specific reference data To reuse in services for data curation and harmonization Support for geospatial services To capture, manage, analyze, and display all forms of data that can be geographically referenced  Integrated resources registry Format agnostic  To support discovery and access
D4Science offers solutions to: Decouple the business domain and infrastructure specific logic from the core “execution” functionality Invocate a wide range of logic components: SOAP and REST WebServices, Shell Scripts, Executable Binaries, POJOs,  … Support most of the execution paradigms: batch, map-reduce, synchronous call Bridges key distributed computation technologies: grid (gLite and Globus), Condor, Hadoop Control and monitor the execution of a processing flow Staging of data among different storage providers Streaming data among computation elements Embracing Heterogeneity:  Process Execution [1/2]
Embracing Heterogeneity:  Process Execution [2/2] By using adaptors that operate on a specific third party language and translate them into native constructs,  allow for the creation of  complex workflows that exploit several diverse technologies  deployed  on different infrastructures
Conclusions Facts Very rich services and data collections are currently maintained by a multitude of authoritative providers Several standards are adopted in the same domain Interoperability approaches are key to exploit such richness  D4Science offers a variety of patterns, tools, and solutions  to delivery interoperability solutions and interconnect  Heterogeneous digital content Heterogeneous repository systems Heterogeneous computation platforms to decrease the cost of adoption to reduce the time to market of new ideas to deal with plethora of standards
Supported Standards WS-* WSRF WS-BPEL JDL JSDL Glue Schema (part) X-* DC, TEI, ISO etc JSR  ( several) GSI-Security XACML SAML OpenSearch OGC related https://quality.wiki.d4science.research-infrastructures.eu/quality/index.php/Standards  Comply with: OAI-PMH  OAI-ORE
Supported Standards WSRF Specifications WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) JSR 168 : Simple Portlets 286 : 186 update 160 : JMX WSN Specifications: WS-BaseNotification WS-Topics (WS-BrokeredNotification) WS-* Standards SOAP WSDL WS-Addressing ISO: ISO3166 countries ISO4217 currencies ISO1915 geo-location X-* XML XSD XSL XSLT xPath xQuery OGC Web Coverage  Processing Service  Web Coverage Service  Web Feature Service  Web Map Context  Web Map Service  Web Map Tile Service  Web Processing Service  Web Service Common OGF Standard: Glue Schema (2) ……… . Comply with: OAI-PMH  OAI-ORE
Thanks www.gcube-system.org www.d4science.eu Pasquale Pagano D4Science-II Technical Director [email_address] Donatella Castelli D4Science-II Project Director [email_address] Jessica Michel Assoumou D4Science-II  Administrative and Financial Director [email_address]   D4Science is powered by the open-source gCube framework

More Related Content

PPT
e-Infrastructure Integration-with gCube
 
PPT
D4Science scientific data infrastructure promoting interoperability by embrac...
 
PPTX
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
PPTX
HNSciCloud: Project Results and lessons learned
PDF
Session19 Globus
PPTX
CLARIN CMDI use case and flexible metadata schemes
 
PDF
Cooperative Architectures and Algorithms for Discovery and ...
PDF
IoT Protocols Integration with Vortex Gateway
e-Infrastructure Integration-with gCube
 
D4Science scientific data infrastructure promoting interoperability by embrac...
 
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
HNSciCloud: Project Results and lessons learned
Session19 Globus
CLARIN CMDI use case and flexible metadata schemes
 
Cooperative Architectures and Algorithms for Discovery and ...
IoT Protocols Integration with Vortex Gateway

What's hot (20)

PPTX
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
PDF
An Optimal Cooperative Provable Data Possession Scheme for Distributed Cloud ...
PDF
05958007cloud
PPT
PDF
Distributed Algorithms with DDS
PPTX
Capacity building, validation and repeatability
PDF
Introducing Vortex Lite
PPTX
The world of Docker and Kubernetes
 
PDF
Architecting IoT Systems with Vortex
PPTX
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
PPT
Real Time Java DDS
PDF
IRJET - A Secure Access Policies based on Data Deduplication System
PDF
DDS In Action Part II
PDF
Data Sharing in Extremely Resource Constrained Envionrments
PDF
Reactive Data Centric Architectures with Vortex, Spark and ReactiveX
PDF
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
PDF
Building Reactive Applications with DDS
PPT
Data Grid Taxonomies
PDF
Desktop, Embedded and Mobile Apps with Vortex Café
PDF
International Journal of Computational Engineering Research(IJCER)
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
An Optimal Cooperative Provable Data Possession Scheme for Distributed Cloud ...
05958007cloud
Distributed Algorithms with DDS
Capacity building, validation and repeatability
Introducing Vortex Lite
The world of Docker and Kubernetes
 
Architecting IoT Systems with Vortex
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
Real Time Java DDS
IRJET - A Secure Access Policies based on Data Deduplication System
DDS In Action Part II
Data Sharing in Extremely Resource Constrained Envionrments
Reactive Data Centric Architectures with Vortex, Spark and ReactiveX
BUILDING A PRIVATE HPC CLOUD FOR COMPUTE AND DATA-INTENSIVE APPLICATIONS
Building Reactive Applications with DDS
Data Grid Taxonomies
Desktop, Embedded and Mobile Apps with Vortex Café
International Journal of Computational Engineering Research(IJCER)
Ad

Similar to D4 science scientific data infrastructure promoting interoperability by embracing the value of the differences (ICT2010 Networking Session) (20)

PPTX
5 years of Dataverse evolution
 
PPTX
FAIR Computational Workflows
PPT
Networked Digital Library Of Theses And Dissertations
DOCX
GLOSARIO SOBRE LA CIENCIA DE DATOS ORDENADO SEGUN CURSO
PPTX
DEMETER at OGC Agriculture Session
PPTX
Deep Hybrid DataCloud
PPTX
Research Object Community Update
PDF
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
PPTX
Decentralised identifiers and knowledge graphs
 
PPT
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
PPTX
OSFair2017 Workshop | EPOS: European Plate Observing System
PDF
PDF
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
PDF
Reactive Stream Processing for Data-centric Publish/Subscribe
PDF
OWF13 - Multiple Clouds and multiple interest communities
PDF
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
PPTX
Hughes RDAP11 Data Publication Repositories
PPT
Knowledge Discovery in an Agents Environment
PPTX
Orchestrating stateful applications with PKS and Portworx
PPTX
Orchestrating Stateful Applications with PKS and Portworx
5 years of Dataverse evolution
 
FAIR Computational Workflows
Networked Digital Library Of Theses And Dissertations
GLOSARIO SOBRE LA CIENCIA DE DATOS ORDENADO SEGUN CURSO
DEMETER at OGC Agriculture Session
Deep Hybrid DataCloud
Research Object Community Update
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Decentralised identifiers and knowledge graphs
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
OSFair2017 Workshop | EPOS: European Plate Observing System
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
Reactive Stream Processing for Data-centric Publish/Subscribe
OWF13 - Multiple Clouds and multiple interest communities
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
Hughes RDAP11 Data Publication Repositories
Knowledge Discovery in an Agents Environment
Orchestrating stateful applications with PKS and Portworx
Orchestrating Stateful Applications with PKS and Portworx
Ad

More from FAO (10)

PPT
D4science-II Codata
 
PPT
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
 
PPT
Interoperability and standards adoption FAO’s inputs (ICT2010 Networking Se...
 
PPT
Data integration and standards at ioc of UNESCO (ICT2010 Networking Session)
 
PPT
A european spatial data infrastructure under construction context, scope and ...
 
PPT
VRE Exploitation
 
PDF
VRE Definition And Creation
 
PDF
VRE Monitoring And Support
 
PDF
VRE - User And Role Management
 
PDF
Perspectives on Collaborative Research Environments offered by D4Science
 
D4science-II Codata
 
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
 
Interoperability and standards adoption FAO’s inputs (ICT2010 Networking Se...
 
Data integration and standards at ioc of UNESCO (ICT2010 Networking Session)
 
A european spatial data infrastructure under construction context, scope and ...
 
VRE Exploitation
 
VRE Definition And Creation
 
VRE Monitoring And Support
 
VRE - User And Role Management
 
Perspectives on Collaborative Research Environments offered by D4Science
 

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
Approach and Philosophy of On baking technology
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
Approach and Philosophy of On baking technology
Understanding_Digital_Forensics_Presentation.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Encapsulation_ Review paper, used for researhc scholars
MIND Revenue Release Quarter 2 2025 Press Release
MYSQL Presentation for SQL database connectivity
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Spectroscopy.pptx food analysis technology
Programs and apps: productivity, graphics, security and other tools
Digital-Transformation-Roadmap-for-Companies.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx

D4 science scientific data infrastructure promoting interoperability by embracing the value of the differences (ICT2010 Networking Session)

  • 1. D4Science Scientific Data Infrastructure: promoting interoperability by embracing the value of the differences Pasquale Pagano [email_address] Networking session September 2010 Brussels (Belgium) www.d4science.eu
  • 2. Assumptions Consolidated facts: Very rich applications and data collections are currently maintained by a multitude of authoritative providers Different problems require different execution paradigms: batch, map-reduce, synchronous call, message-queue, … Key distributed computation technologies exist: grid (gLite and Globus), distributed resource management (Condor), clusters (Hadoop), … Several standards are adopted in the same domain Societal observations A rich variety of protocols, models, and formats Create barriers in the usage of resources Delay dramatically new exploitation patterns Technical observations Protocols, models, and formats heterogeneity increases load, Load increases failures
  • 3. D4Science Vision D4Science objectives: hide heterogeneity , i.e. abstract over differences in location, protocol, and model; embrace heterogeneity , i.e. allow for multiple locations, protocols, and models; Technical goals no bottlenecks : scale no less than the interfaced resources no outages : keep failures partial and temporary autonomicity : system reacts and recovers
  • 4. Hiding Heterogeneity [1/2] D4Science is an ecosystem of e-infrastructures where: various communities cohabitate by maintaining their peculiarities and policies, resources sharing and reuse of services from other domains is feasible and affordable
  • 5. Hiding Heterogeneity [2/2] D4Science approach: Heterogeneous resources are virtually accessible in a common ecosystem of resources despite their locations, technologies, and protocol Different communities have access to different views according to the conditions under which the sharing can occur Each community can define its own virtual research environment to satisfy specific needs for a limited timeframe and at no cost for the providers of the resource Several virtual research environments can coexist without interfering each other even by competing for the same resources
  • 6. Approaches and solutions to achieve interoperability : Blackboard-based asynchronous communication between components in a system one protocol to R/W and one language to specify messages Wrapper/ Mediator-based translates one interface for a component into a compatible interface Proxy-based exposes the same interface but allows additional operation over received calls Adaptor-based provides a unified interface to a set of other components interfaces and encapsulates how this set of objects interact Broker-based Specialises an Adaptor by coordinating communication Embracing Heterogeneity: Interoperability Approaches
  • 7. Embracing Heterogeneity: Data Representation, Discovery, and Access D4Science offers Open transformation service framework Extendible with specific source-target mediators To use for metadata and data crosswalk transformations Tailored for statistical, geospatial, temporal, and textual data Rich set of reference data Extendible with domain-specific reference data To reuse in services for data curation and harmonization Support for geospatial services To capture, manage, analyze, and display all forms of data that can be geographically referenced Integrated resources registry Format agnostic To support discovery and access
  • 8. D4Science offers solutions to: Decouple the business domain and infrastructure specific logic from the core “execution” functionality Invocate a wide range of logic components: SOAP and REST WebServices, Shell Scripts, Executable Binaries, POJOs, … Support most of the execution paradigms: batch, map-reduce, synchronous call Bridges key distributed computation technologies: grid (gLite and Globus), Condor, Hadoop Control and monitor the execution of a processing flow Staging of data among different storage providers Streaming data among computation elements Embracing Heterogeneity: Process Execution [1/2]
  • 9. Embracing Heterogeneity: Process Execution [2/2] By using adaptors that operate on a specific third party language and translate them into native constructs, allow for the creation of complex workflows that exploit several diverse technologies deployed on different infrastructures
  • 10. Conclusions Facts Very rich services and data collections are currently maintained by a multitude of authoritative providers Several standards are adopted in the same domain Interoperability approaches are key to exploit such richness D4Science offers a variety of patterns, tools, and solutions to delivery interoperability solutions and interconnect Heterogeneous digital content Heterogeneous repository systems Heterogeneous computation platforms to decrease the cost of adoption to reduce the time to market of new ideas to deal with plethora of standards
  • 11. Supported Standards WS-* WSRF WS-BPEL JDL JSDL Glue Schema (part) X-* DC, TEI, ISO etc JSR ( several) GSI-Security XACML SAML OpenSearch OGC related https://quality.wiki.d4science.research-infrastructures.eu/quality/index.php/Standards Comply with: OAI-PMH OAI-ORE
  • 12. Supported Standards WSRF Specifications WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) JSR 168 : Simple Portlets 286 : 186 update 160 : JMX WSN Specifications: WS-BaseNotification WS-Topics (WS-BrokeredNotification) WS-* Standards SOAP WSDL WS-Addressing ISO: ISO3166 countries ISO4217 currencies ISO1915 geo-location X-* XML XSD XSL XSLT xPath xQuery OGC Web Coverage Processing Service Web Coverage Service Web Feature Service Web Map Context Web Map Service Web Map Tile Service Web Processing Service Web Service Common OGF Standard: Glue Schema (2) ……… . Comply with: OAI-PMH OAI-ORE
  • 13. Thanks www.gcube-system.org www.d4science.eu Pasquale Pagano D4Science-II Technical Director [email_address] Donatella Castelli D4Science-II Project Director [email_address] Jessica Michel Assoumou D4Science-II Administrative and Financial Director [email_address] D4Science is powered by the open-source gCube framework

Editor's Notes

  • #12: WSRF Specifications WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) JSR 168 : Simple Portlets 286 : 186 update 160 : JMX WSN Specifications: WS-BaseNotification WS-Topics (WS-BrokeredNotification) WS-* Standards SOAP WSDL WS-Addressing ISO: ISO3166 countries ISO4217 currencies ISO9115 geo-location X-* XML XSD XSL XSLT xPath xQuery Other WSRP OpenGIS KML OGF Standard: Glue Schema (2) eXtensible Access Control Markup Language(XACML) is a specification in XML for writing access control policies in XML and how to interpret them Security Assertion Markup Language(SAML) is a XML specification, defining syntax and processing semantics about security assertions