SlideShare a Scribd company logo
D4Science Scientific Data Infrastructure:  promoting interoperability by embracing the value of the differences Pasquale Pagano [email_address] Networking session September 2010 Brussels  (Belgium) www.d4science.eu
Assumptions Consolidated facts: Very rich applications and data collections are currently maintained by a multitude of authoritative providers Different problems require different execution paradigms: batch, map-reduce, synchronous call, message-queue, … Key distributed computation technologies exist: grid (gLite and Globus), distributed resource management (Condor), clusters (Hadoop), … Several standards are adopted in the same domain Societal observations A rich variety of protocols, models, and formats  Create barriers in the usage of resources Delay dramatically new exploitation patterns Technical observations Protocols, models, and formats heterogeneity  increases load,  Load increases failures
D4Science Vision D4Science objectives: hide heterogeneity , i.e. abstract over differences in location, protocol, and model; embrace heterogeneity , i.e. allow for multiple locations, protocols, and models;  Technical goals no bottlenecks : scale no less than the interfaced resources no outages : keep failures partial and temporary autonomicity : system reacts and recovers
Hiding Heterogeneity [1/2]  D4Science is an  ecosystem of e-infrastructures  where: various communities cohabitate by maintaining their peculiarities and policies,  resources sharing and reuse of services from other domains is feasible and affordable
Hiding Heterogeneity [2/2]  D4Science approach: Heterogeneous resources are virtually accessible in a common ecosystem of resources  despite their locations, technologies, and protocol Different communities have access to different views according to the conditions under which the sharing can occur Each community can define its own virtual research environment to satisfy specific needs for a limited timeframe and  at no cost for the providers of the resource Several virtual research environments can coexist without interfering each other even by competing for the same resources
Approaches and solutions to achieve interoperability : Blackboard-based asynchronous communication between components in a system one protocol to R/W and one language to specify messages Wrapper/ Mediator-based translates one interface for a component into a compatible interface Proxy-based exposes the same interface but allows additional operation over received calls Adaptor-based provides a unified interface to a set of other components interfaces and encapsulates how this set of objects interact Broker-based Specialises an Adaptor by coordinating communication Embracing Heterogeneity: Interoperability Approaches
Embracing Heterogeneity: Data Representation, Discovery, and Access D4Science offers Open transformation service framework Extendible with specific source-target mediators To use for metadata and data crosswalk transformations Tailored for statistical, geospatial, temporal, and textual data Rich set of reference data Extendible with domain-specific reference data To reuse in services for data curation and harmonization Support for geospatial services To capture, manage, analyze, and display all forms of data that can be geographically referenced  Integrated resources registry Format agnostic  To support discovery and access
D4Science offers solutions to: Decouple the business domain and infrastructure specific logic from the core “execution” functionality Invocate a wide range of logic components: SOAP and REST WebServices, Shell Scripts, Executable Binaries, POJOs,  … Support most of the execution paradigms: batch, map-reduce, synchronous call Bridges key distributed computation technologies: grid (gLite and Globus), Condor, Hadoop Control and monitor the execution of a processing flow Staging of data among different storage providers Streaming data among computation elements Embracing Heterogeneity:  Process Execution [1/2]
Embracing Heterogeneity:  Process Execution [2/2] By using adaptors that operate on a specific third party language and translate them into native constructs,  allow for the creation of  complex workflows that exploit several diverse technologies  deployed  on different infrastructures
Conclusions Facts Very rich services and data collections are currently maintained by a multitude of authoritative providers Several standards are adopted in the same domain Interoperability approaches are key to exploit such richness  D4Science offers a variety of patterns, tools, and solutions  to delivery interoperability solutions and interconnect  Heterogeneous digital content Heterogeneous repository systems Heterogeneous computation platforms to decrease the cost of adoption to reduce the time to market of new ideas to deal with plethora of standards
Supported Standards WS-* WSRF WS-BPEL JDL JSDL Glue Schema (part) X-* DC, TEI, ISO etc JSR  ( several) GSI-Security XACML SAML OpenSearch OGC related https://quality.wiki.d4science.research-infrastructures.eu/quality/index.php/Standards  Comply with: OAI-PMH  OAI-ORE
Supported Standards WSRF Specifications WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) JSR 168 : Simple Portlets 286 : 186 update 160 : JMX WSN Specifications: WS-BaseNotification WS-Topics (WS-BrokeredNotification) WS-* Standards SOAP WSDL WS-Addressing ISO: ISO3166 countries ISO4217 currencies ISO1915 geo-location X-* XML XSD XSL XSLT xPath xQuery OGC Web Coverage  Processing Service  Web Coverage Service  Web Feature Service  Web Map Context  Web Map Service  Web Map Tile Service  Web Processing Service  Web Service Common OGF Standard: Glue Schema (2) ……… . Comply with: OAI-PMH  OAI-ORE
Thanks www.gcube-system.org www.d4science.eu Pasquale Pagano D4Science-II Technical Director [email_address] Donatella Castelli D4Science-II Project Director [email_address] Jessica Michel Assoumou D4Science-II  Administrative and Financial Director [email_address]   D4Science is powered by the open-source gCube framework

More Related Content

PPT
e-Infrastructure Integration-with gCube
 
PPTX
Capacity building, validation and repeatability
PPTX
HNSciCloud: Project Results and lessons learned
PPTX
Flexible metadata schemes for research data repositories - Clarin Conference...
PPTX
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
PPTX
Building COVID-19 Knowledge Graph at CoronaWhy
 
PDF
Future@SystemX - Nabil Bouzerna - Experiment IMM Project
PPTX
Setting up Dataverse repository for research data
 
e-Infrastructure Integration-with gCube
 
Capacity building, validation and repeatability
HNSciCloud: Project Results and lessons learned
Flexible metadata schemes for research data repositories - Clarin Conference...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
Building COVID-19 Knowledge Graph at CoronaWhy
 
Future@SystemX - Nabil Bouzerna - Experiment IMM Project
Setting up Dataverse repository for research data
 

What's hot (15)

PPTX
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
PDF
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
PPTX
The world of Docker and Kubernetes
 
PDF
05958007cloud
PDF
Session19 Globus
PDF
Ah.hypermedia gaf.poster
PDF
INDIGO-PaaS-overview
PPTX
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
PPT
PDF
IBM Cloud & Helix Nebula
PPTX
TierraCloud HC2 Customer Presentation
PPTX
RDM@Edinburgh_interoperation_IDCC2015
DOCX
Cooperative provable data possession for
PPTX
iRODS: Interoperability in Data Management
PDF
Multilingual Data Value Chain for CEF Automated Translation: Interoperability...
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
The world of Docker and Kubernetes
 
05958007cloud
Session19 Globus
Ah.hypermedia gaf.poster
INDIGO-PaaS-overview
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
IBM Cloud & Helix Nebula
TierraCloud HC2 Customer Presentation
RDM@Edinburgh_interoperation_IDCC2015
Cooperative provable data possession for
iRODS: Interoperability in Data Management
Multilingual Data Value Chain for CEF Automated Translation: Interoperability...
Ad

Similar to D4Science scientific data infrastructure promoting interoperability by embracing the value of the differences (D4SCIENCE-II) (20)

PPTX
5 years of Dataverse evolution
 
PPTX
FAIR Computational Workflows
PPT
Networked Digital Library Of Theses And Dissertations
DOCX
GLOSARIO SOBRE LA CIENCIA DE DATOS ORDENADO SEGUN CURSO
PPTX
DEMETER at OGC Agriculture Session
PPTX
Deep Hybrid DataCloud
PPTX
Research Object Community Update
PDF
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
PPTX
Decentralised identifiers and knowledge graphs
 
PPT
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
PPTX
OSFair2017 Workshop | EPOS: European Plate Observing System
PDF
PDF
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
PDF
Reactive Stream Processing for Data-centric Publish/Subscribe
PDF
OWF13 - Multiple Clouds and multiple interest communities
PDF
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
PPTX
Hughes RDAP11 Data Publication Repositories
PPT
Knowledge Discovery in an Agents Environment
PPTX
Orchestrating stateful applications with PKS and Portworx
PPTX
Orchestrating Stateful Applications with PKS and Portworx
5 years of Dataverse evolution
 
FAIR Computational Workflows
Networked Digital Library Of Theses And Dissertations
GLOSARIO SOBRE LA CIENCIA DE DATOS ORDENADO SEGUN CURSO
DEMETER at OGC Agriculture Session
Deep Hybrid DataCloud
Research Object Community Update
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Decentralised identifiers and knowledge graphs
 
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
OSFair2017 Workshop | EPOS: European Plate Observing System
MPLS/SDN 2013 Intercloud Standardization and Testbeds - Sill
Reactive Stream Processing for Data-centric Publish/Subscribe
OWF13 - Multiple Clouds and multiple interest communities
DSD-INT 2014 - OpenMI symposium - OpenMI and other model coupling standards, ...
Hughes RDAP11 Data Publication Repositories
Knowledge Discovery in an Agents Environment
Orchestrating stateful applications with PKS and Portworx
Orchestrating Stateful Applications with PKS and Portworx
Ad

More from FAO (11)

PPT
D4science-II Codata
 
PPT
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
 
PPT
Interoperability and standards adoption FAO’s inputs (ICT2010 Networking Se...
 
PPT
Data integration and standards at ioc of UNESCO (ICT2010 Networking Session)
 
PPT
D4 science scientific data infrastructure promoting interoperability by embra...
 
PPT
A european spatial data infrastructure under construction context, scope and ...
 
PPT
VRE Exploitation
 
PDF
VRE Definition And Creation
 
PDF
VRE Monitoring And Support
 
PDF
VRE - User And Role Management
 
PDF
Perspectives on Collaborative Research Environments offered by D4Science
 
D4science-II Codata
 
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
 
Interoperability and standards adoption FAO’s inputs (ICT2010 Networking Se...
 
Data integration and standards at ioc of UNESCO (ICT2010 Networking Session)
 
D4 science scientific data infrastructure promoting interoperability by embra...
 
A european spatial data infrastructure under construction context, scope and ...
 
VRE Exploitation
 
VRE Definition And Creation
 
VRE Monitoring And Support
 
VRE - User And Role Management
 
Perspectives on Collaborative Research Environments offered by D4Science
 

D4Science scientific data infrastructure promoting interoperability by embracing the value of the differences (D4SCIENCE-II)

  • 1. D4Science Scientific Data Infrastructure: promoting interoperability by embracing the value of the differences Pasquale Pagano [email_address] Networking session September 2010 Brussels (Belgium) www.d4science.eu
  • 2. Assumptions Consolidated facts: Very rich applications and data collections are currently maintained by a multitude of authoritative providers Different problems require different execution paradigms: batch, map-reduce, synchronous call, message-queue, … Key distributed computation technologies exist: grid (gLite and Globus), distributed resource management (Condor), clusters (Hadoop), … Several standards are adopted in the same domain Societal observations A rich variety of protocols, models, and formats Create barriers in the usage of resources Delay dramatically new exploitation patterns Technical observations Protocols, models, and formats heterogeneity increases load, Load increases failures
  • 3. D4Science Vision D4Science objectives: hide heterogeneity , i.e. abstract over differences in location, protocol, and model; embrace heterogeneity , i.e. allow for multiple locations, protocols, and models; Technical goals no bottlenecks : scale no less than the interfaced resources no outages : keep failures partial and temporary autonomicity : system reacts and recovers
  • 4. Hiding Heterogeneity [1/2] D4Science is an ecosystem of e-infrastructures where: various communities cohabitate by maintaining their peculiarities and policies, resources sharing and reuse of services from other domains is feasible and affordable
  • 5. Hiding Heterogeneity [2/2] D4Science approach: Heterogeneous resources are virtually accessible in a common ecosystem of resources despite their locations, technologies, and protocol Different communities have access to different views according to the conditions under which the sharing can occur Each community can define its own virtual research environment to satisfy specific needs for a limited timeframe and at no cost for the providers of the resource Several virtual research environments can coexist without interfering each other even by competing for the same resources
  • 6. Approaches and solutions to achieve interoperability : Blackboard-based asynchronous communication between components in a system one protocol to R/W and one language to specify messages Wrapper/ Mediator-based translates one interface for a component into a compatible interface Proxy-based exposes the same interface but allows additional operation over received calls Adaptor-based provides a unified interface to a set of other components interfaces and encapsulates how this set of objects interact Broker-based Specialises an Adaptor by coordinating communication Embracing Heterogeneity: Interoperability Approaches
  • 7. Embracing Heterogeneity: Data Representation, Discovery, and Access D4Science offers Open transformation service framework Extendible with specific source-target mediators To use for metadata and data crosswalk transformations Tailored for statistical, geospatial, temporal, and textual data Rich set of reference data Extendible with domain-specific reference data To reuse in services for data curation and harmonization Support for geospatial services To capture, manage, analyze, and display all forms of data that can be geographically referenced Integrated resources registry Format agnostic To support discovery and access
  • 8. D4Science offers solutions to: Decouple the business domain and infrastructure specific logic from the core “execution” functionality Invocate a wide range of logic components: SOAP and REST WebServices, Shell Scripts, Executable Binaries, POJOs, … Support most of the execution paradigms: batch, map-reduce, synchronous call Bridges key distributed computation technologies: grid (gLite and Globus), Condor, Hadoop Control and monitor the execution of a processing flow Staging of data among different storage providers Streaming data among computation elements Embracing Heterogeneity: Process Execution [1/2]
  • 9. Embracing Heterogeneity: Process Execution [2/2] By using adaptors that operate on a specific third party language and translate them into native constructs, allow for the creation of complex workflows that exploit several diverse technologies deployed on different infrastructures
  • 10. Conclusions Facts Very rich services and data collections are currently maintained by a multitude of authoritative providers Several standards are adopted in the same domain Interoperability approaches are key to exploit such richness D4Science offers a variety of patterns, tools, and solutions to delivery interoperability solutions and interconnect Heterogeneous digital content Heterogeneous repository systems Heterogeneous computation platforms to decrease the cost of adoption to reduce the time to market of new ideas to deal with plethora of standards
  • 11. Supported Standards WS-* WSRF WS-BPEL JDL JSDL Glue Schema (part) X-* DC, TEI, ISO etc JSR ( several) GSI-Security XACML SAML OpenSearch OGC related https://quality.wiki.d4science.research-infrastructures.eu/quality/index.php/Standards Comply with: OAI-PMH OAI-ORE
  • 12. Supported Standards WSRF Specifications WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) JSR 168 : Simple Portlets 286 : 186 update 160 : JMX WSN Specifications: WS-BaseNotification WS-Topics (WS-BrokeredNotification) WS-* Standards SOAP WSDL WS-Addressing ISO: ISO3166 countries ISO4217 currencies ISO1915 geo-location X-* XML XSD XSL XSLT xPath xQuery OGC Web Coverage Processing Service Web Coverage Service Web Feature Service Web Map Context Web Map Service Web Map Tile Service Web Processing Service Web Service Common OGF Standard: Glue Schema (2) ……… . Comply with: OAI-PMH OAI-ORE
  • 13. Thanks www.gcube-system.org www.d4science.eu Pasquale Pagano D4Science-II Technical Director [email_address] Donatella Castelli D4Science-II Project Director [email_address] Jessica Michel Assoumou D4Science-II Administrative and Financial Director [email_address] D4Science is powered by the open-source gCube framework

Editor's Notes

  • #12: WSRF Specifications WS-ResourceProperties (WSRF-RP) WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF) JSR 168 : Simple Portlets 286 : 186 update 160 : JMX WSN Specifications: WS-BaseNotification WS-Topics (WS-BrokeredNotification) WS-* Standards SOAP WSDL WS-Addressing ISO: ISO3166 countries ISO4217 currencies ISO9115 geo-location X-* XML XSD XSL XSLT xPath xQuery Other WSRP OpenGIS KML OGF Standard: Glue Schema (2) eXtensible Access Control Markup Language(XACML) is a specification in XML for writing access control policies in XML and how to interpret them Security Assertion Markup Language(SAML) is a XML specification, defining syntax and processing semantics about security assertions