SlideShare a Scribd company logo
ECCB10
TT11
29 September 2010
Rafael Jimenez
rafael@ebi.ac.uk
EnCORE
presentation
Data integration in Proteomics through
EnVision and EnCore Web Services
ENFIN Network of Excellence
• Brings together
experimentalists and
computational biologists to
develop the next generation of
informatics resources for
systems biology
• Funded by the European
Commission within its FP6
programme under the
thematic area ‘Life sciences,
genomics and biotechnology
for health’
• 20 partners in 13 countries
• www.enfin.org
EnCore
• ENFIN Platform to enable mining data across various domains,
sources, formats and types
• Integrates database resources and analysis tools across different
disciplines
EnXML
EnCORE services
EnVISION pages
EnXML
User
input output
SOAP
Web Services
Web Client
Interface
Standard
exchange format
Diverse service world
SOAP, REST,
Java API, Perl
API, FTP,
GUI, …
External data sources
Results
Access interfaces
User
?integration
• Multiple manual connections
• Multiple technologies
• Multiple result files which have to be combined manually
• Much work to reproduce
XML, CSV,
Plain Text,
JSON, …
Standardized EnCORE world
Heterogeneous
external world
Standardised
EnCORE world
EnXML
External data sources
EnCORE services
EnVISION pages
API, WS access
Standard EnXML format
User
input output
EnCORE services
From Inputs to Outputs
Positive Negative
Results / EnXML
Web Services
Protein identifications
• pride
Microarray probe mapping to Uniprot
• probe2uniprot
Protein Identifier Cross Reference Service
• picr
Biological Pathways
• reactome
• kegg
Microarray experiments
• arrayExpress
Protein sequence information
• uniprot
• uniprot2proteinAnnotations
Biological models
• biomodels
Gene ontology enrichment
• gGost
Cellular location
• cellmint
Protein domain analysis
• domaination
Protein function prediction
• funcnet
Molecular interactions
• uniprot2molecularInteractions
Protein
• Database IDs
• Sequences
• Experiment: Identifies the result
• Sets: Contains the structure of the result
• Molecules: Includes the results
• Features: Describe details of the result
Input / EnXML
Experiment
Set
Molecule
Feature
EnXML structure
Identifies the result
Contains the structure of the result
Includes the results
Describe details of the result
EnXML Schema
http://www.enfin.org/encore/schema/v1_2_5/doc/
EnCORE services
Example
Positive Negative
Input/Query
Output/Results
Program/Service
EnCORE dataset
EnCORE
results
EnCORE webservice
• Encore webservice
uniprot2molecularInteractions
• Database ID (Uniprot ID)
P37173
• Experiment: ID4
• Sets: (1)EBI-296235, (2)EBI-1033040, (3) EBI-
902913, EBI-902937, (4) EBI-296166, EBI-296246,
(5)EBI-902913
• Molecules: (1)O35613, (2)P10600, (3)P07200,
(4)Q9UER7, (5)Q99K41
• Features: No features
EnCore services
Example (Result on a table)
Interactor A Interactor B Interaction IDs
1 P37173 O35613 EBI-296235
2 P37173 P10600 EBI-1033040
3 P37173 P07200 EBI-902913, EBI-902937
4 P37173 Q9UER7 EBI-296166, EBI-296246
5 P37173 Q99K41 EBI-902913
Input/Query
Output/Results
Program/Service
Enfin-IntAct
P37173
EnCore services
Building workflows
Input Result Positive result Negative resultWebservice Input selection
EnCore Services
Taverna
…
EnVision interface
Input Form
Default
workflow
BioModels
CellMint
IntAct
Reactome
PICR
Pride
Query
P07200,Q99K41,P3717
3,P37023,Q13131,A3Q
NQ0,Q9Y6C2,P98170,
A2AI38,Q8CGZ0,Q132
87,Q8WTW2,P61812
P07200,Q99K41,P37173,P37023,Q13131,A3QNQ0,Q9Y6C2,P98170,A2AI38,
Q8CGZ0,Q13287,Q8WTW2,P61812
EnVision2 interface
• Results for Pride, Uniprot, Intact, Reactome, CellMint, PICR, Biomodels, …
Results per service
EnVision2 Pathways result
Positive results
Negative results
EnVision2, molecular interactions
http://www.enfin.org/
Download results
ENFIN Network of Excellence
• Brings together
experimentalists and
computational biologists to
develop the next generation of
informatics resources for
systems biology
• Funded by the European
Commission within its FP6
programme under the
thematic area ‘Life sciences,
genomics and biotechnology
for health’
• 20 partners in 13 countries
• www.enfin.org
EnCore
Adapting EnCORE to Standards and Federation
Molecular Biology Database resources
Human Genes and
Diseases
14%
Proteomics Resources
(20)
0%
Other Molecular
Biology Databases
3%
Immunological
databases
2%
Plant databases
8%
Organelle databases
2%
Human and other
Vertebrate Genomes
8%
Nucleotide
Sequence Databases
9%
RNA
sequence
databases
Protein
sequence
databases
Structure Databases
9%
Genomics
-Databases (non
(vertebrate
Metabolic and
Signaling Pathways
9%
Nucleic Acids Research annual
Database Issue and the NAR online
Molecular Biology Database Collection
in 2009MY Galperin, GR Cochrane -
Nucleic Acids Research, 2008
~1440
resources
~1440
resources
Traditional EnCore approach
Domain 5 Domain …Domain 4
Domain 2 Domain 3Domain 1
New EnCore approach
Standards and Federation
Domain 1
External data sources
Federated systems / Standards
EnVISION pages
WS
WS
Web interface
EnCORE wrapper
New EnCore approach
Standards and Federation
Domain 5 Domain …Domain 4
Domain 2 Domain 3Domain 1
New EnCore approach
Standards and Federation
• Less development
• More sources
• Data integration per domain
• Comparable results
• Automatic inclusion of new data sources
• More stable formats
• Validation
• Extra value to the original data
New role for EnCore and EnVision
Extra value to the original data
• Integration of sources
– Clustering results
– Data analysis
• Interconnect results
• More visualization
Domain 5 Domain …Domain 4
Domain 2 Domain 3Domain 1
Centralization VS Federation
DB
GUI
API
WS
DB DB DB
SP SP SP SP
Federation
Database Graphical User InterfaceGUI User Standard protocolSP
DB
GUI
API
WS
Centralized database
A AA A
A A A A
Standards and Federation in EnCORE
Service
broker
Service
consumer
Service
provider
Service
Contract
...
...
Interact
PublishFind
Service Oriented Architecture
DAS & PSICQUIC
Implementation
...
...
...
Registry
DAS Clients
Annotation
sources
Annotation
sources
Annotation
sources
DAS Clients
Clients
Protocol
23.08.18 27
The Distributed Annotation System, 2001 Dowell et al;
BMC Bioinformatics. 2001; 2: 7. Published online 2001 October 10.
DAS, how it works
illustration
23.08.18 28
DAS Registry - list of protein annotation sources
EnCore DAS service
for protein sequence annotations
Protein DAS
annotation sources
Protein DAS
annotation sources
Experiment
Set
Molecule
Feature
Uniprot DAS
reference source
Uniprot DAS
annotation source
Protein
information
Protein feature
information
Protein DAS
annotation sources
Protein DAS
annotation sources
Protein DAS
annotation sources
• Service:
• Name: uniprot2proteinannotations
• URL: http://www.ebi.ac.uk/enfin-srv/encore/uniprot2proteinannotations/service
• Input: List of Uniprot Acc numbers
• Options: DAS Sources to query
• Direct input (DAS feature URL) [0,*]
• Registry LABEL [0,1]
• Registry source URI (DS_XXX) [0,*]
PSICQUIC, how it works
….….
….....
….….
….....
PSICQUIC PSICQUIC PSICQUIC
Sample
Observation error
Interaction databases
Publications
PSICQUIC services
Annotation error
User
PSICQUIC
Registry
PSICQUIC client
PSICQUIC
Registry
• 13 sources
• 14.665.530
interactions
EnVision2, molecular interactions
PSICQUIC client
http://www.enfin.org/
PSICQUIC
registry
Query
Clustering
Scores
Data integration in Proteomics through EnVision and EnCore Web Services
Documentation
http://code.google.com/p/enfin-core
EnCore APIs
Example: Score distribution across several databases
http://code.google.com/p/enfin-core
http://www.enfin.org
Thank you!
Questions?
ENFIN partners:
• Pascal Kahlem (project coordinator)
• Bernd Brandt (IBIVU)
• Christine Orengo (UCL)
• Andrew Clegg (UCL)
• Ioannis Xenarios (SIB)
• Heinz Stockinger (SIB)
• Jaak Vilo (QURETEC)
• Jüri Reimand (QURETEC)
• Gianni Cesareni (UNITOR)
• Arnaud Ceol (UNITOR)
• James Procter (UNIVDUN)
• Ana Rojas Mendoza (CNIO)

More Related Content

PPT
EnCore & EnVision
PPT
Data Integration through Enfin and EnCore
PPT
WP1, ENFIN Core
PPT
Data integration in ENFIN using standards. The EnCore DAS service.
PPT
EnVisioning Pathways
PDF
Eswc lsagrsv9-boris-110602060147-phpapp02
PPTX
Quality and capacity expansion of thematic services in EOSC-SYNERGY
PDF
EUGM 2014 - Serge P. Parel (Exquiron): Farewell, PipelinePilot : Migrating th...
EnCore & EnVision
Data Integration through Enfin and EnCore
WP1, ENFIN Core
Data integration in ENFIN using standards. The EnCore DAS service.
EnVisioning Pathways
Eswc lsagrsv9-boris-110602060147-phpapp02
Quality and capacity expansion of thematic services in EOSC-SYNERGY
EUGM 2014 - Serge P. Parel (Exquiron): Farewell, PipelinePilot : Migrating th...

Similar to Data integration in Proteomics through EnVision and EnCore Web Services (20)

PDF
01-06 OCRE Test Suite - Fernandes.pdf
PDF
Structural Biology in the Clouds: A Success Story of 10 years
PDF
3 - EO requirements gathering preliminary findings
PDF
EUGM 2014 - Marco Brazzarola (Aptuit): Aptuit Compound Registration and Inte...
PDF
WSO2Con USA 2017: Journey of Migration from Legacy ESB to Modern WSO2 ESB Pla...
PPTX
An introduction . Programmatic access to interaction resources
PDF
IntAct and data distribution with PSICQUIC
PPTX
Ar Quality M System project presentation
PPTX
ICOS Services and Products
PDF
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
PPTX
C4Bio paper talk
PDF
SDN and metrics from the SDOs
PPT
Webservices and Workflows. Taverna, Biocatalgue and myExperiment.
PPT
E pec colombia todorov
PPT
Non technical introduction to Web Services & Workflows. Taverna, Biocatalogue...
PPTX
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
PPTX
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
PPTX
A user journey in OpenAIRE services through the lens of repository managers -...
PPTX
The Neuroinformatics community in OpenAIRE Connect (Presentation by Sorina Po...
PDF
6th Content Providers Community Call
01-06 OCRE Test Suite - Fernandes.pdf
Structural Biology in the Clouds: A Success Story of 10 years
3 - EO requirements gathering preliminary findings
EUGM 2014 - Marco Brazzarola (Aptuit): Aptuit Compound Registration and Inte...
WSO2Con USA 2017: Journey of Migration from Legacy ESB to Modern WSO2 ESB Pla...
An introduction . Programmatic access to interaction resources
IntAct and data distribution with PSICQUIC
Ar Quality M System project presentation
ICOS Services and Products
SFScon21 - Simone Tritini - The Environmental Data Platform web portal
C4Bio paper talk
SDN and metrics from the SDOs
Webservices and Workflows. Taverna, Biocatalgue and myExperiment.
E pec colombia todorov
Non technical introduction to Web Services & Workflows. Taverna, Biocatalogue...
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
OpenAIRE services and tools - 6th National Open Access Conference and OpenAIR...
A user journey in OpenAIRE services through the lens of repository managers -...
The Neuroinformatics community in OpenAIRE Connect (Presentation by Sorina Po...
6th Content Providers Community Call
Ad

More from Rafael C. Jimenez (20)

PPTX
BMB Resource Integration Workshop
PPTX
Proteomics repositories integration using EUDAT resources
PPTX
Summary of Technical Coordinators discussions
PPTX
The European life-science data infrastructure: Data, Computing and Services ...
PPT
Standardisation in BMS European infrastructures
PPT
PPT
ELIXIR TCG update
PPT
An introduction to programmatic access
PPTX
Life science requirements from e-infrastructure: initial results from a joint...
PPT
Technical activities in ELIXIR Europe
PPTX
Challenges of big data. Summary day 1.
PPTX
Challenges of big data. Aims of the workshop.
PPTX
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
PPT
ELIXIR and data grand challenges in life sciences
PPT
SASI, A lightweight standard for exchanging course information
BMB Resource Integration Workshop
Proteomics repositories integration using EUDAT resources
Summary of Technical Coordinators discussions
The European life-science data infrastructure: Data, Computing and Services ...
Standardisation in BMS European infrastructures
ELIXIR TCG update
An introduction to programmatic access
Life science requirements from e-infrastructure: initial results from a joint...
Technical activities in ELIXIR Europe
Challenges of big data. Summary day 1.
Challenges of big data. Aims of the workshop.
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
ELIXIR and data grand challenges in life sciences
SASI, A lightweight standard for exchanging course information
Ad

Recently uploaded (20)

PPTX
BIOMOLECULES PPT........................
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
An interstellar mission to test astrophysical black holes
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
2Systematics of Living Organisms t-.pptx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
2. Earth - The Living Planet earth and life
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
BIOMOLECULES PPT........................
TOTAL hIP ARTHROPLASTY Presentation.pptx
Introduction to Cardiovascular system_structure and functions-1
Biophysics 2.pdffffffffffffffffffffffffff
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
An interstellar mission to test astrophysical black holes
. Radiology Case Scenariosssssssssssssss
INTRODUCTION TO EVS | Concept of sustainability
2Systematics of Living Organisms t-.pptx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
microscope-Lecturecjchchchchcuvuvhc.pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Placing the Near-Earth Object Impact Probability in Context
2. Earth - The Living Planet earth and life
neck nodes and dissection types and lymph nodes levels
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
bbec55_b34400a7914c42429908233dbd381773.pdf

Data integration in Proteomics through EnVision and EnCore Web Services

Editor's Notes

  • #5: We are exposed to a very diverse service world
  • #7: This is a generic example of how an EnCORE service work
  • #10: An specific example The query is a protein Acc We run the Intact service We get the interactions result defined by the EnXML terminology
  • #11: The same results in a table
  • #12: EnCORE facilitates building workflows
  • #15: EnVISION is an EnCORE interface With just one click user can run different services get a quick overview for a dataset This example shows result for …
  • #16: Here an example of the potential of EnVISION In this example we used a dataset of more than 300 protein Acc. In this screenshot EnVISION was able to find more than 500 pathways for this dataset. EnVISION is capable to link and display positive results in a pathway map.
  • #20: EnVISION results are nice, but do not forget our initial integration problem For one domain (protein interaction, pathways, protein sequence …) we might have several databases providing data
  • #21: EnCORE provides a great solution however it is not complete if it can not include more resources For EnCORE it is not feasible to develop and maintain so many wrappers. Nonetheless EnCORE can overcome this problem using standards and federated systems
  • #25: Data producers have good reasons to have their own database. However among all of us have to think about ways to share our data and make it easily available to user. Federation provides an easy way to integrate data resources. 100% compatible with database providers continuing working with their own database structure, GUI, ...
  • #40: Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both generation of new bioinformatics tools and experimental validation of computational predictions. Beyond the use of common standards to format individual datasets, there is a need for sophisticated informatics platforms to enable mining data across various domains, sources, formats and types. The aim of the EnCORE project is to integrate across different disciplines an extensive list of database resources and analysis tools in a computationally accessible and extensible manner, facilitating automated data retrieval and processing with a special focus on systems biology. The EnCORE platform is available as a collection of webservices with a common standard format easy to integrate in Workflow management software such as Taverna. Additionally EnCORE services are also accessible thought EnVISION, a web graphical user interface providing elaborated information such as molecular interaction, biological pathways and computational models of pathways.