Wf4Ever: Scientific Workflows
and Research Objects as tools
for scientific insight and
methodology curation
Juande Santander-Vela	 jdsant@iaa.es
Instituto de Astrofísica de Andalucía-CSIC
Talk Outline
Introduction
Current challenges for radio astronomy and science
Potential e-Science solutions: Workflows and
Research Objects
Final points
Introduction
Who am I?
Member of the AMIGA international collaboration,
based at IAA-CSIC
Ph.D. on bringing Radio Astronomical data archives and
tools into the VO
Applied Scientist at ESO VLT archive, Software
Engineer/Astronomy Specialist at ALMA archive
(May 2009-Dec 2011)
Back to IAA-CSIC as VIA-SKA Project Manager,
Radio Astroinformatician
GROUP INTEREST IN TECH
DEVELOPMENTS FOR BETTER SCIENCE
Why I’m here?
Collaboration with Stephane Leon and the ALMA
Data Management Group
Helping bring the ALMA Science Archive to the VO
Modelling radio data cubes
Finding use cases for workflow technology (see later)
AMIGA
Analysis of the interstellar Medium of Isolated GAlaxies
Multi-wavelength, multi-object study on isolated galaxies
with strict isolation criteria
Careful curation of data
Very careful processing of new parameters from
Group’s own observation programs and data reduction
Literature table scanning
Virtual Observatory table harvesting and parsing
Emphasis on marrying astronomy and computer science,
and buy-in of the VO
E-SCIENCE USERS
AMIGA
Analysis of the interstellar Medium of Isolated GAlaxies
Multi-wavelength, multi-object study on isolated galaxies
with strict isolation criteria
Careful curation of data
Very careful processing of new parameters from
Group’s own observation programs and data reduction
Literature table scanning
Virtual Observatory table harvesting and parsing
Emphasis on marrying astronomy and computer science,
and buy-in of the VO
E-SCIENCE DEVELOPERS!
AMIGA
Project goal: providing a baseline for galaxy
properties to compare with other environments
Interaction-free sample, ideal for tracing HI infall:
we can use CIG galaxies to detect the cosmic web
Need for very sensitive telescopes able to resolve
faint HI ➡ Square Kilometre Array & pathfinders
PARTICIPATING IN SKA.TEL.SDP CONSORTIUM
WE NEED TOOLS FOR OUR OWN SCIENCE ANALYSIS
⤷
Current challenges for
radio astronomy
and science
Data over-abundance
Moore’s Law for Detectors ➡ Exponential increase
of individual and accumulated data sets
We have more data than ever… but we can’t use it:
Because we can’t:
Difficult to set up (for sharing)
Difficult to find (for using)
Difficult to document (both using and sharing)
Difficult to deal with (because of size, formatting, purpose…)
Because it is not in our best interest
FULLY
?
Courtesy J.E. Ruiz (AMIGA,Wf4Ever)
Courtesy J.E. Ruiz (AMIGA,Wf4Ever)
Tools!
Data sharing
Search Go
Advanced search
Home News & Comment Research Careers & Jobs Current Issue Archive Audio & Video For Authors
SPECIALS See all specials
Editorial Feature Opinion Elsewhere in Nature
DATA SHARING
Sharing data is good. But sharing your own data? That can get complicated. As two research
communities who held meetings in May on the issue report their proposals to promote data sharing
in biology, a special issue of Nature examines the cultural and technical hurdles that can get in the
way of good intentions.
Data SharingSpecials & supplements archiveArchive
DATA FLIRTING
DATA HOARDING
IRREPRODUCIBLE
RESEARCH
?
Irreproducible researchSearch Go
Advanced search
Home News & Comment Research Careers & Jobs Current Issue Archive Audio & Video For Authors
SPECIAL See all specials
Editorial News and analysis Comment Perspectives and reviews
CHALLENGES IN IRREPRODUCIBLE RESEARCH
No research paper can ever be considered to be the final word, and the replication and
corroboration of research results is key to the scientific process. In studying complex entities,
especially animals and human beings, the complexity of the system and of the techniques can all
too easily lead to results that seem robust in the lab, and valid to editors and referees of journals,
but which do not stand the test of further studies. Nature has published a series of articles about
the worrying extent to which research results have been found wanting in this respect. The editors
of Nature and the Nature life sciences research journals have also taken substantive steps to put
our own houses in order, in improving the transparency and robustness of what we publish.
Journals, research laboratories and institutions and funders all have an interest in tackling issues
of irreproducibility. We hope that the articles contained in this collection will help.
Free full access
Challenges in irreproducible researchSpecials & supplements archiveArchive
nature.com Sitemap Cart Login Register
Search Go
Advanced search
Home News & Comment Research Careers & Jobs Current Issue Audio & Video For Authors
SPECIAL See all specials
Editorial News and analysis Comment Perspectives and reviews
CHALLENGES IN IRREPRODUCIBLE RESEARCH
No research paper can ever be considered to be the final word, and the replication and
corroboration of research results is key to the scientific process. In studying complex entities,
especially animals and human beings, the complexity of the system and of the techniques can all
too easily lead to results that seem robust in the lab, and valid to editors and referees of journals,
but which do not stand the test of further studies. Nature has published a series of articles about
the worrying extent to which research results have been found wanting in this respect. The editors
of Nature and the Nature life sciences research journals have also taken substantive steps to put
our own houses in order, in improving the transparency and robustness of what we publish.
Journals, research laboratories and institutions and funders all have an interest in tackling issues
of irreproducibility. We hope that the articles contained in this collection will help.
Free full access
Challenges in irreproducible researchSpecials & supplements archiveArchive
Irreproducible researchSearch Go
Advanced search
Home News & Comment Research Careers & Jobs Current Issue Archive Audio & Video For Authors
SPECIAL See all specials
Editorial News and analysis Comment Perspectives and reviews
CHALLENGES IN IRREPRODUCIBLE RESEARCH
No research paper can ever be considered to be the final word, and the replication and
corroboration of research results is key to the scientific process. In studying complex entities,
especially animals and human beings, the complexity of the system and of the techniques can all
too easily lead to results that seem robust in the lab, and valid to editors and referees of journals,
but which do not stand the test of further studies. Nature has published a series of articles about
the worrying extent to which research results have been found wanting in this respect. The editors
of Nature and the Nature life sciences research journals have also taken substantive steps to put
our own houses in order, in improving the transparency and robustness of what we publish.
Journals, research laboratories and institutions and funders all have an interest in tackling issues
of irreproducibility. We hope that the articles contained in this collection will help.
Free full access
Challenges in irreproducible researchSpecials & supplements archiveArchive
CHALLENGES IN IRREPRODUCIBLE RESEARCH
No research paper can ever be considered to be the final word, and the replication and
corroboration of research results is key to the scientific process. In studying complex entities,
especially animals and human beings, the complexity of the system and of the techniques can all
too easily lead to results that seem robust in the lab, and valid to editors and referees of journals,
but which do not stand the test of further studies. Nature has published a series of articles about
the worrying extent to which research results have been found wanting in this respect. The editors
of Nature and the Nature life sciences research journals have also taken substantive steps to put
our own houses in order, in improving the transparency and robustness of what we publish.
Journals, research laboratories and institutions and funders all have an interest in tackling issues
of irreproducibility. We hope that the articles contained in this collection will help.
Free full access
Tool over-abundance
++
Starship Asterisk*
APOD and General Astronomy Discussion Forum
Board index ‹ Learning & Resources ‹ The Engineering Deck: Astrophysics Source Code Library
FAQ Register Login
Search this forum… Search 671 topics • Page 1 of 7 • 1 2 3 4 5 ... 7
The Engineering Deck: Astrophysics Source Code Library
Search… Search
Advanced search
Post a new topic
ANNOUNCEMENTS REPLIES VIEWS LAST POST
Welcome & Rules (please read before posting)
by RJN » Mon Jan 18, 2010 7:40 pm 0 15666 by RJN
Mon Jan 18, 2010 7:40 pm
TOPICS REPLIES VIEWS LAST POST
Guide to the Astrophysics Source Code Library
by RJN » Sat Jul 24, 2010 8:01 pm 13 17027 by owlice
Mon Jul 01, 2013 3:32 am
1 2
Papers of Possible Interest to Astronomical Software Users
by owlice » Tue Oct 12, 2010 7:02 am 27 7056 by owlice
Wed May 15, 2013 1:31 pm
1 2
The Astrophysics Source Code Library: New codes welcome
by RJN » Sat Jul 24, 2010 8:01 pm 26 5273
by Eran Ofek
Thu Dec 13, 2012 9:20 pm
*Web Resources and Tools for Astrophysicists/Astronomers*
by owlice » Sat Jul 16, 2011 12:01 pm 22 2750 by owlice
Fri May 10, 2013 12:12 pm
2011 and 2012 Additions to the ASCL
by owlice » Thu Feb 24, 2011 11:26 pm 23 1693 by owlice
Sat Dec 08, 2012 8:09 pm
21cmFAST: Simulation of the High-Redshift 21-cm Signal
by owlice » Thu Feb 17, 2011 10:47 pm 0 3443 by owlice
Thu Feb 17, 2011 10:47 pm
2LPTIC: 2nd-order Lagrangian Perturbation Theory Initial Con
by owlice » Tue Jan 03, 2012 5:27 am 0 855 by owlice
Tue Jan 03, 2012 5:27 am
2MASS Kit: 2MASS Catalog Server Kit
by owlice » Sun Mar 17, 2013 5:16 pm 0 214 by owlice
Sun Mar 17, 2013 5:16 pm
3DEX: Fast Fourier-Bessel Decomposition of Spherical 3D Surv
by owlice » Sat Nov 26, 2011 4:00 pm 0 741 by owlice
Sat Nov 26, 2011 4:00 pm
AAOGlimpse: Three-dimensional Data Viewer
by owlice » Sat Oct 15, 2011 11:29 am 0 1034 by owlice
Sat Oct 15, 2011 11:29 am
ACORNS-ADI: Calibration, Registration and Nulling in Imaging
by kcd » Sat Mar 30, 2013 7:40 am 0 177 by kcd
Sat Mar 30, 2013 7:40 am
ACS: ALMA Common Software
by kcd » Sat Feb 09, 2013 3:44 am 0 269 by kcd
Sat Feb 09, 2013 3:44 am
671 topics • Page 1 of 7 •
Services too!
How to deal with all this?
++
All of this compounds the
problems of
reproducibility,
methodology
assessment, result
dissemination…
How to deal with all this?
AND THE
CODE?
WHAT
SOFTWARE
DOES IT
DEPEND ON?
WHICH
CODE DID
WHAT?
NOT
A GOOD
SOLUTION
TRADITIONALLY…
How to deal with all this?
++ ORCHESTATION,
ENCAPSULATION,
DATA ACCESS,
PROVENANCE,
ANNOTATION…
Why Workflows?
SCIENTIFIC
Workflows define
computations
Events & Processes
Dependencies
Resources
Local & Remote Processes
Sequences
Concurrences
Triggers
FORMALLY,
OR AT LEAST
MACHINE READABLE
➡
WORKFLOW
DEFINITION
LANGUAGES
Workflows enable
distributed computing
Distributed computing paradigm
Move computation to the data
Computing services
Collaborative environments
Linked data }
FOR SCIENTIFIC
DISCUSSION &
SCIENCE EXTRACTION
➡ Science-computing
Workflows enable
distributed computing
Data can be anywhere
Workflows can be constructed hierarchicaly
Each workflow does useful work on its own
The data flow can be easily followed
Workflows enable
interactive computing
Each workflow run records it’s inputs, outputs,
and intermediate results
You can build and run workflows incrementally
You can get (almost) immediate feedback on
changes
Tools for workflow
storage and discovery
About | Give us Feedback | Publications Juandesant
New Workflow GO Workflows Search
View
Download (v7)
Taverna 2
Original
Uploader
Paul
Fisher
Sort by: Rank« Previous 1 2 3 4 5 … 221 Next »
1111
562
243
43
34
26
24
23
18
13
223
Search filter terms
Filter by type
Taverna 2
Taverna 1
RapidMiner
Kepler
Bioclipse Scri…
LONI Pipeline
GWorkflowDL
KNIME
BioExtract Ser…
Galaxy
Filter by tag
example
Home Users Groups Workflows Files Packs Topics
Home > Workflows
Workflows
Showing 2207 results. Use the filters on the left and the search box below to refine the results.
Search
Pathways and Gene annotations for QTL region (7)
Created: 19/11/09 @ 18:18:52 | Last updated: 07/09/12 @ 18:23:36
Credits: Paul Fisher
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow searches for genes which reside in a QTL (Quantitative Trait Loci) region in the
mouse, Mus musculus. The workflow requires an input of: a chromosome name or number; a QTL
start base pair position; QTL end base pair position. Data is then extracted from BioMart to annotate
each of the genes found in this region. The Entrez and UniProt identifiers are then sent to KEGG to
obtain KEGG gene identifiers. The KEGG gene identifiers are then used to searcg for pathways in
the KEGG path...
Tools for workflow
storage and discovery
About | Give us Feedback | Publications Juandesant
New Workflow GO Astrotaverna Workflows Search
View
Download (v3)
Taverna 2
Original
Uploader
Julian
Garrido
Sort by: Relevance« Previous 1 2 3 4 5 Next »
44
43
42
40
26
23
9
9
9
5
5
Search filter terms
Filter by type
Taverna 2
Filter by tag
astronomy
astrotaverna
votable
virtual observ…
starter pack
local processes
taverna
workflow
galfit
sextractor
Home Users Groups Workflows Files Packs Topics
Home > Workflows
Workflows
Showing 44 results. Use the filters on the left and the search box below to refine the results.
Astrotaverna Search
Remove search query
Cocatenates several VOTables into one (3)
Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00
Credits: Julian Garrido
License: Creative Commons Attribution-Share Alike 3.0 Unported License
Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input
is four VOTables with the same number of columns. The result if using sample values provided
will be a four times vertically duplicated VOTable.
Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0
View
Download (v3)
Taverna 2
Original
Uploader
Julian
Garrido
View
Download (v1)
Taverna 2
Original
Uploader
Julian
Garrido
View
Download (v1)
Taverna 2
Original
Uploader
Sort by: Relevance« Previous 1 2 3 4 5 Next »
44
43
42
40
26
23
9
9
9
5
5
27
17
40
4
16
4
Search filter terms
Filter by type
Taverna 2
Filter by tag
astronomy
astrotaverna
votable
virtual observ…
starter pack
local processes
taverna
workflow
galfit
sextractor
Filter by user
Jose Enrique …
Julian Garrido
Filter by licence
by-sa
BSD
Filter by group
AMIGA
Wf4Ever
Showing 44 results. Use the filters on the left and the search box below to refine the results.
Astrotaverna Search
Remove search query
Cocatenates several VOTables into one (3)
Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00
Credits: Julian Garrido
License: Creative Commons Attribution-Share Alike 3.0 Unported License
Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input
is four VOTables with the same number of columns. The result if using sample values provided
will be a four times vertically duplicated VOTable.
Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0
Viewed: 26 times | Downloaded: 12 times
Tags (4):
astronomy | astrotaverna | cat | votable
Create configuration files from a template... (1)
Created: 26/07/12 @ 10:56:46 | Last updated: 04/09/12 @ 07:30:55
Credits: Julian Garrido
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow uses astrotaverna artifacts. It creates files by using a template whose keys are
replaced by data from a votable. A configuration file is created for every row in the votable. The
keys must appear also in the vocabulary file and match column names in the votable. A column
in the votable must contain the name of the result configuration file.
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0
Viewed: 14 times | Downloaded: 15 times
Tags (4):
astronomy | astrotaverna | local processes | votable
Simulates the physical, dynamical, and che... (1)
Created: 17/05/13 @ 08:03:13
Credits: Julian Garrido
View
Download (v3)
Taverna 2
Original
Uploader
Julian
Garrido
View
Download (v1)
Taverna 2
Original
Uploader
Julian
Garrido
View
Download (v1)
Taverna 2
Original
Uploader
Sort by: Relevance« Previous 1 2 3 4 5 Next »
44
43
42
40
26
23
9
9
9
5
5
27
17
40
4
16
4
Search filter terms
Filter by type
Taverna 2
Filter by tag
astronomy
astrotaverna
votable
virtual observ…
starter pack
local processes
taverna
workflow
galfit
sextractor
Filter by user
Jose Enrique …
Julian Garrido
Filter by licence
by-sa
BSD
Filter by group
AMIGA
Wf4Ever
Showing 44 results. Use the filters on the left and the search box below to refine the results.
Astrotaverna Search
Remove search query
Cocatenates several VOTables into one (3)
Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00
Credits: Julian Garrido
License: Creative Commons Attribution-Share Alike 3.0 Unported License
Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input
is four VOTables with the same number of columns. The result if using sample values provided
will be a four times vertically duplicated VOTable.
Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0
Viewed: 26 times | Downloaded: 12 times
Tags (4):
astronomy | astrotaverna | cat | votable
Create configuration files from a template... (1)
Created: 26/07/12 @ 10:56:46 | Last updated: 04/09/12 @ 07:30:55
Credits: Julian Garrido
License: Creative Commons Attribution-Share Alike 3.0 Unported License
This workflow uses astrotaverna artifacts. It creates files by using a template whose keys are
replaced by data from a votable. A configuration file is created for every row in the votable. The
keys must appear also in the vocabulary file and match column names in the votable. A column
in the votable must contain the name of the result configuration file.
Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0
Viewed: 14 times | Downloaded: 15 times
Tags (4):
astronomy | astrotaverna | local processes | votable
Simulates the physical, dynamical, and che... (1)
Created: 17/05/13 @ 08:03:13
Credits: Julian Garrido
About | Give us Feedback | Publications Juandesant
New Workflow GO All Search
Version 3 (latest) (of 3) View version: 3 (latest)
Version created on: 22/04/13 @ 16:52:00 by: Julian Garrido
Title: Cocatenates several VOTables into one
Type: Taverna 2
Preview
(Click on the image to get the full size)
Workflow Type
Taverna 2
Original Uploader
Julian
Garrido
License
All versions of this Workflow are licensed
under:
Credits (1)
(People/Groups)
Julian Garrido
Attributions (0)
(Workflows/Files)
None
Home Users Groups Workflows Files Packs Topics
Home > Workflows > Cocatenates several VOTables into one
Workflow Entry: Cocatenates several VOTables into one
Created at: 30/08/12 @ 10:05:29 Last updated: 22/04/13 @ 16:52:00
| License | Credits (1) | Attributions (0) | Tags (4) | Featured in Packs (1) | Ratings (0) | Attributed By (0) | Favourited By (0) |
| Citations (0) | Version History | Reviews (0) | Comments (0) |
Version 3 (latest) (of 3) View version: 3 (latest)
Version created on: 22/04/13 @ 16:52:00 by: Julian Garrido
Title: Cocatenates several VOTables into one
Type: Taverna 2
Preview
(Click on the image to get the full size)
Download Scalable Diagram (SVG)
Description
Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four
VOTables with the same number of columns. The result if using sample values provided will be a four times
vertically duplicated VOTable.
Download
Download Workflow File/Package (T2FLOW)
Workflow Type
Taverna 2
Original Uploader
Julian
Garrido
License
All versions of this Workflow are licensed
under:
Credits (1)
(People/Groups)
Julian Garrido
Attributions (0)
(Workflows/Files)
None
Tags (4)
Original Uploader tags
astronomy | astrotaverna | cat |
votable
Add Tags
Shared with Groups (1)
AMIGA
Featured In Packs (1)
AstroTaverna Starter Pack
Ratings (0)
Download
Download Workflow File/Package (T2FLOW)
Download Workflow as a Galaxy tool
Run
Run this Workflow in the Taverna Workbench...
Option 1:
Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/3130/download?version=3
[ More Info ]
Workflow Components
Authors (1)
Titles (1)
Descriptions (1)
Dependencies (0)
Inputs (4)
Processors (1)
Beanshells (0)
Outputs (1)
Datalinks (5)
Coordinations (0)
Featured In Packs (1)
AstroTaverna Starter Pack
Ratings (0)
Hover and click to rate
Current:
0.0 / 5
(0 ratings)
You haven't rated yet
Breakdown
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Add to your Favourites
Statistics
53 viewings
75 downloads
[ see breakdown ]
More
That’s not enough!
FOR ASTRONOMERS
FOR REPRODUCIBILITY
AND REUSE
3
7
4
1
6
5
2
1. Intelligent Software Components
(iSOCO, Spain)
2. University of Manchester (UNIMAN,
UK)
3. Universidad Politécnica de Madrid
(UPM, Spain)
4. Poznan Supercomputing and
Networking Centre (PSNC, Poland)
5. University of Oxford (OXF, UK)
6. Instituto de Astrofísica de Andalucía
(IAA, Spain)
7. Leiden University Medical Centre
(LUMC, NL)
EU FUNDED FP7 STREP PROJECT
DECEMBER 2010 – DECEMBER 2013
• Astronomy (IAA-CSIC)
• Genome-wide Analysis and Biobanking
Case Studies
Archival, classification, and indexing
of scientific workflows and their
associated materials in scalable
semantic repositories, providing
advanced access and recommendation
capabilities
Creation of scientific communities to
collaboratively share, reuse, and evolve
workflows and their parts, stimulating
the development of new scientific
knowledge
Goals
• Digital Libraries
• Workflow Management
• Semantic Web
• Integrity & Authenticity
• Provenance
• Information Quality
Core Competencies (Tech)
• One SME
• Six public organisations
Partners
Technological infrastructure for the preservation and efficient
retrieval and reuse of scientific workflows in a range of
disciplines
TARGETING ALREADY ESTABLISHED
COMMUNITIES: MYEXPERIMENT,
VIRTUAL OBSERVATORY
3
What is a Scientific Workflow?
Workflows to Access and Massage VO Data
»  A mechanism for coordinating the execution of
services and codes, and linking together resources.
»  The combination of data and processes into a
configurable, modular, structured set of steps that
implement semi-automated computational solutions
in scientific problem-solving.
»  The implementation of a scientific method.
COURTESY J.E. RUIZ
NOT A PIPELINE!
AMIGA4GAS
3D KINEMATICAL MODELING
INPUT FILES
ROTCUR
12 RUNS
POSSIBLE COMBINATIONS
IN INPUT PARAMETERS
12 ASCII FILES
GALMOD
12 CUBES
4 APPROACHING
4 RECEEDING
4 BOTH
COPY
8 CUBES
4 APPROACHING + RECEEDING
4 BOTH
MOMENTS
8 VELOCITY MAPS
1 DATACUBE
1 VELOCITY MAP
1 CONFIG FILE ROTCUR
1 CONFIG FILE GALMOD
SUB
8 RESIDUAL CUBES
8 RESIDUAL MAPS
SUB
MNMX 8 VALUES FOR PEAKS IN CUBES
8 VALUES FOR PEAKS IN MAPS
VARIABLE
PARAMS
INSET
RADII, WIDTHS
WEIGHT
TOLERANCE
DENS
NV
Z0
VDISP
How do we build
workflows?
AstroTaverna
Taverna plugin for retrieving and manipulating
VO Data + Catalogs on HTML Pages
VO Services: ConeSearch, SIA, SSA, TAP coming soon
Tabular Data (VOTables, converters from other formats)
Crossmatching, Filtering, NameResolving, Coordinates and reference
system transformation, Data massage.. (STILTS)
Source catalog overplotting on Images and filtering, overplot circles,
ellipses, etc. as a function of physical magnitude. Resampling, crops,
blinks, mosaics, movies, blinks, RGBs, fusion, diff.. (through Aladin)
VO Table rendering, SAMP for final inspection
Image support, Spectra not yet PLUS ADDITIONAL
ANALYSIS USING
SCRIPTS
Service discovery
Data massaging
Data massaging
X-Matching
Calculation
Additions
Filtering
Access
Data curation
X-Matching
Calculation
Additions
Filtering
Access
Data curation
X-Matching
Calculation
Additions
Filtering
Access
Data curation
X-Matching
Calculation
Additions
Filtering
Access
Aladin scripting
Interactive data inspection
Interactive data inspection
Learning examples
Not yet enough!
FOR REPRODUCIBILITY
AND REUSE
Home RO at 5000 feet Examples Ontologies Tools Collaboration Publications History About
Search
Research Objects
Research Objects
Content
Process (workflows), data, external resources and bibliography
Execution environment set-up and local software dependencies
Experimental protocol followed
Roles, types and relationships among all digital components
Provenance of intermediate and final results
Decomposable attribution and authoring
Fine-grained access control and permissions
Example datasets for demonstration, reproducibility, monitoring, etc
Templates
Placeholders to ease the aggregation process
Completeness checking/quality assessment
Research Objects
Target Audiencies
Scientists [producers] who want to share their research
outcomes so that they are more reusable and
reproducible – ease of sharing and citation.
Scientists [consumers] who want to understand, reuse,
validate and further extend existing RO’s.
Publishers can adopt the concept and principles of
Research Object to enable the sharing of and access to
the actual data and methods.
Librarians who want to support research preservation.
Semantic annotations
Author of an annotation
Author and co-authors of a workflow; reference link to a re-used workflow and its
author
WhohasperformedtheexecutionofaworkflowleadingtotheresultsprovidedintheRO
Computing execution environment of the RO and local software dependencies
Special access requirements to web services
Datasets provider: person, webpage, survey, data release, etc.
How much time does it take to run a workflow using the full data and the provided
subsample
The number of elements of the sample dataset where one workflow and/or RO iterates
Previous and subsequent workflows to be executed, as in the experimental
protocol
Research institution, country, and scientific domain of the RO
The actual size of the RO and/or a folder
Semantic model
DataLink
MULTI
DISCIPLINARY
RO data organisation
Recommended
organisation provides
automatic semantics
for some items
It makes it easier for
both people and
machines to
understand the RO
ROs in Astronomy
ADSLabs Research Objects
Authors
Publications
Journals
Objects SIMBAD
Tabular data behind the plots CDS
ASCL reference of used software
Observing time Proposals
Used facilities, surveys or missions
NOT JUST FROM WORKFLOWS
POTENTIAL FOR
RESEARCH OBJECT
INDEXING IN ADS
RO Incentive
PAPERS WITH DATA LINKS ARE CITED MORE THAN THOSE WITHOUT
Effect of E-printing on Citation Rates in Astronomy and Physics
2006. Edwin A. Henneken et al.
RO Incentive
PAPERS WITH DATA LINKS ARE CITED MORE THAN THOSE WITHOUT
Effect of E-printing on Citation Rates in Astronomy and Physics
2006. Edwin A. Henneken et al.
NOW YOU CAN
CITE DATA AND
PROCESSES, TOO
Roadmap
AstroTaverna, mostly ready: you can publish
workflows and packs to myExperiment from
Taverna
myExperiment, building support for ROs
ADS will populate myExperiment with literature-
ROs
Taverna will be able to publish ROs to
myExperiment
Final points
We need something like workflows to describe
computations in a distributed environment
Workflows are not enough for supporting reuse
and methodology preservation
Research Objects are meaningful associations of
data, operations, provenance, which can also be
cited CAN EMBED
COMPUTATIONS IN
SCIENCE ARCHIVES
Useful Links
http://www.wf4ever-project.org
http://www.myexperiment.org
http://www.researchobject.org
http://wf4ever.github.io/astrotaverna/
http://amiga.iaa.es
Thank you!

More Related Content

PDF
Is the current measure of excellence perverting Science? A Data deluge is com...
PDF
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
PDF
Salami slicing
PPTX
SEEKing our way to better presentation of data and models from scientific inv...
PDF
Tradeline 2016
PPTX
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
PPTX
Presentation to the J. Craig Venter Institute, Dec. 2014
PDF
Strukturdaten 2013 - Bezirksregierung Köln
Is the current measure of excellence perverting Science? A Data deluge is com...
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Salami slicing
SEEKing our way to better presentation of data and models from scientific inv...
Tradeline 2016
Ontologies For the Modern Age - McGuinness' Keynote at ISWC 2017
Presentation to the J. Craig Venter Institute, Dec. 2014
Strukturdaten 2013 - Bezirksregierung Köln

Viewers also liked (20)

PDF
PPS
Educación emocional
PDF
E mail to transformative cm os
PDF
NexTReT CEUS Redes Sociales
PDF
Carta de España Septiembre 2010
PDF
Curriculum Vitae al Add Energy Renewable Romania
PDF
Club Ligeresa: Branded Content + Gamificación
PDF
Bibliotecas Escolares - Caderno Informativo 2012-2013
PDF
Shopster
PPTX
coolbrandz presentation: Dublin Web Summit 2013
PDF
Alternativas energeticas
PDF
Informe uvd final agua fria ago-dic velasco Trejo Jorge Alejandro 15042013
PPTX
Plano Jeunesse Global - Rede BlueTeam
PPT
Move Your Body
DOCX
Legislacion maritima
PDF
S6 gestion m2
PPTX
2. Politicas y Estrategias de Mkt Internacional - Philip Cateora
PDF
High Step-Up Converter with Voltage Multiplier Module for Renewable Energy Sy...
PDF
AXIAL VINOS - Corporate Social Resposibility Manual
Educación emocional
E mail to transformative cm os
NexTReT CEUS Redes Sociales
Carta de España Septiembre 2010
Curriculum Vitae al Add Energy Renewable Romania
Club Ligeresa: Branded Content + Gamificación
Bibliotecas Escolares - Caderno Informativo 2012-2013
Shopster
coolbrandz presentation: Dublin Web Summit 2013
Alternativas energeticas
Informe uvd final agua fria ago-dic velasco Trejo Jorge Alejandro 15042013
Plano Jeunesse Global - Rede BlueTeam
Move Your Body
Legislacion maritima
S6 gestion m2
2. Politicas y Estrategias de Mkt Internacional - Philip Cateora
High Step-Up Converter with Voltage Multiplier Module for Renewable Energy Sy...
AXIAL VINOS - Corporate Social Resposibility Manual
Ad

Similar to Wf4Ever: Scientific Workflows and Research Objects as tools for scientific insight and methodology curation (20)

PPTX
Reproducibility (and the R*) of Science: motivations, challenges and trends
PPT
E research overview gahegan bioinformatics workshop 2010
PPT
Evolution of e-Research
PDF
Journal On Data Semantics Xiv 1st Edition Olga De Troyer Wesley Bille
PDF
0849382971 physoptics
PDF
Abdul al azzawi-physical_optics_principles_and_practices__2006
PPTX
The Evolution of e-Research: Machines, Methods and Music
PPT
Peer Review and Science2.0
PPTX
Developing and sharing tools for bioelectromagnetic research
PPTX
Research Objects for FAIRer Science
PDF
RDA Scholarly Infrastructure 2015
PDF
An Introduction To Interdisciplinary Research Theory And Practice Steph Menke...
PPT
PDF
Roche_open_science_NIOO_KNAW_workshop_NL
PDF
Love for science or Academic prostitution, 2019 update
PPT
New e-Science Edinburgh Late Edition
PPTX
2015 12 ebi_ganley_final
PPT
Ceh Conference Nsb
PDF
Data publication: Discover, Explore, Visualise
PDF
Open Science for sustainability and inclusiveness: the SKA role model
Reproducibility (and the R*) of Science: motivations, challenges and trends
E research overview gahegan bioinformatics workshop 2010
Evolution of e-Research
Journal On Data Semantics Xiv 1st Edition Olga De Troyer Wesley Bille
0849382971 physoptics
Abdul al azzawi-physical_optics_principles_and_practices__2006
The Evolution of e-Research: Machines, Methods and Music
Peer Review and Science2.0
Developing and sharing tools for bioelectromagnetic research
Research Objects for FAIRer Science
RDA Scholarly Infrastructure 2015
An Introduction To Interdisciplinary Research Theory And Practice Steph Menke...
Roche_open_science_NIOO_KNAW_workshop_NL
Love for science or Academic prostitution, 2019 update
New e-Science Edinburgh Late Edition
2015 12 ebi_ganley_final
Ceh Conference Nsb
Data publication: Discover, Explore, Visualise
Open Science for sustainability and inclusiveness: the SKA role model
Ad

More from Joint ALMA Observatory (20)

PDF
Hablemos de ALMA — Wideband Sensitivity Upgrade
PDF
From SKA to SKAO: Early progress in SKA project construction.
PDF
The Square Kilometre Array Science Cases (CosmoAndes 2018)
PDF
Software Development Practices in ESFRIS—SKA Software Development
PDF
Agile Systems Engineering & Agile at SKA Scale
PDF
How much control do you need to dance TANGO?
PDF
Citizen Science in the era of the Square Kilometre Array
PDF
The Square Kilometre Array: Overview and Engineering Update
PDF
SKA Systems Engineering: from PDR to Construction
PDF
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
PDF
e-Science for the Science Kilometre Array
PDF
VO Course 11: Spatial indexing
PDF
VO Course 10: Big data challenges in astronomy
PDF
Curso VO 07: Sistemas gestores de bases de datos
PDF
VO Course 06: VO Data-models
PDF
VO Course 05: VOTable, VO Protocols, and UCDs
PDF
VO Course 04: VO architecture
PDF
VO Course 03: IVOA, the International Virtual Observatory Alliance
PDF
VO Course 02: Astronomy & Standards
PDF
VO Course 12: Workflows & the Wf4Ever project
Hablemos de ALMA — Wideband Sensitivity Upgrade
From SKA to SKAO: Early progress in SKA project construction.
The Square Kilometre Array Science Cases (CosmoAndes 2018)
Software Development Practices in ESFRIS—SKA Software Development
Agile Systems Engineering & Agile at SKA Scale
How much control do you need to dance TANGO?
Citizen Science in the era of the Square Kilometre Array
The Square Kilometre Array: Overview and Engineering Update
SKA Systems Engineering: from PDR to Construction
Building a National Virtual Observatory: The Case of the Spanish Virtual Obse...
e-Science for the Science Kilometre Array
VO Course 11: Spatial indexing
VO Course 10: Big data challenges in astronomy
Curso VO 07: Sistemas gestores de bases de datos
VO Course 06: VO Data-models
VO Course 05: VOTable, VO Protocols, and UCDs
VO Course 04: VO architecture
VO Course 03: IVOA, the International Virtual Observatory Alliance
VO Course 02: Astronomy & Standards
VO Course 12: Workflows & the Wf4Ever project

Recently uploaded (20)

PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
Modernising the Digital Integration Hub
PDF
Architecture types and enterprise applications.pdf
PDF
CloudStack 4.21: First Look Webinar slides
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
August Patch Tuesday
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Getting Started with Data Integration: FME Form 101
PPT
Geologic Time for studying geology for geologist
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
1 - Historical Antecedents, Social Consideration.pdf
WOOl fibre morphology and structure.pdf for textiles
Final SEM Unit 1 for mit wpu at pune .pptx
Taming the Chaos: How to Turn Unstructured Data into Decisions
Assigned Numbers - 2025 - Bluetooth® Document
A novel scalable deep ensemble learning framework for big data classification...
Modernising the Digital Integration Hub
Architecture types and enterprise applications.pdf
CloudStack 4.21: First Look Webinar slides
Chapter 5: Probability Theory and Statistics
O2C Customer Invoices to Receipt V15A.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
August Patch Tuesday
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Getting Started with Data Integration: FME Form 101
Geologic Time for studying geology for geologist
Module 1.ppt Iot fundamentals and Architecture
Enhancing emotion recognition model for a student engagement use case through...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor

Wf4Ever: Scientific Workflows and Research Objects as tools for scientific insight and methodology curation

  • 1. Wf4Ever: Scientific Workflows and Research Objects as tools for scientific insight and methodology curation Juande Santander-Vela jdsant@iaa.es Instituto de Astrofísica de Andalucía-CSIC
  • 2. Talk Outline Introduction Current challenges for radio astronomy and science Potential e-Science solutions: Workflows and Research Objects Final points
  • 4. Who am I? Member of the AMIGA international collaboration, based at IAA-CSIC Ph.D. on bringing Radio Astronomical data archives and tools into the VO Applied Scientist at ESO VLT archive, Software Engineer/Astronomy Specialist at ALMA archive (May 2009-Dec 2011) Back to IAA-CSIC as VIA-SKA Project Manager, Radio Astroinformatician GROUP INTEREST IN TECH DEVELOPMENTS FOR BETTER SCIENCE
  • 5. Why I’m here? Collaboration with Stephane Leon and the ALMA Data Management Group Helping bring the ALMA Science Archive to the VO Modelling radio data cubes Finding use cases for workflow technology (see later)
  • 6. AMIGA Analysis of the interstellar Medium of Isolated GAlaxies Multi-wavelength, multi-object study on isolated galaxies with strict isolation criteria Careful curation of data Very careful processing of new parameters from Group’s own observation programs and data reduction Literature table scanning Virtual Observatory table harvesting and parsing Emphasis on marrying astronomy and computer science, and buy-in of the VO E-SCIENCE USERS
  • 7. AMIGA Analysis of the interstellar Medium of Isolated GAlaxies Multi-wavelength, multi-object study on isolated galaxies with strict isolation criteria Careful curation of data Very careful processing of new parameters from Group’s own observation programs and data reduction Literature table scanning Virtual Observatory table harvesting and parsing Emphasis on marrying astronomy and computer science, and buy-in of the VO E-SCIENCE DEVELOPERS!
  • 8. AMIGA Project goal: providing a baseline for galaxy properties to compare with other environments Interaction-free sample, ideal for tracing HI infall: we can use CIG galaxies to detect the cosmic web Need for very sensitive telescopes able to resolve faint HI ➡ Square Kilometre Array & pathfinders PARTICIPATING IN SKA.TEL.SDP CONSORTIUM WE NEED TOOLS FOR OUR OWN SCIENCE ANALYSIS ⤷
  • 9. Current challenges for radio astronomy and science
  • 10. Data over-abundance Moore’s Law for Detectors ➡ Exponential increase of individual and accumulated data sets We have more data than ever… but we can’t use it: Because we can’t: Difficult to set up (for sharing) Difficult to find (for using) Difficult to document (both using and sharing) Difficult to deal with (because of size, formatting, purpose…) Because it is not in our best interest FULLY ?
  • 11. Courtesy J.E. Ruiz (AMIGA,Wf4Ever)
  • 12. Courtesy J.E. Ruiz (AMIGA,Wf4Ever) Tools!
  • 13. Data sharing Search Go Advanced search Home News & Comment Research Careers & Jobs Current Issue Archive Audio & Video For Authors SPECIALS See all specials Editorial Feature Opinion Elsewhere in Nature DATA SHARING Sharing data is good. But sharing your own data? That can get complicated. As two research communities who held meetings in May on the issue report their proposals to promote data sharing in biology, a special issue of Nature examines the cultural and technical hurdles that can get in the way of good intentions. Data SharingSpecials & supplements archiveArchive DATA FLIRTING DATA HOARDING IRREPRODUCIBLE RESEARCH ?
  • 14. Irreproducible researchSearch Go Advanced search Home News & Comment Research Careers & Jobs Current Issue Archive Audio & Video For Authors SPECIAL See all specials Editorial News and analysis Comment Perspectives and reviews CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access Challenges in irreproducible researchSpecials & supplements archiveArchive nature.com Sitemap Cart Login Register Search Go Advanced search Home News & Comment Research Careers & Jobs Current Issue Audio & Video For Authors SPECIAL See all specials Editorial News and analysis Comment Perspectives and reviews CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access Challenges in irreproducible researchSpecials & supplements archiveArchive
  • 15. Irreproducible researchSearch Go Advanced search Home News & Comment Research Careers & Jobs Current Issue Archive Audio & Video For Authors SPECIAL See all specials Editorial News and analysis Comment Perspectives and reviews CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access Challenges in irreproducible researchSpecials & supplements archiveArchive CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access
  • 17. Starship Asterisk* APOD and General Astronomy Discussion Forum Board index ‹ Learning & Resources ‹ The Engineering Deck: Astrophysics Source Code Library FAQ Register Login Search this forum… Search 671 topics • Page 1 of 7 • 1 2 3 4 5 ... 7 The Engineering Deck: Astrophysics Source Code Library Search… Search Advanced search Post a new topic ANNOUNCEMENTS REPLIES VIEWS LAST POST Welcome & Rules (please read before posting) by RJN » Mon Jan 18, 2010 7:40 pm 0 15666 by RJN Mon Jan 18, 2010 7:40 pm TOPICS REPLIES VIEWS LAST POST Guide to the Astrophysics Source Code Library by RJN » Sat Jul 24, 2010 8:01 pm 13 17027 by owlice Mon Jul 01, 2013 3:32 am 1 2 Papers of Possible Interest to Astronomical Software Users by owlice » Tue Oct 12, 2010 7:02 am 27 7056 by owlice Wed May 15, 2013 1:31 pm 1 2 The Astrophysics Source Code Library: New codes welcome by RJN » Sat Jul 24, 2010 8:01 pm 26 5273 by Eran Ofek Thu Dec 13, 2012 9:20 pm *Web Resources and Tools for Astrophysicists/Astronomers* by owlice » Sat Jul 16, 2011 12:01 pm 22 2750 by owlice Fri May 10, 2013 12:12 pm 2011 and 2012 Additions to the ASCL by owlice » Thu Feb 24, 2011 11:26 pm 23 1693 by owlice Sat Dec 08, 2012 8:09 pm 21cmFAST: Simulation of the High-Redshift 21-cm Signal by owlice » Thu Feb 17, 2011 10:47 pm 0 3443 by owlice Thu Feb 17, 2011 10:47 pm 2LPTIC: 2nd-order Lagrangian Perturbation Theory Initial Con by owlice » Tue Jan 03, 2012 5:27 am 0 855 by owlice Tue Jan 03, 2012 5:27 am 2MASS Kit: 2MASS Catalog Server Kit by owlice » Sun Mar 17, 2013 5:16 pm 0 214 by owlice Sun Mar 17, 2013 5:16 pm 3DEX: Fast Fourier-Bessel Decomposition of Spherical 3D Surv by owlice » Sat Nov 26, 2011 4:00 pm 0 741 by owlice Sat Nov 26, 2011 4:00 pm AAOGlimpse: Three-dimensional Data Viewer by owlice » Sat Oct 15, 2011 11:29 am 0 1034 by owlice Sat Oct 15, 2011 11:29 am ACORNS-ADI: Calibration, Registration and Nulling in Imaging by kcd » Sat Mar 30, 2013 7:40 am 0 177 by kcd Sat Mar 30, 2013 7:40 am ACS: ALMA Common Software by kcd » Sat Feb 09, 2013 3:44 am 0 269 by kcd Sat Feb 09, 2013 3:44 am 671 topics • Page 1 of 7 •
  • 19. How to deal with all this? ++ All of this compounds the problems of reproducibility, methodology assessment, result dissemination…
  • 20. How to deal with all this? AND THE CODE? WHAT SOFTWARE DOES IT DEPEND ON? WHICH CODE DID WHAT? NOT A GOOD SOLUTION TRADITIONALLY…
  • 21. How to deal with all this? ++ ORCHESTATION, ENCAPSULATION, DATA ACCESS, PROVENANCE, ANNOTATION…
  • 23. Workflows define computations Events & Processes Dependencies Resources Local & Remote Processes Sequences Concurrences Triggers FORMALLY, OR AT LEAST MACHINE READABLE ➡ WORKFLOW DEFINITION LANGUAGES
  • 24. Workflows enable distributed computing Distributed computing paradigm Move computation to the data Computing services Collaborative environments Linked data } FOR SCIENTIFIC DISCUSSION & SCIENCE EXTRACTION ➡ Science-computing
  • 25. Workflows enable distributed computing Data can be anywhere Workflows can be constructed hierarchicaly Each workflow does useful work on its own The data flow can be easily followed
  • 26. Workflows enable interactive computing Each workflow run records it’s inputs, outputs, and intermediate results You can build and run workflows incrementally You can get (almost) immediate feedback on changes
  • 27. Tools for workflow storage and discovery About | Give us Feedback | Publications Juandesant New Workflow GO Workflows Search View Download (v7) Taverna 2 Original Uploader Paul Fisher Sort by: Rank« Previous 1 2 3 4 5 … 221 Next » 1111 562 243 43 34 26 24 23 18 13 223 Search filter terms Filter by type Taverna 2 Taverna 1 RapidMiner Kepler Bioclipse Scri… LONI Pipeline GWorkflowDL KNIME BioExtract Ser… Galaxy Filter by tag example Home Users Groups Workflows Files Packs Topics Home > Workflows Workflows Showing 2207 results. Use the filters on the left and the search box below to refine the results. Search Pathways and Gene annotations for QTL region (7) Created: 19/11/09 @ 18:18:52 | Last updated: 07/09/12 @ 18:23:36 Credits: Paul Fisher License: Creative Commons Attribution-Share Alike 3.0 Unported License This workflow searches for genes which reside in a QTL (Quantitative Trait Loci) region in the mouse, Mus musculus. The workflow requires an input of: a chromosome name or number; a QTL start base pair position; QTL end base pair position. Data is then extracted from BioMart to annotate each of the genes found in this region. The Entrez and UniProt identifiers are then sent to KEGG to obtain KEGG gene identifiers. The KEGG gene identifiers are then used to searcg for pathways in the KEGG path...
  • 28. Tools for workflow storage and discovery About | Give us Feedback | Publications Juandesant New Workflow GO Astrotaverna Workflows Search View Download (v3) Taverna 2 Original Uploader Julian Garrido Sort by: Relevance« Previous 1 2 3 4 5 Next » 44 43 42 40 26 23 9 9 9 5 5 Search filter terms Filter by type Taverna 2 Filter by tag astronomy astrotaverna votable virtual observ… starter pack local processes taverna workflow galfit sextractor Home Users Groups Workflows Files Packs Topics Home > Workflows Workflows Showing 44 results. Use the filters on the left and the search box below to refine the results. Astrotaverna Search Remove search query Cocatenates several VOTables into one (3) Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0
  • 29. View Download (v3) Taverna 2 Original Uploader Julian Garrido View Download (v1) Taverna 2 Original Uploader Julian Garrido View Download (v1) Taverna 2 Original Uploader Sort by: Relevance« Previous 1 2 3 4 5 Next » 44 43 42 40 26 23 9 9 9 5 5 27 17 40 4 16 4 Search filter terms Filter by type Taverna 2 Filter by tag astronomy astrotaverna votable virtual observ… starter pack local processes taverna workflow galfit sextractor Filter by user Jose Enrique … Julian Garrido Filter by licence by-sa BSD Filter by group AMIGA Wf4Ever Showing 44 results. Use the filters on the left and the search box below to refine the results. Astrotaverna Search Remove search query Cocatenates several VOTables into one (3) Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 26 times | Downloaded: 12 times Tags (4): astronomy | astrotaverna | cat | votable Create configuration files from a template... (1) Created: 26/07/12 @ 10:56:46 | Last updated: 04/09/12 @ 07:30:55 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License This workflow uses astrotaverna artifacts. It creates files by using a template whose keys are replaced by data from a votable. A configuration file is created for every row in the votable. The keys must appear also in the vocabulary file and match column names in the votable. A column in the votable must contain the name of the result configuration file. Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 14 times | Downloaded: 15 times Tags (4): astronomy | astrotaverna | local processes | votable Simulates the physical, dynamical, and che... (1) Created: 17/05/13 @ 08:03:13 Credits: Julian Garrido
  • 30. View Download (v3) Taverna 2 Original Uploader Julian Garrido View Download (v1) Taverna 2 Original Uploader Julian Garrido View Download (v1) Taverna 2 Original Uploader Sort by: Relevance« Previous 1 2 3 4 5 Next » 44 43 42 40 26 23 9 9 9 5 5 27 17 40 4 16 4 Search filter terms Filter by type Taverna 2 Filter by tag astronomy astrotaverna votable virtual observ… starter pack local processes taverna workflow galfit sextractor Filter by user Jose Enrique … Julian Garrido Filter by licence by-sa BSD Filter by group AMIGA Wf4Ever Showing 44 results. Use the filters on the left and the search box below to refine the results. Astrotaverna Search Remove search query Cocatenates several VOTables into one (3) Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 26 times | Downloaded: 12 times Tags (4): astronomy | astrotaverna | cat | votable Create configuration files from a template... (1) Created: 26/07/12 @ 10:56:46 | Last updated: 04/09/12 @ 07:30:55 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License This workflow uses astrotaverna artifacts. It creates files by using a template whose keys are replaced by data from a votable. A configuration file is created for every row in the votable. The keys must appear also in the vocabulary file and match column names in the votable. A column in the votable must contain the name of the result configuration file. Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 14 times | Downloaded: 15 times Tags (4): astronomy | astrotaverna | local processes | votable Simulates the physical, dynamical, and che... (1) Created: 17/05/13 @ 08:03:13 Credits: Julian Garrido
  • 31. About | Give us Feedback | Publications Juandesant New Workflow GO All Search Version 3 (latest) (of 3) View version: 3 (latest) Version created on: 22/04/13 @ 16:52:00 by: Julian Garrido Title: Cocatenates several VOTables into one Type: Taverna 2 Preview (Click on the image to get the full size) Workflow Type Taverna 2 Original Uploader Julian Garrido License All versions of this Workflow are licensed under: Credits (1) (People/Groups) Julian Garrido Attributions (0) (Workflows/Files) None Home Users Groups Workflows Files Packs Topics Home > Workflows > Cocatenates several VOTables into one Workflow Entry: Cocatenates several VOTables into one Created at: 30/08/12 @ 10:05:29 Last updated: 22/04/13 @ 16:52:00 | License | Credits (1) | Attributions (0) | Tags (4) | Featured in Packs (1) | Ratings (0) | Attributed By (0) | Favourited By (0) | | Citations (0) | Version History | Reviews (0) | Comments (0) |
  • 32. Version 3 (latest) (of 3) View version: 3 (latest) Version created on: 22/04/13 @ 16:52:00 by: Julian Garrido Title: Cocatenates several VOTables into one Type: Taverna 2 Preview (Click on the image to get the full size) Download Scalable Diagram (SVG) Description Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Download Download Workflow File/Package (T2FLOW) Workflow Type Taverna 2 Original Uploader Julian Garrido License All versions of this Workflow are licensed under: Credits (1) (People/Groups) Julian Garrido Attributions (0) (Workflows/Files) None Tags (4) Original Uploader tags astronomy | astrotaverna | cat | votable Add Tags Shared with Groups (1) AMIGA Featured In Packs (1) AstroTaverna Starter Pack Ratings (0)
  • 33. Download Download Workflow File/Package (T2FLOW) Download Workflow as a Galaxy tool Run Run this Workflow in the Taverna Workbench... Option 1: Copy and paste this link into File > 'Open workflow location...' http://www.myexperiment.org/workflows/3130/download?version=3 [ More Info ] Workflow Components Authors (1) Titles (1) Descriptions (1) Dependencies (0) Inputs (4) Processors (1) Beanshells (0) Outputs (1) Datalinks (5) Coordinations (0) Featured In Packs (1) AstroTaverna Starter Pack Ratings (0) Hover and click to rate Current: 0.0 / 5 (0 ratings) You haven't rated yet Breakdown Attributed By (0) (Workflows/Files) None Favourited By (0) No one Add to your Favourites Statistics 53 viewings 75 downloads [ see breakdown ] More
  • 34. That’s not enough! FOR ASTRONOMERS FOR REPRODUCIBILITY AND REUSE
  • 35. 3 7 4 1 6 5 2 1. Intelligent Software Components (iSOCO, Spain) 2. University of Manchester (UNIMAN, UK) 3. Universidad Politécnica de Madrid (UPM, Spain) 4. Poznan Supercomputing and Networking Centre (PSNC, Poland) 5. University of Oxford (OXF, UK) 6. Instituto de Astrofísica de Andalucía (IAA, Spain) 7. Leiden University Medical Centre (LUMC, NL) EU FUNDED FP7 STREP PROJECT DECEMBER 2010 – DECEMBER 2013
  • 36. • Astronomy (IAA-CSIC) • Genome-wide Analysis and Biobanking Case Studies Archival, classification, and indexing of scientific workflows and their associated materials in scalable semantic repositories, providing advanced access and recommendation capabilities Creation of scientific communities to collaboratively share, reuse, and evolve workflows and their parts, stimulating the development of new scientific knowledge Goals • Digital Libraries • Workflow Management • Semantic Web • Integrity & Authenticity • Provenance • Information Quality Core Competencies (Tech) • One SME • Six public organisations Partners Technological infrastructure for the preservation and efficient retrieval and reuse of scientific workflows in a range of disciplines TARGETING ALREADY ESTABLISHED COMMUNITIES: MYEXPERIMENT, VIRTUAL OBSERVATORY
  • 37. 3 What is a Scientific Workflow? Workflows to Access and Massage VO Data »  A mechanism for coordinating the execution of services and codes, and linking together resources. »  The combination of data and processes into a configurable, modular, structured set of steps that implement semi-automated computational solutions in scientific problem-solving. »  The implementation of a scientific method. COURTESY J.E. RUIZ NOT A PIPELINE!
  • 38. AMIGA4GAS 3D KINEMATICAL MODELING INPUT FILES ROTCUR 12 RUNS POSSIBLE COMBINATIONS IN INPUT PARAMETERS 12 ASCII FILES GALMOD 12 CUBES 4 APPROACHING 4 RECEEDING 4 BOTH COPY 8 CUBES 4 APPROACHING + RECEEDING 4 BOTH MOMENTS 8 VELOCITY MAPS 1 DATACUBE 1 VELOCITY MAP 1 CONFIG FILE ROTCUR 1 CONFIG FILE GALMOD SUB 8 RESIDUAL CUBES 8 RESIDUAL MAPS SUB MNMX 8 VALUES FOR PEAKS IN CUBES 8 VALUES FOR PEAKS IN MAPS VARIABLE PARAMS INSET RADII, WIDTHS WEIGHT TOLERANCE DENS NV Z0 VDISP
  • 39. How do we build workflows?
  • 40. AstroTaverna Taverna plugin for retrieving and manipulating VO Data + Catalogs on HTML Pages VO Services: ConeSearch, SIA, SSA, TAP coming soon Tabular Data (VOTables, converters from other formats) Crossmatching, Filtering, NameResolving, Coordinates and reference system transformation, Data massage.. (STILTS) Source catalog overplotting on Images and filtering, overplot circles, ellipses, etc. as a function of physical magnitude. Resampling, crops, blinks, mosaics, movies, blinks, RGBs, fusion, diff.. (through Aladin) VO Table rendering, SAMP for final inspection Image support, Spectra not yet PLUS ADDITIONAL ANALYSIS USING SCRIPTS
  • 51. Not yet enough! FOR REPRODUCIBILITY AND REUSE
  • 52. Home RO at 5000 feet Examples Ontologies Tools Collaboration Publications History About Search Research Objects
  • 53. Research Objects Content Process (workflows), data, external resources and bibliography Execution environment set-up and local software dependencies Experimental protocol followed Roles, types and relationships among all digital components Provenance of intermediate and final results Decomposable attribution and authoring Fine-grained access control and permissions Example datasets for demonstration, reproducibility, monitoring, etc Templates Placeholders to ease the aggregation process Completeness checking/quality assessment
  • 54. Research Objects Target Audiencies Scientists [producers] who want to share their research outcomes so that they are more reusable and reproducible – ease of sharing and citation. Scientists [consumers] who want to understand, reuse, validate and further extend existing RO’s. Publishers can adopt the concept and principles of Research Object to enable the sharing of and access to the actual data and methods. Librarians who want to support research preservation.
  • 55. Semantic annotations Author of an annotation Author and co-authors of a workflow; reference link to a re-used workflow and its author WhohasperformedtheexecutionofaworkflowleadingtotheresultsprovidedintheRO Computing execution environment of the RO and local software dependencies Special access requirements to web services Datasets provider: person, webpage, survey, data release, etc. How much time does it take to run a workflow using the full data and the provided subsample The number of elements of the sample dataset where one workflow and/or RO iterates Previous and subsequent workflows to be executed, as in the experimental protocol Research institution, country, and scientific domain of the RO The actual size of the RO and/or a folder
  • 57. RO data organisation Recommended organisation provides automatic semantics for some items It makes it easier for both people and machines to understand the RO
  • 58. ROs in Astronomy ADSLabs Research Objects Authors Publications Journals Objects SIMBAD Tabular data behind the plots CDS ASCL reference of used software Observing time Proposals Used facilities, surveys or missions NOT JUST FROM WORKFLOWS POTENTIAL FOR RESEARCH OBJECT INDEXING IN ADS
  • 59. RO Incentive PAPERS WITH DATA LINKS ARE CITED MORE THAN THOSE WITHOUT Effect of E-printing on Citation Rates in Astronomy and Physics 2006. Edwin A. Henneken et al.
  • 60. RO Incentive PAPERS WITH DATA LINKS ARE CITED MORE THAN THOSE WITHOUT Effect of E-printing on Citation Rates in Astronomy and Physics 2006. Edwin A. Henneken et al. NOW YOU CAN CITE DATA AND PROCESSES, TOO
  • 61. Roadmap AstroTaverna, mostly ready: you can publish workflows and packs to myExperiment from Taverna myExperiment, building support for ROs ADS will populate myExperiment with literature- ROs Taverna will be able to publish ROs to myExperiment
  • 62. Final points We need something like workflows to describe computations in a distributed environment Workflows are not enough for supporting reuse and methodology preservation Research Objects are meaningful associations of data, operations, provenance, which can also be cited CAN EMBED COMPUTATIONS IN SCIENCE ARCHIVES