SlideShare a Scribd company logo
The Reality of
 Reproducibility of
 in silico Science




Prof Carole Goble FREng FBCS CITP
JCDL Washington DC, June 2012
X REUNIÓN CIENTÍFICA
                                 DE LA SOCIEDAD ESPAÑOLA
                                           DE ASTRONOMÍA

                                                 VALENCIA
                                                 9/13 JULIO




                Digital Science
Reproducibility and Visibility in Astronomy
 José Enrique Ruiz, Lourdes Verdes-Montenegro, Susana Sánchez,
   Julian Garrido, Juan de Dios Santander and the Wf4Ever Team

               SESIÓN INSTRUMENTACIÓN Y COMPUTACIÓN
                   VALENCIA, VIERNES 13 JULIO 2012




                                                                 2
Digital Science - Reproducibility and Visibility in Astronomy
                                           Astronomy Research Lifecycle

Astronomy research lifecycle is entirely digital

»    Observation proposals
»    Data reduction pipelines
»    Analysis of science ready data
»    Catalogs of objects and data
»    Publish process
      ›  Final data results
      ›  Experiment in DL
         ADS/arXiv

Reproducible research is still not           A normalized preservation of
   possible in a digital world                 methodology is needed

A rich infrastructure of data (VO)                                     Tools
      is not efficiently used
                                                                               3
Digital Science - Reproducibility and Visibility in Astronomy
                                           The next generation of archives

    Much wider FoV and spectral coverage
    »  Large volumes for an observed datacube
    »  Subproducts are Virtual Data generated on-the-fly




     ASKAP Cubes
Prof. Kevin Vinsen

                                                                                4
Digital Science - Reproducibility and Visibility in Astronomy
                                        The next generation of archives

Automated surveys
»  Huge amounts of tabular data
»  Services for KDD




                         Extraction of scientifically relevant information from
                         a multidimensional parameter space

                         »    Exploration services
                         »    Anomaly detection
                         »    Cross-matching data
                         »    Dimensionality reduction

                                                                                  5
Digital Science - Reproducibility and Visibility in Astronomy
                                       The next generation of archives

»  A cloud of Web Services
           Archives should evolve from data providers into
           »  Virtual data providers
           »  Software tasks providers
»  Archives speaking Web Services
           Astronomy of multi archives/facilities/wavelength
           Interconnected and interoperable archives
           »  Software Tasks
                                                                  Preservation
           »  Data

Process should benefit of the same privileges acquired by data
Preserving the method ensures replication of final results at any moment



                                                                                 6
Digital Science - Reproducibility and Visibility in Astronomy
                                                        Efficiency and Reuse


Optimize return on investments made on big facilities
»  Avoid duplication of efforts and reinvention
»  How to discover and not duplicate ?
»  How to re-use and not duplicate ?
»  How to make use of best practices ?
»  How to use the rich infrastructure of data ?
»  Intellectual contributions are encoded in softw

More data in archives does not imply more knowledge
»  Time has come to go beyond the PDF
»  Expose complete scientific record, not the story
»  Allow easy discovery of methods and tools




                                                                                7
Digital Science - Reproducibility and Visibility in Astronomy
                                 Reproducibility and The Scientific Method
                                     Benefits
                                     »  Publishing knowledge, not advertising
                                     »  The author, the referee and the re-user
                                     »  Reputation, prestige and respect
                                     »  Higher quality of publications
                                         ›  Authors will be more careful
                                         ›  Many eyes to check results
                                     Challenges
                                     »  Hard and time consuming
                                     »  Need incentives – not rewarded now
                                     Initiatives
                                     »  Elsevier Executable Papers Challenge
http://xkcd.com/242/
                                     »  Open Data / Open Science

                                                                                  8
Digital Science - Reproducibility and Visibility in Astronomy
          Reproducibility and The Scientific Method




          I don’t know how




                                                           9
Digital Science - Reproducibility and Visibility in Astronomy
                       Discovery, Visibility and Credit




                                                           10
Digital Science - Reproducibility and Visibility in Astronomy
                       Discovery, Visibility and Credit




                                                           11
Digital Science - Reproducibility and Visibility in Astronomy
                       Discovery, Visibility and Credit




                                                           12
Digital Science - Reproducibility and Visibility in Astronomy
                                            Discovery, Visibility and Credit

Exploring and understanding scientific metrics in citation




  2010 Krapivin et al.                                                          13
Digital Science - Reproducibility and Visibility in Astronomy
                                 Discovery, Visibility and Credit
Paper discovery: the social dimension




                                                        #SEA2012




                                                                         14
Digital Science - Reproducibility and Visibility in Astronomy
                                                           The Wf4Ever Project
EU funded FP7 STREP Project
December 2010 – December 2013


                                1.  Intelligent Software Components (ISOCO, Spain)
                                2.  University of Manchester (UNIMAN, UK)
    2                           3.  Universidad Politécnica de Madrid (UPM, Spain)
          7
     5           4              4.  Poznan Supercomputing and Networking Centre
                                    (PSNC, Poland)
                                5.  University of Oxford (OXF, UK)
                                6.  Instituto de Astrofísica de Andalucía (IAA, Spain)
    31                          7.  Leiden University Medical Centre (LUMC, NL)
     6




                                                                                         15
Digital Science - Reproducibility and Visibility in Astronomy
                                             Scientific Workflows


         Living Tutorials
      Templates for Re-use
       Expedites Training
      Reduce time to insight
       Avoids reinvention
 Digital Libraries of workflows may boost the
use of the existing infrastructure of data (VO)
                                                                    16
Digital Science - Reproducibility and Visibility in Astronomy
                                                                        Scientific Workflows
   !
Survey in the domain of astrophysical workflows
 !                                                                             Scientific
    ›  Personal script-based recipes                                            Insight
        •  Python, IDL, Software..
    ›  Multi-archive VO recipes
        •  Euro-VO, IVOA..
    ›  Internal group developments                                            Accessible
        •  GRID, Clusters, Specific knowledge..                               Shareable
    ›  Processing pipelines                                                   Reusable
        •  Facilities provide data, computing infrastructure, tools..         Adaptable
                                                                            Understandable

»  Clarity (workflows) for re-use and re-porpuse vs. automation (pipelines)
»  A black box is not re-usable, cannot be broken into parts
»  Reproducibility vs. industrial paper publishing
                                                                                             17
Digital Science - Reproducibility and Visibility in Astronomy
                                             Research Objects
Organization is more sexy than automation




           Assistive building
         Completeness evaluation




                                                                 18
Digital Science - Reproducibility and Visibility in Astronomy
                                                      Research Objects
Expose experiment in a structured way in order to be understood




                                  Technical Objects        Social Objects
                                   Distributed                              19
Digital Science - Reproducibility and Visibility in Astronomy
                                                        Research Objects
   !
Similar initiatives in Astronomy
 !
»  Semantic curation of digital objects
    ›  CDS Centre Données Strasbourg
    ›  US Virtual Astronomical Observatory
    ›  SAO/NASA ADSLabs

»  Workflow users platforms
    ›  Cyber-SKA
    ›  IceCore
    ›  Montage
    ›  Astro-WISE
    ›  Helio-VO

»  Semantically auto descriptive WS
    ›  Workflows VO-France


                                                                            20
Digital Science - Reproducibility and Visibility in Astronomy
                                                         Research Objects

ADSLabs Initiative

ADO Linked Components
»    Authors
»    Publications
»    Journals
»    Objects SIMBAD
»    Tabular data behind the plots CDS
»    ASCL reference of used software
»    Observing time Proposals
»    Used facilities, surveys or missions

                                       Incentives

     http://labs.adsabs.harvard.edu/
                                                                             21
Digital Science - Reproducibility and Visibility in Astronomy
                                                                   Research Objects
   !
The Incentive
 !
Papers with data links are cited more than those without




 Effect of E-printing on Citation Rates in Astronomy and Physics
 2006. Edwin A. Henneken et al.
                                                                                   22
Digital Science - Reproducibility and Visibility in Astronomy
                                                         The Wf4Ever Project

»  Development of AstroTaverna plugins to access and manage VO data
»  Development Golden Exemplars of astrophysical Workflows and
   Research Objects that use the Wf4Ever technological support

   ›  Curation of physical quantities in 1D catalogues
       •  Data retrieved from external repositories and stored locally
       •  Only local processes for calculations

   ›  Environment and Modelling from 1D catalogues and 2D images
       •  Data retrieved from external repositories (SDSS DR7)
       •  Local software and external web services as processes

   ›  Modelling and Analysis of 3D formatted data
       •  Only external data and processes
                                                                            23
Digital Science - Reproducibility and Visibility in Astronomy
                                             Astronomical Research Objects in Action
   !
Curation by inspecting propagation of changes in quantities
 !




 Credit: Zsolt Frei and James E. Gunn. The Galaxy Catalog                             24
Digital Science - Reproducibility and Visibility in Astronomy
                              Astronomical Research Objects in Action
AMIGA Catalog
Panchromatic properties for a sample of the most isolated nearby galaxies

       How is the User DB affected ?
         - Changes in External DB
                                                  Evaluate variations with time
                  - Modifications in Calculations
                                                    - Modified External Quantities
                                                    - Affected User DB Quantities




                                                               Update




        External DB              Calculations               User DB

                                                                                     25
Digital Science - Reproducibility and Visibility in Astronomy
                        Astronomical Research Objects in Action
Curation by inspecting propagation of changes in quantities
   !
 !
Multi-workflow Research Object




                                                                         26
Digital Science - Reproducibility and Visibility in Astronomy
                       Astronomical Research Objects in Action
Create, annotate and run a workflow
  !
!




                                                                        27
Digital Science - Reproducibility and Visibility in Astronomy
                      Astronomical Research Objects in Action
Populate the Research Object and annotate




                                 Extract !




                                                                       28
Digital Science - Reproducibility and Visibility in Astronomy
                     Astronomical Research Objects in Action
Add documents and references




                                                                      29
Digital Science - Reproducibility and Visibility in Astronomy
                     Astronomical Research Objects in Action
Create and explore relations among components
  !
!




                                                                      30
Digital Science - Reproducibility and Visibility in Astronomy
                     Astronomical Research Objects in Action
Add schema of the experiment




                                                                      31
Digital Science - Reproducibility and Visibility in Astronomy
                       Astronomical Research Objects in Action
Publication for later discovery
                                                        Import and
                                                          re-use !




                                                                        32
Digital Science - Reproducibility and Visibility in Astronomy
                                 Astronomical Research Objects in Action
   !
Curation by inspecting propagation of changes in quantities
 !
»  Taverna 2.3
»  MyExperiment Pack
   ›  http://www.myexperiment.org/packs/231




Related Publication
The AMIGA sample of isolated galaxies XI.
A First Look at Isolated Galaxy Colors
2012 A&A 540, A.47




                                                                              33
Digital Science - Reproducibility and Visibility in Astronomy
                                                                           Conclusions
How NOT to be a good e-astronomer
»     Search the beautiful plot for high impact instead of real scientific results
»     Write a obscure paper, do not say clearly how to reproduce the results
»     Do things quickly and forget about them once you’ve submitted the paper
»     Be untidy, spread your code and data in a variety of formats, folders and disks
»     Practise the “data mine-ing” – input data are mine
»     Practise the “data flirting” – call me if you would like to have more
»     Do not provide data results, including the plots is just fine
»     Always cite the same authors and papers or those that cite you
»     Do not cite other resources than papers, neither provide their URL links
»     Do not search info on Internet with other tools than ADS or arXiv
»     Work alone and email/phone one friend if you have any doubt

     http://amiga.iaa.es/p/212-workflows.htm
     http://www.wf4ever-project.org
     jer@iaa.es
     bultako                                                                            34

More Related Content

PDF
0812 blanche ieee4
PDF
KAUST Vis labbrochure
PDF
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
PDF
Portable Retinal Imaging and Medical Diagnostics
PDF
Shallow introduction for Deep Learning Retinal Image Analysis
KEY
Data-Intensive Research
PDF
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
PDF
Light Treatment Glasses
0812 blanche ieee4
KAUST Vis labbrochure
Video Analysis with Convolutional Neural Networks (Master Computer Vision Bar...
Portable Retinal Imaging and Medical Diagnostics
Shallow introduction for Deep Learning Retinal Image Analysis
Data-Intensive Research
Wf4Ever: Advanced Workflow Preservation Technologies for Enhanced Science i
Light Treatment Glasses

What's hot (14)

PDF
Ophthalmology & Optometry 2.0
PDF
Notes on "Artificial Intelligence in Bioscience Symposium 2017"
PDF
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
PPT
Building a Global Collaboration System for Data-Intensive Discovery
PDF
Ec Nsf Workshop June99
PDF
Coupling Australia’s Researchers to the Global Innovation Economy
PDF
Precision Physiotherapy & Sports Training: Part 1
PDF
Coupling Australia’s Researchers to the Global Innovation Economy
PDF
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research
PDF
Emc 2013 Big Data in Astronomy
PPT
Uses of the OptIPortal
PDF
AI in Ophthalmology | Startup Landscape
PDF
Shrinking the Planet—How Dedicated Optical Networks are Transforming Computat...
PDF
Conferencia Web semantica Mihai Datcu
Ophthalmology & Optometry 2.0
Notes on "Artificial Intelligence in Bioscience Symposium 2017"
Deep Learning for Computer Vision (3/4): Video Analytics @ laSalle 2016
Building a Global Collaboration System for Data-Intensive Discovery
Ec Nsf Workshop June99
Coupling Australia’s Researchers to the Global Innovation Economy
Precision Physiotherapy & Sports Training: Part 1
Coupling Australia’s Researchers to the Global Innovation Economy
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research
Emc 2013 Big Data in Astronomy
Uses of the OptIPortal
AI in Ophthalmology | Startup Landscape
Shrinking the Planet—How Dedicated Optical Networks are Transforming Computat...
Conferencia Web semantica Mihai Datcu
Ad

Viewers also liked (8)

PDF
Collaborative Digital Experiments
PDF
VO Course 04: VO architecture
PDF
LEDAS, AstroGrid and the Virtual Observatory
PDF
Multidimensional Data in the VO
PDF
The selection of guide stars for giant telescopes using Virtual Observatory t...
PPTX
PDF
Virtual observatory water_final
PDF
IPython Notebooks - Hacia los papers ejecutables
Collaborative Digital Experiments
VO Course 04: VO architecture
LEDAS, AstroGrid and the Virtual Observatory
Multidimensional Data in the VO
The selection of guide stars for giant telescopes using Virtual Observatory t...
Virtual observatory water_final
IPython Notebooks - Hacia los papers ejecutables
Ad

Similar to Digital Science (20)

PDF
Digital Science: Reproducibility and Visibility in Astronomy
PDF
Research Objects in Wf4Ever
PDF
Open Science and Executable Papers
PDF
RDFC2012 Open Access to Research Data
PPTX
The Web in Science and Research: A tour through four topics
PDF
Digital Science: Towards the executable paper
PPTX
Michener Plenary PPSR2012
PPTX
International Perspectives: Visualization in Science and Education
PPTX
Visualization: ACS Sp 2010 CINF Keynote
PPTX
Integrated Technology for Archaeological Imaging in the Field and in the Lab
PDF
VO Course 12: Workflows & the Wf4Ever project
PDF
Usability, Reusability and Reproducibility of Bioinformatic Applications
PPTX
Keynote speech - Carole Goble - Jisc Digital Festival 2015
PPTX
RARE and FAIR Science: Reproducibility and Research Objects
KEY
Wf4Ever: Work!ows for Methodology and Science Preservation
PPT
30_Eden.ppt
PDF
CODATA International Training Workshop in Big Data for Science for Researcher...
PPTX
Reasons to select research data and where to start
PDF
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
PPTX
Davis aas 2012 goddard 1
Digital Science: Reproducibility and Visibility in Astronomy
Research Objects in Wf4Ever
Open Science and Executable Papers
RDFC2012 Open Access to Research Data
The Web in Science and Research: A tour through four topics
Digital Science: Towards the executable paper
Michener Plenary PPSR2012
International Perspectives: Visualization in Science and Education
Visualization: ACS Sp 2010 CINF Keynote
Integrated Technology for Archaeological Imaging in the Field and in the Lab
VO Course 12: Workflows & the Wf4Ever project
Usability, Reusability and Reproducibility of Bioinformatic Applications
Keynote speech - Carole Goble - Jisc Digital Festival 2015
RARE and FAIR Science: Reproducibility and Research Objects
Wf4Ever: Work!ows for Methodology and Science Preservation
30_Eden.ppt
CODATA International Training Workshop in Big Data for Science for Researcher...
Reasons to select research data and where to start
Keynote IEEE International Workshop on Cloud Analytics. Dennis Gannon
Davis aas 2012 goddard 1

More from Jose Enrique Ruiz (15)

PDF
Jupyter notebooks on steroids
PDF
Velocity cubes of galaxies
PDF
Implementing a VO archive for datacubes of galaxies
PDF
Workflows to access and massage VOData
PDF
Curation and Characterization of Web Services
PDF
Wf4Ever: Workflow Preservation
PDF
Workflows in the Virtual Observatory
PDF
Use of CharDM in an archive of velocity cubes
PDF
Workflow Preservation
PDF
VO web-services-based astronomy workflows
PDF
Web services based workflows to deal with 3D data
PDF
Curating and Preserving Collaborative Digital Experiments
PDF
SVO Activities - SEA 2008
PDF
El Observatorio Virtual - eCA
PDF
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop
Jupyter notebooks on steroids
Velocity cubes of galaxies
Implementing a VO archive for datacubes of galaxies
Workflows to access and massage VOData
Curation and Characterization of Web Services
Wf4Ever: Workflow Preservation
Workflows in the Virtual Observatory
Use of CharDM in an archive of velocity cubes
Workflow Preservation
VO web-services-based astronomy workflows
Web services based workflows to deal with 3D data
Curating and Preserving Collaborative Digital Experiments
SVO Activities - SEA 2008
El Observatorio Virtual - eCA
B0DEGA 3D VO Archive - IVOA 2010 Fall Interop

Recently uploaded (20)

PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Pre independence Education in Inndia.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
01-Introduction-to-Information-Management.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
2.FourierTransform-ShortQuestionswithAnswers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Pre independence Education in Inndia.pdf
O7-L3 Supply Chain Operations - ICLT Program
Abdominal Access Techniques with Prof. Dr. R K Mishra
Microbial diseases, their pathogenesis and prophylaxis
Basic Mud Logging Guide for educational purpose
Final Presentation General Medicine 03-08-2024.pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
human mycosis Human fungal infections are called human mycosis..pptx
Insiders guide to clinical Medicine.pdf
01-Introduction-to-Information-Management.pdf

Digital Science

  • 1. The Reality of Reproducibility of in silico Science Prof Carole Goble FREng FBCS CITP JCDL Washington DC, June 2012
  • 2. X REUNIÓN CIENTÍFICA DE LA SOCIEDAD ESPAÑOLA DE ASTRONOMÍA VALENCIA 9/13 JULIO Digital Science Reproducibility and Visibility in Astronomy José Enrique Ruiz, Lourdes Verdes-Montenegro, Susana Sánchez, Julian Garrido, Juan de Dios Santander and the Wf4Ever Team SESIÓN INSTRUMENTACIÓN Y COMPUTACIÓN VALENCIA, VIERNES 13 JULIO 2012 2
  • 3. Digital Science - Reproducibility and Visibility in Astronomy Astronomy Research Lifecycle Astronomy research lifecycle is entirely digital »  Observation proposals »  Data reduction pipelines »  Analysis of science ready data »  Catalogs of objects and data »  Publish process ›  Final data results ›  Experiment in DL ADS/arXiv Reproducible research is still not A normalized preservation of possible in a digital world methodology is needed A rich infrastructure of data (VO) Tools is not efficiently used 3
  • 4. Digital Science - Reproducibility and Visibility in Astronomy The next generation of archives Much wider FoV and spectral coverage »  Large volumes for an observed datacube »  Subproducts are Virtual Data generated on-the-fly ASKAP Cubes Prof. Kevin Vinsen 4
  • 5. Digital Science - Reproducibility and Visibility in Astronomy The next generation of archives Automated surveys »  Huge amounts of tabular data »  Services for KDD Extraction of scientifically relevant information from a multidimensional parameter space »  Exploration services »  Anomaly detection »  Cross-matching data »  Dimensionality reduction 5
  • 6. Digital Science - Reproducibility and Visibility in Astronomy The next generation of archives »  A cloud of Web Services Archives should evolve from data providers into »  Virtual data providers »  Software tasks providers »  Archives speaking Web Services Astronomy of multi archives/facilities/wavelength Interconnected and interoperable archives »  Software Tasks Preservation »  Data Process should benefit of the same privileges acquired by data Preserving the method ensures replication of final results at any moment 6
  • 7. Digital Science - Reproducibility and Visibility in Astronomy Efficiency and Reuse Optimize return on investments made on big facilities »  Avoid duplication of efforts and reinvention »  How to discover and not duplicate ? »  How to re-use and not duplicate ? »  How to make use of best practices ? »  How to use the rich infrastructure of data ? »  Intellectual contributions are encoded in softw More data in archives does not imply more knowledge »  Time has come to go beyond the PDF »  Expose complete scientific record, not the story »  Allow easy discovery of methods and tools 7
  • 8. Digital Science - Reproducibility and Visibility in Astronomy Reproducibility and The Scientific Method Benefits »  Publishing knowledge, not advertising »  The author, the referee and the re-user »  Reputation, prestige and respect »  Higher quality of publications ›  Authors will be more careful ›  Many eyes to check results Challenges »  Hard and time consuming »  Need incentives – not rewarded now Initiatives »  Elsevier Executable Papers Challenge http://xkcd.com/242/ »  Open Data / Open Science 8
  • 9. Digital Science - Reproducibility and Visibility in Astronomy Reproducibility and The Scientific Method I don’t know how 9
  • 10. Digital Science - Reproducibility and Visibility in Astronomy Discovery, Visibility and Credit 10
  • 11. Digital Science - Reproducibility and Visibility in Astronomy Discovery, Visibility and Credit 11
  • 12. Digital Science - Reproducibility and Visibility in Astronomy Discovery, Visibility and Credit 12
  • 13. Digital Science - Reproducibility and Visibility in Astronomy Discovery, Visibility and Credit Exploring and understanding scientific metrics in citation 2010 Krapivin et al. 13
  • 14. Digital Science - Reproducibility and Visibility in Astronomy Discovery, Visibility and Credit Paper discovery: the social dimension #SEA2012 14
  • 15. Digital Science - Reproducibility and Visibility in Astronomy The Wf4Ever Project EU funded FP7 STREP Project December 2010 – December 2013 1.  Intelligent Software Components (ISOCO, Spain) 2.  University of Manchester (UNIMAN, UK) 2 3.  Universidad Politécnica de Madrid (UPM, Spain) 7 5 4 4.  Poznan Supercomputing and Networking Centre (PSNC, Poland) 5.  University of Oxford (OXF, UK) 6.  Instituto de Astrofísica de Andalucía (IAA, Spain) 31 7.  Leiden University Medical Centre (LUMC, NL) 6 15
  • 16. Digital Science - Reproducibility and Visibility in Astronomy Scientific Workflows Living Tutorials Templates for Re-use Expedites Training Reduce time to insight Avoids reinvention Digital Libraries of workflows may boost the use of the existing infrastructure of data (VO) 16
  • 17. Digital Science - Reproducibility and Visibility in Astronomy Scientific Workflows ! Survey in the domain of astrophysical workflows ! Scientific ›  Personal script-based recipes Insight •  Python, IDL, Software.. ›  Multi-archive VO recipes •  Euro-VO, IVOA.. ›  Internal group developments Accessible •  GRID, Clusters, Specific knowledge.. Shareable ›  Processing pipelines Reusable •  Facilities provide data, computing infrastructure, tools.. Adaptable Understandable »  Clarity (workflows) for re-use and re-porpuse vs. automation (pipelines) »  A black box is not re-usable, cannot be broken into parts »  Reproducibility vs. industrial paper publishing 17
  • 18. Digital Science - Reproducibility and Visibility in Astronomy Research Objects Organization is more sexy than automation Assistive building Completeness evaluation 18
  • 19. Digital Science - Reproducibility and Visibility in Astronomy Research Objects Expose experiment in a structured way in order to be understood Technical Objects Social Objects Distributed 19
  • 20. Digital Science - Reproducibility and Visibility in Astronomy Research Objects ! Similar initiatives in Astronomy ! »  Semantic curation of digital objects ›  CDS Centre Données Strasbourg ›  US Virtual Astronomical Observatory ›  SAO/NASA ADSLabs »  Workflow users platforms ›  Cyber-SKA ›  IceCore ›  Montage ›  Astro-WISE ›  Helio-VO »  Semantically auto descriptive WS ›  Workflows VO-France 20
  • 21. Digital Science - Reproducibility and Visibility in Astronomy Research Objects ADSLabs Initiative ADO Linked Components »  Authors »  Publications »  Journals »  Objects SIMBAD »  Tabular data behind the plots CDS »  ASCL reference of used software »  Observing time Proposals »  Used facilities, surveys or missions Incentives http://labs.adsabs.harvard.edu/ 21
  • 22. Digital Science - Reproducibility and Visibility in Astronomy Research Objects ! The Incentive ! Papers with data links are cited more than those without Effect of E-printing on Citation Rates in Astronomy and Physics 2006. Edwin A. Henneken et al. 22
  • 23. Digital Science - Reproducibility and Visibility in Astronomy The Wf4Ever Project »  Development of AstroTaverna plugins to access and manage VO data »  Development Golden Exemplars of astrophysical Workflows and Research Objects that use the Wf4Ever technological support ›  Curation of physical quantities in 1D catalogues •  Data retrieved from external repositories and stored locally •  Only local processes for calculations ›  Environment and Modelling from 1D catalogues and 2D images •  Data retrieved from external repositories (SDSS DR7) •  Local software and external web services as processes ›  Modelling and Analysis of 3D formatted data •  Only external data and processes 23
  • 24. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action ! Curation by inspecting propagation of changes in quantities ! Credit: Zsolt Frei and James E. Gunn. The Galaxy Catalog 24
  • 25. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action AMIGA Catalog Panchromatic properties for a sample of the most isolated nearby galaxies How is the User DB affected ? - Changes in External DB Evaluate variations with time - Modifications in Calculations - Modified External Quantities - Affected User DB Quantities Update External DB Calculations User DB 25
  • 26. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Curation by inspecting propagation of changes in quantities ! ! Multi-workflow Research Object 26
  • 27. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Create, annotate and run a workflow ! ! 27
  • 28. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Populate the Research Object and annotate Extract ! 28
  • 29. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Add documents and references 29
  • 30. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Create and explore relations among components ! ! 30
  • 31. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Add schema of the experiment 31
  • 32. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action Publication for later discovery Import and re-use ! 32
  • 33. Digital Science - Reproducibility and Visibility in Astronomy Astronomical Research Objects in Action ! Curation by inspecting propagation of changes in quantities ! »  Taverna 2.3 »  MyExperiment Pack ›  http://www.myexperiment.org/packs/231 Related Publication The AMIGA sample of isolated galaxies XI. A First Look at Isolated Galaxy Colors 2012 A&A 540, A.47 33
  • 34. Digital Science - Reproducibility and Visibility in Astronomy Conclusions How NOT to be a good e-astronomer »  Search the beautiful plot for high impact instead of real scientific results »  Write a obscure paper, do not say clearly how to reproduce the results »  Do things quickly and forget about them once you’ve submitted the paper »  Be untidy, spread your code and data in a variety of formats, folders and disks »  Practise the “data mine-ing” – input data are mine »  Practise the “data flirting” – call me if you would like to have more »  Do not provide data results, including the plots is just fine »  Always cite the same authors and papers or those that cite you »  Do not cite other resources than papers, neither provide their URL links »  Do not search info on Internet with other tools than ADS or arXiv »  Work alone and email/phone one friend if you have any doubt http://amiga.iaa.es/p/212-workflows.htm http://www.wf4ever-project.org jer@iaa.es bultako 34