This document discusses the creation of abstractions in scientific workflows. It hypothesizes that it is possible to automatically extract reusable patterns and abstractions from scientific workflow repositories that could be useful for developers. The document outlines challenges in workflow representation, abstraction, reuse, and annotation. It then describes an approach to define vocabularies and methodologies for publishing workflows as linked data. This includes defining a catalog of common workflow abstractions and techniques for finding and evaluating these abstractions across different workflow corpora. Evaluation shows the extracted patterns are similar to those defined by users and are considered useful.
Related topics: