This document discusses the need to study data science as a discipline through examining the processes, techniques, and outputs. It presents data science as consisting of iterative steps like forming hypotheses, collecting and analyzing data, and extracting results. Ontologies and platforms are proposed as tools to systematically describe datasets, licenses, models, and tasks. Case studies examine modeling data flows and understanding patterns in large data science systems. The document argues for an interdisciplinary approach and using techniques like science fiction to ensure data science is developed and applied responsibly through considering social and ethical implications.
Related topics: