SlideShare a Scribd company logo
Curating Research Data Volume One Practical
Strategies for Your Digital Repository 1st
Edition Lisa R Johnston install download
https://ebookmeta.com/product/curating-research-data-volume-one-
practical-strategies-for-your-digital-repository-1st-edition-
lisa-r-johnston/
Download more ebook from https://ebookmeta.com
We believe these products will be a great fit for you. Click
the link to download now, or visit ebookmeta.com
to discover even more!
Curating Research Data Volume Two A Handbook of Current
Practice 1st Edition Lisa R Johnston
https://ebookmeta.com/product/curating-research-data-volume-two-
a-handbook-of-current-practice-1st-edition-lisa-r-johnston/
Managing Your Research Data and Documentation 1st
Edition Kathy R. Berenson
https://ebookmeta.com/product/managing-your-research-data-and-
documentation-1st-edition-kathy-r-berenson/
Managing Your Research Data And Documentation 1st
Edition Kathy R. Berenson
https://ebookmeta.com/product/managing-your-research-data-and-
documentation-1st-edition-kathy-r-berenson-2/
Cryptocurrency: Concepts, Technology, and Applications
Jay Liebowitz (Editor)
https://ebookmeta.com/product/cryptocurrency-concepts-technology-
and-applications-jay-liebowitz-editor/
Animal Pollinators Jennifer Boothroyd
https://ebookmeta.com/product/animal-pollinators-jennifer-
boothroyd/
The Politics of Memory in Post Authoritarian
Transitions Volume Two Comparative Analysis 1st Edition
Joanna Marsza■ek-Kawa
https://ebookmeta.com/product/the-politics-of-memory-in-post-
authoritarian-transitions-volume-two-comparative-analysis-1st-
edition-joanna-marszalek-kawa/
Nothing But Good 1st Edition Tess Mckinley Mckinley
Tess
https://ebookmeta.com/product/nothing-but-good-1st-edition-tess-
mckinley-mckinley-tess/
A Field Guide to Gifted Students A Teacher s
Introduction to Identifying and Meeting the Needs of
Gifted Learners 1st Edition Charlotte Agell
https://ebookmeta.com/product/a-field-guide-to-gifted-students-a-
teacher-s-introduction-to-identifying-and-meeting-the-needs-of-
gifted-learners-1st-edition-charlotte-agell/
Disgraced Lords Of The Immortal Anthology Boxset Jen
Katemi Tamsin Baker Amelia Shaw Charmaine Ross Kim
Cleary
https://ebookmeta.com/product/disgraced-lords-of-the-immortal-
anthology-boxset-jen-katemi-tamsin-baker-amelia-shaw-charmaine-
ross-kim-cleary/
Fortschritte der Physik Progress of Physics Band 29
Heft 11 12
https://ebookmeta.com/product/fortschritte-der-physik-progress-
of-physics-band-29-heft-11-12/
Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston
Association of College and Research Libraries
A division of the American Library Association
Chicago, Illinois 2017
Curating
Research Data
Volume One: Practical
Strategies for Your Digital
Repository
edited by
Lisa R. Johnston
The paper used in this publication meets the minimum requirements of Ameri-
can National Standard for Information Sciences–Permanence of Paper for Printed
Library Materials, ANSI Z39.48-1992. ∞
Cataloging-in-Publication data is on file with the Library of Congress
Copyright ©2017 by the Association of College and Research Libraries.
All rights reserved except those which may be granted by Sections 107 and 108 of
the Copyright Revision Act of 1976.
Printed in the United States of America.
21 20 19 18 17 5 4 3 2 1
Cover image
Copyright: kentoh / 123RF Stock Photo (http://www.123rf.com/profile_kentoh)
iii
Table of Contents
1���������Introduction to Data Curation
Lisa R. Johnston
Data, Data Repositories, and Data Curation: Our Terminology
Why We Curate Research Data
The Challenge of Providing Data Curation Services
Reuse: the Ultimate Goal of Data Curation?
Conclusion
Notes
Bibliography
Part I. Setting the Stage for Data Curation. Policies,
Culture, and Collaboration
33�������Chapter 1. Research and the Changing Nature of Data
Repositories
Karen S. Baker and Ruth E. Duerr
Introduction
Background
Changing Support for Data
Expanding Support for Data in Natural and Social Sciences
Data Repository Diversity
Three Concepts at Work
Data Ecosystem: Growing Interdependence
Liaison Work and Mediation
Continuing Design: Standards, Systems, and Models
Changing Research Needs and New Initiatives
Final Thoughts
Notes
Bibliography
61�������Chapter 2. Institutional, Funder, and Journal Data
Policies
Kristin Briney, Abigail Goben, and Lisa Zilinski
Funding Agency Data Policies
Institutional Data Policies
Journal Data Policies
Navigating the Data Policy Landscape for Curation
Summary
Notes
Bibliography
iv TABLE OF CONTENTS
79�������Chapter 3. Collaborative Research Data Curation
Services: A View from Canada
Eugene Barsky, Larry Laliberté, Amber Leahey, and Leanne Trimble
Canadian Academic Library Involvement in Research Data
Management
Overview of Case Studies
Local Services: University of Alberta Libraries
Informal Regional Consortia: University of British Columbia Library
Formal Regional Consortia: The Ontario Council of University
Libraries
Data Repository Services in Canadian Libraries
Discovery and Access Platforms
Long-Term Preservation
Operational Costs of Data Repository Services
National Collaboration: Portage
Goal 1: Portage National Data Preservation Infrastructure
Goal 2: Portage Network of Expertise
Future Directions
Conclusions
Notes
Bibliography
103�����Chapter 4. Practices Do Not Make Perfect: Disciplinary
Data Sharing and Reuse Practices and Their
Implications for Repository Data Curation
Ixchel M. Faniel and Elizabeth Yakel
Introduction
Overview and Methodology for the DIPIR Project
Disciplinary Traditions for Data Sharing and Reuse
Social Scientists
Archaeologists
Zoologists
Data Reuse and Trust
Trust Marker: Data Producer
Trust Marker: Documentation
Trust Marker: Publications and Prior Reuse Indicators
Trust Marker: Repository Reputation
Sources of Additional Support for Data Reuse
Social Scientists
Archaeologists
Zoologists
Implications for Repository Practice
Conclusion
Acknowledgments
Notes
Bibliography
Table of Contents v
127�����Chapter 5. Overlooked and Overrated Data Sharing:
Why Some Scientists Are Confused and/or Dismissive
Heidi J. Imker
Data Sharing in Context
Overlooked Data Sharing: Article Publication
Overlooked Data Sharing: Supplemental Material
Overrated Data Sharing: Unsustained Community Resources
Overrated Data Sharing: Hyperbolic Arguments
Conclusions
Acknowledgments
Notes
Bibliography
Part II. Data Curation Services in Action
153�����Chapter 6. Research Data Services Maturity in
Academic Libraries
Inna Kouper, Kathleen Fear, Mayu Ishida, Christine Kollen, and
Sarah C. Williams
Introduction
Research Data and Libraries
The Current Landscape
RDS Maturity
Looking into the Future
Appendix 6A: Typology of Services and Their Descriptions on Websites
Notes
Bibliography
171�����Chapter 7. Extending Data Curation Service Models for
Academic Library and Institutional Repositories
Jon Wheeler
Introduction
Conceptual Models and Rationale
Alignment with Existing Roles and Capabilities
Applications: Requirements and Example Use Cases
Defining Stakeholder Interactions and Requirements
Harvesting and Metadata Processing
Content Curation and Packaging
Conclusion
Acknowledgments
Notes
Bibliography
193�����Chapter 8. Beyond Cost Recovery: Revenue Models and
Practices for Data Repositories in Academia
Karl Nilsen
Introduction
vi TABLE OF CONTENTS
From Costs to Revenue
Data Repository Revenue Models
Model 1: Public or Consortium
Model 2: Freemium
Model 3: Pay-to-Play
Model 4: Pay-if-You-Can or Pay-if-You-Want
Model 5: Grants
Model 6: Outside-Data
Common Challenges Associated with Revenue Practices
Conclusion
Notes
Bibliography
213�����Chapter 9. Current Outreach and Marketing Practices
for Research Data Repositories
Katherine J. Gerwig
The Survey
The Interviews
Measuring the Success of Repository Promotions
Successful Promotional Techniques
Unsuccessful Promotional Techniques
Target Audiences
Challenges to Increasing Awareness
Differences in Promoting the Institutional Repository and the Data
Repository
Looking for Inspiration
Discussion
Conclusion
Promotional Examples for Inspiration
Acknowledgments
Appendix 9A: Data Repository Promotional Practices—Initial Google
Survey
Notes
Bibliography
Part III. Preparing Data for the Future. Ethical and
Appropriate Reuse of Data
235�����Chapter 10. Open Exit: Reaching the End of the Data
Life Cycle
Andrea Ogier, Natsuko Nicholls, and Ryan Speer
Introduction
Comparative Exploration
“End of Life Cycle” Terminology
Scope
Authority
Appraisal Criteria
Resources (Human, Financial, and Spatial)
Table of Contents vii
Discussion
University Records and Information Management
Library Collections
Data Curation
Conclusion
Notes
Bibliography
251�����Chapter 11. The Current State of Meta-Repositories for
Data
Cynthia R. Hudson Vitale
Introduction
Community Initiatives and Solutions to Support Meta-Repositories of
Data
Methods
256.Results
Content
Functionality
Metadata
Discussion
Conclusion
Notes
Bibliography
263�����Chapter 12. Curation of Scientific Data at Risk of Loss:
Data Rescue and Dissemination
Robert R. Downs and Robert S. Chen
Benefits of Data Rescue
Challenges of Data Rescue for Repositories
Repository Considerations for Data Rescue
Rescue of the Millennium Ecosystem Assessment (MA) Data
Dissemination of the Millennium Ecosystem Assessment (MA) Data
Lessons Learned
Discussion and Conclusion
Acknowledgments
Notes
Bibliography
279�����Contributor Biographies
Editor Biography
Author Biographies
Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston
1
INTRODUCTION TO VOLUME ONE
Introduction to Data
Curation
Lisa R. Johnston
As varied as they can be rare and precious, data are becoming the proverbial coin
of the digital realm: a research commodity that might purchase reputation credit
in a disciplinary culture of data sharing or buy transparency when faced with
funding agency mandates or publisher scrutiny. Unlike most monetary systems,
however, digital data can flow in all too great abundance. Not only does this cur-
rency actually “grow” on trees, but it comes from animals, books, thoughts, and
each of us! And that is what makes data curation so essential. The abundance of
digital research data challenges library and information science professionals to
harness this flow of information streaming from research discovery and scholarly
pursuit and preserve the unique evidence for future use. Our expertise as curators
can help ensure the resiliency of digital data, and the information it represents, by
addressing how the meaning, integrity, and provenance of digital data generated
by researchers today will be captured and conveyed to future researchers over time.
The focus of Curating Research Data, Volume One: Practical Strategies for Your
Digital Repository and the companion Volume Two: A Handbook of Current Prac-
tice is to present those tasked with long-term stewardship of digital research data
a blueprint for how to curate data for eventual reuse. There are many motivations
for storing and preserving data, but the ultimate goal of reuse by others will be
a theme for all that follows. Following a brief overview to the terminology used
in the two volumes, this introduction will explore the external motivations that
impact why we develop data curation services and the driving forces behind why
researchers share their data, including federal data management requirements,
publisher policies for data sharing, and an overall sea change of disciplinary ex-
pectations for digital data exchange. Next, this chapter will dive into some of the
2 Introduction to Volume One
challenges that practitioners in the library and archival fields face when curating
digital research data as well as some emerging solutions. In closing we will ex-
plore the sea change stemming from data reuse, from the disruptive effects that
data transparency and the reproducibility movement have had on the scholarly
communication life cycle to the potentially democratizing effect of digital data
availability worldwide.
Data, Data Repositories, and Data
Curation: Our Terminology
Data is an evolving term. At its core, data can be any information that is factual
and can be analyzed. Data is “information in numerical form that can be digitally
transmitted or processed.” But in the research setting, data can be more abstract
and consist of any information object (numerical or otherwise).1
For information
science professionals, the term ‘research data’ has been recently defined as:
“data that are used as primary sources to support technical or
scientific enquiry, research, scholarship, or artistic activity, and
that are used as evidence in the research process and/or are
commonly accepted in the research community as necessary
to validate research findings and results…. Research data may
be experimental, observational, operational, data from a third
party, from the public sector, monitoring data, processed data,
or repurposed data.
Data are defined in the Digital Curation Center (DCC) Curation Lifecycle
Model as “any information in the binary digital form” and is treated there in the
sense of any digital information that be taken in a broad perspective.3
Harvey
describes the breadth of data as encompassing all things digital, based on the
UNESCO’s Guidelines for the Preservation of Digital Heritage and takes into
account the more subtle nuances of NSF’s description of “scientific data” to cre-
ate a list of data objects to include:
• Data sets: Observational, computational, simulated, or otherwise re-
corded output
• Digital collections: A grouping of digital objects, such as a photo archive
or a vast text-based library of digitized books, can be interpreted as one
data set
• Learning objects: Videos, digital online tutorials
• Multimedia: Recordings of film, music, and performance art
• Software: Applications including the code and documentation files4
Introduction to Data Curation 3
Sometimes primarily associated with the sciences, data can be found in any
discipline and in many forms.5
Data may be raw (e.g., numbers collected by
an instrument), aggregated from multiple sources, or the product of a model,
simulation, or visualization (e.g., a graphic or video). Digital humanities data
might include digitized or born-digital texts and monographs, digital image li-
braries, and 3D models, such as those used for historic reconstruction of ancient
or mythological sites.6
Social scientists produce large quantities of data, including
survey data and observational data, such as complex human activity and interac-
tions captured via sensors or video.7
Outside of research, the business, industry,
and commerce sectors produce “big data” that is used to better understand re-
search questions about human behavior, and as a result a growing (and some-
times nefarious) economy of selling the transactional data derived from business
has emerged.8
With the explosion of digital data produced by modern research or recorded
through our general day-to-day activity, digital data repositories are storing vast
amounts of information. Data repositories preserve information “by taking own-
ership of the records, ensuring that they are understandable to the accessing com-
munity, and managing them so as to preserve their information content and Au-
thenticity.”9
The co-authors of the “Key Components of Data Publishing” report
use the practitioner-based Research Data Alliance (RDA) definitions developed
by the Data Foundations and Terminology Working Group and the Research
Data Canada’s Glossary of Terms and Definitions to define digital repositories as:
A repository (also referred to as a data repository or digital
data repository) is a searchable and queryable interfacing entity
that is able to store, manage, maintain and curate Data/Dig-
ital Objects. A repository is a managed location (destination,
directory or ‘bucket’) where digital data objects are registered,
permanently stored, made accessible and retrievable, and curat-
ed. Repositories preserve, manage, and provide access to many
types of digital material in a variety of formats. Materials in
online repositories are curated to enable search, discovery, and
reuse. There must be sufficient control for the digital material
to be authentic, reliable, accessible and usable on a continuing
basis.10
Additionally, the 2005 National Science Board anticipated the need for data
repositories, stating that:
It is exceedingly rare that fundamentally new approaches to
research and education arise. Information technology has ush-
4 Introduction to Volume One
ered in such a fundamental change. Digital data collections are
at the heart of this change. They enable analysis at unprece-
dented levels of accuracy and sophistication and provide novel
insights through innovative information integration. Through
their very size and complexity, such digital collections provide
new phenomena for study. At the same time, such collections
are a powerful force for inclusion, removing barriers to partici-
pation at all ages and levels of education.11
Simply put: data includes a wide range of information, and data repositories
retain this information for reuse. Therefore our challenge as data curators is to
apply the archival principles of library and information sciences to a wide-variety
of complex data objects from all disciplines and prepare them for ingest, access,
and long-term preservation within an environment (such as a data repository)
that facilitates discovery and access while not diminishing their context, authen-
ticity, and value. No short order. As data curators we effectively become the first
users of the data. In doing so we may review the various aspects of the data (such
as arrangement, completeness, clarity, and quality), identify any reuse issues early
on, and work with the data author to correct these issues. This concept is very
important considering the long-term burden of ingesting and storing research
data in our repositories. We need to first verify that those data can be understood
and do our best to optimize them for reuse. Otherwise, our data repository can
still do all of the things listed in the RDA definition above, the only difference
being that the data might not be usable.
It is the variety and complexity of data, and its context, that make it much
more difficult to preserve so that others might make use of it. Therefore our
definition of data curation must also include verifying that all of the essential
metadata and supplementary information, describing what the data is and how
to understand it, are curated as well. For example, ensuring that supplementary
files to the dataset, like codebooks, data dictionaries, schemas, and readme files
provide the additional documentation needed to understand the file contents is a
key step in the data curation process.
The optimization aspect can be found in the “adds values” statement of the
University of Illinois’ School of Information Sciences Data Curation Specializa-
tion definition for data curation as
the active and ongoing management of data through its life-
cycle of interest and usefulness to scholarship, science, and
education. Data curation enables data discovery and retrieval,
maintains data quality, adds value, and provides for re-use over
time through activities including authentication, archiving,
management, preservation, and representation.12
Introduction to Data Curation 5
However these concepts also apply to any digital object (for example, a book
or an article), not necessarily just data, and therefore data curation is understood
as a subset of digital curation which covers all types of digital information.13
In
short, the goal of data curation is to prepare research outputs in ways that make it
useful beyond its original purpose, ensure completeness, and facilitate long-term
citability.
Volume One of Curating Research Data explores the variety of reasons, mo-
tivations, and drivers for why data curation services are needed in the context of
academic and disciplinary data repository efforts. The following twelve chapters,
divided into three parts, take an in-depth look at the complex practice of data cu-
ration as it emerges around us. Part I sets the stage for data curation by describing
current policies, data sharing cultures, and collaborative efforts underway that
impact potential services. Part II brings several key issues, such as cost recovery
and marketing strategy, into focus for practitioners when considering how to put
data curation services into action. Finally, Part III describes the full life cycle of
data by examining the ethical and practical reuse issues that data curation practi-
tioners must consider as we strive to prepare data for the future.
Why We Curate Research Data
In Part I, Setting the Stage for Data Curation: Policies, Culture and Collaboration,
we explore the factors that influence our actions to provide data curation services
for research data. Some factors include incentives, both scholarly positive and
negative, from the funding bodies and the scholarly publishing entities. Other
factors come directly from the research communities themselves, some of which
are demanding greater transparency in research. These motivations can some-
times be indirect or at even at odds with a researcher’s goals.14
Overall the poli-
cies, culture, and collaborations involved with data curation provide us with an
interesting canvas with which to begin our work.
One driving force that leads library and information science practitioners to
provide data curation services is the inherent fact that digital data are more easily
shared. Data have always held value beyond their original purpose, and today,
digital data can travel and reach worldwide audiences at unprecedented speeds
with incremental costs. A 1989 National Academies of Sciences panel described
the impact of information technology on research in the sciences, engineering,
and clinical research as improving collaboration among researchers “more widely
and efficiently” by reducing “the constraints of speed, cost, and distance from
the researcher.”15
And incentives to collaborate across institutional or disciplinary
boundaries have boomed. Rates of co-authorship are increasing not only in the
sciences but across disciplines that were traditionally solo-researcher focused such
as the social sciences.16
In short, digital data presents researchers with many new
6 Introduction to Volume One
ways of working collaboratively across institutional and geographic boundaries.
In Chapter 1, “Research and the Changing Nature of Data Repositories,”
Karen S. Baker and Ruth E. Duerr draw from their experiences working at
large scientific data repositories to explore data management and curation
in the broader landscape of disciplinary research. They describe how reposito-
ries, which initially were designed for highly structured data housed at key disci-
plinary repositories, have now emerged at the center of a modern ‘data ecosystem’
proliferated by the emerging requirements to openly, and ethically, disseminate
research data. Their examples of early data registries and international data orga-
nizations—and the various stakeholders involved—paint a complex picture and
provide excellent food for thought as our authors ask us to ponder how library
data professionals contribute to and coordinate with the broader ecosystem of
data repositories.
Another significant, and more opaque, driver for data curation services
are the emerging funding requirements for data sharing. Over the last several
years, national funding agencies and political administrations worldwide have
developed a growing awareness of and the need for public access to the re-
sults of government-funded research and the long-term preservation of these
unique digital research data sets.17
For example, a key turning point in the
US was the February 22, 2013 memorandum18
by the White House Office
of Science and Technology Policy (OSTP) directing federal agencies to devel-
op plans to ensure all resulting publications and research data are publically
accessible. The memo’s requirements for sharing digital research data in ways
that make the data “publicly accessible to search, retrieve, and analyze” sug-
gested that federally funded researchers will soon be faced with many new
requirements that:
• Ensure that the data are richly described with machine-actionable meta-
data
• Ensure that data are complete, self-explanatory, and accurate (quality)
• Protect confidentiality and privacy when making data available (e.g.,
remove identifiers, virtual data enclaves)
• Account for the long-term access and preservation needs that go beyond
the life of a grant.
• Identify and/or create trusted digital repositories to steward data over
time19
Three years after the OSTP directive, “policies to make data and publica-
tions resulting from federally funded research publicly accessible are becoming
the norm.”20
Interestingly these efforts for sharing nationally funded research
data run parallel to an open data movement for government-authored data. This
movement is characterized by the G8 adoption of the “Open Data Charter” in
June 2013 and demonstrated by the principles set forth in the US Open Data
Action Plan released in 2014.21
And not only federal funders that have moved the
Introduction to Data Curation 7
needle towards open. Private funders of research, such as the Ford Foundation,
the Alfred P. Sloan Foundation, and the Bill & Melinda Gates Foundation, now
require their funded projects release underlying data with some degree of open-
ness.22
For a detailed listing of the current policies of federal agency responses to
the OSTP memo, see SPARC Open Data’s resource for Research Funder Data
Sharing Policies.23
Complex? Absolutely. Fortunately, Chapter 2, titled “Institutional,
Funder, and Journal Data Policies” by Kristin Briney, Abigail Goben, and
Lisa D. Zilinski, does an excellent job of describing the current landscape
of funder mandates for data as well as other top-down drivers for curation
services. For example, in 2009 the National Academies of Sciences put out a call
for better standards for data sharing in ways that support reproducibility through
the ethical sharing of data along with published research results. Authors of this
report included editors of scientific journals that cited the emerging problem
of “misguided efforts to clarify results” by distorting, falsifying, or even faking
data.24
This trend continues today and sources such as Retraction Watch regularly
report examples of publishers responding to data-related issues in publications.25
As a result, many journals have implemented policies to make the underlying
data for an article more open to replication and validation. According to several
studies such as Fear, Piwowar & Chapman, and Naughton & Kernohan of the
Jisc-funded Journal of Research Data policy bank (JoRD) project, journal data
sharing requirements come in many forms.26
The latter in particular, after review-
ing the data policies of nearly 400 journals, found that half did not have a data
sharing policy and of those that did, 76 percent were found to be weakly worded
and vague. In response the JoRD project developed a model data sharing policy
that could be implemented by any organization.27
Some prominent examples of
journal data sharing policies include Nature, where “authors are required to make
materials, data, code, and associated protocols promptly available to readers with-
out undue qualifications.” The PLOS data sharing policy goes one step further to
say “Refusal to share data and related metadata and methods in accordance with
this policy will be grounds for rejection.”28
Indeed, one such retraction occurred
in 2015, albeit in a different journal (Frontiers in Neuroscience), due to an author
refusing to share their data.29
Going beyond publisher requirements to simply make data accessible and
linked to the article (see for example Elsevier’s platform for linking data in
data repositories such as PANGEA), some publishers have created new jour-
nals that provide a venue for “data papers” or the long-form description of a
dataset in conjunction with the data release.30
Examples include Springer-Na-
ture’s Scientific Data and Elsevier’s Data in Brief that both launched in 2014.
The latter reports “an exponential rise in data articles over the six quarters
since the journal came into existence, with approximately 300 publications ex-
pected in 2016 Q1.”31
An independent survey of 116 data journals found that
8 Introduction to Volume One
the growth in data papers nearly doubled from 2012 to 2013 and continues
to rise at an incredible rate.32
Yet, one of the curious aspects of data journals is
that the data are often not provided by the journal but rather “[the publisher
does] not consider the publication of data as part of their own mission.”33
For
example, Scientific Data suggests a list of recommended data repositories for
deposit since “we do not ourselves host data. Instead, we ask authors to submit
datasets to an appropriate public data repository.”34
It seems that scholarly
communication is still rapidly adjusting to the new norm of data sharing and
our data curation services will directly provide authors with the much-needed
support.
International collaborations providing incentives for data curation ser-
vices might be key. In 2004, many countries from Europe and others such as
Australia, the US, and Canada signed the “Declaration on Access to Research
Data from Public Funding” by the Organisation for Economic Co-operation
and Development’s (OECDs) Committee for Scientific and Technological
Policy, which set the stage for open access to digital research data result-
ing from public funding.35
The results stemming from this Declaration have
been substantial. In the United Kingdom, the seven councils of the Research
Council UK (RCUK) and the private funder, the Wellcome Trust, have each
established a policy on access to data in the years following the RCUKs 2011
report on “Common Principles on Data Policy.”36
The European Commis-
sion has established a pilot program for data sharing through its Horizon
2020 granting arm.37
And Canada’s three federal granting agencies are mov-
ing toward policies for research data such as those explored by Shearer in the
comprehensive 2011 “Brief on Open Access to Publications and Research
Data.”38
In Chapter 3, “Collaborative Research Data Curation Services: A
View from Canada,” Eugene Barsky, Larry Laliberté, Amber Leahey, and
Leanne Trimble provide in-depth case studies from their respective in-
stitutions, the University of British Columbia, the University of Alberta,
and the Scholars Portal for the Ontario Council of University Libraries.
The three case studies are presented in the context of Canada’s overarching
national infrastructure initiative, the ambitious Portage network developed
by the Canadian Association of Research Libraries (CARL).39
An exciting
collaborative project, Portage aims to integrate existing research data reposi-
tories within a robust national discovery and preservation infrastructure net-
work for all Canadian research data. Moreover the project will bring together
library-based experts in order to share data management consultation services
across a broader network. This national effort appears similar to the role that
the JISC has played in the UK with its Research Data Management Shared
Service Project and, on a much smaller scale for sharing curation staff exper-
tise across institutions, the Data Curation Network project that your editor
recently helped launch in the US in 2016.40
Introduction to Data Curation 9
In Chapter 4, different disciplinary and cultural norms of how data
reuse are explored by Ixchel M. Faniel and Elizabeth Yakel, who draw
from ethnographic research with archaeologists, quantitative social sci-
entists, and zoologists in “Practices Do Not Make Perfect: Disciplinary
Data Sharing and Reuse Practices and Their Implications for Repository
Data Curation.” To synthesize disciplinary data sharing and reuse findings
the authors partner with three repositories—the Inter-university Consortium
for Social and Political Research (ICPSR), Open Context, and the University
of Michigan Museum of Zoology (UMMZ)—to obtain data reuse stories and
even download statistics. Their study reveals the dependencies between how
data are shared and how data are reused with emphasis on the differences in
disciplines, and explores the interesting elements of “trust” in the data ex-
changed.
In Chapter 5, “Overlooked and Overrated Data Sharing: Why So Many
Scientists are Confused and/or Dismissive,” Heidi J. Imker aptly focuses our
attention away from scientists not or wrongly sharing their data to how often
scientists share their data, and have historically been sharing data long be-
fore public access requirements. This chapter presents the idea that traditional
methods of data sharing, though not generally meant for preservation purposes,
are still valid forms of sharing within the discipline. For example, sharing data via
publication in the traditional journal article is still very common, though much
of this data is often fixed in graphs or charts found in the body of the article and
therefore impractical or labor-intensive to reuse.41
As one blogger quips, “‘Send
me your data—pdf is fine,’ said no one ever.”42
Similarly, lengthy data tables his-
torically induced costly page fees and data supplements to journal articles have
been criticized as unstable and “far harder to locate than [data] in public repos-
itories.”43
Other widespread data sharing approaches, such as posting data to a
project website or sharing data upon request, may not sustainable for the long-
term. For example, research has shown that ‘available by request’ does not work
and furthermore that the availability of data declines rapidly with age.44
Yet, data
sharing is still happening and data curation efforts may help mitigate these error
prone approaches. Imkers’ exploration of these “overlooked” methods will help
data curators and librarians providing data services become better educated in the
larger picture of scholarly data exchange.
The Challenge of Providing Data
Curation Services
In Part II, Data Curation Services in Action, we explore several examples of institu-
tions already providing data curation services, review their service offerings, un-
10 Introduction to Volume One
derstand their technology infrastructure, and explore some of their challenging
constraints, such as identifying appropriate cost-recovery models and rolling out
promotion and marketing strategies that resonate with end users.
In addition to the chapters described here, there are many practi-
cal examples to be found in this book’s companion volume Curating
Research Data, Volume Two: A Handbook of Current Practice which
collects 30 practitioner case studies from institutional, disciplinary,
and national data repositories in an eight-step workflow for data cu-
ration, from receiving to reuse.
Putting data curation into context within the broader range of research
data management services is essential as libraries shift toward progressively more
responsible data stewardship roles at their institutions (see Figure Intro.1). For
example, Witt describes the “information bottleneck” as a place where libraries
can use data curation to help push valuable data sets beyond the laboratory
and out to the broader research community.45
Choudhury paints a rather bleak
picture of the state of institutional repositories in 2008 and recommends data
curation as a place of redemption for libraries in the larger scholarly communi-
cation landscape.46
In Chapter 6, authors Inna Kouper, Kathleen Fear, Mayu
Ishida, Christine Kollen, and Sarah C. Williams address how far we have
come with an empirical analysis of research data services provided by the
Association of Research Libraries (ARL) in “Research Data Services Matu-
rity in Academic Libraries.” As the title suggests, the results of their study of
current ARL service offerings are categorized by frequency into topographical
levels and present a vocabulary for describing research data services (RDS). They
find that basic services, such as data management plan consultations and data
management workshops, were practiced in over 50% of their sample, while in-
termediate services, such as data deposit into repositories and data preservation,
were only found in 15 percent to 50 percent of the group. Finally, the concept
of data curation is found in less than 15 percent of the sample and labeled as
an advanced service, which includes other services such as data and researcher
IDs and data analysis. Their discussion of how these RDS concepts interrelate
to one another provides an excellent snapshot at the evolving vernacular, if not
actual nature, of our field. For example, the concept of data curation was still
an emerging topic within the library science, archival, and information sciences
disciplines just a few years ago and in fact very few academic libraries were suc-
cessfully offering data curation services at all according to a study in 2011.47
The
RDS maturity model presents an opportunity to self-measure the actions our
library takes in the broad arena of data services and allows us to strive to expand
them to the next level.
Introduction to Data Curation 11
Data Curation
Data Repositories
Research Data Services
FIGURE INTRO.1
Data curation as a subset of research data services. Note that data curation
services may support or overlap with local data repository services, or
curation services may be provided for data that are deposited elsewhere,
such as disciplinary repositories or non-accessible (dark) storage.
The next chapter in this volume provides an excellent case study in one ac-
ademic library’s ascendance from basic to advanced data services. In Chapter 7,
Jon Wheeler describes how academic library-run institutional repositories
might be adapted to provide complementary platforms for data publication
alongside disciplinary repositories in “Extending Data Curation Service
Models for Academic Library and Institutional Repositories.” Here the con-
flation between data sharing and data preservation come to a head. While aca-
demic researchers may deposit their data into disciplinary repositories to achieve
one, then may not always be gaining the other. Wheeler presents data repository
mirroring as one way for academic libraries to compliment successful disciplinary
data repository efforts and goes on to provide several illustrative examples of
“data mirroring” efforts underway with the University of New Mexico (UNM)
Libraries. This example is unique by connecting an institutional repository to
established disciplinary data repositories and collaborating their efforts. Disci-
plinary repositories such as Flybase, PLEXdb, and the Cambridge Structural
12 Introduction to Volume One
Database present the collective data outputs of a sub-topic in publicly accessi-
ble platforms designed to allow for widespread reuse of the data.48
Within the
context of disciplinary data repositories, several repository best practices for data
curation emerge. For example, DataOne continues to educate the field by host-
ing workshops and publishing guides on research data management and software
tools.49
Their in-depth resources help researchers better prepare their data for
eventual deposit into the DataOne connected archives.50
Similarly detailed data
curation instructions for oceanographic researchers are presented in the Ocean
Data Publication Cookbook, which describes step-by-step instructions for cu-
rating disciplinary data from their field and applying digital object identifiers
(DOIs) as a central component to the curation approach.51
Greater collaboration between the stakeholders of disciplinary and institution-
al data repositories would enhance our collective understanding of data curation
best practices. In one area in particular there are several lessons to be learned: finan-
cial cost models for sustaining data repositories. Disciplinary data repositories have
been grappling with how to maintain financial support beyond their initial start-
up phase (often provided in the form of seed or grant funding) for decades.52
For
example, Ember and colleagues note the dichotomy between the long-term pres-
ervation costs of maintaining digital data, often indefinitely, with the periodic and
uncertain grant support on which these repositories must rely.53
Their white paper,
resulting from a 2013 summit with representatives from twenty two disciplinary
data repositories, evaluated several funding models and found both advantages and
disadvantages. Their goals of meeting long-term sustainability, open access, and po-
tential for equity by all depositors were not met by a single approach. For example,
charging user fees to access data in the repository would limit open access, while de-
positor-incurred submission fees would lower equity for individual depositors not
backed by generous grants or institutional open access funds. Only one approach
(not currently in place in the US but found in other nations) appeared to provide
a good balance: the infrastructure model. This was described as, “Funding agencies
pay for archives directly as a necessary aspect of research infrastructure. The funding
model is structured for long-term investment, rather than being tied to three-year
grant cycles.”54
Chapter 8 draws from these cost models and many more in “Be-
yond Cost Recovery: Entrepreneurial Business Models for Data Curation in
Academia,” in which Karl Nilsen reviews and compares the popular models for
financing data curation efforts and reports on a new business model emerging
at the University of Maryland Libraries.
One potentially effective way to secure funding for your data repository may
be to demonstrate positive use trends: both in data curation activities as well as
reuse of the data your repository maintains. But the challenge here is determining
how best to market and promote services to our intended audiences. In Chapter 9,
“Current Outreach and Marketing Practices for Research Data Repositories,”
Katherine J. Gerwig from Metropolitan State University provides a mixed
Introduction to Data Curation 13
methods approach to understanding the current data repository marketing
and outreach strategies employed by over a dozen academic institutions. Based
on survey and interview results, Gerwig makes recommendations for those strug-
gling to get the word out about their data curation services. For example, providing
library liaisons, who are often embedded within their departmental cultures, with
targeted messaging about the services in the form of presentation slides or an ele-
vator speech was shown as one means of successful outreach activity. The lessons
learned from current outreach efforts also demonstrates how libraries should re-
frame the data repository and curation efforts around the positive incentives for
sharing data rather than the sharing requirements themselves: such as a means of
advancing knowledge in their field or by facilitating reproduction and verification.
Reuse: the Ultimate Goal of Data
Curation?
Part III, Preparing Data for the Future, explores the outcomes of data curation
efforts in numerous ways. If the ultimate goal of data curation is reuse, then
how data are reused will inform the development of our services and best prac-
tices. But perhaps this is a thankless task? One illustrative quote comes from the
introduction to a 2002 technical report, written by astronomer and Microsoft
researcher Jim Gray, that aptly demonstrates the potentially uphill battle we face:
Once published, scientific data should remain available forever
so that other scientists can reproduce the results and do new
science with the data. Data may be used long after the project
that gathered it ends. Later users will not implicitly know the
details of how the data was gathered and prepared. To under-
stand the data, those later users need the metadata: (1) how
the instruments were designed and built; (2) when, where, and
how the data was gathered; and (3) a careful description of the
processing steps that led to the derived data products that are
typically used for scientific data analysis. It’s fine to say that
scientists should record and preserve all this information, but it
is far too laborious and expensive to document everything. The
scientist wants to do science, not be a clerk. And besides, who
cares? Most data is never looked at again anyway.55
The clarity and examples for types of “metadata” needed for successful data
reuse in this example is impressive. Yet the sentiment that most data would not
be looked at again does not hold up just over a decade later.
14 Introduction to Volume One
Instead, we are experiencing a dramatic shift in how data are reused, not only
to “do new science,” but also because data reuse may increase a paper’s potential
research impact, provide greater transparency to the results, and in some cases,
can even make or break an individual’s career.56
The research disciplines are often
the driving force in the reproducibility (or replicability) movement using data
sharing to build greater expectations for rerunning experiments, providing in-
dependent confirmations or validation of the research results, and more quickly
identifying false findings.57
Again, remembering that digital data are more eas-
ily shared, it is not surprising to ask researchers to provide the digital evidence
of their findings for validation purposes. Some disciplines have embraced data
transparency and provide portals and virtual hubs to share data and discuss re-
sults.58
In one instance, national policy has embraced this idea of validation and
Irish researchers are subject to external scrutiny when it comes to data presented
in papers or captured in lab notebooks.59
Not everyone agrees that data transparency to the extreme is a positive trend.
One 2016 editorial in Nature explains: ‘The progress of research demands trans-
parency. But as scientists work to boost rigor, they risk making science more
vulnerable to attacks. Awareness of tactics is paramount.”60
They go on to provide
10 ways to “distinguish scrutiny from harassment.”61
Another controversial take
on data reuse issues erupted when the editor-in-chief of The New England Journal
of Medicine (NEJM) published a sharply-worded editorial casting the role of data
reuser as
…people who had nothing to do with the design and execution
of the study but use another group’s data for their own ends,
possibly stealing from the research productivity planned by the
data gatherers, or even use the data to try to disprove what the
original investigators had posited. There is concern among some
front-line researchers that the system will be taken over by what
some researchers have characterized as ‘research parasites.’62
A journalist from Forbes magazine drew an interesting comparable of the sit-
uation by suggesting, “In just four years, it seems, data science has devolved from
the ‘sexiest job of the 21st century’ to a community of ‘research parasites,’” where
the former linked to the widely cited Harvard Business Review report describing
informatics-based jobs as exciting and lucrative career choices.63
But the NEJM
editorial, though sensational in some respects, does go on to make the point
that researchers don’t want to be scooped, they don’t want to be proven wrong
or taken out of context, and they are worried about not getting credit. Another
researcher from a completely different field has a similar story. As co-author on
a huge data sharing success story, the SnapShot Serengeti project hosted on the
Introduction to Data Curation 15
community science driven platform Zooniverse, Kosmala describes some of the
pressures faced by early career researchers to publish their results (in the form of
traditional publications) and get scholarly credit for their work.64
Data sharing,
she argues, though admirable, removes overarching control over the data so that
anyone else could use it, with your permission or not. On the other hand, when
data are shared with conditions of co-authorship, the loss of control converts
itself into an opportunity (even expectation) of collaboration. As data curators
we must be keenly aware of these disincentives. Data sharing may be great for
end users of data, but it can be not-so-great for the data creators. In addition to
researcher fears, there are costs involved with data sharing in terms of time (and
occasionally monetary investments), muddy ownership claims at stake, and well,
data sharing can just be a “pain in the ass…”65
In short, there is a lack of incen-
tives for researchers to share: few carrots but many sticks.
Therefore, an additional role for data curators may be to understand and
assist as much as possible in the ethical and appropriate reuse of data.
Library and information science professionals so often deal with the
end-product in the scholarly communication pipeline, collecting the published
finale of research: the papers, monographs, maps, and other well-formatted re-
cords of scholarship. Archives and special collections, on the other hand, cover a
larger swath of the research process by also collecting the creation and evolution
of a work in the form of an edited manuscript, unlabeled photos, and the order
in which press clippings were arranged.66
Research data curation may fall some-
where in between and be viewed as one way to bridge that gap of creation and
final product by working with data creators to prepare their data for eventual
publication, context and all. In Chapter 10, “Open Exit: Reaching the End of
the Data Lifecycle,” Andrea Ogier, Natsuko Nicholls, and Ryan Speer argue
that data retention should be considered iteratively throughout the data life
cycle and that knowledge gained from university records and information
management, and library collection management can be applied to data cu-
ration efforts in order to assist with planned data obsolescence. Rather than
assume reuse potential for all data, our authors appropriately ask us to define
better appraisal criteria to make critical selections for which data to retain and
which data to dispose for reasons that incorporate the assessment of liability, risk,
or resource cost over potential value.
But what happens once data have fallen into obsolescence? Looking the op-
posite direction, Chapter 12 by Robert R. Downs and Robert S. Chen asks:
when should data be resurrected? They describe the data curation actions
that might be taken in order to protect data that are experiencing less than
ideal conditions in “Curation of Scientific Data at Risk of Loss: Data Res-
cue and Dissemination.” Their data rescue examples involve a data set that was
originally housed in the National Biological Information Infrastructure (NBII)
program of the United States Geological Survey (USGS). This repository is a
16 Introduction to Volume One
favorite among instructors of data information literacy due to its abrupt closure
in response to federal budget cuts.67
The digital archive was permanently taken
offline in January 2012. Here our authors provide not only practical experiences
from a data rescue effort but general advice on the benefits and challenges of
these attempts. Their balanced recommendations to identify critical and timely
documentation rather than strive for completeness are underscored by the rel-
evant case study presented with the NBII dataset. Particularly notable are the
intellectual property and ownership issues encountered with orphaned data as
time passes, and their recommendation for data curators to apply metadata now,
even at the most basic level, in order to help future curators pull out the details
of the dataset in the possibly all-too-near future.
Finally, I’ll close this introduction to Volume One with a focus on issues of
worldwide access and discovery of data. This is an essential component of data
curation and data discovery can be a key factor for prompting worldwide inclu-
sivity in research. The 2005 NSB report projects that “Long-lived digital data
collections are powerful catalysts for progress and for democratization of science
and education.”68
Yet in 2015, Sorrono et al. argue that the inclusivity of data
sharing is not well-discussed nor yet fully realized:
…a critical shift that is happening in both society and the envi-
ronmental science community that makes data sharing not just
good but ethically obligatory. This is a shift toward the ethical
value of promoting inclusivity within and beyond science. An es-
sential element of a truly inclusionary and democratic approach
to science is to share data through publicly accessible data sets.69
Why? Because open data benefits science, enhances social and economic
development, and, according to one Australian study, can even be significantly
profitable.70
In Chapter 11, “The Current State of Linked Data Repositories: A Com-
parative Analysis,” Cynthia R. Hudson Vitale assesses the impact of the com-
plexity of data sharing options available to researchers and observes that as
a result data may be scattered across various institutional, disciplinary, or
general repositories. One possible solution is open and federated “meta-repos-
itories” that search across the collective holdings of disparate data repositories.
Lynch described this transition of data sharing practices as going from “journals
[that] offer to accept it as ‘supplementary materials’ that accompany the arti-
cle” to a future of repositories of machine-readable digital data that can be “data
mined” for the generation of new knowledge.71
Hudson Vitale explores how this far end of the spectrum is emerging and
compares thirteen linked data repositories, their underlying missions, and their
technical approaches to federating data search and discovery using a website anal-
Introduction to Data Curation 17
ysis across fifteen variables. The future of data reuse rests on the discoverability
of data to potential reusers, and this chapter demonstrates that we have much to
accomplish to make data repositories more interoperable.
Conclusion
Digital data is ubiquitous and rapidly reshaping how scholarship progresses now
and into the future. The abundant—and sometimes chaotic—flow of data world-
wide enables a new form of collaborative exploration and discovery that minimiz-
es international and interdisciplinary barriers connecting researchers with shared
goals and accelerates the rate of scientific understanding. Just take a moment to
consider the vast body of digital information housed in openly accessible data
repositories across the world representing unique information products such as
the mysterious and brief flashes of high-energy gamma-ray bursts originating
from the far outer-reaches of our universe, the Alexandrian feat that is Hathi-
Trust bringing together into a single corpus of searchable text everything from
Shakespearean plays to song lyrics by The Beatles, the echoes of evolutionary
history surfacing from the endless strings of human genetic DNA, and the daily
snapshot of social norms and human values which can emerge from the deluge of
human-machine interactions generated across the social web.72
In 2003, Hey and
Trefethen anticipated that “new types of digital libraries for scientific data with
the same sort of management services as conventional digital libraries” would
emerge in response to our changing world.73
That time is now. These are extraor-
dinary times for data curators and how we rise to the challenge of providing new
services and respond to the shifting patterns of data sharing and data reuse has
the potential to shape and define our profession into the future.
Notes
1. Merriam-Webster’s Learner’s Dictionary, “Data,” accessed August 6, 2016, http://www.
merriam-webster.com/dictionary/data.
2. Definition from footnote 1 on page 2 in the article by Claire C. Austin, Theodora
Bloom, Sünje Dallmeier-Tiessen, Varsha K. Khodiyar, Fiona Murphy, Amy Nurnberg-
er, Lisa Raymond, Martina Stockhause, Jonathan Tedds, Mary Vardigan, and Angus
Whyte, “Key components of data publishing: Using current best practices to develop
a reference model for data publishing,” International Journal on Digital Libraries, June
2016, doi:10.1007/s00799-016-0178-2.
3. See the Digital Curation Center (DCC). “DCC Curation Lifecycle Model,” accessed
August 6, 2016, http://www.dcc.ac.uk/resources/curation-lifecycle-model; for the his-
tory and development of this model see Sarah Higgins, “The DCC Curation Lifecycle
Model,” International Journal of Digital Curation 3, no. 1 (2008): 134–40, doi:10.2218/
ijdc.v3i1.48, where data are defined on p137.
18 Introduction to Volume One
4. Ross Harvey, “Chapter 4. Defining Data,” Digital Curation: A How-To-Do-It Manual,
No. 025.06. (Chicago: Neal-Schuman Publishers, 2010), http://www.alastore.ala.org/
pdf/digital_curation.pdf.
5. The US federal government, for example, defines research data in their OMB circular
a-110 as “recorded factual material commonly accepted in the scientific community as
necessary to validate research findings,” see full notice at Office of Management and
Budget, “CIRCULAR A-110,” revised November 19, 1993, further amended Septem-
ber 20, 1999, https://www.whitehouse.gov/omb/circulars_a110.
6. See for example the PublicVR project, accessed August 6, 2016, http://publicvr.org/
index.html, which provides virtual reality 3d environments for places such as the Grand
Theater in the Roman city of Pompeii as it may have looked prior to the devastating
volcanic eruption in 79AD.
7. See for example the eMotion lab at the University of Notre Dame that uses “advanced
video capture equipment to track posture, gesture, and facial expression during a variety
of experimental tasks” at the University of Notre Dame, “About the eMotion and eCog-
nition Lab,” accessed August 6, 2016, http://www3.nd.edu/~emotecog/about.html.
8. The 2015 report by McAfee Labs warns of the cyber security challenges that are abundant
such as identity theft, data breaches, and national security risks in Intel Security Group
McAfee Labs, “The Hidden Data Economy,” October 15, 2015, http://www.mcafee.
com/us/resources/reports/rp-hidden-data-economy.pdf; This Technology Watch report
describes techniques to preserve large-scale transactional data derived from business and
industry in Thomson, Sara Day, “Technology Watch Report 16: Preserving Transactional
Data,” Digital Preservation Coalition, May 2, 2016, doi:10.7207/twr16-02.
9. This quote is from page 2-1 of the OAIS Reference Model found in Consultative
Committee for Space Data Systems, Audit and Certification of Trustworthy Digital
Repositories, Recommended Practice, CCSDS 652.0-M-1, Magenta Book, Issue 1
Washington, DC: CCSDS Secretariat, September 2011, http://public.ccsds.org/publi-
cations/archive/652x0m1.pdf.
10. Footnote 2 on page 2 of Austin et. al. “Key components of data publishing: Using
current best practices to develop a reference model for data publishing.” Reference in
the quote is to CASRAI, “Category:Research Data Domain,” The CASRAI Dictionary,
Last Modified August18, 2015, http://dictionary.casrai.org/Category:Research_Data_
Domain; the RDA Data Foundations and Terminology working group has a growing
dictionary of data related terms that is searchable at Research Data Alliance Data Foun-
dation and Terminology Interest Group, “Term Definition Tool (TeD-T),” last modified
March 1, 2016, http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page.
11. National Science Board, “NSB-05-40, Long-Lived Digital Data Collections Enabling
Research and Education in the 21st Century,” Summer 2005, National Science Founda-
tion, http://www.nsf.gov/pubs/2005/nsb0540, p1.
12. University of Illinois Urbana-Champaign School of Information Science, “Specializa-
tion in Data Curation,” accessed August 4, 2016, http://www.lis.illinois.edu/academics/
programs/specializations/data_curation.
13. Committee on Future Career Opportunities and Educational Requirements for
Digital Curation; Board on Research Data and Information; Policy and Global Affairs;
National Research Council, Preparing the Workforce for Digital Curation (Washington,
DC: National Academies Press; April 22, 2015), http://www.nap.edu/catalog.php?re-
cord_id=18590.
Introduction to Data Curation 19
14. For more in-depth coverage of this topic, read a systematic review of data sharing
studies in academia. See: Fecher, Benedikt, Sascha Friesike, and Marcel Hebing, “What
drives academic data sharing?,” PLoS One 10, no. 2 (2015), doi:10.1371/journal.
pone.0118053.
15. National Academy of Sciences, National Academy of Engineering, and Institute of
Medicine, Information Technology and the Conduct of Research: The User’s View (Washing-
ton, DC: The National Academies Press, 1989), doi:10.17226/763, p1.
16. Gary King, “Ensuring the Data-Rich Future of the Social Sciences,” Science 331(6018):
719–721 (2011), doi:10.1126/science.1197872.
17. An overview of these policies is found in Kathleen Shearer, “Comprehensive Brief on
Research Data Management Policies,” released April 2015, http://acts.oecd.org/Instru-
ments/ShowInstrumentView.aspx?InstrumentID=157.
18. The memo from the White House’s Office of Science Technology Policy (OSTP) was
released as John P. Holdren, “Increasing Access to the Results of Federally Funded Sci-
entific Research,” Memorandum for the Heads of Executive Departments and Agencies,
Office of Science and Technology Policy, Executive Office of the President, February 22,
2013, http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_ac-
cess_memo_2013.pdf.
19. Adapted from Inter-university Consortium for Political and Social Research (ICPSR),
“Guidelines for OSTP Data Access Plan,” accessed August 6, 2016, http://www.icpsr.
umich.edu/icpsrweb/content/datamanagement/ostp.html.
20. Jerry Sheehan, “Increasing Access to the Results of Federally Funded Science,”
The White House Blog, posted February 22, 2016, https://www.whitehouse.gov/
blog/2016/02/22/increasing-access-results-federally-funded-science.
21. United States Government, “US Open Data Action Plan,” May 9, 2014, https://www.
whitehouse.gov/sites/default/files/microsites/ostp/us_open_data_action_plan.pdf.
22. Ford Foundation, “Ford Foundation expands Creative Commons licensing for all
grant-funded projects,” February 3, 2015, https://www.fordfoundation.org/the-latest/
news/ford-foundation-expands-creative-commons-licensing-for-all-grant-funded-proj-
ects; Alfred P. Sloan Foundation, “Grant Application Guidelines,” last modified January
6, 2014, http://www.sloan.org/fileadmin/media/files/application_documents/propos-
al_guidelines_research_officer_grants.pdf; Bill & Melinda Gates Foundation, “Bill &
Melinda Gates Foundation Open Access Policy,” accessed August 6, 2016, http://www.
gatesfoundation.org/How-We-Work/General-Information/Open-Access-Policy.
23. SPARC Open Data, “Research Funder Data Sharing Policies,” accessed August 5, 2016,
http://sparcopen.org/our-work/research-data-sharing-policy-initiative/funder-policies.
24. Institute of Medicine and National Academy of Sciences, Ensuring the Integrity, Ac-
cessibility, and Stewardship of Research Data in the Digital Age (Washington, DC: The
National Academies Press, 2009), doi:10.17226/12615, 34.
25. Retraction Watch, “Archive for the ‘data issues’ Category,” accessed August 6, 2016,
http://retractionwatch.com/category/by-reason-for-retraction/data-issues.
26. Kathleen Fear, “Building Outreach on Assessment: Researcher Compliance with Journal
Policies for Data Sharing,” Bulletin of the American Society for Information Science and
Technology 41, no. 6 (2015): 18–21, doi:10.1002/bult.2015.1720410609; Heather A.
Piwowar and Wendy W. Chapman, “A Review of Journal Policies for Sharing Research
Data,” Nature Precedings, March 20, 2008, hdl:10101/npre.2008.1700.1; Linda Naugh-
ton and David Kernohan, “Making Sense of Journal Research Data Policies,” Insights
20 Introduction to Volume One
29, no. 1 (2016), http://doi.org/10.1629/uksg.284.
27. The model is published in Paul Sturges, Marianne Bamkin, Jane H.S. Anders, Bill
Hubbard, Azhar Hussain, and Melanie Heeley, “Research Data Sharing: Developing a
Stakeholder-Driven Model for Journal Policies,” Journal of the Association for Informa-
tion Science and Technology, doi:10.1002/asi.23336.
28. Nature, “Availability of Data, Material and Methods,” accessed August 6, 2016, http://
www.nature.com/authors/policies/availability.html; PLOS One, “Data Availability,”
accessed August 6, 2016, http://journals.plos.org/plosone/s/data-availability.
29. Chelsey Coombs, “Neuroscience Paper Retracted After Colleagues Object to
Data Publication,” Retraction Watch, December 31, 2015, http://retractionwatch.
com/2015/12/31/neuroscience-paper-retracted-after-colleagues-object-to-data-publica-
tion.
30. Elsevier, “Elsevier and the Inter-University Consortium for Political and Social Research
(ICPSR) Announce Data Linking,” February 8, 2016, http://www.prnewswire.com/
news-releases/elsevier-and-the-inter-university-consortium-for-political-and-social-re-
search-icpsr-announce-data-linking-568022141.html; See the list of data repositories at
Elsevier, “Supported Data Repositories,” accessed August 6, 2016, https://www.elsevier.
com/?a=57755.
31. Scientific Data homepage, accessed August 6, 2016, http://www.nature.com/sdata; Data
in Brief homepage, accessed August 6, 2016, http://www.journals.elsevier.com/data-
in-brief; as reported in Tim Austin, “Towards a Digital Infrastructure for Engineering
Materials Data,” Materials Discovery (2016), doi:10.1016/j.md.2015.12.003, 2.
32. Leonardo Candela, Donatella Castelli, Paolo Manghi, and Alice Tani, “Data Journals:
A Survey,” Journal of the Association for Information Science and Technology 66, no. 9
(2015): 1747–1762, doi: 10.1002/asi.23358.
33. Ibid, 1756.
34. Scientific Data, “Recommended Data Repositories,” accessed July 18, 2016, http://www.
nature.com/sdata/policies/repositories.
35. The declaration signifies that each country will “Work towards the establishment of
access regimes for digital research data from public funding” and with shared objectives
and principles. Available as Organisation for Economic Co-operation and Develop-
ment, “Declaration on Access to Research Data from Public Funding,” January 30,
2004, http://acts.oecd.org/Instruments/ShowInstrumentView.aspx?InstrumentID=157.
36. The UK funding council polices are each summarized and linked to from the Digital
Curation Center, “Funders’ Data Policies,” accessed August 6, 2016, http://www.dcc.
ac.uk/resources/policy-and-legal/funders-data-policies; the Wellcome Trust, “Policy
on data management and sharing,” accessed August 6, 2016, https://wellcome.ac.uk/
funding/managing-grant/policy-data-management-and-sharing; Research Councils UK,
“RCUK Common Principles on Data Policy,” published April 2011, http://www.rcuk.
ac.uk/research/datapolicy.
37. European Commission, “Guidelines on Open Access to Scientific Publications and
Research Data in Horizon 2020”, version 3.0,” July 26, 2016, http://ec.europa.eu/
research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-
guide_en.pdf.
38. Kathleen Shearer, “Comprehensive Brief on Research Data Management Policies.” In
2015 Canada also released a federal policy on the open access to publications resulting
from federal funds from its three primary funding agencies (see Government of Canada,
Introduction to Data Curation 21
“Tri-Agency Open Access Policy on Publications,” February 27, 2015, http://www.
science.gc.ca/default.asp?lang=En&n=F6765465-1), yet this requirement only applies
to research articles, not data.
39. Portage network homepage, accessed August 6, 2016, https://portagenetwork.ca.
40. JISC-funded Research Data Management Shared Service Project, accessed August 4,
2016, https://www.jisc.ac.uk/rd/projects/research-data-shared-service; Data Curation
Network Project homepage, accessed August 4, 2016, https://sites.google.com/site/data-
curationnetwork.
41. For example, findings from reviewing a sample of 182 Data Management Plans of suc-
cessful National Science Foundation grant proposals showed this to be the case for 74%
of the sample in Carolyn Bishoff and Lisa R. Johnston, “Approaches to Data Sharing:
An Analysis of NSF Data Management Plans from a Large Research University,” Journal
of Librarianship and Scholarly Communication 3, no. 2 (2015). doi:10.7710/2162-
3309.1231.
42. Caitlin Rivers, “‘Send Me Your Data—PDF is Fine,’ Said No One Ever (How to Share
Your Data Effectively),” April 8, 2013, http://www.caitlinrivers.com/blog/send-me-
your-data-pdf-is-fine-said-no-one-ever-how-to-share-your-data-effectively.
43. Carlos Santos, Judith Blake, and David J. States, “Supplementary Data Need to be Kept
in Public Repositories,” Nature 438, no. 7069 (2005): 738-738, doi: 10.1038/438738a.
44. Caroline J. Savage, and Andrew J. Vickers, “Empirical Study of Data Sharing by
Authors Publishing in PLoS Journals,” PloS One 4, no. 9 (2009): e7078, doi:10.1371/
journal.pone.0007078; Timothy H. Vines, Arianne YK Albert, Rose L. Andrew,
Florence Débarre, Dan G. Bock, Michelle T. Franklin, Kimberly J. Gilbert, Jean-Sébas-
tien Moore, Sébastien Renaut, and Diana J. Rennison, “The Availability of Research
Data Declines Rapidly with Article Age,” Current Biology 24, no. 1 (2014): 94–97,
doi:10.1016/j.cub.2013.11.014.
45. Michael Witt, “Institutional Repositories and Research Data Curation in a Distributed
Environment,” Library Trends 57, no. 2 (2008): 191–201, doi:10.1353/lib.0.0029.
46. G. Sayeed Choudhury, “Case Study in Data Curation at Johns Hopkins University,”
Library Trends 57, no. 2 (2008): 211–220, doi:10.1353/lib.0.0028.
47. Carol Tenopir, Ben Birch, and Suzie Allard, Academic Libraries and Research Data
Services: Current Practices and Plans for the Future, An ACRL White Paper, Association
of College and Research Libraries, a division of the American Library Association,
2012, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/
Tenopir_Birch_Allard.pdf.
48. Further examples of disciplinary repositories are found in re3data.org homepage, ac-
cessed August 6, 2016, http://www.re3data.org.
49. DataOne, “Best Practices,” accessed August 5, 2016, http://www.dataone.org/best-prac-
tices; DataOne, “Software Tools Catalog,” accessed August 5, 2016, https://www.
dataone.org/software_tools_catalog.
50. DataOne, “ESA 2011: How to Manage Ecological Data for Effective Use and Re-use,”
August 7, 2011, http://www.dataone.org/esa-2011-how-manage-ecological-data-effec-
tive-use-and-re-use.
51. Raymond Leadbetter, A., L., Chandler, C., Pikula, L., Pissierssens, P., Urban, E., Ocean
Data Publication Cookbook (Paris: UNESCO, 2013), http://www.iode.org/mg64; For
further context see the slides by Lisa Raymond, “Publishing and Citing Ocean Data,”
OneNOAA Science Seminar, National Oceanographic Data Center, May 22, 2013,
22 Introduction to Volume One
http://www.nodc.noaa.gov/seminars/2013/support/Lisa_Raymond_OneNOAASemi-
nar_slides.pdf.
52. Jared Lyle, George Alter and Mary Vardigan, “‘The Price of Keeping Knowledge’ Work-
shop: ICPSR Position Paper,” (2013), http://www.knowledge-ex-change.info/Admin/
Public/DWSDownload.aspx?File=%2FFiles%2FFiler%2Fdownloads%2FPrimary+Re-
search+Data%2FWorkshop+Price+of+Keeping+Knowledge%2FJared+Lyle+ICPSR_Po-
sition+Paper_Price+workshop_public.pdf.
53. Carol Ember, Robert Hanisch, George Alter, Helen Berman, Margaret Hedstrom, and
Mary Vardigan. “Sustaining Domain Repositories for Digital Data: A White Paper,”
December 11, 2013, 10–11, http://datacommunity.icpsr.umich.edu/sites/default/files/
WhitePaper_ICPSR_SDRDD_121113.pdf.
54. Ibid., 10.
55. Jim Gray, Alexander S. Szalay, Ani R. Thakar, Christopher Stoughton, and Jan vanden-
Berg, “Online Scientific Data Curation, Publication, and Archiving,” submitted August
7, 2002, http://arxiv.org/abs/cs.DL/0208012.
56. According to a 2007 study, openly sharing data was linked higher citation rates for
the publications associated with that data. See Heather A. Piwowar, Roger S. Day, and
Douglas B. Fridsma, “Sharing Detailed Research Data is Associated with Increased
Citation Rate,” PloS One 2, no. 3 (2007): e308, doi:10.1371/journal.pone.0000308;
Cases of unreplicable or faulty data have been the subject of several studies, such as the
Reproducibility Studies by the Center for Open Science in the fields of psychology, (Al-
exander A. Aarts, Christopher J. Anderson, Joanna Anderson, Marcel A.L.M van Assen,
Peter R. Attridge, Angela S. Attwood, Jordan Axt, et al., 2016, “Reproducibility Project:
Psychology,” Open Science Framework, July 23, https://osf.io/EZcUj/); and cancer
biology (Timothy M. Errington, Fraser E. Tan, Joelle Lomax, Nicole Perfito, Elizabeth
Iorns, William Gunn, Brian A. Nosek, et al., 2016, “Reproducibility Project: Cancer
Biology,” Open Science Framework, July 22. https://osf.io/e81xl/). In addition, the high
profile case of scientists Dong-Pyou Han in an HIV-data falsification charge actually led
to jail time and $7.2 million in fines according to the report Sara Reardon, “US Vaccine
Researcher Sentenced to Prison for Fraud,” Nature News, July 1, 2015, http://www.
nature.com/news/us-vaccine-researcher-sentenced-to-prison-for-fraud-1.17660.
57. Victoria Sodden provides entertaining slide presentation on “A Brief History of the
Reproducibility Movement,” December 10, 2012, http://hdl.handle.net/10022/
AC:P:15396; Prasad Patil, Roger D. Peng, Jeffrey Leek, “A Statistical Definition for
Reproducibility and Replicability,” BioRxiv, July 29, 2016, doi:10.1101/066803.
58. Disciplinary repositories such as the iPlant Collaborative (homepage, accessed August 6,
2016, http://www.iplantcollaborative.org), nanoHUB.org (homepage, accessed August
6, 2016, https://nanohub.org), EarthCube (homepage, accessed August 6, 2016, http://
earthcube.org), and CUAHSI (Hydrologic Information System homepage, accessed
August 6, 2016, http://his.cuahsi.org) represent the collective outputs of the discipline
to allow for widespread reuse of the data.
59. Richard Van Noorden, “Irish University Labs Face External Audits,” Nature News,
June 17, 2014, http://www.nature.com/news/irish-university-labs-face-external-au-
dits-1.15422.
60. Stephan Lewandowsky and Dorothy Bishop, “Research Integrity: Don’t Let Trans-
parency Damage Science,” Nature, January 25, 2016, http://www.nature.com/news/
research-integrity-don-t-let-transparency-damage-science-1.19219.
Introduction to Data Curation 23
61. Ibid.
62. Dan L. Longo, and Jeffrey M. Drazen, “Data Sharing,” New England Journal of Medi-
cine 374, no. 3 (2016): 276–277, doi: 10.1056/NEJMe1516564.
63. David Shaywitz, “Data Scientists = Research Parasites?,” Forbes, January 21, 2016,
http://www.forbes.com/sites/davidshaywitz/2016/01/21/data-scientists-research-par-
asites/#3ddef3453d1c; Thomas H. Davenport and D.J. Patil, “Data Scientist: The
Sexiest Job of the 21st Century,” Harvard Business Review, October 2012, https://hbr.
org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century.
64. Margaret Kosmala, “Open Data, Authorship, and the Early Career Scientist,” Ecology
Bits, posted June 15, 2016, http://ecologybits.com/index.php/2016/06/15/open-da-
ta-authorship-and-the-early-career-scientist/; Snapshot Serengeti dataset available as Al-
exandra Swanson, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith, and
Craig Packer, “Snapshot Serengeti, High-Frequency Annotated Camera Trap Images of
40 Mammalian Species in an African Savanna,” Dryad Digital Repository, http://dx.doi.
org/10.5061/dryad.5pt92 and the paper describing the data available as Alexandra
Swanson, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith, and Craig
Packer, “Snapshot Serengeti, High-Frequency Annotated Camera Trap Images of 40
Mammalian Species in an African Savanna,” Scientific Data 2 (2015), doi:10.1038/sda-
ta.2015.26.
65. Terry McGlynn, “I Own My Data, Until I Don’t,” Small Pond Science, March 3, 2014,
http://smallpondscience.com/2014/03/03/i-own-my-data-until-i-dont; Emilio M. Bru-
na, “The Opportunity Cost of My #OpenScience was 36 Hours + $690,” The Bruma
Lab, September 4, 2014, http://brunalab.org/blog/2014/09/04/the-opportunity-cost-
of-my-openscience-was-35-hours-690.
66. The archival community has dealt with curation issues in the print and analog for
centuries and the lessons learned translate well into the digital realm but are often
overlooked by developers of new data curation services in academic and disciplinary
settings according to Helen R. Tibbo, and Christopher A. Lee, “Closing the Digital
Curation Gap: A Grounded Framework for Providing Guidance and Education in
Digital Curation,” Archiving Conference, vol. 2012, no. 1, pp. 57–62, Society for Im-
aging Science and Technology, 2012, http://www.ils.unc.edu/callee/p57-tibbo.pdf. Some
example archival workflows that translate well to data curation include Julianna Barre-
ra-Gomez and Ricky Erway, Walk This Way: Detailed Steps for Transferring Born-Digital
Content from Media You Can Read In-House (Dublin, OH: OCLC Online Computer
Library Center, 2013), http://www.oclc.org/content/dam/research/publications/li-
brary/2013/2013-02.pdf and the AIMS Work Group, “AIMS Born-Digital Collections:
An Inter-Institutional Model for Stewardship,” January 2012, http://dcs.library.virginia.
edu/files/2013/02/AIMS_final.pdf.
67. US Geological Survey, “NBII to Be Taken Offline Permanently in January,” USGS Access
Newsletter 14, no. 3 (Fall 2011), https://www2.usgs.gov/core_science_systems/Access/
p1111-1.html.
68. National Science Board, “NSB-05-40, Long-Lived Digital Data Collections Enabling
Research and Education in the 21st Century,” https://www.nsf.gov/pubs/2005/nsb0540/.
69. Patricia A. Soranno, Kendra S. Cheruvelil, Kevin C. Elliott, and Georgina M. Mont-
gomery, “It’s Good to Share: Why Environmental Scientists’ Ethics are Out of Date,”
BioScience 65, no. 1 (2015): 69–73, doi: 10.1093/biosci/biu169.
70. Australian National Data Service, “Open Research Data,” November 2014, http://www.
24 Introduction to Volume One
ands.org.au/working-with-data/articulating-the-value-of-open-data/open-research-da-
ta-report.
71. Clifford Lynch, “The Shape of the Scientific Article in the Developing Cyberinfra-
structure,” CTWatch Quarterly 3, no. 3 (2007), http://www.ctwatch.org/quarterly/arti-
cles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastructure/
index.html.
72. Real-time observational data of the quickly dimming objects known as gamma-ray
bursts (GRBs) are available to researchers through the Goddard Space Flight Center,
“GCN: The Gamma-ray Coordinates Network (TAN: Transient Astronomy Network),”
accessed August 6, 2016, http://gcn.gsfc.nasa.gov and public download access to GRB
recordings that predate the SWIFT satellite mission launched in 2003 are also available
Goddard Space Flight Center, “The Gamma Ray Burst Catalog,” accessed August 6,
2016, http://heasarc.gsfc.nasa.gov/grbcat/grbcat.html; Hathitrust is a searchable data-
base of millions of digitized text and available at Hathitrust homepage, accessed August
6, 2016, http://babel.hathitrust.org; Public access to download the human genome and
tools to analyze and compare DNA are available at NCBI, “Human Genome Resourc-
es,” accessed August 6, 2016, http://www.ncbi.nlm.nih.gov/genome/guide/human; Big
data generated by human-computer interaction can be derived from many social web
services, though some do not release their data to the public (e.g., Amazon, Facebook).
Sources of public data are available via APIs that contain real-time, and sometimes
historical, information. For example Twitter interaction data can be found at the Gnip
homepage, accessed August 6, 2016, https://gnip.com, and in 2016 Yahoo released a
News Feed dataset of 110 billion interactions of anonymized users interactions with
their home page and news sites as Yahoo, “R10—Yahoo News Feed dataset, version
1.0 (1.5TB),” accessed August 6, 2016, http://webscope.sandbox.yahoo.com/catalog.
php?datatype=r&did=75.
73. Anthony J.G. Hey, and Anne E. Trefethen, “The Data Deluge: An E-Science Perspec-
tive,” Grid Computing: Making the Global Infrastructure a Reality, (Chichester: Wiley,
2003), 809–24, http://eprints.soton.ac.uk/id/eprint/257648.
Bibliography
Aarts, Alexander A., Christopher J. Anderson, Joanna Anderson, Marcel A.L.M van Assen,
Peter R. Attridge, Angela S. Attwood, Jordan Axt, et al. 2016. “Reproducibility Proj-
ect: Psychology.” Open Science Framework. July 23. osf.io/ezcuj.
AIMS Work Group. “AIMS Born-Digital Collections: An Inter-Institutional Model for Stew-
ardship.” January 2012. http://dcs.library.virginia.edu/files/2013/02/AIMS_final.pdf.
Alfred P. Sloan Foundation. “Grant Application Guidelines.” Last modified January 6, 2014.
http://www.sloan.org/fileadmin/media/files/application_documents/proposal_guide-
lines_research_officer_grants.pdf.
Austin, Claire C., Theodora Bloom, Sünje Dallmeier-Tiessen, Varsha K. Khodiyar, Fiona
Murphy, Amy Nurnberger, Lisa Raymond, Martina Stockhause, Jonathan Tedds,
Mary Vardigan, and Angus Whyte. “Key components of data publishing: Using
current best practices to develop a reference model for data publishing.” International
Journal on Digital Libraries, 20 June 2016. doi:10.1007/s00799-016-0178-2.
Introduction to Data Curation 25
Austin, Tim. “Towards a Digital Infrastructure for Engineering Materials Data.” Materials
Discovery (2016). doi:10.1016/j.md.2015.12.003.
Australian National Data Service. “Open Research Data.” November 2014. http://www.ands.
org.au/working-with-data/articulating-the-value-of-open-data/open-research-data-re-
port.
Barrera-Gomez, Julianna, and Ricky Erway. Walk This Way: Detailed Steps for Transferring
Born-Digital Content from Media You Can Read In-House. Dublin, OH: OCLC
Online Computer Library Center, Inc., 2013. http://www.oclc.org/content/dam/
research/publications/library/2013/2013-02.pdf.
Bill & Melinda Gates Foundation. “Bill & Melinda Gates Foundation Open Access Policy.”
Accessed August 6, 2016. http://www.gatesfoundation.org/How-We-Work/Gener-
al-Information/Open-Access-Policy.
Bishoff, Carolyn, and Lisa R. Johnston. “Approaches to Data Sharing: An Analysis of NSF
Data Management Plans from a Large Research University.” Journal of Librarianship
and Scholarly Communication 3, no. 2 (2015). doi:10.7710/2162-3309.1231.
Bruna, Emilio M. “The Opportunity Cost of My #OpenScience was 36 Hours + $690.” The
Bruma Lab. September 4, 2014. http://brunalab.org/blog/2014/09/04/the-opportu-
nity-cost-of-my-openscience-was-35-hours-690/.
Candela, Leonardo, Donatella Castelli, Paolo Manghi, and Alice Tani. “Data Journals: A Sur-
vey.” Journal of the Association for Information Science and Technology 66, no. 9 (2015):
1747-1762. doi: 10.1002/asi.23358.
CASRAI. “Category:Research Data Domain.” The CASRAI Dictionary. Last Modified Au-
gust18, 2015. http://dictionary.casrai.org/Category:Research_Data_Domain.
Choudhury, G. Sayeed. “Case Study in Data Curation at Johns Hopkins University.” Library
Trends 57, no. 2 (2008): 211-220. doi: 10.1353/lib.0.0028.
Committee on Future Career Opportunities and Educational Requirements for Digital Cura-
tion; Board on Research Data and Information; Policy and Global Affairs; National
Research Council. Preparing the Workforce for Digital Curation. Washington, DC:
National Academies Press; April 22, 2015. http://www.nap.edu/catalog.php?record_
id=18590.
Consultative Committee for Space Data Systems. Audit and Certification of Trustworthy
Digital Repositories. Recommended Practice, CCSDS 652.0-M-1, Magenta Book,
Issue 1. Washington, DC: CCSDS Secretariat, September 2011. http://public.ccsds.
org/publications/archive/652x0m1.pdf.
Coombs, Chelsey. “Neuroscience Paper Retracted After Colleagues Object to Data
Publication.” Retraction Watch. December 31, 2015. http://retractionwatch.
com/2015/12/31/neuroscience-paper-retracted-after-colleagues-object-to-data-publi-
cation/.
CUAHSI Hydrologic Information System homepage. Accessed August 6, 2016. http://his.
cuahsi.org/.
Data Curation Network Project homepage. Accessed August 4, 2016. https://sites.google.
com/site/datacurationnetwork/.
Data in Brief homepage. Accessed August 6, 2016. http://www.journals.elsevier.com/data-in-
brief.
DataOne. “Best Practices.” Accessed August 5, 2016. http://www.dataone.org/best-practices.
26 Introduction to Volume One
DataOne. “ESA 2011: How to Manage Ecological Data for Effective Use and Re-use.”
August 7, 2011. http://www.dataone.org/esa-2011-how-manage-ecological-data-ef-
fective-use-and-re-use.
DataOne. “Software Tools Catalog.” Accessed August 5, 2016. https://www.dataone.org/
software_tools_catalog.
Davenport, Thomas H., D.J. Patil. “Data Scientist: The Sexiest Job of the 21st Century.”
Harvard Business Review. October 2012. https://hbr.org/2012/10/data-scientist-the-
sexiest-job-of-the-21st-century.
Digital Curation Center. “Funders’ Data Policies.” Accessed August 6, 2016. http://www.dcc.
ac.uk/resources/policy-and-legal/funders-data-policies.
Digital Curation Center (DCC). “DCC Curation Lifecycle Model.” Accessed August 6,
2016. http://www.dcc.ac.uk/resources/curation-lifecycle-model.
EarthCube homepage. Accessed August 6, 2016. http://earthcube.org/.
Elsevier. “Elsevier and the Inter-University Consortium for Political and Social Research
(ICPSR) Announce Data Linking.” February 8, 2016. http://www.prnewswire.com/
news-releases/elsevier-and-the-inter-university-consortium-for-political-and-social-re-
search-icpsr-announce-data-linking-568022141.html.
———. “Supported Data Repositories.” Accessed August 6, 2016. https://www.elsevier.
com/?a=57755.
Ember, Carol, Robert Hanisch, George Alter, Helen Berman, Margaret Hedstrom, and Mary
Vardigan. “Sustaining Domain Repositories for Digital Data: A White Paper.” De-
cember 11, 2013, 10–11. http://datacommunity.icpsr.umich.edu/sites/default/files/
WhitePaper_ICPSR_SDRDD_121113.pdf.
Errington, Timothy M, Fraser E. Tan, Joelle Lomax, Nicole Perfito, Elizabeth Iorns, William
Gunn, Brian A. Nosek, et al. 2016. “Reproducibility Project: Cancer Biology.” Open
Science Framework. July 22. osf.io/e81xl.
European Commission. “Guidelines on Open Access to Scientific Publications and Research
Data in Horizon 2020. Version 3.0.” July 26, 2016. http://ec.europa.eu/research/par-
ticipants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf.
Fear, Kathleen. “Building Outreach on Assessment: Researcher Compliance with Journal
Policies for Data Sharing.” Bulletin of the American Society for Information Science and
Technology 41, no. 6 (2015): 18-21. doi:10.1002/bult.2015.1720410609.
Fecher, Benedikt, Sascha Friesike, and Marcel Hebing. “What Drives Academic Data Shar-
ing?” PLoS One 10, no. 2 (2015): doi:10.1371/journal.pone.0118053.
Ford Foundation. “Ford Foundation expands Creative Commons licensing for all
grant-funded projects.” February 3, 2015. https://www.fordfoundation.org/the-latest/
news/ford-foundation-expands-creative-commons-licensing-for-all-grant-funded-
projects/.
Gnip homepage. Accessed August 6, 2016. https://gnip.com/.
Goddard Space Flight Center. “GCN: The Gamma-ray Coordinates Network (TAN: Tran-
sient Astronomy Network).” Accessed August 6, 2016. http://gcn.gsfc.nasa.gov.
Goddard Space Flight Center. “The Gamma Ray Burst Catalog.” Accessed August 6, 2016.
http://heasarc.gsfc.nasa.gov/grbcat/grbcat.html.
Government of Canada. “Tri-Agency Open Access Policy on Publications.” February 27,
2015. http://www.science.gc.ca/default.asp?lang=En&n=F6765465-1.
Gray, Jim, Alexander S. Szalay, Ani R. Thakar, Christopher Stoughton, and Jan vandenBerg.
“Online Scientific Data Curation, Publication, and Archiving.” Submitted August 7,
2002. http://arxiv.org/abs/cs.DL/0208012.
Introduction to Data Curation 27
Harvey, Ross. “Chapter 4. Defining Data.” Digital Curation: A How-To-Do-It Manual. No.
025.06. Chicago: Neal-Schuman Publishers, 2010.
HathiTrust homepage. Accessed August 6, 2016. http://babel.hathitrust.org.
Hey, Anthony J.G., and Anne E. Trefethen. “The Data Deluge: An E-Science Perspective.”
In Grid Computing: Making the Global Infrastructure a Reality, edited by F. Berman,
G. Fox, A. J.G. Hey, 809–24. Chichester: Wiley 2003. http://eprints.soton.ac.uk/id/
eprint/257648.
Higgins, Sarah. “The DCC Curation Lifecycle Model.” International Journal of Digital Cura-
tion 3, no. 1 (2008): 134–40. doi:10.2218/ijdc.v3i1.48, p137.
Holdren, John P. “Increasing Access to the Results of Federally Funded Scientific Research.”
Memorandum for the Heads of Executive Departments and Agencies, Office of
Science and Technology Policy, Executive Office of the President, February 22, 2013.
http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_
memo_2013.pdf.
Institute of Medicine and National Academy of Sciences. Ensuring the Integrity, Accessibility,
and Stewardship of Research Data in the Digital Age. Washington, DC: The National
Academies Press, 2009. doi:10.17226/12615, 34.
Intel Security Group McAfee Labs. “The Hidden Data Economy.” October 15, 2015. http://
www.mcafee.com/us/resources/reports/rp-hidden-data-economy.pdf.
Inter-university Consortium for Political and Social Research (ICPSR). “Guidelines for
OSTP Data Access Plan.” Accessed August 6, 2016. http://www.icpsr.umich.edu/
icpsrweb/content/datamanagement/ostp.html.
iPlant Collaborative homepage. Accessed August 6, 2016. http://www.iplantcollaborative.
org.
King, Gary. 2011. Ensuring the Data-rich Future of the Social Sciences. Science 331(6018):
719–721. doi:10.1126/science.1197872.
Kosmala, Margaret. “Open Data, Authorship, and the Early Career Scientist.” Ecology Bits,
posted June 15, 2016. http://ecologybits.com/index.php/2016/06/15/open-data-au-
thorship-and-the-early-career-scientist.
Leadbetter, A., Raymond, L., Chandler, C., Pikula, L., Pissierssens, P., Urban, E. Ocean Data
Publication Cookbook. (Paris: UNESCO, 2013.) http://www.iode.org/mg64.
Lewandowsky, Stephan and Dorothy Bishop. “Research Integrity: Don’t Let Transparency
Damage Science.” Nature. January 25, 2016. http://www.nature.com/news/re-
search-integrity-don-t-let-transparency-damage-science-1.19219.
Longo, Dan L. and Jeffrey M. Drazen. “Data Sharing.” New England Journal of Medicine
374, no. 3 (2016): 276-277. doi:10.1056/NEJMe1516564.
Lyle, Jared, George Alter, and Mary Vardigan. “The Price of Keeping Knowledge Workshop:
ICPSR Position Paper.” (2013) http://www.knowledge-ex-change.info/Admin/Public/
DWSDownload.aspx?File=%2FFiles%2FFiler%2Fdownloads%2FPrimary+Re-
search+Data%2FWorkshop+Price+of+Keeping+Knowledge%2FJared+Lyle+ICPSR_
Position+Paper_Price+workshop_public.pdf.
Lynch, Clifford. “The Shape of the Scientific Article in the Developing Cyberinfrastruc-
ture.” CTWatch Quarterly 3, no. 3 (2007). http://www.ctwatch.org/quarterly/
articles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastruc-
ture/index.html.
McGlynn, Terry. “I Own My Data, Until I Don’t.” Small Pond Science. March 3, 2014.
http://smallpondscience.com/2014/03/03/i-own-my-data-until-i-dont/.
28 Introduction to Volume One
Merriam-Webster’s Learner’s Dictionary. “Data.” Web version. Accessed August 6, 2016.
http://www.merriam-webster.com/dictionary/data.
nanoHUB.org homepage. Accessed August 6, 2016. https://nanohub.org/.
National Academy of Sciences, National Academy of Engineering, and Institute of Medicine.
Information Technology and the Conduct of Research: The User’s View. Washington, DC:
The National Academies Press, 1989. doi:10.17226/763.
National Science Board. “NSB-05-40, Long-Lived Digital Data Collections Enabling
Research and Education in the 21st Century.” Summer 2005. National Science Foun-
dation. http://www.nsf.gov/pubs/2005/nsb0540.
Nature. “Availability of Data, Material and Methods.” Accessed August 6, 2016. http://www.
nature.com/authors/policies/availability.html.
Naughton, Linda and David Kernohan. “Making Sense of Journal Research Data Policies.”
Insights 29, no. 1 (2016). doi: http://doi.org/10.1629/uksg.284.
NCBI. “Human Genome Resources.” Accessed August 6, 2016. http://www.ncbi.nlm.nih.
gov/genome/guide/human.
Office of Management and Budget. “CIRCULAR A-110.” Revised November 19, 1993 as
further amended September 20, 1999. https://www.whitehouse.gov/omb/circulars_
a110 OMB circular a-110.
Organisation for Economic Co-operation and Development. “Declaration on Access to
Research Data from Public Funding.” January 30, 2004. http://acts.oecd.org/Instru-
ments/ShowInstrumentView.aspx?InstrumentID=157.
Patil, Prasad, Roger D. Peng, and Jeffrey Leek. “A Statistical Definition for Reproducibility
and Replicability.” BioRxiv. July 29, 2016. doi:10.1101/066803.
Piwowar, Heather A., Roger S. Day, and Douglas B. Fridsma. “Sharing Detailed Research
Data is Associated with Increased Citation Rate.” PloS One 2, no. 3 (2007): e308.
doi:10.1371/journal.pone.0000308.
Piwowar, Heather A. and Wendy W. Chapman. “A Review of Journal Policies for Sharing
Research Data.” Nature Precedings. March 20, 2008. hdl:10101/npre.2008.1700.1.
PLOS One. “Data Availability.” Accessed August 6, 2016. http://journals.plos.org/plosone/s/
data-availability.
Portage network homepage. Accessed August 6, 2016. https://portagenetwork.ca/.
PublicVR project homepage. Accessed August 6, 2016. http://publicvr.org/index.html.
Raymond, Lisa. “Publishing and Citing Ocean Data.” One NOAA Science Seminar, Na-
tional Oceanographic Data Center. May 22, 2013. http://www.nodc.noaa.gov/semi-
nars/2013/support/Lisa_Raymond_OneNOAASeminar_slides.pdf.
re3data.org homepage. Accessed August 6, 2016. http://www.re3data.org/.
Reardon, Sara. “US Vaccine Researcher Sentenced to Prison for Fraud.” Nature News, July 1,
2015. http://www.nature.com/news/us-vaccine-researcher-sentenced-to-prison-for-
fraud-1.17660.
Research Councils UK. “RCUK Common Principles on Data Policy.” April 2011. http://
www.rcuk.ac.uk/research/datapolicy/.
Research Data Alliance Data Foundation and Terminology Interest Group. “Term Definition
Tool (TeD-T).” Last modified March 1, 2016. http://smw-rda.esc.rzg.mpg.de/index.
php/Main_Page.
Research Data Management Shared Service Project homepage. Accessed August 4, 2016.
https://www.jisc.ac.uk/rd/projects/research-data-shared-service.
Introduction to Data Curation 29
Retraction Watch. “Archive for the ‘Data Issues’ Category.” Accessed August 6, 2016. http://
retractionwatch.com/category/by-reason-for-retraction/data-issues/.
Rivers, Caitlin. “‘Send Me Your Data - PDF is Fine,’ Said No One Ever (How to Share Your
Data Effectively).” April 8, 2013. http://www.caitlinrivers.com/blog/send-me-your-
data-pdf-is-fine-said-no-one-ever-how-to-share-your-data-effectively.
Santos, Carlos, Judith Blake and David J. States. “Supplementary Data Need to be Kept in
Public Repositories.” Nature 438, no. 7069 (2005): 738-738. doi: 10.1038/438738a.
Savage, Caroline J. and Andrew J. Vickers. “Empirical Study of Data Sharing by Authors
Publishing in PLoS Journals.” PloS One 4, no. 9 (2009): e7078. doi:10.1371/journal.
pone.0007078.
Scientific Data homepage. Accessed August 6, 2016. http://www.nature.com/sdata.
Scientific Data. “Recommended Data Repositories.” Accessed July 18, 2016. http://www.
nature.com/sdata/policies/repositories.
Shaywitz, David. “Data Scientists = Research Parasites?” Forbes, January 21, 2016. http://
www.forbes.com/sites/davidshaywitz/2016/01/21/data-scientists-research-para-
sites/#3ddef3453d1c.
Shearer, Kathleen. “Comprehensive Brief on Research Data Management Policies.” Released
April 2015. http://acts.oecd.org/Instruments/ShowInstrumentView.aspx?Instrumen-
tID=157.
Sheehan, Jerry. “Increasing Access to the Results of Federally Funded Science.” The White
House Blog. Feburary 22, 2016. https://www.whitehouse.gov/blog/2016/02/22/in-
creasing-access-results-federally-funded-science.
Sodden, Victoria. “A Brief History of the Reproducibility Movement.” December 10, 2012.
http://hdl.handle.net/10022/AC:P:15396.
Soranno, Patricia A., Kendra S. Cheruvelil, Kevin C. Elliott, and Georgina M. Montgomery.
“It’s Good to Share: Why Environmental Scientists’ Ethics are Out of Date.” BioSci-
ence 65, no. 1 (2015): 69-73. doi: 10.1093/biosci/biu169.
SPARC Open Data. “Research Funder Data Sharing Policies.” Accessed August 5, 2016.
http://sparcopen.org/our-work/research-data-sharing-policy-initiative/funder-poli-
cies/.
Sturges, Paul, Marianne Bamkin, Jane H.S. Anders, Bill Hubbard, Azhar Hussain and Mel-
anie Heeley. “Research Data Sharing: Developing a Stakeholder-Driven Model for
Journal Policies.” Journal of the Association for Information Science and Technology. doi:
10.1002/asi.23336.
Swanson, Alexandra, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith,
and Craig Packer. “Snapshot Serengeti, High-frequency Annotated Camera Trap
Images of 40 Mammalian Species in an African Savanna.” Dryad Digital Repository.
doi:10.5061/dryad.5pt92.
Tenopir, Carol, Ben Birch, and Suzie Allard. Academic Libraries and Research Data Services:
Current Practices and Plans for the Future. An ACRL White Paper. Association of
College and Research Libraries, a division of the American Library Association, 2012.
http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/
Tenopir_Birch_Allard.pdf.
The Wellcome Trust. “Policy on Data Management and Sharing.” Accessed August 6, 2016.
https://wellcome.ac.uk/funding/managing-grant/policy-data-management-and-shar-
ing.
30 Introduction to Volume One
Thomson, Sara Day. “Technology Watch Report 16: Preserving Transactional Data.” Digital
Preservation Coalition. May 2, 2016. doi:10.7207/twr16-02.
Tibbo, Helen R., and Christopher A. Lee. “Closing the Digital Curation Gap: A Grounded
Framework for Providing Guidance and Education in Digital Curation.” In Archiving
Conference, vol. 2012, no. 1, pp. 57-62. Society for Imaging Science and Technology,
2012. http://www.ils.unc.edu/callee/p57-tibbo.pdf.
United States Government. “US Open Data Action Plan.” May 9, 2014. https://www.white-
house.gov/sites/default/files/microsites/ostp/us_open_data_action_plan.pdf.
University of Illinois Urbana-Champaign School of Information Science. “Specialization in
Data Curation.” Accessed August 4, 2016. http://www.lis.illinois.edu/academics/pro-
grams/specializations/data_curation.
University of Notre Dame. “About the eMotion and eCognition Lab.” Accessed August 6,
2016. http://www3.nd.edu/~emotecog/about.html.
US Geological Survey. “NBII to Be Taken Offline Permanently in January.” USGS Access
Newsletter 14, no. 3 (Fall 2011), https://www2.usgs.gov/core_science_systems/Access/
p1111-1.html.
Van Noorden, Richard. “Irish University Labs Face External Audits.” Nature News, June
17, 2014. http://www.nature.com/news/irish-university-labs-face-external-au-
dits-1.15422.
Vines, Timothy H., Arianne YK Albert, Rose L. Andrew, Florence Débarre, Dan G. Bock,
Michelle T. Franklin, Kimberly J. Gilbert, Jean-Sébastien Moore, Sébastien Renaut,
and Diana J. Rennison. “The Availability of Research Data Declines Rapidly with Ar-
ticle Age.” Current Biology 24, no. 1 (2014): 94-97. doi:10.1016/j.cub.2013.11.014.
Witt, Michael. “Institutional Repositories and Research Data Curation in a Distributed
Environment.” Library Trends 57, no. 2 (2008): 191-201. doi:10.1353/lib.0.0029.
Yahoo. “R10—Yahoo News Feed dataset, version 1.0 (1.5TB).”Accessed August 6, 2016.
http://webscope.sandbox.yahoo.com/catalog.php?datatype=r&did=75.
PART I
Setting the Stage for
Data Curation
Policies, Culture, and
Collaboration
Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston
33
CHAPTER 1*
Research and the
Changing Nature of
Data Repositories
Karen S. Baker and Ruth E. Duerr
Introduction
This chapter explores the changing nature of research and data repositories.Trends
in open data, big data, and long-tail data are ongoing,1
following shifts from an-
alog devices and documentation to digital instrumentation and digital data. Fur-
ther, recent mandates about increasing access to data in the United States come
at a time when digital capabilities are increasing though digital infrastructure is
in flux.2
Attention to and funding for data sharing have propelled data repository
activities in both new and established digital settings. As the number and kind
of repositories accepting research-generated data increase, their effectiveness de-
pends upon developing widespread understanding of data concepts as well as the
knowledge accumulated about successes and failures in the digital realm.
The full reality of managing research data and data repositories in a Dig-
ital Age is informed and shaped by past efforts carried out in many sectors. It
is impacted by new participants, new roles, and changes in the distribution of
responsibilities associated with data management. In addition, evolving technol-
ogies result in changing support mechanisms for documentation, preservation,
and access of data. Contemporary data management efforts have more than fifty
* This work is licensed under a Creative Commons Attribution 4.0 License, CC BY (https://
creativecommons.org/licenses/by/4.0/).
34 Chapter 1
years’ experience to draw upon given early large-scale assemblies of digital data
in scientific research fields such as remote sensing and weather as well as social
science research fields such as survey and census methods.3
Only a portion of the
insights gained from past experience with data management and data systems are
readily available given the combination of emphasis on scientific findings and
of succinctness required in writing for the scholarly literature. Incentives and
rewards for writing about work with data have been lacking.4
New forums and
journals are emerging that provide venues for discussions about past and present
work with data so that past experience is available to new communities of data
workers (see section “Changing Research Needs and New Initiatives” below).
This paper considers both conceptual and historical underpinnings in the
story of data repositories. From work with data repositories in a variety of research
fields, three concepts—data ecosystem, liaison work, and continuing design—
help in understanding how work with digital data can contribute to the viability
and well-being of the research process. These concepts, together with related issues
and recommendations, are presented below as projects, communities, consortia,
alliances, centers, programs, agencies, universities, publishers, libraries, and orga-
nizations of all kinds grapple with managing and preserving data in repositories.
Background
A few early data efforts in the sciences are presented as examples of past activities
that inform today’s work.
Changing Support for Data
Work with data is embedded in the processes, methods, and goals of research.
Rigor in documenting thought processes, evidence collection, and data is inte-
gral to ensuring a robust research process. There is a long history of research data
recorded in station books and laboratory notebooks.5
In addition, white papers
and project newsletters as well as expedition and technical reports full of tables of
numbers were, and continue to be, published outside formal academic and com-
mercial channels by a variety of organizations. Such materials, known as “the gray
literature,” are authoritative as primary sources. As the name suggests, however,
they may be limited in terms of discoverability, access, and vetting. Nevertheless,
these outlets have played a significant role in providing researchers access to data.
While research findings traditionally appear in formal publication venues, the
original, full data record was often in the gray literature as well as file cabinets.6
With the development of technologies such as cameras and strip chart
recorders, a variety of organizational subunits such as photo labs emerged to
Research and the Changing Nature of Data Repositories 35
handle these analog materials and to support conversion to forms that could
be published. Although they did not consider themselves data publishers, they
or their counterparts routinely created reports with primary data in the form
of tables, photos, maps, and graphs. Many of these offices have since closed
or have been transformed, such as the photo lab that becomes a digital service
group. Closing often occurred before infrastructure was in place to handle
documentation and data in new ways beyond the capability of an individual’s
desktop. Eventually, with Internet availability, researchers and research groups
developed new practices such as delivery of content including field data under
a Data tab on a research website. In a sense, the current attention to data ac-
cess and new forms of data citation is a return to the norm of retrieving and
citing data that appeared in the print-based gray literature. With orders of
magnitude more digital data generated, however, new kinds of digital tools,
capabilities, and arrangements are required to support widespread access to
digital data.
Expanding Support for Data in Natural
and Social Sciences
With the development of large-scale international research initiatives, support
for data took a variety of forms. Spurred by twentieth-century post–World War
II planning, a number of data facilities were established. For instance, World
Data Centers and the Federation of Astronomical and Geophysical Data Anal-
ysis Services evolved, starting with the International Geophysical Year (IGY) in
1957–1958 with its focus on international science. From the IGY, a revolution-
ary vision of the earth as a whole emerged, focusing the attention of geoscientists
collectively on scientific methods, measurements, and data. The International
Council of Scientific Unions (now International Council for Science) established
a system of World Data Centers to serve the IGY and developed data manage-
ment plans for each IGY scientific discipline.7
The World Data Centers focused
on replicating data across the centers and sharing data across the globe. The
ICSU Committee on Data for Science and Technology (CODATA) continues to
develop and share knowledge about data today.8
With their beginnings as centers
full of the books and reports containing data for IGY and other initiatives, early
data efforts grew to include magnetic tapes and punch cards at designated loca-
tions. Today management in data centers has grown to include digital data and
physical samples as well as to accommodate many stakeholders and audiences.9
The transition and renaming of the World Data Center system in 2009 to be the
World Data System represents another shift in perspective with data envisioned
within an interoperable set of systems.
36 Chapter 1
In the United States, federal centers developed and took many forms.
Federally Funded Research Development Centers (FFRDC) were created as
public-private partnerships to support research community projects by mak-
ing available large-scale resources such as the aircraft required for atmospheric
science fieldwork.10
Research support includes project coordination, instru-
mentation, field support, and work with data. National Data Centers such
as the National Climate Data Center and the National Oceanographic Data
Center were created in order to support management of data from platforms
with large data streams such as from satellites. Supercomputer centers were
developed as national resources to provide computational power to research-
ers across the nation.11
These centers have developed repositories for data of
many kinds existing alongside other preservation institutions such as archives
with collections of photos and manuscripts, museums with physical artifacts,
and libraries with books and journals. Tape racks proliferated as recordings
on seven- and nine-track tapes replaced everything from strip chart recorders
to images. Tapes were replaced in turn by new storage technologies. Many
other, less visible changes were occurring in data centers, driven by chang-
es in applications, configurations, budgets, institutions, and careers.12
As the
number of data centers grew, coordination activities started taking place. For
instance, the National Archives and Records Administration (NARA) joined
in 1992 with the scientific community and with federal and nonfederal enti-
ties that collect data about the earth to consider collectively data management
and archiving procedures.13
The ramifications of this interaction resulted in
recommendations that NARA collaborate with other agencies that maintain
long-term custody of data.
In the social sciences, early national-level repository development was
spurred by an initial need for community access to data from election studies
and from the US Census.14
The Inter-university Consortium for Political and
Social Research (ICPSR), which dates its origin to 1962, provides an example of
responding to change over time. ICPSR began with a membership model to fund
its data management costs but is now leading a call for change in support mech-
anisms for domain repositories.15
This consortium has responded to community
interests by participating in an alliance to distribute widely backup copies of data
across several repositories. ICPSR has also responded to recent mandates for pub-
lic data access by creating a new level of service. This service, called OpenICPSR,
supports public availability of data free of cost.16
Data Repository Diversity
Setting aside the issue of data presentation, we consider two categories of data
repositories depending upon whether they ingest homogeneous or heterogeneous
Exploring the Variety of Random
Documents with Different Content
tomb of Archimedes.[792]
On his return home[793]
he resumed his
forensic practice: and in B. C. 70 was the champion of his old friends,
the Sicilians, and impeached Verres, who had been prætor of
Syracuse, for oppression and maladministration. In the following
year[794]
he was elected curule ædile by a triumphant majority. In the
celebration of the games which belonged to the province of this
magistrate, he exhibited great prudence by avoiding the lavish
expenditure in which so many were accustomed to indulge, whilst, at
the same time, no one could accuse him of meanness and illiberality.
In the year B. C. 67, he obtained the prætorship, and
notwithstanding the judicial duties of his office, defended Cluentius.
Hitherto his speeches had been entirely of the judicial kind. He now
for the first time distinguished himself as a deliberative orator, and
supported the Manilian law which conferred upon Pompey, to the
discomfiture of the aristocratic party, the command in chief of the
Mithridatic war.
The great object of his ambition now was the consulship, which
seemed almost inaccessible to a new man. As all difficulties and
prejudices were on the side of the aristocratic party, his only hope of
surmounting them was by warmly espousing the cause of the people.
Catiline and C. Antonius, who were his principal competitors,
formed a coalition, and were supported by Cæsar and Crassus, but
the influence of Pompey and the popular party prevailed; and Cicero
and Antony were elected. He entered upon his office January 1, B. C.
63. At this period, perhaps, the moral qualities of his character are
the highest, and his genius shines forth with the brightest splendour.
The conspiracy of Catiline was the great event of his consulship; a
plot which its historian does not hesitate to dignify with the title of a
war. Yet this war was crushed in an unparalleled short space of time;
and a splendid triumph was gained over so formidable an enemy, by
one who wore the peaceful toga, not the habiliments of a general.
The prudence and tact of the civilian did as good service as the
courage and decision of the soldier. The applause and gratitude of his
fellow citizens were unbounded, and all united in hailing him the
father of his country. One act alone laid him open to attack, and in
fact eventually caused his ruin. There is no doubt that it was
unconstitutional, although under the circumstances it was
defensible, perhaps scarcely to be avoided. This act was the execution
of Lentulus, Cethegus, and the other ringleaders, without sentence
being passed upon them by the comitia. The senate, seeing that the
danger was imminent, had invested Cicero and his colleague with
power to do all that the exigencies of the state might require (videre
ne quid res publica detrimenti caperet;) and although it was Cicero
who recommended the measure and argued in its favour, it was the
senate who pronounced the sentence, and assumed that, as traitors,
the conspirators had forfeited their rights as citizens.
The grateful people saw this clearly; and when Metellus Celer, one
of the tribunes, would have prevented Cicero from giving an account
of his administration at the close of the consular year, he swore that
he saved his country, and his oath was confirmed by the
acclamations of the multitude. This was a great triumph; and in
sadder times he looked back to it with a justifiable self-complacency.
[795]
He now, as though his mission was accomplished, refused all
public dignities except that of a senator: but he did not thus escape
peril; he soon exposed himself to the implacable vengeance of a
powerful and unscrupulous enemy. The infamous P. Clodius Pulcher
intruded himself in female attire into the rites of the Bona Dea,
which were celebrated in the house of Cæsar. Suspicion fell upon
Cæsar’s wife, and a divorce was the consequence.[796]
Clodius was
brought to trial on the charge of sacrilege, and pleaded an alibi.
Cicero, however, proved his presence in Rome on the very day on
which the accused asserted that he was at Interamnum.
Although the guilt of Clodius was fully established, his influence
over the corrupt Roman judices was powerful enough to procure an
acquittal. Henceforward he never could forgive Cicero, and
determined to work his ruin. He caused himself to be adopted in a
plebeian family; and thus becoming qualified for the tribunate was
elected to that magistracy, B. C. 59. No sooner was he appointed,
than he proposed a bill for the outlawry of any one who had caused
the execution of a citizen without trial. Cicero at once saw that this
blow was aimed against himself. He had disgusted Cæsar by his
political coquetry; the false and selfish Pompey refused to aid him in
his trouble; and spirit-broken, he fled to Brundisium,[797]
and thence
to Thessalonica. He had an interview with Pompey before his flight,
but it led to no results.[798]
He had sworn to help him as long as he
felt that there was danger, lest he should join Cæsar’s party; but
when he saw that his foes were successful, he deserted him.
In his absence his exile was decreed, and his town and country
houses were given up to plunder. It cannot be denied that during his
banishment he exhibited weakness and pusillanimity: his reverses
had such an effect upon his mind that he was even supposed to be
mad.[799]
His great fault was vanity, of which defect he was himself
conscious, and confessed it;[800]
and disappointed vanity was the
cause of his affliction. He could bear anything better than loss of
popular applause; and on this occasion, more than any other, he gave
grounds for the assertion, that “he bore none of his calamities like a
man, except his death.” Rome, however, could not forget her
preserver; and in the following year he was recalled, and entered
Rome in triumph, in the midst of the loud plaudits of the assembled
people.[801]
Still, however, he was obliged to secure the prosperity
which he had recovered by political tergiversation. The measures of
the triumvirate, which he had formerly attacked with the utmost
virulence, he did not hesitate now to approve and defend.
After his return[802]
he was appointed to a seat in the College of
Augurs; a dignity which he had anxiously coveted before his exile,
and to obtain which, he had offered almost any terms to Cæsar and
Pompey.[803]
The following year, much against his will, the province
of Cilicia was assigned to him. Strictly did the accuser of Verres act
up to the high and honourable principles which he professed. His
was a model administration: a stop was put to corruption, wrongs
were redressed, justice impartially administered. Those great
occasions on which he was compelled to act on his own
responsibility, and to listen to the dictates of his beautiful soul,
“seine schöne seele,”[804]
his pure, honest, and incorruptible heart,
are the bright points in Cicero’s career. The emergency of the
occasion overcame his constitutional timidity.
In the year B. C. 49, he returned to Rome, and finding himself in a
position in which he could calmly observe the current of affairs, and
determine unbiassed what part he should take in them, or whether it
was his duty to take any part at all, his weak, wavering, vacillating
temper again got the mastery over him. He would not do anything
dishonest, but he was not chivalrous enough to spurn at once that
which was dishonourable. Cæsar and Pompey were now at open war,
and he could not make up his mind which to join.[805]
He felt,
probably, that the energy, ability, and firmness of Cæsar, would be
crowned with success; and yet his friends, his party, and his own
heart were with Pompey, and he dreaded the scorn which would be
heaped upon him if he forsook his political opinions. His were not
the stern, unyielding principles of a Cato; but the fear of what men
would say of him made him anxious and miserable. The struggle was
a long one between caution and honour, but at length honour
overcame caution. He made his decision, and went to the camp of
Pompey; but he could never rally his spirits, or feel sanguine as to the
result. He immediately saw that Pharsalia decided the question for
ever, and consequently hastened to Brundisium, where he awaited
the return of the conqueror. It was a long time to remain in
suspense; but at last the generous Cæsar relieved him from it by a
full and free pardon.
And now again his character rose higher, and his good qualities
had room to display themselves. There were no longer equally
balanced parties to revive the discord which formerly distracted his
mind, nor were the circumstances of the times such as to demand his
active interference in the cause of his country; but he was as great in
the exercise of his contemplative faculties as he had been in the
brightest period of his political life. The same faults may, perhaps, be
discerned in his philosophical speculations: the same indecision
which rendered him incapable of being a statesman or a patriot
caused him to adopt in philosophy a skeptical eclecticism. Truth was
to him as variable as political honesty; but he is always the advocate
and supporter of resignation, and fortitude, and purity, and virtue.
He had hitherto suffered as a public man: he was now bowed down
by domestic affliction. A quarrel with his wife Terentia ended in a
divorce:[806]
such was the facility with which at Rome the nuptial tie
could be severed. His second wife was his own ward—a young lady of
large fortune; but disparity of years and temper prevented this
connexion from lasting long. In B. C. 45 he lost his daughter Tullia.
The blow was overwhelming: he sought in vain to soothe his grief in
the woody solitudes of his maritime villa at Astura, and it was long
before the bereaved father found consolation in philosophy.
The political crisis which ensued upon the assassination of Cæsar
alarmed him for his own personal safety: he therefore meditated a
voyage to Greece; but being wind-bound at Rhegium, the hopes of an
accommodation between Antony and the senate (a hope destined not
to be realized) induced him to return. Antony now left Rome, and
Cicero delivered that torrent of indignant and eloquent invective—his
twelve Philippic orations.[807]
He was again the popular idol—crowds
of applauding and admiring fellow-citizens attended him to the
Forum in a kind of triumphant procession, as they had on his return
from exile. But soon the second triumvirate was formed. Each
member readily gave up friends to satisfy the vengeance of his
colleagues, and Octavius sacrificed Cicero.
The story of his death is a brief and sad one. He was enjoying the
literary retirement of his Tusculan villa when his friends warned him
of his approaching fate. He was too great a philosopher to fear death;
but too high-principled and resigned to the Divine will to commit
suicide. Still he scarcely thought life worth preserving: “I will die,” he
said, “in my fatherland, which I have so often saved.” However, at the
entreaty of his brother, to whom he was affectionately attached, he
endeavoured to escape. He first went across the country to Astura,
and there embarked. The weather was tempestuous, and as he
suffered much from sea-sickness, he again landed at Gaëta. A
treacherous freedman betrayed him, and as he was being carried in a
litter he was overtaken by his pursuers. He would not permit his
attendants to make any resistance; but patiently and courageously
submitted to the sword of the assassins, who cut off his head and
hands and carried them to Antony. A savage joy sparkled in the eyes
of the triumvir at the sight of these bloody trophies. His wife, Fulvia,
gloated with inhuman delight upon the pallid features, and in petty
spite pierced with a needle that once eloquent tongue. The head and
hands were fixed upon the rostrum which had so often witnessed his
unequalled eloquence. All that passed by bewailed his death, and
gave vent to their affectionate feelings.
Although it is impossible to be blind to the numerous faults of
Cicero, few men have been more maligned and misrepresented, and
the judgment of antiquity has been, upon the whole, generally
unfavourable. He was vain, vacillating, inconstant, constitutionally
timid, and the victim of a morbid sensibility; but he was candid,
truthful, just, generous, pure-minded, and warm-hearted. His
amiability, acted upon by timidity, led him to set too high a value on
public esteem and favour; and this weakened his moral sense and his
instinctive love of virtue. That he possessed heroism is proved by his
defence of Roscius, although the favourite of the terrible Sulla was
his adversary. He was not entirely destitute of decision, or he would
not so promptly have expressed his approbation of Cæsar’s assassins
as tyrannicides. He had resolution to strive against his over-
sensitiveness, and wisdom to see that mental occupation was its best
remedy; for in the midst of the distractions and anxieties of that
eventful and critical year which preceded the consulship of Hirtius
and Pansa an almost incredible number of works proceeded from his
pen.[808]
There are many circumstances to account for his political
inconsistency and indecision. He had an early predilection for the
aristocratic party; but he saw that they were narrow-minded and
behind their age. All the patricians, except Sulla and his small party,
were on the popular side. He was proud of his connexion with
Marius; and his friend Sulpicius Rufus, whom he greatly admired,
joined the Marians. For these reasons, Cicero was inconsistent as a
politician. Again, during periods of revolutionary turbulence,
moderate men are detested by both sides; and yet it was impossible
for a philosophic temper, which could calmly and dispassionately
weigh the merits and demerits of both, to sympathize warmly with
either. Cicero saw that both were wrong: he was too temperate to
approve, too honest to pretend a zeal which he did not feel, and,
therefore, he was undecided.
Again, having a large benevolence, and a firm faith in virtue, he
was unconscious of guile himself, and thought no evil of others. He
therefore mistook flattery for sincerity, and compliments for
kindness. He was vain; but vanity is a weakness not inconsistent with
great minds, and in the case of Cicero it was fed by the unanimous
voice of public approbation.
As an advocate his delight was to defend, not to accuse.[809]
In
three only of his twenty-four orations did he undertake the office of
an accuser.
Gentle, sympathizing, and affectionate, he lived as a patriot and
died as a philosopher.
Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston
CHAPTER X.
CICERO NO HISTORIAN—HIS ORATORICAL STYLE DEFENDED—ITS
PRINCIPAL CHARM—OBSERVATIONS ON HIS FORENSIC ORATION—
HIS ORATORY ESSENTIALLY JUDICIAL—POLITICAL ORATIONS—
RHETORICAL TREATISES—THE OBJECT OF HIS PHILOSOPHICAL
WORKS—CHARACTERISTICS OF ROMAN PHILOSOPHICAL
LITERATURE—PHILOSOPHY OF CICERO—HIS POLITICAL WORKS—
LETTERS—HIS CORRESPONDENTS—VARRO.
Such were the life and character of Cicero. The place which he
occupies in a history of Roman literature is that of an orator and
philosopher. It has been already stated that he had some taste for
poetry: in fact, without imagination he could scarcely have been so
eminent as an orator; but though the power which he wielded over
prose was irresistible, he had not fancy enough to give a poetical
character to the language.
Nor had he, notwithstanding the versatility of his talents, any taste
for historical investigation. He delighted to read the Greek
historians, for the same purpose for which he studied the Attic
orators, merely as an instrument of intellectual cultivation; but he
was ignorant of Roman history, because he took no interest in
original research. His countrymen[810]
expected from him an
historical work, but he was unfit for the task. It is plain from his
“Republic,” how little he knew as an antiquarian.
The greatest praise of an orator’s style is to say that he was
successful. The end and object of oratory is to convince and persuade
—to rivet the attention of the hearer, and to gain a mastery over the
minds of men. If, therefore, any who study the speeches of Cicero in
the closet find faults in his style, they must remember the very faults
themselves were suited to the object which he was carrying into
execution. During the process of raising the public taste to the
highest standard, he carried his hearers with him: he was not too
much in advance; he did not aim his shafts too high; they hit the
head and heart. Senate, judges, people understood his arguments,
and felt his passionate appeals. Compared with the dignified energy
and majestic vigour of the Athenian orator, the Asiatic exuberance of
some of his orations may be fatiguing to the sober and chastened
taste of the modern classical scholar; but in order to form a just
appreciation, he must transport himself mentally to the excitements
of the thronged Forum—to the senate composed, not of aged,
venerable men, but statesmen and warriors in the prime of life,
maddened with the party spirit of revolutionary times—to the
presence of the jury of judices, as numerous as a deliberative
assembly, whose office was not merely calmly to give their verdict of
guilty or not guilty, but who were invested as representatives of the
sovereign people with the prerogative of pardoning or condemning.
Viewed in this light, his most florid passages will appear free from
affectation—the natural flow of a speaker carried away with the
torrent of his enthusiasm. The melodious rise and fall of his periods
are not the result of studied effect, but of a true and musical ear.
Undoubtedly, amongst his earlier orations, are to be found passages
somewhat too declamatory and inconsistent with the principles
which he afterwards laid down when his taste was more matured,
and when he undertook to write scientifically on the theory of
eloquence. Nor must it be concealed that some of the staid and stern
Romans of his own days were daring enough, notwithstanding his
popularity and success, to find the same fault with him. “Suorum
temporum homines,” says Quintilian, “incessere audebant eum ut
tumidiorem et Asianum[811]
et redundantem et in repetitionibus
nimium et in salibus aliquando frigidum et in compositione fractum
et exsultantem et pene viro molliorem.”
But it is not only the brilliance and variety of expression, and the
finely-modulated periods, which constituted the principal charm of
Ciceronian oratory, and rendered it so effective. Its effectiveness was
mainly owing to the great orator’s knowledge of the human heart,
and of the national peculiarities of his countrymen. Its charm was
owing to his extensive acquaintance with the stores of literature and
philosophy, which his sprightly wit moulded at will, to the varied
learning which his unpedantic mind made so pleasant and popular,
to his fund of illustration at once interesting and convincing. Even if
his knowledge, because it spread over so wide a surface, was
superficial, in this case profoundness was unnecessary.
In a work like the present it is only possible to devote a few brief
observations to the most important of his numerous orations, in
which, according to the criticism of Quintilian, he combined the force
of Demosthenes, the copiousness of Plato, and the elegance of
Isocrates. Knowledge of law, far superior to that possessed by the
great orators of the day,[812]
distinguishes his earliest extant oration,
the defence of P. Quinctius.[813]
Hortensius was the defendant’s
counsel. Nævius, the defendant, who had unjustly possessed himself
of the property of the plaintiff’s deceased brother, was a deserter
from the Marians, and therefore a protégé of Sylla; but,
notwithstanding these disadvantages, Cicero gained his cause. In the
masterly defence of S. Roscius,[814]
Cicero again defied Sulla. His
client was accused of parricide: there was not a shadow of proof, and
Cicero saved the life of an innocent man. The noble enthusiasm with
which he inveighs against tyranny in this oration strikingly contrasts
with the language, full of sweetness, in which he describes Roman
rural life. The passage on parricide was too glowing and Asiatic for
the taste of his maturer years, and he did not hesitate to make it the
subject of severe criticism.[815]
Passing over speeches of less interest,
we come to the six celebrated Verrian orations. Of these chefs-
d’œuvre the first only was delivered.[816]
The others were merely
published; for the voluntary exile of the criminal rendered further
pleading unnecessary. The first is entitled “Divinatio,” i. e., an
inquiry as to who should have the right of prosecuting: Cæcilius, who
had been quæstor to the accused, claimed this privilege, wishing to
make the suit a friendly one, and thus quash the proceedings.
Nothing can surpass the ironical and sarcastic exposure of this
fraudulent attempt to defeat the ends of justice. The noble passages
in the succeeding orations of the series are well known; the sketch of
the wicked proconsul’s antecedent career; the graceful eulogy of that
province, in the welfare of which Cicero himself felt so warm an
interest; the tasteful description of the statues and antiquities which
tempted the more than Roman cupidity of Verres; the interesting
history of ancient art which accompanies it; the burst of pathetic
indignation with which he paints the horrible tortures to which not
only the provincials, but even Roman citizens, were exposed.
Transports of joy pervaded the whole of Sicily at Cicero’s success;
and the Sicilians caused a medal to be struck with this inscription
—“Prostrato Verre Trinacria.” The oration for Fonteius[817]
is a
skilful defence of an unpopular governor; that in defence of
Cluentius[818]
is one of the most remarkable causes célèbres of
antiquity; and the complicated scene of villany which Cicero’s
forcible and soul-harrowing language paints, makes one shudder
with horror, whilst we are struck with admiration at the clearness of
intellect with which he unravels the web of guilt woven by
Oppianicus and Sassia. This remarkable oration has been analyzed
by Dr. Blair.[819]
Again, passing over other forensic orations, we come to that on
which he had evidently expended all his resources of art, taste, and
skill—the speech for the poet Archias.[820]
If possible it is even too
elaborate and polished for so graceful a theme. Although the object
of the advocate was simply to establish the right of his client to
Roman citizenship, the genius of the poet of Antioch furnished an
opportunity not to be neglected for digressing into the fields of
literature, and for pronouncing a truly academical eulogium on
poetry. It is satisfactory to the admirers of Cicero to find that the
attack which has been made on the genuineness of this pleasing
oration is groundless and unwarrantable.[821]
The oration pro Cælio[822]
is the most entertaining in the whole
collection. It contains a rich fund of anecdote, seasoned with witty
observations; a knowledge of human nature illustrated in a piquant
and humorous style, expressed in a tone of most gentlemanlike yet
playful eloquence, and interspersed with passages of great beauty. It
presents a marked contrast to the coarse personal abuse which
defaces the otherwise powerful invective against L. Piso, which was
delivered in the following year.[823]
The list, though many more marvellous specimens are omitted,
must be closed with the oration in defence of T. Annius Milo. On this
occasion Cicero lost his wonted self-possession. When the court
opened, Pompey was presiding on the bench, and he had caused the
Forum to be occupied with soldiers. The sight, added, perhaps, to the
consciousness that he was advocating a bad cause, struck Cicero with
alarm; his voice trembled, his tongue refused to give utterance to the
conceptions which he had formed. The judges were unmoved; and
Milo remained in his self-imposed exile at Marseilles. When Cicero
left the court his courage and calmness returned. He penned the
oration which is now extant. He had little or no proof or evidence to
offer, and therefore, as an argumentative work, it is unconvincing;
but for force, pathos, and the externals of eloquence, it deserves to be
reckoned amongst his most wonderful efforts. When the exiled Milo
read it, he is said to have exclaimed, “O, Cicero, if you had pleaded
so, I should not be eating such capital fish here!” The author himself
and his contemporaries thought this his finest oration; probably its
deficiencies were concealed by its eloquence and ingenuity. It
appears that the oration which he actually delivered was taken down
in writing by reporters, and was extant in the time of Asconius
Pedianus, the most ancient commentator on Cicero’s orations.[824]
Its
feebleness proved the correctness of the judgment of antiquity.
The oratory of Cicero was essentially judicial: he was himself
conscious that his talents lay in that direction, and he saw that in
that field was the best opportunity for displaying oratorical power.
Even his political orations are rather judicial than deliberative. He
was not born for a politician. He possessed not that analytical
character of mind which penetrates into the remote causes of human
action, nor the synthetical power which enables a man to follow them
out to their farthest consequences; he had not that comprehensive
grasp of mind which can dismiss at once all points of minor
importance and useless speculation, and, seizing all the salient
points, can bring them to bear together upon questions of practical
expediency. Of the three qualities necessary for a statesman he
possessed only two, honesty and patriotism: he had not political
wisdom.
Hence, in the finest specimens of his political harangues, his
Catilinarians and Philippics, and that in support of the Manilian law,
we look in vain for the calm, practical weighing of the subject which
is necessary in addressing a deliberative assembly. This was not the
habit of his mind. He was only lashed to action by circumstances of
great emergency; but even then he is still an advocate—all is
excitement, personal feeling, and party spirit: he deals in invective
and panegyric, and the denunciation of the enemies of his country;
and the parts which especially call forth our admiration differ in
nothing from those which we admire in his judicial orations.
Nevertheless, so irresistible was the influence which he exercised
upon the minds of his hearers, that all his political speeches were
triumphs. His panegyric on Pompey,[825]
in the speech for the
Manilian law, carried his appointment as commander-in-chief of the
armies of the East. The consequence of the oration de Provinciis
Consularibus continued to Cæsar his administration of Gaul. He
crushed in Catiline one of the most formidable traitors that had ever
menaced the safety of the republic. Antony’s fall followed the
complete exposure of his debauchery in private life, and the
factiousness of his public career.[826]
Of the Catilinarians, the first and fourth were delivered in the
senate, the second and third in the presence of the people. Every one
knows the burst of indignation which the consul, rising in his place,
aims at the audacious conspirator who dared to pollute with his
presence the temple of the deity, and the most august assembly of
the Roman people. In less than twenty-four hours Catiline had left
Rome, and the conspiracy had become a war. In four words Cicero
announced this to the assembled Romans the day after he had
addressed the senate. The third is a piece of self-complacent but
pardonable egotism. Success has overwhelmed him—he sees that all
eyes are turned upon himself—he is the hero of his own story; still he
demands no reward but the approbation of his fellow-citizens, and
reminds them that to the gods alone their gratitude is due.
Two days pass away, and after Cæsar and Cicero had spoken,
Cicero again addresses the senate, and recommends that measure
which was the beginning of his troubles, the condemnation of the
conspirators. The zeal of the senate made the act their own, but
Cicero paid the penalty. The position which Cicero occupies on this
occasion invests his speech with more dignity than is displayed in
any of the preceding. He is the chief magistrate of the republic,
performing the duty of pronouncing a capital sentence on the guilty.
The excitement of the crisis is subsiding; and he has the more
composure, because he knows that he carries with him the
sympathies of the senate and people.
The Philippics, so named after the orations of Demosthenes, are
fourteen in number. Cicero commenced his attack[827]
upon the
object of his implacable hatred with a defence of the laws of Cæsar,
which Antony wished to repeal. He followed it up with the celebrated
second oration, in which he demolished the character of Antony; a
speech which Juvenal pronounced to be his chef-d’œuvre, but which
Niebuhr thought was undeserving of being so highly exalted. He
delivered the remaining twelve in the course of the succeeding year;
they were the last monuments of his eloquence; he never spoke
again. The fourteenth is a brilliant panegyric, but nothing more; the
gallant army of Octavius received their deserved applause; but in this
political crisis the orator could not discern or even catch a glimpse of
the future destinies of his country.
In his rhetorical works, Cicero left a legacy of practical instruction
to posterity. The treatise “De Inventione,” although it displays
genius, is merely interesting as the juvenile production of a future
great man; and the author himself alludes to it as a rude and
unfinished production.[828]
Of the Rhetorical Hand-Book, in four
sections, addressed to Herennius, it is unnecessary to speak, as it is
now universally pronounced spurious.[829]
The De Oratore, Brutus
sive de claris Oratoribus, and Orator ad M. Brutum,[830]
are the
result of his matured experience. They form together one series; the
principles are first laid down; their developments are carried out and
illustrated; and lastly, in the Orator, he places before the eyes of
Brutus the model of ideal perfection. In his treatment of this subject,
he shows a mind imbued with the spirit of Plato: he invests it with
dramatic interest, and transports the reader into the scene which he
so graphically describes. The conversation contained in the first of
these works has been already described. The scene of the second is
laid on the lawn of Cicero’s palace at Rome: Cicero, Atticus and M.
Brutus are the dramatis personæ; and their taste receives
inspiration from a statue of Plato which adorns the garden. In the
third, Cicero himself, at the request of M. Brutus, paints, as Plato
would have done, the portrait of a faultless orator.
Three more short treatises must be added—(1.) The dialogue, De
Partitione Oratoria,[831]
an elementary book, written for his son. (2.)
The De Optimo Genere Oratorum,[832]
a short preface to a translation
of the Greek oration, De Corona. (3.) The Topica,[833]
i. e., a treatise
on the commonplaces of judicial oratory.
Philosophy of Cicero.
Cicero somewhat arrogantly claims the credit of being the first to
awaken a taste for philosophy, and to illuminate the darkness in
which it lay hid by the light of Roman letters.[834]
He did not confess
the obligations under which he lay to his predecessors, because he
never could forget that he was an orator.[835]
He could not deny that
some of them thought justly; but he denied that they possessed the
power of expressing what they thought. He felt that there was
nothing in the philosophical writings already existing to tempt his
countrymen to study the subject: they were dry, unadorned,
unpolished. It required an orator to array philosophy in an enticing
garb. He proposed, therefore, to assuage his anxieties—to seek
repose from the harassing cares of politics[836]
—by rendering his
countrymen independent of Greek philosophical literature.
This was all he proposed to himself: it was all that his predecessor
had attempted; nor did he pretend to originality. The periods which
he devoted to the task, and to which all philosophical works belong,
were those during which he was excluded from political life. The first
of these was the triumvirate of Cæsar, Pompey, and Crassus; the
second was coincident with the dictatorship of Cæsar and the
consulship of Antony. Not only did his contemplative spirit delight in
such studies, but, whilst all the avenues to distinction were closed
against him, his ambition sought this road to fame, and his
patriotism urged him to take this method of benefiting his country.
But as he was not the first who introduced philosophy to the
Romans, it will be necessary briefly to sketch its progress up to the
time at which his labours commenced.
Roman philosophy was neither the result of original investigation
nor the gradual development of the Greek system. It arose rather
from a study of ancient philosophical literature than from an
examination of philosophical principles. The Roman intellect did not
possess the power of abstraction in a sufficiently high degree for
research, nor was the Latin language capable of representing
satisfactorily abstract thoughts. Cicero was quite aware of the
poverty of its scientific nomenclature, as compared with that of
Greece. In one treatise,[837]
he writes,—“Equidem soleo etiam, quod
uno Græci, si aliter non possum, idem pluribus verbis exprimere.”
Pliny[838]
and Seneca[839]
assert the same fact. “Magis damnabis,”
writes the latter, “angustias Romanas si scieris unam syllabam esse,
quam mutare non possim. Quæ hæc sit quæris? το ον.” The practical
character also of the people prompted them to take advantage of the
material already furnished by others, and to select such doctrines as
it approved, without regard to their relation to each other.
The Roman philosopher, therefore, or rather (to speak more
correctly) philosophical student, did not throw himself into the
speculations of his age, pursue them contemporaneously, or deduce
from them fresh results. He went back to the earlier ages of Greek
philosophy, studied, commented on, and explained the works of the
best authors, and adopted some of their doctrines as fixed scholastic
dogmas. Consequently, the spirit in which philosophical study was
pursued by the Romans was a literary and not a scientific one. A
taste for literature had been awakened, and philosophy was
considered only as one species of literature, although its importance
was recognised as bearing upon the practical duties, the highest
interests and happiness of man. The practical view which Cicero took
of philosophy, and the extensive influence which he attributed to it,
is manifest from numerous passages in his works,[840]
and is
imbodied in the following beautiful apostrophe in the Tusculan
Disputations:[841]
“O vitæ Philosophia dux! O virtutis indagatrix,
expultrixque vitiorum! Quid non modo nos, sed omnino vita
hominum sine te esse potuisset? Tu urbes peperisti; tu dissipatos
homines in societatem vitæ convocasti; tu eos inter se primo
domiciliis, deinde conjugiis, tum literarum et vocum communione
junxisti; tu inventrix legum, tu magistra morum et disciplinæ fuisti;
ad te confugimus, a te opem petimus; tibi nos, ut antea magna ex
parte, sic nunc penitus totosque tradimus.”
It is plain, therefore, that the chief characteristics of Roman
philosophy would be—(1.) Learning, for it consisted in bringing
together doctrines and opinions scattered over a wide field; (2,)
Generally speaking, an ethical purpose and object, for Romans would
be little inclined to value any subject of study which had no ultimate
reference to man’s political and social relations; (3,) Eclecticism; for
although there were certain schools, such as the Epicurean and Stoic,
which were evidently favourites, the dogmas of different teachers
were collected and combined together often without regard to
consistency.
The defects of such a system are fatal to its claim to be considered
philosophical; for the scientific connexion of its parts is lost sight of,
and results are presented independent of the chain of causes and
effects by which they are connected with principles. Such a system
must necessarily be illogical and inconsequential. Even the liberality
which adopts the principle, “Nullius jurare in verba magistri,” and
which, therefore, appears to be its chief merit, was absurd; and the
willingness with which all views were readily admitted led to
skepticism, or doubt whether such a thing as absolute truth had a
real existence.
Greek philosophy was probably first introduced into Rome by the
Achæan exiles, of whom Polybius was one.[842]
The embassy of
Carneades the Academic, Diogenes the Stoic, and Critolaus the
Peripatetic, followed six years afterwards. In vain the stern M.
Porcius Cato caused their dismissal; for some of the most illustrious
and accomplished Romans, such as Africanus, Lælius, and Furius,
had already profited by their lectures and instructions.[843]
Whilst the
educated Romans were gaining an historical insight into the
doctrines of these schools, the Stoic Panætius, who was entertained
in the household of Scipio Africanus, was unfolding the mysterious
and transcendental doctrines of the great object of his veneration,
Plato. But although the Romans could appreciate the majestic dignity
and poetical beauty of his style, they were not equal to the task of
penetrating his hidden meaning; they were, therefore, content to
take upon trust the glosses and commentaries of his expositors.
These inclined to the New Academy rather than to the Old: in its
skeptical spirit they compared and balanced opposing probabilities;
and went no farther than recommending the adoption of opinions
upon which they could not pronounce with certainty. Neither did the
Peripatetic doctrines meet with much favour, although the works of
Aristotle had been brought to Rome by the dictator Sulla, partly, as
Cicero says, because of the vastness of the subjects treated, partly
because they seemed incapable of satisfactory proof to unskilled and
inexperienced minds.[844]
The philosophical system which first arrested the attention of the
Romans, and gained an influence over their minds, was the
Epicurean.[845]
But it is somewhat remarkable that, although this
philosophy was in its general character ethical, a people so eminently
practical in their turn of mind should have especially devoted
themselves to the study of the physical speculations of this school.
[846]
The only apparent exception to this statement is Catius, but even
his principal works, although he wrote one, “de Summo Bono,” are
on the physical nature of things.[847]
Cicero accounts for the popularity of Epicureanism by saying that
it was easy—that it appealed to the blandishments of pleasure; and
that its first professors, Amafanius and Rabirius, used none of the
refinements of art or subtleties or dialectic, but clothed their
discussions in a homely and popular style, suited to the simple and
unlearned. There were many successors to Amafanius; and the
doctrines which they taught rapidly spread over the whole of Italy.
Many illustrious statesmen, also, were amongst the believers in this
fashionable creed; of whom the best known are C. Cassius, the
fellow-conspirator of Brutus, and T. Pomponius Atticus, the friend of
Cicero. All the monuments and records, however, of the Epicurean
philosophy, which were published in Latin, have perished, with the
exception of the immortal work of T. Lucretius Carus, “De Naturâ
Rerum.”
Nor was Stoicism, the severe principles of which were in harmony
with the stern old Roman virtues, without distinguished disciples;
such as were the unflinching M. Brutus, the learned Terentius Varro,
the jurist Scævola, the unbending Cato of Utica, and the magnificent
Lucullus—a Stoic in creed, though not in life and conduct. The part
which Cicero’s character qualified him to perform in the
philosophical instruction of his countrymen was scarcely that of a
guide: he could give them a lively interest in the subject, and reveal
to them the discoveries and speculations of others, but he could not
mould and form their belief, and train them in the work of original
investigation. Not being himself devoutly attached to any system of
philosophical belief, he would be cautious of offending the
philosophical prejudices of others. He loved learning, but his temper
was undecided and vacillating: whilst, therefore, he delighted in
accumulating stores of Greek erudition, the tendency of his mind
was, in the midst of a variety of inconsistent doctrines, to leave the
conclusion undetermined. Although he listened to various
instructors—Phædrus the Epicurean, Diodotus the Stoic, and Philo
the Academician—he found the eclecticism of the latter more
congenial to his taste. Its preference of probability to certainty suited
one who shrunk from the responsibility of deciding.
It is this personality, as it were, which gives a special interest to the
Ciceronian philosophy. The reflexion of his personal character which
pervades it rescues it from the imputation of being a mere transcript
of his Greek originals. Cicero brings everything as much as possible
to a practical standard. If the question arises between the study of
morals and politics and that of physics or metaphysics, he decides in
favour of the former, on the grounds that the latter transcends the
capacities of the human intellect;[848]
that in morals and politics we
are under obligations from which in physics we are free; that we are
bound to tear ourselves from these abstract studies at the call of duty
to our country or our fellow-creatures, even if we were able to count
the stars or measure the magnitude of the universe.[849]
In the
didactic method which he pursues he bears in mind that he is dealing
not with contemplative philosophers, or minds that have been
logically trained, but with statesmen and men of the world; he does
not therefore claim too much, or make his lessons too hard, and is
always ready to sacrifice scientific system to a method of popular
instruction. His object seems to be to recommend the subject—to
smoothe difficulties, and illustrate obscurities. He evidently admires
the exalted purity of Stoical morality; and the principles of that sect
are those which he endeavours to impress upon his son.[850]
His only
fear is that their system is impracticable.[851]
Cicero believed in the existence of one supreme Creator and
Governor of the universe, and also in His spiritual nature;[852]
but his
belief is rather the result of instinctive conviction, than of the proofs
derived from philosophy; for as to them, he is, as on other points,
uncertain and wavering. He disbelieved the popular mythical
religion; but, uncertain as to what was the truth, he would not have
that disturbed which he looked upon as a political engine.[853]
Amidst
the doubtful and conflicting reasons, respecting the human soul and
man’s eternal destiny, there is no doubt that, although he finds no
satisfactory proof, he is a believer in immortality.[854]
It is
unnecessary to pursue the subject of his philosophical creed any
further, because it is not a system, but only a collection of precepts,
not of investigations. Its materials are borrowed, its illustrations
alone novel. But, nevertheless, the study of Cicero’s philosophical
works is invaluable, in order to understand the minds of those who
came after him. It must not be forgotten, that not only all Roman
philosophy after his time, but a great part of that of the middle ages,
was Greek philosophy filtered through Latin, and mainly founded on
that of Cicero. Cicero’s works on speculative philosophy generally
consist of—(1.) The Academics, or a history and defence of the belief
of the New Academy. (2.) The De Finibus Bonorum et Malorum,
dialogues on the supreme good, the end of all moral action. (3.) The
Tusculanæ Disputationes, containing five independent treatises on
the fear of death, the endurance of pain, the power of wisdom over
sorrow, the morbid passions, the relation of virtue to happiness. In
these treatises Stoicism predominates, although opinions are
adduced from the whole range of Greek philosophy. (4.) Paradoxa,
in which the six celebrated Stoical paradoxies are touched upon in a
light and amusing manner. (5.) A dialogue in praise of philosophy,
named after Hortensius. (6.) Translations of the Timæus and
Protagoras of Plato. Of these last three treatises only a few fragments
remain.
His moral philosophy comprehends—(1.) The De Officiis, a Stoical
treatise on moral obligations, addressed to his son Marcus, at that
time a student at Athens. (2.) The unequalled little essays on
Friendship and Old Age. A few words also are preserved of two books
on Glory, addressed to Atticus; and one which he wrote on the
Alleviation of Grief when bereaved of his beloved daughter.[855]
He
left one theological work in three parts: the first part is on the
“Nature of the Gods;” the second on the “Science of Divination;” the
third on “Fate,” of which an inconsiderable fragment is extant. His
office of augur probably suggested to him the composition of these
treatises.
His political works are two in number—the De Republica[856]
and
De Legibus; both are imperfect. The remains of the former are only
fragmentary; of the latter, three out of six books are extant, and those
not entire. Nevertheless, sufficient of both remains to enable us to
form some estimate of their philosophical character. Although he
does not profess originality, but confesses that they are imitations of
the two treatises of Plato, which bear the same name, still they are
more inductive than any of his other treatises. His purpose is, like
that of Plato, to give in the one an ideal republic, and in the other a
sketch of a model legislation; but the novelty of the treatment
consists in their principles being derived from the Roman
constitution and the Roman laws.
The questions which he proposes to answer are, what is the best
government and the best code: but the limits within which he
confines himself are the institutions of his country. In the Republic
he first discusses, like the Greek philosophers, the merits and
demerits of the three pure forms of government; and upon the whole
decides in favour of monarchy[857]
as the best. With Aristotle[858]
he
agrees that all the pure forms are liable to degenerate,[859]
and comes
to the conclusion that the idea of a perfect polity is a combination of
all three.[860]
In order to prove and illustrate his theory, he
investigates, though it must be confessed in a meager and imperfect
manner, the constitutional history of Rome, and discovers the
monarchical element in the consulship, the aristocratic in the senate,
and the popular in the assembly of the people and the tribunitial
authority.
The Romans continued jealously to preserve the shadow of their
constitution even after they had surrendered the substance.
Nominally, the titles and offices of the old republic never perished—
the Emperor was in name nothing more than (Imperator) the
commander-in-chief of the armies of the republic, but in him all
power centred: he was absolute, autocratic, the chief of a military
despotism.[861]
Cicero, as the treatise De Legibus plainly shows, saw,
with approbation, that this state of things was rapidly coming to
pass; that the people were not fitted to be trusted with liberty, and
yet that they would be contented with its semblance and name.
The method which he pursues, is, firstly, to treat the subject in the
abstract, and to investigate the nature of law; and, secondly, to
propose an ideal code, limited by the principles of Roman
jurisprudence. Thus Cicero’s polity and code were not Utopian—the
models on which they were formed had a real tangible existence. His
was the system of a practical man, as the Roman constitution was
that of a practical people. It was not like Greek liberty, the realization
of one single idea; it was like that of England, the growth of ages, the
development of a long train of circumstances, and expedients, and
experiments, and emergencies. Cicero prudently acquiesced in the
ruin of liberty as a stern necessity; but he evidently thought that
Rome had attained the zenith of its national greatness immediately
before the agitations of the Gracchi.
Both these works are written in the engaging form of dialogues. In
the one, Scipio Æmilianus, Lælius, Scævola, and others, meet
together in the Latin holidays (Feriæ Latinæ,) and discuss the
question of government. In the other, the writer himself, with his
brother Quintus and Atticus, converse on jurisprudence whilst they
saunter on a little islet near Arpinum at the confluence of the Liris
and Fibrena.
We must, lastly, contemplate Cicero as a correspondent. This
intercourse of congenial minds separated from one another, and
induced by the force of circumstances to digest and arrange their
thoughts in their communication, forms one of the most delightful
and interesting, and at the same time one of the most characteristic,
portions of Roman literature. A Roman thought that whenever he
put pen to paper it was his duty, to a certain extent, to avoid
carelessness and offences against good taste, and to bestow upon his
friend some portion of that elaborate attention which, as an author,
he would devote to the public eye. In fact the letter-writer was almost
addressing the same persons as the author; for the latter wrote for
the approbation of his friends, the circle of intimates in which he
lived: the approbation of the public was a secondary object. The
Greeks were not writers of letters: the few which we possess were
mere written messages, containing such necessary information as the
interruption of intercourse demanded. There was no interchange of
hopes and fears, thoughts, sentiments, and feelings.
The extent of Cicero’s correspondence is almost incredible: even
those epistles which remain form a very voluminous collection—
more than eight hundred are extant. The letters to his friends and
acquaintances (ad Familiares) occupy sixteen books; those to Atticus
sixteen more; and we have besides three books of letters to Quintus,
and one to Brutus; but the authenticity of this last collection is
somewhat doubtful. It is quite clear that none of them were intended
for publication, as those of Pliny and Seneca were. They are elegant
without stiffness, the natural outpourings of a mind which could not
give birth to an ungraceful idea. When speaking of the perilous and
critical politics of the day, more or less restraint and reserve are
apparent, according to the intimacy with the person whom he is
addressing, but no attempt at pompous display. His style is so simple
that the reader forgets that Cicero ever wrote or delivered an oration.
There is the eloquence of the heart, not of the rhetoric school. Every
subject is touched upon which could interest the statesman, the man
of letters, the admirer of the fine arts, or the man of the world. The
writer reveals in them his own motives, his secret springs of actions,
his loves, his hatreds, his strength, his weakness. They extend over
more than a quarter of a century, the most interesting period of his
own life, and one of the most critical in the history of his country.
The letters to Quintus are those of an elder brother to one who stood
in great need of good advice. Although Quintus was not deserving of
his brother’s affection, M. Cicero was warmly attached to him, and
took an interest in his welfare. Quintus was proprætor of Asia, and
not fitted for the office; and Cicero was not sparing in his
admonitions, though he offered them with kindness and delicacy.
The details of his family concerns form not the least interesting
portion of this correspondence. There is, as might be expected, more
reserve in the letters ad Familiares than in those addressed to
Atticus. They are written to a variety of correspondents, of every
shade and complexion of opinions, many of them mere
acquaintances, not intimate friends; but whilst, for this reason, less
historically valuable, they are the most pleasing of the collection, on
account of the exquisite elegance of their style. They are models of
pure Latinity. In the letters to Atticus, on the other hand, he lays bare
the secrets of his heart; he trusts his life in his hands; he is not only
his friend but his confidant, his second self. Were it not for the letters
of Cicero, we should have had but a superficial knowledge of this
period of Roman history, as well as of the inner life of Roman society.
An elegant poetic compliment paid to Cicero by Laurea Tullus, one
of his freedmen, has been preserved by Pliny.[862]
The subject of it is a
medicinal spring in the neighbourhood of the Academy.
Quo tua Romanæ vindex clarissime linguæ
Silva loco melius surgere jussa viret
Atque Academiæ celebratam nomine villam
Nunc reparat cultu sub potiore Vetus:
Hic etiam adparent lymphæ non ante repertæ
Languida quæ infuso lumina rore levant.
Nimirum locus ipse sui Ciceronis honori
Hoc dedit hac fontes cum patefecit opes
Ut quoniam totum legitur sine fine per orbem
Sint plures oculis quæ medeantur, aquæ.
Father of eloquence in Rome,
The groves that once pertained to thee
Now with a fresher verdure bloom
Around thy famed Academy.
Vetus at length this favoured seat
Hath with a tasteful care restored;
And newly at thy loved retreat
A gushing fount its stream has poured.
These waters cure an aching sight;
And thus the spring that bursts to view
Through future ages shall requite
The fame this spot from Tully drew. Elton.
The correspondents of Cicero included a number of eminent men.
Atticus was the least interesting, for his politic caution rendered him
unstable and insincere; but there was Cassius the tyrannicide; the
Stoical Cato of Utica; Cæcina, the warm partisan of Pompey; the
orator Cælius Rufus; Hirtius and Oppius, the literary friends of
Cæsar; Lucceius the historian; Matius the mimiambic poet; and that
patron of arts and letters,[863]
C. Asinius Pollio.
Pollio was a scion of a distinguished house, and was born at Rome
B. C. 76.[864]
Even as a youth he was distinguished for wit and
sprightliness;[865]
and at the age of twenty-two was the prosecutor of
C. Cato. He was with Cæsar at the Rubicon, at Pharsalia, in Africa,
and in Spain; and was finally intrusted with the conduct of the war in
that province against Sextus Pompey. On the establishment of the
first triumvirate, Pollio, after some hesitation, sent in his adhesion;
and Antony intrusted him with the administration of Gallia
Transpadana, including the allotment of the confiscated lands among
the veteran soldiers. He thus had opportunity of protecting Virgil
and saving his property. In B. C. 40, Octavian and Antony were
reconciled at Brundisium by his mediation. A successful campaign in
Illyria concluded his military career with the glories of a triumph,[866]
and he then retired from public life to his villa at Tusculum, and
devoted himself to study. He enjoyed life to the last, and died in his
eightieth year. He left three children, one of whom, Asinius Gallus,
[867]
wrote a comparison between his father and Cicero, which was
answered by the Emperor Claudius.[868]
In oratory, poetry, and history, Pollio enjoyed a high reputation
among contemporary critics, and yet none of his works have
survived. The solution of this difficulty may, perhaps, be found in the
following circumstances:—1. His patronage of literary men rendered
him popular, and drew from the critics a somewhat partial verdict.
His kindness caused Horace to extol[869]
him, and Virgil to address to
him his most remarkable eclogue.[870]
2. His taste was formed before
the new literary school commenced. He had always a profound
admiration for the old writers, and frequently quoted them. His style
probably appeared antiquated and pedantic, and, therefore, never
became generally popular. A later writer[871]
says, that he was so
harsh and dry as to appear to have reproduced the style of Attius and
Pacuvius, not only in his tragedies, but also in his orations.
Quintilian observes,[872]
that he seemed to belong to the pre-
Ciceronian period. Niebuhr, who could only form his opinion upon
the slight fragments preserved by Seneca, for the three letters in
Cicero’s collection[873]
are only despatches, affirms that he seems to
stand between two distinct generations,[874]
namely, the literary
periods of Cicero and Virgil. His great work was a history of the civil
wars, in seventeen books. He pretended to be a critic, but his
criticism was fastidious and somewhat ill-natured. He found
blemishes in Cicero, inaccuracies in Cæsar, pedantry in Sallust, and
provincialism (Patavinitas) in Livy. The correctness of his judgment
respecting the charming narratives of the great historian has been
assumed from generation to generation, yet no one can discover in
what this Pativinity consists. It was easier to find fault than to write
correctly; for, whilst all the labours of the critic have perished,
Cicero, Cæsar, Sallust, and Livy are immortal. Vehemence and
passion developed his character.

More Related Content

PDF
Curating Research Data Volume One Practical Strategies for Your Digital Repos...
PPTX
Research data life cycle
PPT
User Engagement in Research Data Curation
PPTX
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
PPTX
Day 1 lecture_intro
PPT
Digital Curation 101 - Taster
PDF
Johnston - How to Curate Research Data
PPTX
Foundations of Data Curation Final Project
Curating Research Data Volume One Practical Strategies for Your Digital Repos...
Research data life cycle
User Engagement in Research Data Curation
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Day 1 lecture_intro
Digital Curation 101 - Taster
Johnston - How to Curate Research Data
Foundations of Data Curation Final Project

Similar to Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston (20)

PDF
Beyond Management: Data Curation as Scholarship in Archaeology
PPTX
data curation issues
PPTX
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
PPT
Introduction to digital curation
PPT
Current and emerging scientific data curation practices
PPTX
RDM LIASA webinar
PDF
From data lakes to actionable data (adventures in data curation)
PPTX
Supporting research life cycle librarians
PPT
Data curation issues for repositories
PPTX
Digital curation centre
PDF
Final data presentation_clir_july2014
PPT
Curation of Research Data
PPTX
2014 ALA MW SPARC-ACRL Forum Talk
PDF
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
PPTX
Next generation data services at the Marriott Library
PDF
McGeary Data Curation Network: Developing and Scaling
PDF
Metadata 2020 Vivo Conference 2018
PPT
DCC 101: Preservation
PPTX
Re tooling for data management-support
PPTX
The Role of Community-Driven Data Curation for Enterprises
Beyond Management: Data Curation as Scholarship in Archaeology
data curation issues
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
Introduction to digital curation
Current and emerging scientific data curation practices
RDM LIASA webinar
From data lakes to actionable data (adventures in data curation)
Supporting research life cycle librarians
Data curation issues for repositories
Digital curation centre
Final data presentation_clir_july2014
Curation of Research Data
2014 ALA MW SPARC-ACRL Forum Talk
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Next generation data services at the Marriott Library
McGeary Data Curation Network: Developing and Scaling
Metadata 2020 Vivo Conference 2018
DCC 101: Preservation
Re tooling for data management-support
The Role of Community-Driven Data Curation for Enterprises
Ad

Recently uploaded (20)

PPTX
GDM (1) (1).pptx small presentation for students
PDF
Complications of Minimal Access Surgery at WLH
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Trump Administration's workforce development strategy
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
GDM (1) (1).pptx small presentation for students
Complications of Minimal Access Surgery at WLH
STATICS OF THE RIGID BODIES Hibbelers.pdf
01-Introduction-to-Information-Management.pdf
Cell Types and Its function , kingdom of life
Final Presentation General Medicine 03-08-2024.pptx
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Anesthesia in Laparoscopic Surgery in India
A systematic review of self-coping strategies used by university students to ...
2.FourierTransform-ShortQuestionswithAnswers.pdf
Weekly quiz Compilation Jan -July 25.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Module 4: Burden of Disease Tutorial Slides S2 2025
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Trump Administration's workforce development strategy
O5-L3 Freight Transport Ops (International) V1.pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Ad

Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston

  • 1. Curating Research Data Volume One Practical Strategies for Your Digital Repository 1st Edition Lisa R Johnston install download https://ebookmeta.com/product/curating-research-data-volume-one- practical-strategies-for-your-digital-repository-1st-edition- lisa-r-johnston/ Download more ebook from https://ebookmeta.com
  • 2. We believe these products will be a great fit for you. Click the link to download now, or visit ebookmeta.com to discover even more! Curating Research Data Volume Two A Handbook of Current Practice 1st Edition Lisa R Johnston https://ebookmeta.com/product/curating-research-data-volume-two- a-handbook-of-current-practice-1st-edition-lisa-r-johnston/ Managing Your Research Data and Documentation 1st Edition Kathy R. Berenson https://ebookmeta.com/product/managing-your-research-data-and- documentation-1st-edition-kathy-r-berenson/ Managing Your Research Data And Documentation 1st Edition Kathy R. Berenson https://ebookmeta.com/product/managing-your-research-data-and- documentation-1st-edition-kathy-r-berenson-2/ Cryptocurrency: Concepts, Technology, and Applications Jay Liebowitz (Editor) https://ebookmeta.com/product/cryptocurrency-concepts-technology- and-applications-jay-liebowitz-editor/
  • 3. Animal Pollinators Jennifer Boothroyd https://ebookmeta.com/product/animal-pollinators-jennifer- boothroyd/ The Politics of Memory in Post Authoritarian Transitions Volume Two Comparative Analysis 1st Edition Joanna Marsza■ek-Kawa https://ebookmeta.com/product/the-politics-of-memory-in-post- authoritarian-transitions-volume-two-comparative-analysis-1st- edition-joanna-marszalek-kawa/ Nothing But Good 1st Edition Tess Mckinley Mckinley Tess https://ebookmeta.com/product/nothing-but-good-1st-edition-tess- mckinley-mckinley-tess/ A Field Guide to Gifted Students A Teacher s Introduction to Identifying and Meeting the Needs of Gifted Learners 1st Edition Charlotte Agell https://ebookmeta.com/product/a-field-guide-to-gifted-students-a- teacher-s-introduction-to-identifying-and-meeting-the-needs-of- gifted-learners-1st-edition-charlotte-agell/ Disgraced Lords Of The Immortal Anthology Boxset Jen Katemi Tamsin Baker Amelia Shaw Charmaine Ross Kim Cleary https://ebookmeta.com/product/disgraced-lords-of-the-immortal- anthology-boxset-jen-katemi-tamsin-baker-amelia-shaw-charmaine- ross-kim-cleary/
  • 4. Fortschritte der Physik Progress of Physics Band 29 Heft 11 12 https://ebookmeta.com/product/fortschritte-der-physik-progress- of-physics-band-29-heft-11-12/
  • 6. Association of College and Research Libraries A division of the American Library Association Chicago, Illinois 2017 Curating Research Data Volume One: Practical Strategies for Your Digital Repository edited by Lisa R. Johnston
  • 7. The paper used in this publication meets the minimum requirements of Ameri- can National Standard for Information Sciences–Permanence of Paper for Printed Library Materials, ANSI Z39.48-1992. ∞ Cataloging-in-Publication data is on file with the Library of Congress Copyright ©2017 by the Association of College and Research Libraries. All rights reserved except those which may be granted by Sections 107 and 108 of the Copyright Revision Act of 1976. Printed in the United States of America. 21 20 19 18 17 5 4 3 2 1 Cover image Copyright: kentoh / 123RF Stock Photo (http://www.123rf.com/profile_kentoh)
  • 8. iii Table of Contents 1���������Introduction to Data Curation Lisa R. Johnston Data, Data Repositories, and Data Curation: Our Terminology Why We Curate Research Data The Challenge of Providing Data Curation Services Reuse: the Ultimate Goal of Data Curation? Conclusion Notes Bibliography Part I. Setting the Stage for Data Curation. Policies, Culture, and Collaboration 33�������Chapter 1. Research and the Changing Nature of Data Repositories Karen S. Baker and Ruth E. Duerr Introduction Background Changing Support for Data Expanding Support for Data in Natural and Social Sciences Data Repository Diversity Three Concepts at Work Data Ecosystem: Growing Interdependence Liaison Work and Mediation Continuing Design: Standards, Systems, and Models Changing Research Needs and New Initiatives Final Thoughts Notes Bibliography 61�������Chapter 2. Institutional, Funder, and Journal Data Policies Kristin Briney, Abigail Goben, and Lisa Zilinski Funding Agency Data Policies Institutional Data Policies Journal Data Policies Navigating the Data Policy Landscape for Curation Summary Notes Bibliography
  • 9. iv TABLE OF CONTENTS 79�������Chapter 3. Collaborative Research Data Curation Services: A View from Canada Eugene Barsky, Larry Laliberté, Amber Leahey, and Leanne Trimble Canadian Academic Library Involvement in Research Data Management Overview of Case Studies Local Services: University of Alberta Libraries Informal Regional Consortia: University of British Columbia Library Formal Regional Consortia: The Ontario Council of University Libraries Data Repository Services in Canadian Libraries Discovery and Access Platforms Long-Term Preservation Operational Costs of Data Repository Services National Collaboration: Portage Goal 1: Portage National Data Preservation Infrastructure Goal 2: Portage Network of Expertise Future Directions Conclusions Notes Bibliography 103�����Chapter 4. Practices Do Not Make Perfect: Disciplinary Data Sharing and Reuse Practices and Their Implications for Repository Data Curation Ixchel M. Faniel and Elizabeth Yakel Introduction Overview and Methodology for the DIPIR Project Disciplinary Traditions for Data Sharing and Reuse Social Scientists Archaeologists Zoologists Data Reuse and Trust Trust Marker: Data Producer Trust Marker: Documentation Trust Marker: Publications and Prior Reuse Indicators Trust Marker: Repository Reputation Sources of Additional Support for Data Reuse Social Scientists Archaeologists Zoologists Implications for Repository Practice Conclusion Acknowledgments Notes Bibliography
  • 10. Table of Contents v 127�����Chapter 5. Overlooked and Overrated Data Sharing: Why Some Scientists Are Confused and/or Dismissive Heidi J. Imker Data Sharing in Context Overlooked Data Sharing: Article Publication Overlooked Data Sharing: Supplemental Material Overrated Data Sharing: Unsustained Community Resources Overrated Data Sharing: Hyperbolic Arguments Conclusions Acknowledgments Notes Bibliography Part II. Data Curation Services in Action 153�����Chapter 6. Research Data Services Maturity in Academic Libraries Inna Kouper, Kathleen Fear, Mayu Ishida, Christine Kollen, and Sarah C. Williams Introduction Research Data and Libraries The Current Landscape RDS Maturity Looking into the Future Appendix 6A: Typology of Services and Their Descriptions on Websites Notes Bibliography 171�����Chapter 7. Extending Data Curation Service Models for Academic Library and Institutional Repositories Jon Wheeler Introduction Conceptual Models and Rationale Alignment with Existing Roles and Capabilities Applications: Requirements and Example Use Cases Defining Stakeholder Interactions and Requirements Harvesting and Metadata Processing Content Curation and Packaging Conclusion Acknowledgments Notes Bibliography 193�����Chapter 8. Beyond Cost Recovery: Revenue Models and Practices for Data Repositories in Academia Karl Nilsen Introduction
  • 11. vi TABLE OF CONTENTS From Costs to Revenue Data Repository Revenue Models Model 1: Public or Consortium Model 2: Freemium Model 3: Pay-to-Play Model 4: Pay-if-You-Can or Pay-if-You-Want Model 5: Grants Model 6: Outside-Data Common Challenges Associated with Revenue Practices Conclusion Notes Bibliography 213�����Chapter 9. Current Outreach and Marketing Practices for Research Data Repositories Katherine J. Gerwig The Survey The Interviews Measuring the Success of Repository Promotions Successful Promotional Techniques Unsuccessful Promotional Techniques Target Audiences Challenges to Increasing Awareness Differences in Promoting the Institutional Repository and the Data Repository Looking for Inspiration Discussion Conclusion Promotional Examples for Inspiration Acknowledgments Appendix 9A: Data Repository Promotional Practices—Initial Google Survey Notes Bibliography Part III. Preparing Data for the Future. Ethical and Appropriate Reuse of Data 235�����Chapter 10. Open Exit: Reaching the End of the Data Life Cycle Andrea Ogier, Natsuko Nicholls, and Ryan Speer Introduction Comparative Exploration “End of Life Cycle” Terminology Scope Authority Appraisal Criteria Resources (Human, Financial, and Spatial)
  • 12. Table of Contents vii Discussion University Records and Information Management Library Collections Data Curation Conclusion Notes Bibliography 251�����Chapter 11. The Current State of Meta-Repositories for Data Cynthia R. Hudson Vitale Introduction Community Initiatives and Solutions to Support Meta-Repositories of Data Methods 256.Results Content Functionality Metadata Discussion Conclusion Notes Bibliography 263�����Chapter 12. Curation of Scientific Data at Risk of Loss: Data Rescue and Dissemination Robert R. Downs and Robert S. Chen Benefits of Data Rescue Challenges of Data Rescue for Repositories Repository Considerations for Data Rescue Rescue of the Millennium Ecosystem Assessment (MA) Data Dissemination of the Millennium Ecosystem Assessment (MA) Data Lessons Learned Discussion and Conclusion Acknowledgments Notes Bibliography 279�����Contributor Biographies Editor Biography Author Biographies
  • 14. 1 INTRODUCTION TO VOLUME ONE Introduction to Data Curation Lisa R. Johnston As varied as they can be rare and precious, data are becoming the proverbial coin of the digital realm: a research commodity that might purchase reputation credit in a disciplinary culture of data sharing or buy transparency when faced with funding agency mandates or publisher scrutiny. Unlike most monetary systems, however, digital data can flow in all too great abundance. Not only does this cur- rency actually “grow” on trees, but it comes from animals, books, thoughts, and each of us! And that is what makes data curation so essential. The abundance of digital research data challenges library and information science professionals to harness this flow of information streaming from research discovery and scholarly pursuit and preserve the unique evidence for future use. Our expertise as curators can help ensure the resiliency of digital data, and the information it represents, by addressing how the meaning, integrity, and provenance of digital data generated by researchers today will be captured and conveyed to future researchers over time. The focus of Curating Research Data, Volume One: Practical Strategies for Your Digital Repository and the companion Volume Two: A Handbook of Current Prac- tice is to present those tasked with long-term stewardship of digital research data a blueprint for how to curate data for eventual reuse. There are many motivations for storing and preserving data, but the ultimate goal of reuse by others will be a theme for all that follows. Following a brief overview to the terminology used in the two volumes, this introduction will explore the external motivations that impact why we develop data curation services and the driving forces behind why researchers share their data, including federal data management requirements, publisher policies for data sharing, and an overall sea change of disciplinary ex- pectations for digital data exchange. Next, this chapter will dive into some of the
  • 15. 2 Introduction to Volume One challenges that practitioners in the library and archival fields face when curating digital research data as well as some emerging solutions. In closing we will ex- plore the sea change stemming from data reuse, from the disruptive effects that data transparency and the reproducibility movement have had on the scholarly communication life cycle to the potentially democratizing effect of digital data availability worldwide. Data, Data Repositories, and Data Curation: Our Terminology Data is an evolving term. At its core, data can be any information that is factual and can be analyzed. Data is “information in numerical form that can be digitally transmitted or processed.” But in the research setting, data can be more abstract and consist of any information object (numerical or otherwise).1 For information science professionals, the term ‘research data’ has been recently defined as: “data that are used as primary sources to support technical or scientific enquiry, research, scholarship, or artistic activity, and that are used as evidence in the research process and/or are commonly accepted in the research community as necessary to validate research findings and results…. Research data may be experimental, observational, operational, data from a third party, from the public sector, monitoring data, processed data, or repurposed data. Data are defined in the Digital Curation Center (DCC) Curation Lifecycle Model as “any information in the binary digital form” and is treated there in the sense of any digital information that be taken in a broad perspective.3 Harvey describes the breadth of data as encompassing all things digital, based on the UNESCO’s Guidelines for the Preservation of Digital Heritage and takes into account the more subtle nuances of NSF’s description of “scientific data” to cre- ate a list of data objects to include: • Data sets: Observational, computational, simulated, or otherwise re- corded output • Digital collections: A grouping of digital objects, such as a photo archive or a vast text-based library of digitized books, can be interpreted as one data set • Learning objects: Videos, digital online tutorials • Multimedia: Recordings of film, music, and performance art • Software: Applications including the code and documentation files4
  • 16. Introduction to Data Curation 3 Sometimes primarily associated with the sciences, data can be found in any discipline and in many forms.5 Data may be raw (e.g., numbers collected by an instrument), aggregated from multiple sources, or the product of a model, simulation, or visualization (e.g., a graphic or video). Digital humanities data might include digitized or born-digital texts and monographs, digital image li- braries, and 3D models, such as those used for historic reconstruction of ancient or mythological sites.6 Social scientists produce large quantities of data, including survey data and observational data, such as complex human activity and interac- tions captured via sensors or video.7 Outside of research, the business, industry, and commerce sectors produce “big data” that is used to better understand re- search questions about human behavior, and as a result a growing (and some- times nefarious) economy of selling the transactional data derived from business has emerged.8 With the explosion of digital data produced by modern research or recorded through our general day-to-day activity, digital data repositories are storing vast amounts of information. Data repositories preserve information “by taking own- ership of the records, ensuring that they are understandable to the accessing com- munity, and managing them so as to preserve their information content and Au- thenticity.”9 The co-authors of the “Key Components of Data Publishing” report use the practitioner-based Research Data Alliance (RDA) definitions developed by the Data Foundations and Terminology Working Group and the Research Data Canada’s Glossary of Terms and Definitions to define digital repositories as: A repository (also referred to as a data repository or digital data repository) is a searchable and queryable interfacing entity that is able to store, manage, maintain and curate Data/Dig- ital Objects. A repository is a managed location (destination, directory or ‘bucket’) where digital data objects are registered, permanently stored, made accessible and retrievable, and curat- ed. Repositories preserve, manage, and provide access to many types of digital material in a variety of formats. Materials in online repositories are curated to enable search, discovery, and reuse. There must be sufficient control for the digital material to be authentic, reliable, accessible and usable on a continuing basis.10 Additionally, the 2005 National Science Board anticipated the need for data repositories, stating that: It is exceedingly rare that fundamentally new approaches to research and education arise. Information technology has ush-
  • 17. 4 Introduction to Volume One ered in such a fundamental change. Digital data collections are at the heart of this change. They enable analysis at unprece- dented levels of accuracy and sophistication and provide novel insights through innovative information integration. Through their very size and complexity, such digital collections provide new phenomena for study. At the same time, such collections are a powerful force for inclusion, removing barriers to partici- pation at all ages and levels of education.11 Simply put: data includes a wide range of information, and data repositories retain this information for reuse. Therefore our challenge as data curators is to apply the archival principles of library and information sciences to a wide-variety of complex data objects from all disciplines and prepare them for ingest, access, and long-term preservation within an environment (such as a data repository) that facilitates discovery and access while not diminishing their context, authen- ticity, and value. No short order. As data curators we effectively become the first users of the data. In doing so we may review the various aspects of the data (such as arrangement, completeness, clarity, and quality), identify any reuse issues early on, and work with the data author to correct these issues. This concept is very important considering the long-term burden of ingesting and storing research data in our repositories. We need to first verify that those data can be understood and do our best to optimize them for reuse. Otherwise, our data repository can still do all of the things listed in the RDA definition above, the only difference being that the data might not be usable. It is the variety and complexity of data, and its context, that make it much more difficult to preserve so that others might make use of it. Therefore our definition of data curation must also include verifying that all of the essential metadata and supplementary information, describing what the data is and how to understand it, are curated as well. For example, ensuring that supplementary files to the dataset, like codebooks, data dictionaries, schemas, and readme files provide the additional documentation needed to understand the file contents is a key step in the data curation process. The optimization aspect can be found in the “adds values” statement of the University of Illinois’ School of Information Sciences Data Curation Specializa- tion definition for data curation as the active and ongoing management of data through its life- cycle of interest and usefulness to scholarship, science, and education. Data curation enables data discovery and retrieval, maintains data quality, adds value, and provides for re-use over time through activities including authentication, archiving, management, preservation, and representation.12
  • 18. Introduction to Data Curation 5 However these concepts also apply to any digital object (for example, a book or an article), not necessarily just data, and therefore data curation is understood as a subset of digital curation which covers all types of digital information.13 In short, the goal of data curation is to prepare research outputs in ways that make it useful beyond its original purpose, ensure completeness, and facilitate long-term citability. Volume One of Curating Research Data explores the variety of reasons, mo- tivations, and drivers for why data curation services are needed in the context of academic and disciplinary data repository efforts. The following twelve chapters, divided into three parts, take an in-depth look at the complex practice of data cu- ration as it emerges around us. Part I sets the stage for data curation by describing current policies, data sharing cultures, and collaborative efforts underway that impact potential services. Part II brings several key issues, such as cost recovery and marketing strategy, into focus for practitioners when considering how to put data curation services into action. Finally, Part III describes the full life cycle of data by examining the ethical and practical reuse issues that data curation practi- tioners must consider as we strive to prepare data for the future. Why We Curate Research Data In Part I, Setting the Stage for Data Curation: Policies, Culture and Collaboration, we explore the factors that influence our actions to provide data curation services for research data. Some factors include incentives, both scholarly positive and negative, from the funding bodies and the scholarly publishing entities. Other factors come directly from the research communities themselves, some of which are demanding greater transparency in research. These motivations can some- times be indirect or at even at odds with a researcher’s goals.14 Overall the poli- cies, culture, and collaborations involved with data curation provide us with an interesting canvas with which to begin our work. One driving force that leads library and information science practitioners to provide data curation services is the inherent fact that digital data are more easily shared. Data have always held value beyond their original purpose, and today, digital data can travel and reach worldwide audiences at unprecedented speeds with incremental costs. A 1989 National Academies of Sciences panel described the impact of information technology on research in the sciences, engineering, and clinical research as improving collaboration among researchers “more widely and efficiently” by reducing “the constraints of speed, cost, and distance from the researcher.”15 And incentives to collaborate across institutional or disciplinary boundaries have boomed. Rates of co-authorship are increasing not only in the sciences but across disciplines that were traditionally solo-researcher focused such as the social sciences.16 In short, digital data presents researchers with many new
  • 19. 6 Introduction to Volume One ways of working collaboratively across institutional and geographic boundaries. In Chapter 1, “Research and the Changing Nature of Data Repositories,” Karen S. Baker and Ruth E. Duerr draw from their experiences working at large scientific data repositories to explore data management and curation in the broader landscape of disciplinary research. They describe how reposito- ries, which initially were designed for highly structured data housed at key disci- plinary repositories, have now emerged at the center of a modern ‘data ecosystem’ proliferated by the emerging requirements to openly, and ethically, disseminate research data. Their examples of early data registries and international data orga- nizations—and the various stakeholders involved—paint a complex picture and provide excellent food for thought as our authors ask us to ponder how library data professionals contribute to and coordinate with the broader ecosystem of data repositories. Another significant, and more opaque, driver for data curation services are the emerging funding requirements for data sharing. Over the last several years, national funding agencies and political administrations worldwide have developed a growing awareness of and the need for public access to the re- sults of government-funded research and the long-term preservation of these unique digital research data sets.17 For example, a key turning point in the US was the February 22, 2013 memorandum18 by the White House Office of Science and Technology Policy (OSTP) directing federal agencies to devel- op plans to ensure all resulting publications and research data are publically accessible. The memo’s requirements for sharing digital research data in ways that make the data “publicly accessible to search, retrieve, and analyze” sug- gested that federally funded researchers will soon be faced with many new requirements that: • Ensure that the data are richly described with machine-actionable meta- data • Ensure that data are complete, self-explanatory, and accurate (quality) • Protect confidentiality and privacy when making data available (e.g., remove identifiers, virtual data enclaves) • Account for the long-term access and preservation needs that go beyond the life of a grant. • Identify and/or create trusted digital repositories to steward data over time19 Three years after the OSTP directive, “policies to make data and publica- tions resulting from federally funded research publicly accessible are becoming the norm.”20 Interestingly these efforts for sharing nationally funded research data run parallel to an open data movement for government-authored data. This movement is characterized by the G8 adoption of the “Open Data Charter” in June 2013 and demonstrated by the principles set forth in the US Open Data Action Plan released in 2014.21 And not only federal funders that have moved the
  • 20. Introduction to Data Curation 7 needle towards open. Private funders of research, such as the Ford Foundation, the Alfred P. Sloan Foundation, and the Bill & Melinda Gates Foundation, now require their funded projects release underlying data with some degree of open- ness.22 For a detailed listing of the current policies of federal agency responses to the OSTP memo, see SPARC Open Data’s resource for Research Funder Data Sharing Policies.23 Complex? Absolutely. Fortunately, Chapter 2, titled “Institutional, Funder, and Journal Data Policies” by Kristin Briney, Abigail Goben, and Lisa D. Zilinski, does an excellent job of describing the current landscape of funder mandates for data as well as other top-down drivers for curation services. For example, in 2009 the National Academies of Sciences put out a call for better standards for data sharing in ways that support reproducibility through the ethical sharing of data along with published research results. Authors of this report included editors of scientific journals that cited the emerging problem of “misguided efforts to clarify results” by distorting, falsifying, or even faking data.24 This trend continues today and sources such as Retraction Watch regularly report examples of publishers responding to data-related issues in publications.25 As a result, many journals have implemented policies to make the underlying data for an article more open to replication and validation. According to several studies such as Fear, Piwowar & Chapman, and Naughton & Kernohan of the Jisc-funded Journal of Research Data policy bank (JoRD) project, journal data sharing requirements come in many forms.26 The latter in particular, after review- ing the data policies of nearly 400 journals, found that half did not have a data sharing policy and of those that did, 76 percent were found to be weakly worded and vague. In response the JoRD project developed a model data sharing policy that could be implemented by any organization.27 Some prominent examples of journal data sharing policies include Nature, where “authors are required to make materials, data, code, and associated protocols promptly available to readers with- out undue qualifications.” The PLOS data sharing policy goes one step further to say “Refusal to share data and related metadata and methods in accordance with this policy will be grounds for rejection.”28 Indeed, one such retraction occurred in 2015, albeit in a different journal (Frontiers in Neuroscience), due to an author refusing to share their data.29 Going beyond publisher requirements to simply make data accessible and linked to the article (see for example Elsevier’s platform for linking data in data repositories such as PANGEA), some publishers have created new jour- nals that provide a venue for “data papers” or the long-form description of a dataset in conjunction with the data release.30 Examples include Springer-Na- ture’s Scientific Data and Elsevier’s Data in Brief that both launched in 2014. The latter reports “an exponential rise in data articles over the six quarters since the journal came into existence, with approximately 300 publications ex- pected in 2016 Q1.”31 An independent survey of 116 data journals found that
  • 21. 8 Introduction to Volume One the growth in data papers nearly doubled from 2012 to 2013 and continues to rise at an incredible rate.32 Yet, one of the curious aspects of data journals is that the data are often not provided by the journal but rather “[the publisher does] not consider the publication of data as part of their own mission.”33 For example, Scientific Data suggests a list of recommended data repositories for deposit since “we do not ourselves host data. Instead, we ask authors to submit datasets to an appropriate public data repository.”34 It seems that scholarly communication is still rapidly adjusting to the new norm of data sharing and our data curation services will directly provide authors with the much-needed support. International collaborations providing incentives for data curation ser- vices might be key. In 2004, many countries from Europe and others such as Australia, the US, and Canada signed the “Declaration on Access to Research Data from Public Funding” by the Organisation for Economic Co-operation and Development’s (OECDs) Committee for Scientific and Technological Policy, which set the stage for open access to digital research data result- ing from public funding.35 The results stemming from this Declaration have been substantial. In the United Kingdom, the seven councils of the Research Council UK (RCUK) and the private funder, the Wellcome Trust, have each established a policy on access to data in the years following the RCUKs 2011 report on “Common Principles on Data Policy.”36 The European Commis- sion has established a pilot program for data sharing through its Horizon 2020 granting arm.37 And Canada’s three federal granting agencies are mov- ing toward policies for research data such as those explored by Shearer in the comprehensive 2011 “Brief on Open Access to Publications and Research Data.”38 In Chapter 3, “Collaborative Research Data Curation Services: A View from Canada,” Eugene Barsky, Larry Laliberté, Amber Leahey, and Leanne Trimble provide in-depth case studies from their respective in- stitutions, the University of British Columbia, the University of Alberta, and the Scholars Portal for the Ontario Council of University Libraries. The three case studies are presented in the context of Canada’s overarching national infrastructure initiative, the ambitious Portage network developed by the Canadian Association of Research Libraries (CARL).39 An exciting collaborative project, Portage aims to integrate existing research data reposi- tories within a robust national discovery and preservation infrastructure net- work for all Canadian research data. Moreover the project will bring together library-based experts in order to share data management consultation services across a broader network. This national effort appears similar to the role that the JISC has played in the UK with its Research Data Management Shared Service Project and, on a much smaller scale for sharing curation staff exper- tise across institutions, the Data Curation Network project that your editor recently helped launch in the US in 2016.40
  • 22. Introduction to Data Curation 9 In Chapter 4, different disciplinary and cultural norms of how data reuse are explored by Ixchel M. Faniel and Elizabeth Yakel, who draw from ethnographic research with archaeologists, quantitative social sci- entists, and zoologists in “Practices Do Not Make Perfect: Disciplinary Data Sharing and Reuse Practices and Their Implications for Repository Data Curation.” To synthesize disciplinary data sharing and reuse findings the authors partner with three repositories—the Inter-university Consortium for Social and Political Research (ICPSR), Open Context, and the University of Michigan Museum of Zoology (UMMZ)—to obtain data reuse stories and even download statistics. Their study reveals the dependencies between how data are shared and how data are reused with emphasis on the differences in disciplines, and explores the interesting elements of “trust” in the data ex- changed. In Chapter 5, “Overlooked and Overrated Data Sharing: Why So Many Scientists are Confused and/or Dismissive,” Heidi J. Imker aptly focuses our attention away from scientists not or wrongly sharing their data to how often scientists share their data, and have historically been sharing data long be- fore public access requirements. This chapter presents the idea that traditional methods of data sharing, though not generally meant for preservation purposes, are still valid forms of sharing within the discipline. For example, sharing data via publication in the traditional journal article is still very common, though much of this data is often fixed in graphs or charts found in the body of the article and therefore impractical or labor-intensive to reuse.41 As one blogger quips, “‘Send me your data—pdf is fine,’ said no one ever.”42 Similarly, lengthy data tables his- torically induced costly page fees and data supplements to journal articles have been criticized as unstable and “far harder to locate than [data] in public repos- itories.”43 Other widespread data sharing approaches, such as posting data to a project website or sharing data upon request, may not sustainable for the long- term. For example, research has shown that ‘available by request’ does not work and furthermore that the availability of data declines rapidly with age.44 Yet, data sharing is still happening and data curation efforts may help mitigate these error prone approaches. Imkers’ exploration of these “overlooked” methods will help data curators and librarians providing data services become better educated in the larger picture of scholarly data exchange. The Challenge of Providing Data Curation Services In Part II, Data Curation Services in Action, we explore several examples of institu- tions already providing data curation services, review their service offerings, un-
  • 23. 10 Introduction to Volume One derstand their technology infrastructure, and explore some of their challenging constraints, such as identifying appropriate cost-recovery models and rolling out promotion and marketing strategies that resonate with end users. In addition to the chapters described here, there are many practi- cal examples to be found in this book’s companion volume Curating Research Data, Volume Two: A Handbook of Current Practice which collects 30 practitioner case studies from institutional, disciplinary, and national data repositories in an eight-step workflow for data cu- ration, from receiving to reuse. Putting data curation into context within the broader range of research data management services is essential as libraries shift toward progressively more responsible data stewardship roles at their institutions (see Figure Intro.1). For example, Witt describes the “information bottleneck” as a place where libraries can use data curation to help push valuable data sets beyond the laboratory and out to the broader research community.45 Choudhury paints a rather bleak picture of the state of institutional repositories in 2008 and recommends data curation as a place of redemption for libraries in the larger scholarly communi- cation landscape.46 In Chapter 6, authors Inna Kouper, Kathleen Fear, Mayu Ishida, Christine Kollen, and Sarah C. Williams address how far we have come with an empirical analysis of research data services provided by the Association of Research Libraries (ARL) in “Research Data Services Matu- rity in Academic Libraries.” As the title suggests, the results of their study of current ARL service offerings are categorized by frequency into topographical levels and present a vocabulary for describing research data services (RDS). They find that basic services, such as data management plan consultations and data management workshops, were practiced in over 50% of their sample, while in- termediate services, such as data deposit into repositories and data preservation, were only found in 15 percent to 50 percent of the group. Finally, the concept of data curation is found in less than 15 percent of the sample and labeled as an advanced service, which includes other services such as data and researcher IDs and data analysis. Their discussion of how these RDS concepts interrelate to one another provides an excellent snapshot at the evolving vernacular, if not actual nature, of our field. For example, the concept of data curation was still an emerging topic within the library science, archival, and information sciences disciplines just a few years ago and in fact very few academic libraries were suc- cessfully offering data curation services at all according to a study in 2011.47 The RDS maturity model presents an opportunity to self-measure the actions our library takes in the broad arena of data services and allows us to strive to expand them to the next level.
  • 24. Introduction to Data Curation 11 Data Curation Data Repositories Research Data Services FIGURE INTRO.1 Data curation as a subset of research data services. Note that data curation services may support or overlap with local data repository services, or curation services may be provided for data that are deposited elsewhere, such as disciplinary repositories or non-accessible (dark) storage. The next chapter in this volume provides an excellent case study in one ac- ademic library’s ascendance from basic to advanced data services. In Chapter 7, Jon Wheeler describes how academic library-run institutional repositories might be adapted to provide complementary platforms for data publication alongside disciplinary repositories in “Extending Data Curation Service Models for Academic Library and Institutional Repositories.” Here the con- flation between data sharing and data preservation come to a head. While aca- demic researchers may deposit their data into disciplinary repositories to achieve one, then may not always be gaining the other. Wheeler presents data repository mirroring as one way for academic libraries to compliment successful disciplinary data repository efforts and goes on to provide several illustrative examples of “data mirroring” efforts underway with the University of New Mexico (UNM) Libraries. This example is unique by connecting an institutional repository to established disciplinary data repositories and collaborating their efforts. Disci- plinary repositories such as Flybase, PLEXdb, and the Cambridge Structural
  • 25. 12 Introduction to Volume One Database present the collective data outputs of a sub-topic in publicly accessi- ble platforms designed to allow for widespread reuse of the data.48 Within the context of disciplinary data repositories, several repository best practices for data curation emerge. For example, DataOne continues to educate the field by host- ing workshops and publishing guides on research data management and software tools.49 Their in-depth resources help researchers better prepare their data for eventual deposit into the DataOne connected archives.50 Similarly detailed data curation instructions for oceanographic researchers are presented in the Ocean Data Publication Cookbook, which describes step-by-step instructions for cu- rating disciplinary data from their field and applying digital object identifiers (DOIs) as a central component to the curation approach.51 Greater collaboration between the stakeholders of disciplinary and institution- al data repositories would enhance our collective understanding of data curation best practices. In one area in particular there are several lessons to be learned: finan- cial cost models for sustaining data repositories. Disciplinary data repositories have been grappling with how to maintain financial support beyond their initial start- up phase (often provided in the form of seed or grant funding) for decades.52 For example, Ember and colleagues note the dichotomy between the long-term pres- ervation costs of maintaining digital data, often indefinitely, with the periodic and uncertain grant support on which these repositories must rely.53 Their white paper, resulting from a 2013 summit with representatives from twenty two disciplinary data repositories, evaluated several funding models and found both advantages and disadvantages. Their goals of meeting long-term sustainability, open access, and po- tential for equity by all depositors were not met by a single approach. For example, charging user fees to access data in the repository would limit open access, while de- positor-incurred submission fees would lower equity for individual depositors not backed by generous grants or institutional open access funds. Only one approach (not currently in place in the US but found in other nations) appeared to provide a good balance: the infrastructure model. This was described as, “Funding agencies pay for archives directly as a necessary aspect of research infrastructure. The funding model is structured for long-term investment, rather than being tied to three-year grant cycles.”54 Chapter 8 draws from these cost models and many more in “Be- yond Cost Recovery: Entrepreneurial Business Models for Data Curation in Academia,” in which Karl Nilsen reviews and compares the popular models for financing data curation efforts and reports on a new business model emerging at the University of Maryland Libraries. One potentially effective way to secure funding for your data repository may be to demonstrate positive use trends: both in data curation activities as well as reuse of the data your repository maintains. But the challenge here is determining how best to market and promote services to our intended audiences. In Chapter 9, “Current Outreach and Marketing Practices for Research Data Repositories,” Katherine J. Gerwig from Metropolitan State University provides a mixed
  • 26. Introduction to Data Curation 13 methods approach to understanding the current data repository marketing and outreach strategies employed by over a dozen academic institutions. Based on survey and interview results, Gerwig makes recommendations for those strug- gling to get the word out about their data curation services. For example, providing library liaisons, who are often embedded within their departmental cultures, with targeted messaging about the services in the form of presentation slides or an ele- vator speech was shown as one means of successful outreach activity. The lessons learned from current outreach efforts also demonstrates how libraries should re- frame the data repository and curation efforts around the positive incentives for sharing data rather than the sharing requirements themselves: such as a means of advancing knowledge in their field or by facilitating reproduction and verification. Reuse: the Ultimate Goal of Data Curation? Part III, Preparing Data for the Future, explores the outcomes of data curation efforts in numerous ways. If the ultimate goal of data curation is reuse, then how data are reused will inform the development of our services and best prac- tices. But perhaps this is a thankless task? One illustrative quote comes from the introduction to a 2002 technical report, written by astronomer and Microsoft researcher Jim Gray, that aptly demonstrates the potentially uphill battle we face: Once published, scientific data should remain available forever so that other scientists can reproduce the results and do new science with the data. Data may be used long after the project that gathered it ends. Later users will not implicitly know the details of how the data was gathered and prepared. To under- stand the data, those later users need the metadata: (1) how the instruments were designed and built; (2) when, where, and how the data was gathered; and (3) a careful description of the processing steps that led to the derived data products that are typically used for scientific data analysis. It’s fine to say that scientists should record and preserve all this information, but it is far too laborious and expensive to document everything. The scientist wants to do science, not be a clerk. And besides, who cares? Most data is never looked at again anyway.55 The clarity and examples for types of “metadata” needed for successful data reuse in this example is impressive. Yet the sentiment that most data would not be looked at again does not hold up just over a decade later.
  • 27. 14 Introduction to Volume One Instead, we are experiencing a dramatic shift in how data are reused, not only to “do new science,” but also because data reuse may increase a paper’s potential research impact, provide greater transparency to the results, and in some cases, can even make or break an individual’s career.56 The research disciplines are often the driving force in the reproducibility (or replicability) movement using data sharing to build greater expectations for rerunning experiments, providing in- dependent confirmations or validation of the research results, and more quickly identifying false findings.57 Again, remembering that digital data are more eas- ily shared, it is not surprising to ask researchers to provide the digital evidence of their findings for validation purposes. Some disciplines have embraced data transparency and provide portals and virtual hubs to share data and discuss re- sults.58 In one instance, national policy has embraced this idea of validation and Irish researchers are subject to external scrutiny when it comes to data presented in papers or captured in lab notebooks.59 Not everyone agrees that data transparency to the extreme is a positive trend. One 2016 editorial in Nature explains: ‘The progress of research demands trans- parency. But as scientists work to boost rigor, they risk making science more vulnerable to attacks. Awareness of tactics is paramount.”60 They go on to provide 10 ways to “distinguish scrutiny from harassment.”61 Another controversial take on data reuse issues erupted when the editor-in-chief of The New England Journal of Medicine (NEJM) published a sharply-worded editorial casting the role of data reuser as …people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited. There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as ‘research parasites.’62 A journalist from Forbes magazine drew an interesting comparable of the sit- uation by suggesting, “In just four years, it seems, data science has devolved from the ‘sexiest job of the 21st century’ to a community of ‘research parasites,’” where the former linked to the widely cited Harvard Business Review report describing informatics-based jobs as exciting and lucrative career choices.63 But the NEJM editorial, though sensational in some respects, does go on to make the point that researchers don’t want to be scooped, they don’t want to be proven wrong or taken out of context, and they are worried about not getting credit. Another researcher from a completely different field has a similar story. As co-author on a huge data sharing success story, the SnapShot Serengeti project hosted on the
  • 28. Introduction to Data Curation 15 community science driven platform Zooniverse, Kosmala describes some of the pressures faced by early career researchers to publish their results (in the form of traditional publications) and get scholarly credit for their work.64 Data sharing, she argues, though admirable, removes overarching control over the data so that anyone else could use it, with your permission or not. On the other hand, when data are shared with conditions of co-authorship, the loss of control converts itself into an opportunity (even expectation) of collaboration. As data curators we must be keenly aware of these disincentives. Data sharing may be great for end users of data, but it can be not-so-great for the data creators. In addition to researcher fears, there are costs involved with data sharing in terms of time (and occasionally monetary investments), muddy ownership claims at stake, and well, data sharing can just be a “pain in the ass…”65 In short, there is a lack of incen- tives for researchers to share: few carrots but many sticks. Therefore, an additional role for data curators may be to understand and assist as much as possible in the ethical and appropriate reuse of data. Library and information science professionals so often deal with the end-product in the scholarly communication pipeline, collecting the published finale of research: the papers, monographs, maps, and other well-formatted re- cords of scholarship. Archives and special collections, on the other hand, cover a larger swath of the research process by also collecting the creation and evolution of a work in the form of an edited manuscript, unlabeled photos, and the order in which press clippings were arranged.66 Research data curation may fall some- where in between and be viewed as one way to bridge that gap of creation and final product by working with data creators to prepare their data for eventual publication, context and all. In Chapter 10, “Open Exit: Reaching the End of the Data Lifecycle,” Andrea Ogier, Natsuko Nicholls, and Ryan Speer argue that data retention should be considered iteratively throughout the data life cycle and that knowledge gained from university records and information management, and library collection management can be applied to data cu- ration efforts in order to assist with planned data obsolescence. Rather than assume reuse potential for all data, our authors appropriately ask us to define better appraisal criteria to make critical selections for which data to retain and which data to dispose for reasons that incorporate the assessment of liability, risk, or resource cost over potential value. But what happens once data have fallen into obsolescence? Looking the op- posite direction, Chapter 12 by Robert R. Downs and Robert S. Chen asks: when should data be resurrected? They describe the data curation actions that might be taken in order to protect data that are experiencing less than ideal conditions in “Curation of Scientific Data at Risk of Loss: Data Res- cue and Dissemination.” Their data rescue examples involve a data set that was originally housed in the National Biological Information Infrastructure (NBII) program of the United States Geological Survey (USGS). This repository is a
  • 29. 16 Introduction to Volume One favorite among instructors of data information literacy due to its abrupt closure in response to federal budget cuts.67 The digital archive was permanently taken offline in January 2012. Here our authors provide not only practical experiences from a data rescue effort but general advice on the benefits and challenges of these attempts. Their balanced recommendations to identify critical and timely documentation rather than strive for completeness are underscored by the rel- evant case study presented with the NBII dataset. Particularly notable are the intellectual property and ownership issues encountered with orphaned data as time passes, and their recommendation for data curators to apply metadata now, even at the most basic level, in order to help future curators pull out the details of the dataset in the possibly all-too-near future. Finally, I’ll close this introduction to Volume One with a focus on issues of worldwide access and discovery of data. This is an essential component of data curation and data discovery can be a key factor for prompting worldwide inclu- sivity in research. The 2005 NSB report projects that “Long-lived digital data collections are powerful catalysts for progress and for democratization of science and education.”68 Yet in 2015, Sorrono et al. argue that the inclusivity of data sharing is not well-discussed nor yet fully realized: …a critical shift that is happening in both society and the envi- ronmental science community that makes data sharing not just good but ethically obligatory. This is a shift toward the ethical value of promoting inclusivity within and beyond science. An es- sential element of a truly inclusionary and democratic approach to science is to share data through publicly accessible data sets.69 Why? Because open data benefits science, enhances social and economic development, and, according to one Australian study, can even be significantly profitable.70 In Chapter 11, “The Current State of Linked Data Repositories: A Com- parative Analysis,” Cynthia R. Hudson Vitale assesses the impact of the com- plexity of data sharing options available to researchers and observes that as a result data may be scattered across various institutional, disciplinary, or general repositories. One possible solution is open and federated “meta-repos- itories” that search across the collective holdings of disparate data repositories. Lynch described this transition of data sharing practices as going from “journals [that] offer to accept it as ‘supplementary materials’ that accompany the arti- cle” to a future of repositories of machine-readable digital data that can be “data mined” for the generation of new knowledge.71 Hudson Vitale explores how this far end of the spectrum is emerging and compares thirteen linked data repositories, their underlying missions, and their technical approaches to federating data search and discovery using a website anal-
  • 30. Introduction to Data Curation 17 ysis across fifteen variables. The future of data reuse rests on the discoverability of data to potential reusers, and this chapter demonstrates that we have much to accomplish to make data repositories more interoperable. Conclusion Digital data is ubiquitous and rapidly reshaping how scholarship progresses now and into the future. The abundant—and sometimes chaotic—flow of data world- wide enables a new form of collaborative exploration and discovery that minimiz- es international and interdisciplinary barriers connecting researchers with shared goals and accelerates the rate of scientific understanding. Just take a moment to consider the vast body of digital information housed in openly accessible data repositories across the world representing unique information products such as the mysterious and brief flashes of high-energy gamma-ray bursts originating from the far outer-reaches of our universe, the Alexandrian feat that is Hathi- Trust bringing together into a single corpus of searchable text everything from Shakespearean plays to song lyrics by The Beatles, the echoes of evolutionary history surfacing from the endless strings of human genetic DNA, and the daily snapshot of social norms and human values which can emerge from the deluge of human-machine interactions generated across the social web.72 In 2003, Hey and Trefethen anticipated that “new types of digital libraries for scientific data with the same sort of management services as conventional digital libraries” would emerge in response to our changing world.73 That time is now. These are extraor- dinary times for data curators and how we rise to the challenge of providing new services and respond to the shifting patterns of data sharing and data reuse has the potential to shape and define our profession into the future. Notes 1. Merriam-Webster’s Learner’s Dictionary, “Data,” accessed August 6, 2016, http://www. merriam-webster.com/dictionary/data. 2. Definition from footnote 1 on page 2 in the article by Claire C. Austin, Theodora Bloom, Sünje Dallmeier-Tiessen, Varsha K. Khodiyar, Fiona Murphy, Amy Nurnberg- er, Lisa Raymond, Martina Stockhause, Jonathan Tedds, Mary Vardigan, and Angus Whyte, “Key components of data publishing: Using current best practices to develop a reference model for data publishing,” International Journal on Digital Libraries, June 2016, doi:10.1007/s00799-016-0178-2. 3. See the Digital Curation Center (DCC). “DCC Curation Lifecycle Model,” accessed August 6, 2016, http://www.dcc.ac.uk/resources/curation-lifecycle-model; for the his- tory and development of this model see Sarah Higgins, “The DCC Curation Lifecycle Model,” International Journal of Digital Curation 3, no. 1 (2008): 134–40, doi:10.2218/ ijdc.v3i1.48, where data are defined on p137.
  • 31. 18 Introduction to Volume One 4. Ross Harvey, “Chapter 4. Defining Data,” Digital Curation: A How-To-Do-It Manual, No. 025.06. (Chicago: Neal-Schuman Publishers, 2010), http://www.alastore.ala.org/ pdf/digital_curation.pdf. 5. The US federal government, for example, defines research data in their OMB circular a-110 as “recorded factual material commonly accepted in the scientific community as necessary to validate research findings,” see full notice at Office of Management and Budget, “CIRCULAR A-110,” revised November 19, 1993, further amended Septem- ber 20, 1999, https://www.whitehouse.gov/omb/circulars_a110. 6. See for example the PublicVR project, accessed August 6, 2016, http://publicvr.org/ index.html, which provides virtual reality 3d environments for places such as the Grand Theater in the Roman city of Pompeii as it may have looked prior to the devastating volcanic eruption in 79AD. 7. See for example the eMotion lab at the University of Notre Dame that uses “advanced video capture equipment to track posture, gesture, and facial expression during a variety of experimental tasks” at the University of Notre Dame, “About the eMotion and eCog- nition Lab,” accessed August 6, 2016, http://www3.nd.edu/~emotecog/about.html. 8. The 2015 report by McAfee Labs warns of the cyber security challenges that are abundant such as identity theft, data breaches, and national security risks in Intel Security Group McAfee Labs, “The Hidden Data Economy,” October 15, 2015, http://www.mcafee. com/us/resources/reports/rp-hidden-data-economy.pdf; This Technology Watch report describes techniques to preserve large-scale transactional data derived from business and industry in Thomson, Sara Day, “Technology Watch Report 16: Preserving Transactional Data,” Digital Preservation Coalition, May 2, 2016, doi:10.7207/twr16-02. 9. This quote is from page 2-1 of the OAIS Reference Model found in Consultative Committee for Space Data Systems, Audit and Certification of Trustworthy Digital Repositories, Recommended Practice, CCSDS 652.0-M-1, Magenta Book, Issue 1 Washington, DC: CCSDS Secretariat, September 2011, http://public.ccsds.org/publi- cations/archive/652x0m1.pdf. 10. Footnote 2 on page 2 of Austin et. al. “Key components of data publishing: Using current best practices to develop a reference model for data publishing.” Reference in the quote is to CASRAI, “Category:Research Data Domain,” The CASRAI Dictionary, Last Modified August18, 2015, http://dictionary.casrai.org/Category:Research_Data_ Domain; the RDA Data Foundations and Terminology working group has a growing dictionary of data related terms that is searchable at Research Data Alliance Data Foun- dation and Terminology Interest Group, “Term Definition Tool (TeD-T),” last modified March 1, 2016, http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page. 11. National Science Board, “NSB-05-40, Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century,” Summer 2005, National Science Founda- tion, http://www.nsf.gov/pubs/2005/nsb0540, p1. 12. University of Illinois Urbana-Champaign School of Information Science, “Specializa- tion in Data Curation,” accessed August 4, 2016, http://www.lis.illinois.edu/academics/ programs/specializations/data_curation. 13. Committee on Future Career Opportunities and Educational Requirements for Digital Curation; Board on Research Data and Information; Policy and Global Affairs; National Research Council, Preparing the Workforce for Digital Curation (Washington, DC: National Academies Press; April 22, 2015), http://www.nap.edu/catalog.php?re- cord_id=18590.
  • 32. Introduction to Data Curation 19 14. For more in-depth coverage of this topic, read a systematic review of data sharing studies in academia. See: Fecher, Benedikt, Sascha Friesike, and Marcel Hebing, “What drives academic data sharing?,” PLoS One 10, no. 2 (2015), doi:10.1371/journal. pone.0118053. 15. National Academy of Sciences, National Academy of Engineering, and Institute of Medicine, Information Technology and the Conduct of Research: The User’s View (Washing- ton, DC: The National Academies Press, 1989), doi:10.17226/763, p1. 16. Gary King, “Ensuring the Data-Rich Future of the Social Sciences,” Science 331(6018): 719–721 (2011), doi:10.1126/science.1197872. 17. An overview of these policies is found in Kathleen Shearer, “Comprehensive Brief on Research Data Management Policies,” released April 2015, http://acts.oecd.org/Instru- ments/ShowInstrumentView.aspx?InstrumentID=157. 18. The memo from the White House’s Office of Science Technology Policy (OSTP) was released as John P. Holdren, “Increasing Access to the Results of Federally Funded Sci- entific Research,” Memorandum for the Heads of Executive Departments and Agencies, Office of Science and Technology Policy, Executive Office of the President, February 22, 2013, http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_ac- cess_memo_2013.pdf. 19. Adapted from Inter-university Consortium for Political and Social Research (ICPSR), “Guidelines for OSTP Data Access Plan,” accessed August 6, 2016, http://www.icpsr. umich.edu/icpsrweb/content/datamanagement/ostp.html. 20. Jerry Sheehan, “Increasing Access to the Results of Federally Funded Science,” The White House Blog, posted February 22, 2016, https://www.whitehouse.gov/ blog/2016/02/22/increasing-access-results-federally-funded-science. 21. United States Government, “US Open Data Action Plan,” May 9, 2014, https://www. whitehouse.gov/sites/default/files/microsites/ostp/us_open_data_action_plan.pdf. 22. Ford Foundation, “Ford Foundation expands Creative Commons licensing for all grant-funded projects,” February 3, 2015, https://www.fordfoundation.org/the-latest/ news/ford-foundation-expands-creative-commons-licensing-for-all-grant-funded-proj- ects; Alfred P. Sloan Foundation, “Grant Application Guidelines,” last modified January 6, 2014, http://www.sloan.org/fileadmin/media/files/application_documents/propos- al_guidelines_research_officer_grants.pdf; Bill & Melinda Gates Foundation, “Bill & Melinda Gates Foundation Open Access Policy,” accessed August 6, 2016, http://www. gatesfoundation.org/How-We-Work/General-Information/Open-Access-Policy. 23. SPARC Open Data, “Research Funder Data Sharing Policies,” accessed August 5, 2016, http://sparcopen.org/our-work/research-data-sharing-policy-initiative/funder-policies. 24. Institute of Medicine and National Academy of Sciences, Ensuring the Integrity, Ac- cessibility, and Stewardship of Research Data in the Digital Age (Washington, DC: The National Academies Press, 2009), doi:10.17226/12615, 34. 25. Retraction Watch, “Archive for the ‘data issues’ Category,” accessed August 6, 2016, http://retractionwatch.com/category/by-reason-for-retraction/data-issues. 26. Kathleen Fear, “Building Outreach on Assessment: Researcher Compliance with Journal Policies for Data Sharing,” Bulletin of the American Society for Information Science and Technology 41, no. 6 (2015): 18–21, doi:10.1002/bult.2015.1720410609; Heather A. Piwowar and Wendy W. Chapman, “A Review of Journal Policies for Sharing Research Data,” Nature Precedings, March 20, 2008, hdl:10101/npre.2008.1700.1; Linda Naugh- ton and David Kernohan, “Making Sense of Journal Research Data Policies,” Insights
  • 33. 20 Introduction to Volume One 29, no. 1 (2016), http://doi.org/10.1629/uksg.284. 27. The model is published in Paul Sturges, Marianne Bamkin, Jane H.S. Anders, Bill Hubbard, Azhar Hussain, and Melanie Heeley, “Research Data Sharing: Developing a Stakeholder-Driven Model for Journal Policies,” Journal of the Association for Informa- tion Science and Technology, doi:10.1002/asi.23336. 28. Nature, “Availability of Data, Material and Methods,” accessed August 6, 2016, http:// www.nature.com/authors/policies/availability.html; PLOS One, “Data Availability,” accessed August 6, 2016, http://journals.plos.org/plosone/s/data-availability. 29. Chelsey Coombs, “Neuroscience Paper Retracted After Colleagues Object to Data Publication,” Retraction Watch, December 31, 2015, http://retractionwatch. com/2015/12/31/neuroscience-paper-retracted-after-colleagues-object-to-data-publica- tion. 30. Elsevier, “Elsevier and the Inter-University Consortium for Political and Social Research (ICPSR) Announce Data Linking,” February 8, 2016, http://www.prnewswire.com/ news-releases/elsevier-and-the-inter-university-consortium-for-political-and-social-re- search-icpsr-announce-data-linking-568022141.html; See the list of data repositories at Elsevier, “Supported Data Repositories,” accessed August 6, 2016, https://www.elsevier. com/?a=57755. 31. Scientific Data homepage, accessed August 6, 2016, http://www.nature.com/sdata; Data in Brief homepage, accessed August 6, 2016, http://www.journals.elsevier.com/data- in-brief; as reported in Tim Austin, “Towards a Digital Infrastructure for Engineering Materials Data,” Materials Discovery (2016), doi:10.1016/j.md.2015.12.003, 2. 32. Leonardo Candela, Donatella Castelli, Paolo Manghi, and Alice Tani, “Data Journals: A Survey,” Journal of the Association for Information Science and Technology 66, no. 9 (2015): 1747–1762, doi: 10.1002/asi.23358. 33. Ibid, 1756. 34. Scientific Data, “Recommended Data Repositories,” accessed July 18, 2016, http://www. nature.com/sdata/policies/repositories. 35. The declaration signifies that each country will “Work towards the establishment of access regimes for digital research data from public funding” and with shared objectives and principles. Available as Organisation for Economic Co-operation and Develop- ment, “Declaration on Access to Research Data from Public Funding,” January 30, 2004, http://acts.oecd.org/Instruments/ShowInstrumentView.aspx?InstrumentID=157. 36. The UK funding council polices are each summarized and linked to from the Digital Curation Center, “Funders’ Data Policies,” accessed August 6, 2016, http://www.dcc. ac.uk/resources/policy-and-legal/funders-data-policies; the Wellcome Trust, “Policy on data management and sharing,” accessed August 6, 2016, https://wellcome.ac.uk/ funding/managing-grant/policy-data-management-and-sharing; Research Councils UK, “RCUK Common Principles on Data Policy,” published April 2011, http://www.rcuk. ac.uk/research/datapolicy. 37. European Commission, “Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020”, version 3.0,” July 26, 2016, http://ec.europa.eu/ research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot- guide_en.pdf. 38. Kathleen Shearer, “Comprehensive Brief on Research Data Management Policies.” In 2015 Canada also released a federal policy on the open access to publications resulting from federal funds from its three primary funding agencies (see Government of Canada,
  • 34. Introduction to Data Curation 21 “Tri-Agency Open Access Policy on Publications,” February 27, 2015, http://www. science.gc.ca/default.asp?lang=En&n=F6765465-1), yet this requirement only applies to research articles, not data. 39. Portage network homepage, accessed August 6, 2016, https://portagenetwork.ca. 40. JISC-funded Research Data Management Shared Service Project, accessed August 4, 2016, https://www.jisc.ac.uk/rd/projects/research-data-shared-service; Data Curation Network Project homepage, accessed August 4, 2016, https://sites.google.com/site/data- curationnetwork. 41. For example, findings from reviewing a sample of 182 Data Management Plans of suc- cessful National Science Foundation grant proposals showed this to be the case for 74% of the sample in Carolyn Bishoff and Lisa R. Johnston, “Approaches to Data Sharing: An Analysis of NSF Data Management Plans from a Large Research University,” Journal of Librarianship and Scholarly Communication 3, no. 2 (2015). doi:10.7710/2162- 3309.1231. 42. Caitlin Rivers, “‘Send Me Your Data—PDF is Fine,’ Said No One Ever (How to Share Your Data Effectively),” April 8, 2013, http://www.caitlinrivers.com/blog/send-me- your-data-pdf-is-fine-said-no-one-ever-how-to-share-your-data-effectively. 43. Carlos Santos, Judith Blake, and David J. States, “Supplementary Data Need to be Kept in Public Repositories,” Nature 438, no. 7069 (2005): 738-738, doi: 10.1038/438738a. 44. Caroline J. Savage, and Andrew J. Vickers, “Empirical Study of Data Sharing by Authors Publishing in PLoS Journals,” PloS One 4, no. 9 (2009): e7078, doi:10.1371/ journal.pone.0007078; Timothy H. Vines, Arianne YK Albert, Rose L. Andrew, Florence Débarre, Dan G. Bock, Michelle T. Franklin, Kimberly J. Gilbert, Jean-Sébas- tien Moore, Sébastien Renaut, and Diana J. Rennison, “The Availability of Research Data Declines Rapidly with Article Age,” Current Biology 24, no. 1 (2014): 94–97, doi:10.1016/j.cub.2013.11.014. 45. Michael Witt, “Institutional Repositories and Research Data Curation in a Distributed Environment,” Library Trends 57, no. 2 (2008): 191–201, doi:10.1353/lib.0.0029. 46. G. Sayeed Choudhury, “Case Study in Data Curation at Johns Hopkins University,” Library Trends 57, no. 2 (2008): 211–220, doi:10.1353/lib.0.0028. 47. Carol Tenopir, Ben Birch, and Suzie Allard, Academic Libraries and Research Data Services: Current Practices and Plans for the Future, An ACRL White Paper, Association of College and Research Libraries, a division of the American Library Association, 2012, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/ Tenopir_Birch_Allard.pdf. 48. Further examples of disciplinary repositories are found in re3data.org homepage, ac- cessed August 6, 2016, http://www.re3data.org. 49. DataOne, “Best Practices,” accessed August 5, 2016, http://www.dataone.org/best-prac- tices; DataOne, “Software Tools Catalog,” accessed August 5, 2016, https://www. dataone.org/software_tools_catalog. 50. DataOne, “ESA 2011: How to Manage Ecological Data for Effective Use and Re-use,” August 7, 2011, http://www.dataone.org/esa-2011-how-manage-ecological-data-effec- tive-use-and-re-use. 51. Raymond Leadbetter, A., L., Chandler, C., Pikula, L., Pissierssens, P., Urban, E., Ocean Data Publication Cookbook (Paris: UNESCO, 2013), http://www.iode.org/mg64; For further context see the slides by Lisa Raymond, “Publishing and Citing Ocean Data,” OneNOAA Science Seminar, National Oceanographic Data Center, May 22, 2013,
  • 35. 22 Introduction to Volume One http://www.nodc.noaa.gov/seminars/2013/support/Lisa_Raymond_OneNOAASemi- nar_slides.pdf. 52. Jared Lyle, George Alter and Mary Vardigan, “‘The Price of Keeping Knowledge’ Work- shop: ICPSR Position Paper,” (2013), http://www.knowledge-ex-change.info/Admin/ Public/DWSDownload.aspx?File=%2FFiles%2FFiler%2Fdownloads%2FPrimary+Re- search+Data%2FWorkshop+Price+of+Keeping+Knowledge%2FJared+Lyle+ICPSR_Po- sition+Paper_Price+workshop_public.pdf. 53. Carol Ember, Robert Hanisch, George Alter, Helen Berman, Margaret Hedstrom, and Mary Vardigan. “Sustaining Domain Repositories for Digital Data: A White Paper,” December 11, 2013, 10–11, http://datacommunity.icpsr.umich.edu/sites/default/files/ WhitePaper_ICPSR_SDRDD_121113.pdf. 54. Ibid., 10. 55. Jim Gray, Alexander S. Szalay, Ani R. Thakar, Christopher Stoughton, and Jan vanden- Berg, “Online Scientific Data Curation, Publication, and Archiving,” submitted August 7, 2002, http://arxiv.org/abs/cs.DL/0208012. 56. According to a 2007 study, openly sharing data was linked higher citation rates for the publications associated with that data. See Heather A. Piwowar, Roger S. Day, and Douglas B. Fridsma, “Sharing Detailed Research Data is Associated with Increased Citation Rate,” PloS One 2, no. 3 (2007): e308, doi:10.1371/journal.pone.0000308; Cases of unreplicable or faulty data have been the subject of several studies, such as the Reproducibility Studies by the Center for Open Science in the fields of psychology, (Al- exander A. Aarts, Christopher J. Anderson, Joanna Anderson, Marcel A.L.M van Assen, Peter R. Attridge, Angela S. Attwood, Jordan Axt, et al., 2016, “Reproducibility Project: Psychology,” Open Science Framework, July 23, https://osf.io/EZcUj/); and cancer biology (Timothy M. Errington, Fraser E. Tan, Joelle Lomax, Nicole Perfito, Elizabeth Iorns, William Gunn, Brian A. Nosek, et al., 2016, “Reproducibility Project: Cancer Biology,” Open Science Framework, July 22. https://osf.io/e81xl/). In addition, the high profile case of scientists Dong-Pyou Han in an HIV-data falsification charge actually led to jail time and $7.2 million in fines according to the report Sara Reardon, “US Vaccine Researcher Sentenced to Prison for Fraud,” Nature News, July 1, 2015, http://www. nature.com/news/us-vaccine-researcher-sentenced-to-prison-for-fraud-1.17660. 57. Victoria Sodden provides entertaining slide presentation on “A Brief History of the Reproducibility Movement,” December 10, 2012, http://hdl.handle.net/10022/ AC:P:15396; Prasad Patil, Roger D. Peng, Jeffrey Leek, “A Statistical Definition for Reproducibility and Replicability,” BioRxiv, July 29, 2016, doi:10.1101/066803. 58. Disciplinary repositories such as the iPlant Collaborative (homepage, accessed August 6, 2016, http://www.iplantcollaborative.org), nanoHUB.org (homepage, accessed August 6, 2016, https://nanohub.org), EarthCube (homepage, accessed August 6, 2016, http:// earthcube.org), and CUAHSI (Hydrologic Information System homepage, accessed August 6, 2016, http://his.cuahsi.org) represent the collective outputs of the discipline to allow for widespread reuse of the data. 59. Richard Van Noorden, “Irish University Labs Face External Audits,” Nature News, June 17, 2014, http://www.nature.com/news/irish-university-labs-face-external-au- dits-1.15422. 60. Stephan Lewandowsky and Dorothy Bishop, “Research Integrity: Don’t Let Trans- parency Damage Science,” Nature, January 25, 2016, http://www.nature.com/news/ research-integrity-don-t-let-transparency-damage-science-1.19219.
  • 36. Introduction to Data Curation 23 61. Ibid. 62. Dan L. Longo, and Jeffrey M. Drazen, “Data Sharing,” New England Journal of Medi- cine 374, no. 3 (2016): 276–277, doi: 10.1056/NEJMe1516564. 63. David Shaywitz, “Data Scientists = Research Parasites?,” Forbes, January 21, 2016, http://www.forbes.com/sites/davidshaywitz/2016/01/21/data-scientists-research-par- asites/#3ddef3453d1c; Thomas H. Davenport and D.J. Patil, “Data Scientist: The Sexiest Job of the 21st Century,” Harvard Business Review, October 2012, https://hbr. org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century. 64. Margaret Kosmala, “Open Data, Authorship, and the Early Career Scientist,” Ecology Bits, posted June 15, 2016, http://ecologybits.com/index.php/2016/06/15/open-da- ta-authorship-and-the-early-career-scientist/; Snapshot Serengeti dataset available as Al- exandra Swanson, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith, and Craig Packer, “Snapshot Serengeti, High-Frequency Annotated Camera Trap Images of 40 Mammalian Species in an African Savanna,” Dryad Digital Repository, http://dx.doi. org/10.5061/dryad.5pt92 and the paper describing the data available as Alexandra Swanson, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith, and Craig Packer, “Snapshot Serengeti, High-Frequency Annotated Camera Trap Images of 40 Mammalian Species in an African Savanna,” Scientific Data 2 (2015), doi:10.1038/sda- ta.2015.26. 65. Terry McGlynn, “I Own My Data, Until I Don’t,” Small Pond Science, March 3, 2014, http://smallpondscience.com/2014/03/03/i-own-my-data-until-i-dont; Emilio M. Bru- na, “The Opportunity Cost of My #OpenScience was 36 Hours + $690,” The Bruma Lab, September 4, 2014, http://brunalab.org/blog/2014/09/04/the-opportunity-cost- of-my-openscience-was-35-hours-690. 66. The archival community has dealt with curation issues in the print and analog for centuries and the lessons learned translate well into the digital realm but are often overlooked by developers of new data curation services in academic and disciplinary settings according to Helen R. Tibbo, and Christopher A. Lee, “Closing the Digital Curation Gap: A Grounded Framework for Providing Guidance and Education in Digital Curation,” Archiving Conference, vol. 2012, no. 1, pp. 57–62, Society for Im- aging Science and Technology, 2012, http://www.ils.unc.edu/callee/p57-tibbo.pdf. Some example archival workflows that translate well to data curation include Julianna Barre- ra-Gomez and Ricky Erway, Walk This Way: Detailed Steps for Transferring Born-Digital Content from Media You Can Read In-House (Dublin, OH: OCLC Online Computer Library Center, 2013), http://www.oclc.org/content/dam/research/publications/li- brary/2013/2013-02.pdf and the AIMS Work Group, “AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship,” January 2012, http://dcs.library.virginia. edu/files/2013/02/AIMS_final.pdf. 67. US Geological Survey, “NBII to Be Taken Offline Permanently in January,” USGS Access Newsletter 14, no. 3 (Fall 2011), https://www2.usgs.gov/core_science_systems/Access/ p1111-1.html. 68. National Science Board, “NSB-05-40, Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century,” https://www.nsf.gov/pubs/2005/nsb0540/. 69. Patricia A. Soranno, Kendra S. Cheruvelil, Kevin C. Elliott, and Georgina M. Mont- gomery, “It’s Good to Share: Why Environmental Scientists’ Ethics are Out of Date,” BioScience 65, no. 1 (2015): 69–73, doi: 10.1093/biosci/biu169. 70. Australian National Data Service, “Open Research Data,” November 2014, http://www.
  • 37. 24 Introduction to Volume One ands.org.au/working-with-data/articulating-the-value-of-open-data/open-research-da- ta-report. 71. Clifford Lynch, “The Shape of the Scientific Article in the Developing Cyberinfra- structure,” CTWatch Quarterly 3, no. 3 (2007), http://www.ctwatch.org/quarterly/arti- cles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastructure/ index.html. 72. Real-time observational data of the quickly dimming objects known as gamma-ray bursts (GRBs) are available to researchers through the Goddard Space Flight Center, “GCN: The Gamma-ray Coordinates Network (TAN: Transient Astronomy Network),” accessed August 6, 2016, http://gcn.gsfc.nasa.gov and public download access to GRB recordings that predate the SWIFT satellite mission launched in 2003 are also available Goddard Space Flight Center, “The Gamma Ray Burst Catalog,” accessed August 6, 2016, http://heasarc.gsfc.nasa.gov/grbcat/grbcat.html; Hathitrust is a searchable data- base of millions of digitized text and available at Hathitrust homepage, accessed August 6, 2016, http://babel.hathitrust.org; Public access to download the human genome and tools to analyze and compare DNA are available at NCBI, “Human Genome Resourc- es,” accessed August 6, 2016, http://www.ncbi.nlm.nih.gov/genome/guide/human; Big data generated by human-computer interaction can be derived from many social web services, though some do not release their data to the public (e.g., Amazon, Facebook). Sources of public data are available via APIs that contain real-time, and sometimes historical, information. For example Twitter interaction data can be found at the Gnip homepage, accessed August 6, 2016, https://gnip.com, and in 2016 Yahoo released a News Feed dataset of 110 billion interactions of anonymized users interactions with their home page and news sites as Yahoo, “R10—Yahoo News Feed dataset, version 1.0 (1.5TB),” accessed August 6, 2016, http://webscope.sandbox.yahoo.com/catalog. php?datatype=r&did=75. 73. Anthony J.G. Hey, and Anne E. Trefethen, “The Data Deluge: An E-Science Perspec- tive,” Grid Computing: Making the Global Infrastructure a Reality, (Chichester: Wiley, 2003), 809–24, http://eprints.soton.ac.uk/id/eprint/257648. Bibliography Aarts, Alexander A., Christopher J. Anderson, Joanna Anderson, Marcel A.L.M van Assen, Peter R. Attridge, Angela S. Attwood, Jordan Axt, et al. 2016. “Reproducibility Proj- ect: Psychology.” Open Science Framework. July 23. osf.io/ezcuj. AIMS Work Group. “AIMS Born-Digital Collections: An Inter-Institutional Model for Stew- ardship.” January 2012. http://dcs.library.virginia.edu/files/2013/02/AIMS_final.pdf. Alfred P. Sloan Foundation. “Grant Application Guidelines.” Last modified January 6, 2014. http://www.sloan.org/fileadmin/media/files/application_documents/proposal_guide- lines_research_officer_grants.pdf. Austin, Claire C., Theodora Bloom, Sünje Dallmeier-Tiessen, Varsha K. Khodiyar, Fiona Murphy, Amy Nurnberger, Lisa Raymond, Martina Stockhause, Jonathan Tedds, Mary Vardigan, and Angus Whyte. “Key components of data publishing: Using current best practices to develop a reference model for data publishing.” International Journal on Digital Libraries, 20 June 2016. doi:10.1007/s00799-016-0178-2.
  • 38. Introduction to Data Curation 25 Austin, Tim. “Towards a Digital Infrastructure for Engineering Materials Data.” Materials Discovery (2016). doi:10.1016/j.md.2015.12.003. Australian National Data Service. “Open Research Data.” November 2014. http://www.ands. org.au/working-with-data/articulating-the-value-of-open-data/open-research-data-re- port. Barrera-Gomez, Julianna, and Ricky Erway. Walk This Way: Detailed Steps for Transferring Born-Digital Content from Media You Can Read In-House. Dublin, OH: OCLC Online Computer Library Center, Inc., 2013. http://www.oclc.org/content/dam/ research/publications/library/2013/2013-02.pdf. Bill & Melinda Gates Foundation. “Bill & Melinda Gates Foundation Open Access Policy.” Accessed August 6, 2016. http://www.gatesfoundation.org/How-We-Work/Gener- al-Information/Open-Access-Policy. Bishoff, Carolyn, and Lisa R. Johnston. “Approaches to Data Sharing: An Analysis of NSF Data Management Plans from a Large Research University.” Journal of Librarianship and Scholarly Communication 3, no. 2 (2015). doi:10.7710/2162-3309.1231. Bruna, Emilio M. “The Opportunity Cost of My #OpenScience was 36 Hours + $690.” The Bruma Lab. September 4, 2014. http://brunalab.org/blog/2014/09/04/the-opportu- nity-cost-of-my-openscience-was-35-hours-690/. Candela, Leonardo, Donatella Castelli, Paolo Manghi, and Alice Tani. “Data Journals: A Sur- vey.” Journal of the Association for Information Science and Technology 66, no. 9 (2015): 1747-1762. doi: 10.1002/asi.23358. CASRAI. “Category:Research Data Domain.” The CASRAI Dictionary. Last Modified Au- gust18, 2015. http://dictionary.casrai.org/Category:Research_Data_Domain. Choudhury, G. Sayeed. “Case Study in Data Curation at Johns Hopkins University.” Library Trends 57, no. 2 (2008): 211-220. doi: 10.1353/lib.0.0028. Committee on Future Career Opportunities and Educational Requirements for Digital Cura- tion; Board on Research Data and Information; Policy and Global Affairs; National Research Council. Preparing the Workforce for Digital Curation. Washington, DC: National Academies Press; April 22, 2015. http://www.nap.edu/catalog.php?record_ id=18590. Consultative Committee for Space Data Systems. Audit and Certification of Trustworthy Digital Repositories. Recommended Practice, CCSDS 652.0-M-1, Magenta Book, Issue 1. Washington, DC: CCSDS Secretariat, September 2011. http://public.ccsds. org/publications/archive/652x0m1.pdf. Coombs, Chelsey. “Neuroscience Paper Retracted After Colleagues Object to Data Publication.” Retraction Watch. December 31, 2015. http://retractionwatch. com/2015/12/31/neuroscience-paper-retracted-after-colleagues-object-to-data-publi- cation/. CUAHSI Hydrologic Information System homepage. Accessed August 6, 2016. http://his. cuahsi.org/. Data Curation Network Project homepage. Accessed August 4, 2016. https://sites.google. com/site/datacurationnetwork/. Data in Brief homepage. Accessed August 6, 2016. http://www.journals.elsevier.com/data-in- brief. DataOne. “Best Practices.” Accessed August 5, 2016. http://www.dataone.org/best-practices.
  • 39. 26 Introduction to Volume One DataOne. “ESA 2011: How to Manage Ecological Data for Effective Use and Re-use.” August 7, 2011. http://www.dataone.org/esa-2011-how-manage-ecological-data-ef- fective-use-and-re-use. DataOne. “Software Tools Catalog.” Accessed August 5, 2016. https://www.dataone.org/ software_tools_catalog. Davenport, Thomas H., D.J. Patil. “Data Scientist: The Sexiest Job of the 21st Century.” Harvard Business Review. October 2012. https://hbr.org/2012/10/data-scientist-the- sexiest-job-of-the-21st-century. Digital Curation Center. “Funders’ Data Policies.” Accessed August 6, 2016. http://www.dcc. ac.uk/resources/policy-and-legal/funders-data-policies. Digital Curation Center (DCC). “DCC Curation Lifecycle Model.” Accessed August 6, 2016. http://www.dcc.ac.uk/resources/curation-lifecycle-model. EarthCube homepage. Accessed August 6, 2016. http://earthcube.org/. Elsevier. “Elsevier and the Inter-University Consortium for Political and Social Research (ICPSR) Announce Data Linking.” February 8, 2016. http://www.prnewswire.com/ news-releases/elsevier-and-the-inter-university-consortium-for-political-and-social-re- search-icpsr-announce-data-linking-568022141.html. ———. “Supported Data Repositories.” Accessed August 6, 2016. https://www.elsevier. com/?a=57755. Ember, Carol, Robert Hanisch, George Alter, Helen Berman, Margaret Hedstrom, and Mary Vardigan. “Sustaining Domain Repositories for Digital Data: A White Paper.” De- cember 11, 2013, 10–11. http://datacommunity.icpsr.umich.edu/sites/default/files/ WhitePaper_ICPSR_SDRDD_121113.pdf. Errington, Timothy M, Fraser E. Tan, Joelle Lomax, Nicole Perfito, Elizabeth Iorns, William Gunn, Brian A. Nosek, et al. 2016. “Reproducibility Project: Cancer Biology.” Open Science Framework. July 22. osf.io/e81xl. European Commission. “Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020. Version 3.0.” July 26, 2016. http://ec.europa.eu/research/par- ticipants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf. Fear, Kathleen. “Building Outreach on Assessment: Researcher Compliance with Journal Policies for Data Sharing.” Bulletin of the American Society for Information Science and Technology 41, no. 6 (2015): 18-21. doi:10.1002/bult.2015.1720410609. Fecher, Benedikt, Sascha Friesike, and Marcel Hebing. “What Drives Academic Data Shar- ing?” PLoS One 10, no. 2 (2015): doi:10.1371/journal.pone.0118053. Ford Foundation. “Ford Foundation expands Creative Commons licensing for all grant-funded projects.” February 3, 2015. https://www.fordfoundation.org/the-latest/ news/ford-foundation-expands-creative-commons-licensing-for-all-grant-funded- projects/. Gnip homepage. Accessed August 6, 2016. https://gnip.com/. Goddard Space Flight Center. “GCN: The Gamma-ray Coordinates Network (TAN: Tran- sient Astronomy Network).” Accessed August 6, 2016. http://gcn.gsfc.nasa.gov. Goddard Space Flight Center. “The Gamma Ray Burst Catalog.” Accessed August 6, 2016. http://heasarc.gsfc.nasa.gov/grbcat/grbcat.html. Government of Canada. “Tri-Agency Open Access Policy on Publications.” February 27, 2015. http://www.science.gc.ca/default.asp?lang=En&n=F6765465-1. Gray, Jim, Alexander S. Szalay, Ani R. Thakar, Christopher Stoughton, and Jan vandenBerg. “Online Scientific Data Curation, Publication, and Archiving.” Submitted August 7, 2002. http://arxiv.org/abs/cs.DL/0208012.
  • 40. Introduction to Data Curation 27 Harvey, Ross. “Chapter 4. Defining Data.” Digital Curation: A How-To-Do-It Manual. No. 025.06. Chicago: Neal-Schuman Publishers, 2010. HathiTrust homepage. Accessed August 6, 2016. http://babel.hathitrust.org. Hey, Anthony J.G., and Anne E. Trefethen. “The Data Deluge: An E-Science Perspective.” In Grid Computing: Making the Global Infrastructure a Reality, edited by F. Berman, G. Fox, A. J.G. Hey, 809–24. Chichester: Wiley 2003. http://eprints.soton.ac.uk/id/ eprint/257648. Higgins, Sarah. “The DCC Curation Lifecycle Model.” International Journal of Digital Cura- tion 3, no. 1 (2008): 134–40. doi:10.2218/ijdc.v3i1.48, p137. Holdren, John P. “Increasing Access to the Results of Federally Funded Scientific Research.” Memorandum for the Heads of Executive Departments and Agencies, Office of Science and Technology Policy, Executive Office of the President, February 22, 2013. http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_ memo_2013.pdf. Institute of Medicine and National Academy of Sciences. Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. Washington, DC: The National Academies Press, 2009. doi:10.17226/12615, 34. Intel Security Group McAfee Labs. “The Hidden Data Economy.” October 15, 2015. http:// www.mcafee.com/us/resources/reports/rp-hidden-data-economy.pdf. Inter-university Consortium for Political and Social Research (ICPSR). “Guidelines for OSTP Data Access Plan.” Accessed August 6, 2016. http://www.icpsr.umich.edu/ icpsrweb/content/datamanagement/ostp.html. iPlant Collaborative homepage. Accessed August 6, 2016. http://www.iplantcollaborative. org. King, Gary. 2011. Ensuring the Data-rich Future of the Social Sciences. Science 331(6018): 719–721. doi:10.1126/science.1197872. Kosmala, Margaret. “Open Data, Authorship, and the Early Career Scientist.” Ecology Bits, posted June 15, 2016. http://ecologybits.com/index.php/2016/06/15/open-data-au- thorship-and-the-early-career-scientist. Leadbetter, A., Raymond, L., Chandler, C., Pikula, L., Pissierssens, P., Urban, E. Ocean Data Publication Cookbook. (Paris: UNESCO, 2013.) http://www.iode.org/mg64. Lewandowsky, Stephan and Dorothy Bishop. “Research Integrity: Don’t Let Transparency Damage Science.” Nature. January 25, 2016. http://www.nature.com/news/re- search-integrity-don-t-let-transparency-damage-science-1.19219. Longo, Dan L. and Jeffrey M. Drazen. “Data Sharing.” New England Journal of Medicine 374, no. 3 (2016): 276-277. doi:10.1056/NEJMe1516564. Lyle, Jared, George Alter, and Mary Vardigan. “The Price of Keeping Knowledge Workshop: ICPSR Position Paper.” (2013) http://www.knowledge-ex-change.info/Admin/Public/ DWSDownload.aspx?File=%2FFiles%2FFiler%2Fdownloads%2FPrimary+Re- search+Data%2FWorkshop+Price+of+Keeping+Knowledge%2FJared+Lyle+ICPSR_ Position+Paper_Price+workshop_public.pdf. Lynch, Clifford. “The Shape of the Scientific Article in the Developing Cyberinfrastruc- ture.” CTWatch Quarterly 3, no. 3 (2007). http://www.ctwatch.org/quarterly/ articles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastruc- ture/index.html. McGlynn, Terry. “I Own My Data, Until I Don’t.” Small Pond Science. March 3, 2014. http://smallpondscience.com/2014/03/03/i-own-my-data-until-i-dont/.
  • 41. 28 Introduction to Volume One Merriam-Webster’s Learner’s Dictionary. “Data.” Web version. Accessed August 6, 2016. http://www.merriam-webster.com/dictionary/data. nanoHUB.org homepage. Accessed August 6, 2016. https://nanohub.org/. National Academy of Sciences, National Academy of Engineering, and Institute of Medicine. Information Technology and the Conduct of Research: The User’s View. Washington, DC: The National Academies Press, 1989. doi:10.17226/763. National Science Board. “NSB-05-40, Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century.” Summer 2005. National Science Foun- dation. http://www.nsf.gov/pubs/2005/nsb0540. Nature. “Availability of Data, Material and Methods.” Accessed August 6, 2016. http://www. nature.com/authors/policies/availability.html. Naughton, Linda and David Kernohan. “Making Sense of Journal Research Data Policies.” Insights 29, no. 1 (2016). doi: http://doi.org/10.1629/uksg.284. NCBI. “Human Genome Resources.” Accessed August 6, 2016. http://www.ncbi.nlm.nih. gov/genome/guide/human. Office of Management and Budget. “CIRCULAR A-110.” Revised November 19, 1993 as further amended September 20, 1999. https://www.whitehouse.gov/omb/circulars_ a110 OMB circular a-110. Organisation for Economic Co-operation and Development. “Declaration on Access to Research Data from Public Funding.” January 30, 2004. http://acts.oecd.org/Instru- ments/ShowInstrumentView.aspx?InstrumentID=157. Patil, Prasad, Roger D. Peng, and Jeffrey Leek. “A Statistical Definition for Reproducibility and Replicability.” BioRxiv. July 29, 2016. doi:10.1101/066803. Piwowar, Heather A., Roger S. Day, and Douglas B. Fridsma. “Sharing Detailed Research Data is Associated with Increased Citation Rate.” PloS One 2, no. 3 (2007): e308. doi:10.1371/journal.pone.0000308. Piwowar, Heather A. and Wendy W. Chapman. “A Review of Journal Policies for Sharing Research Data.” Nature Precedings. March 20, 2008. hdl:10101/npre.2008.1700.1. PLOS One. “Data Availability.” Accessed August 6, 2016. http://journals.plos.org/plosone/s/ data-availability. Portage network homepage. Accessed August 6, 2016. https://portagenetwork.ca/. PublicVR project homepage. Accessed August 6, 2016. http://publicvr.org/index.html. Raymond, Lisa. “Publishing and Citing Ocean Data.” One NOAA Science Seminar, Na- tional Oceanographic Data Center. May 22, 2013. http://www.nodc.noaa.gov/semi- nars/2013/support/Lisa_Raymond_OneNOAASeminar_slides.pdf. re3data.org homepage. Accessed August 6, 2016. http://www.re3data.org/. Reardon, Sara. “US Vaccine Researcher Sentenced to Prison for Fraud.” Nature News, July 1, 2015. http://www.nature.com/news/us-vaccine-researcher-sentenced-to-prison-for- fraud-1.17660. Research Councils UK. “RCUK Common Principles on Data Policy.” April 2011. http:// www.rcuk.ac.uk/research/datapolicy/. Research Data Alliance Data Foundation and Terminology Interest Group. “Term Definition Tool (TeD-T).” Last modified March 1, 2016. http://smw-rda.esc.rzg.mpg.de/index. php/Main_Page. Research Data Management Shared Service Project homepage. Accessed August 4, 2016. https://www.jisc.ac.uk/rd/projects/research-data-shared-service.
  • 42. Introduction to Data Curation 29 Retraction Watch. “Archive for the ‘Data Issues’ Category.” Accessed August 6, 2016. http:// retractionwatch.com/category/by-reason-for-retraction/data-issues/. Rivers, Caitlin. “‘Send Me Your Data - PDF is Fine,’ Said No One Ever (How to Share Your Data Effectively).” April 8, 2013. http://www.caitlinrivers.com/blog/send-me-your- data-pdf-is-fine-said-no-one-ever-how-to-share-your-data-effectively. Santos, Carlos, Judith Blake and David J. States. “Supplementary Data Need to be Kept in Public Repositories.” Nature 438, no. 7069 (2005): 738-738. doi: 10.1038/438738a. Savage, Caroline J. and Andrew J. Vickers. “Empirical Study of Data Sharing by Authors Publishing in PLoS Journals.” PloS One 4, no. 9 (2009): e7078. doi:10.1371/journal. pone.0007078. Scientific Data homepage. Accessed August 6, 2016. http://www.nature.com/sdata. Scientific Data. “Recommended Data Repositories.” Accessed July 18, 2016. http://www. nature.com/sdata/policies/repositories. Shaywitz, David. “Data Scientists = Research Parasites?” Forbes, January 21, 2016. http:// www.forbes.com/sites/davidshaywitz/2016/01/21/data-scientists-research-para- sites/#3ddef3453d1c. Shearer, Kathleen. “Comprehensive Brief on Research Data Management Policies.” Released April 2015. http://acts.oecd.org/Instruments/ShowInstrumentView.aspx?Instrumen- tID=157. Sheehan, Jerry. “Increasing Access to the Results of Federally Funded Science.” The White House Blog. Feburary 22, 2016. https://www.whitehouse.gov/blog/2016/02/22/in- creasing-access-results-federally-funded-science. Sodden, Victoria. “A Brief History of the Reproducibility Movement.” December 10, 2012. http://hdl.handle.net/10022/AC:P:15396. Soranno, Patricia A., Kendra S. Cheruvelil, Kevin C. Elliott, and Georgina M. Montgomery. “It’s Good to Share: Why Environmental Scientists’ Ethics are Out of Date.” BioSci- ence 65, no. 1 (2015): 69-73. doi: 10.1093/biosci/biu169. SPARC Open Data. “Research Funder Data Sharing Policies.” Accessed August 5, 2016. http://sparcopen.org/our-work/research-data-sharing-policy-initiative/funder-poli- cies/. Sturges, Paul, Marianne Bamkin, Jane H.S. Anders, Bill Hubbard, Azhar Hussain and Mel- anie Heeley. “Research Data Sharing: Developing a Stakeholder-Driven Model for Journal Policies.” Journal of the Association for Information Science and Technology. doi: 10.1002/asi.23336. Swanson, Alexandra, Margaret Kosmala, Chris Lintott, Robert Simpson, Arfon Smith, and Craig Packer. “Snapshot Serengeti, High-frequency Annotated Camera Trap Images of 40 Mammalian Species in an African Savanna.” Dryad Digital Repository. doi:10.5061/dryad.5pt92. Tenopir, Carol, Ben Birch, and Suzie Allard. Academic Libraries and Research Data Services: Current Practices and Plans for the Future. An ACRL White Paper. Association of College and Research Libraries, a division of the American Library Association, 2012. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/whitepapers/ Tenopir_Birch_Allard.pdf. The Wellcome Trust. “Policy on Data Management and Sharing.” Accessed August 6, 2016. https://wellcome.ac.uk/funding/managing-grant/policy-data-management-and-shar- ing.
  • 43. 30 Introduction to Volume One Thomson, Sara Day. “Technology Watch Report 16: Preserving Transactional Data.” Digital Preservation Coalition. May 2, 2016. doi:10.7207/twr16-02. Tibbo, Helen R., and Christopher A. Lee. “Closing the Digital Curation Gap: A Grounded Framework for Providing Guidance and Education in Digital Curation.” In Archiving Conference, vol. 2012, no. 1, pp. 57-62. Society for Imaging Science and Technology, 2012. http://www.ils.unc.edu/callee/p57-tibbo.pdf. United States Government. “US Open Data Action Plan.” May 9, 2014. https://www.white- house.gov/sites/default/files/microsites/ostp/us_open_data_action_plan.pdf. University of Illinois Urbana-Champaign School of Information Science. “Specialization in Data Curation.” Accessed August 4, 2016. http://www.lis.illinois.edu/academics/pro- grams/specializations/data_curation. University of Notre Dame. “About the eMotion and eCognition Lab.” Accessed August 6, 2016. http://www3.nd.edu/~emotecog/about.html. US Geological Survey. “NBII to Be Taken Offline Permanently in January.” USGS Access Newsletter 14, no. 3 (Fall 2011), https://www2.usgs.gov/core_science_systems/Access/ p1111-1.html. Van Noorden, Richard. “Irish University Labs Face External Audits.” Nature News, June 17, 2014. http://www.nature.com/news/irish-university-labs-face-external-au- dits-1.15422. Vines, Timothy H., Arianne YK Albert, Rose L. Andrew, Florence Débarre, Dan G. Bock, Michelle T. Franklin, Kimberly J. Gilbert, Jean-Sébastien Moore, Sébastien Renaut, and Diana J. Rennison. “The Availability of Research Data Declines Rapidly with Ar- ticle Age.” Current Biology 24, no. 1 (2014): 94-97. doi:10.1016/j.cub.2013.11.014. Witt, Michael. “Institutional Repositories and Research Data Curation in a Distributed Environment.” Library Trends 57, no. 2 (2008): 191-201. doi:10.1353/lib.0.0029. Yahoo. “R10—Yahoo News Feed dataset, version 1.0 (1.5TB).”Accessed August 6, 2016. http://webscope.sandbox.yahoo.com/catalog.php?datatype=r&did=75.
  • 44. PART I Setting the Stage for Data Curation Policies, Culture, and Collaboration
  • 46. 33 CHAPTER 1* Research and the Changing Nature of Data Repositories Karen S. Baker and Ruth E. Duerr Introduction This chapter explores the changing nature of research and data repositories.Trends in open data, big data, and long-tail data are ongoing,1 following shifts from an- alog devices and documentation to digital instrumentation and digital data. Fur- ther, recent mandates about increasing access to data in the United States come at a time when digital capabilities are increasing though digital infrastructure is in flux.2 Attention to and funding for data sharing have propelled data repository activities in both new and established digital settings. As the number and kind of repositories accepting research-generated data increase, their effectiveness de- pends upon developing widespread understanding of data concepts as well as the knowledge accumulated about successes and failures in the digital realm. The full reality of managing research data and data repositories in a Dig- ital Age is informed and shaped by past efforts carried out in many sectors. It is impacted by new participants, new roles, and changes in the distribution of responsibilities associated with data management. In addition, evolving technol- ogies result in changing support mechanisms for documentation, preservation, and access of data. Contemporary data management efforts have more than fifty * This work is licensed under a Creative Commons Attribution 4.0 License, CC BY (https:// creativecommons.org/licenses/by/4.0/).
  • 47. 34 Chapter 1 years’ experience to draw upon given early large-scale assemblies of digital data in scientific research fields such as remote sensing and weather as well as social science research fields such as survey and census methods.3 Only a portion of the insights gained from past experience with data management and data systems are readily available given the combination of emphasis on scientific findings and of succinctness required in writing for the scholarly literature. Incentives and rewards for writing about work with data have been lacking.4 New forums and journals are emerging that provide venues for discussions about past and present work with data so that past experience is available to new communities of data workers (see section “Changing Research Needs and New Initiatives” below). This paper considers both conceptual and historical underpinnings in the story of data repositories. From work with data repositories in a variety of research fields, three concepts—data ecosystem, liaison work, and continuing design— help in understanding how work with digital data can contribute to the viability and well-being of the research process. These concepts, together with related issues and recommendations, are presented below as projects, communities, consortia, alliances, centers, programs, agencies, universities, publishers, libraries, and orga- nizations of all kinds grapple with managing and preserving data in repositories. Background A few early data efforts in the sciences are presented as examples of past activities that inform today’s work. Changing Support for Data Work with data is embedded in the processes, methods, and goals of research. Rigor in documenting thought processes, evidence collection, and data is inte- gral to ensuring a robust research process. There is a long history of research data recorded in station books and laboratory notebooks.5 In addition, white papers and project newsletters as well as expedition and technical reports full of tables of numbers were, and continue to be, published outside formal academic and com- mercial channels by a variety of organizations. Such materials, known as “the gray literature,” are authoritative as primary sources. As the name suggests, however, they may be limited in terms of discoverability, access, and vetting. Nevertheless, these outlets have played a significant role in providing researchers access to data. While research findings traditionally appear in formal publication venues, the original, full data record was often in the gray literature as well as file cabinets.6 With the development of technologies such as cameras and strip chart recorders, a variety of organizational subunits such as photo labs emerged to
  • 48. Research and the Changing Nature of Data Repositories 35 handle these analog materials and to support conversion to forms that could be published. Although they did not consider themselves data publishers, they or their counterparts routinely created reports with primary data in the form of tables, photos, maps, and graphs. Many of these offices have since closed or have been transformed, such as the photo lab that becomes a digital service group. Closing often occurred before infrastructure was in place to handle documentation and data in new ways beyond the capability of an individual’s desktop. Eventually, with Internet availability, researchers and research groups developed new practices such as delivery of content including field data under a Data tab on a research website. In a sense, the current attention to data ac- cess and new forms of data citation is a return to the norm of retrieving and citing data that appeared in the print-based gray literature. With orders of magnitude more digital data generated, however, new kinds of digital tools, capabilities, and arrangements are required to support widespread access to digital data. Expanding Support for Data in Natural and Social Sciences With the development of large-scale international research initiatives, support for data took a variety of forms. Spurred by twentieth-century post–World War II planning, a number of data facilities were established. For instance, World Data Centers and the Federation of Astronomical and Geophysical Data Anal- ysis Services evolved, starting with the International Geophysical Year (IGY) in 1957–1958 with its focus on international science. From the IGY, a revolution- ary vision of the earth as a whole emerged, focusing the attention of geoscientists collectively on scientific methods, measurements, and data. The International Council of Scientific Unions (now International Council for Science) established a system of World Data Centers to serve the IGY and developed data manage- ment plans for each IGY scientific discipline.7 The World Data Centers focused on replicating data across the centers and sharing data across the globe. The ICSU Committee on Data for Science and Technology (CODATA) continues to develop and share knowledge about data today.8 With their beginnings as centers full of the books and reports containing data for IGY and other initiatives, early data efforts grew to include magnetic tapes and punch cards at designated loca- tions. Today management in data centers has grown to include digital data and physical samples as well as to accommodate many stakeholders and audiences.9 The transition and renaming of the World Data Center system in 2009 to be the World Data System represents another shift in perspective with data envisioned within an interoperable set of systems.
  • 49. 36 Chapter 1 In the United States, federal centers developed and took many forms. Federally Funded Research Development Centers (FFRDC) were created as public-private partnerships to support research community projects by mak- ing available large-scale resources such as the aircraft required for atmospheric science fieldwork.10 Research support includes project coordination, instru- mentation, field support, and work with data. National Data Centers such as the National Climate Data Center and the National Oceanographic Data Center were created in order to support management of data from platforms with large data streams such as from satellites. Supercomputer centers were developed as national resources to provide computational power to research- ers across the nation.11 These centers have developed repositories for data of many kinds existing alongside other preservation institutions such as archives with collections of photos and manuscripts, museums with physical artifacts, and libraries with books and journals. Tape racks proliferated as recordings on seven- and nine-track tapes replaced everything from strip chart recorders to images. Tapes were replaced in turn by new storage technologies. Many other, less visible changes were occurring in data centers, driven by chang- es in applications, configurations, budgets, institutions, and careers.12 As the number of data centers grew, coordination activities started taking place. For instance, the National Archives and Records Administration (NARA) joined in 1992 with the scientific community and with federal and nonfederal enti- ties that collect data about the earth to consider collectively data management and archiving procedures.13 The ramifications of this interaction resulted in recommendations that NARA collaborate with other agencies that maintain long-term custody of data. In the social sciences, early national-level repository development was spurred by an initial need for community access to data from election studies and from the US Census.14 The Inter-university Consortium for Political and Social Research (ICPSR), which dates its origin to 1962, provides an example of responding to change over time. ICPSR began with a membership model to fund its data management costs but is now leading a call for change in support mech- anisms for domain repositories.15 This consortium has responded to community interests by participating in an alliance to distribute widely backup copies of data across several repositories. ICPSR has also responded to recent mandates for pub- lic data access by creating a new level of service. This service, called OpenICPSR, supports public availability of data free of cost.16 Data Repository Diversity Setting aside the issue of data presentation, we consider two categories of data repositories depending upon whether they ingest homogeneous or heterogeneous
  • 50. Exploring the Variety of Random Documents with Different Content
  • 51. tomb of Archimedes.[792] On his return home[793] he resumed his forensic practice: and in B. C. 70 was the champion of his old friends, the Sicilians, and impeached Verres, who had been prætor of Syracuse, for oppression and maladministration. In the following year[794] he was elected curule ædile by a triumphant majority. In the celebration of the games which belonged to the province of this magistrate, he exhibited great prudence by avoiding the lavish expenditure in which so many were accustomed to indulge, whilst, at the same time, no one could accuse him of meanness and illiberality. In the year B. C. 67, he obtained the prætorship, and notwithstanding the judicial duties of his office, defended Cluentius. Hitherto his speeches had been entirely of the judicial kind. He now for the first time distinguished himself as a deliberative orator, and supported the Manilian law which conferred upon Pompey, to the discomfiture of the aristocratic party, the command in chief of the Mithridatic war. The great object of his ambition now was the consulship, which seemed almost inaccessible to a new man. As all difficulties and prejudices were on the side of the aristocratic party, his only hope of surmounting them was by warmly espousing the cause of the people. Catiline and C. Antonius, who were his principal competitors, formed a coalition, and were supported by Cæsar and Crassus, but the influence of Pompey and the popular party prevailed; and Cicero and Antony were elected. He entered upon his office January 1, B. C. 63. At this period, perhaps, the moral qualities of his character are the highest, and his genius shines forth with the brightest splendour. The conspiracy of Catiline was the great event of his consulship; a plot which its historian does not hesitate to dignify with the title of a war. Yet this war was crushed in an unparalleled short space of time; and a splendid triumph was gained over so formidable an enemy, by one who wore the peaceful toga, not the habiliments of a general. The prudence and tact of the civilian did as good service as the courage and decision of the soldier. The applause and gratitude of his fellow citizens were unbounded, and all united in hailing him the father of his country. One act alone laid him open to attack, and in fact eventually caused his ruin. There is no doubt that it was unconstitutional, although under the circumstances it was defensible, perhaps scarcely to be avoided. This act was the execution
  • 52. of Lentulus, Cethegus, and the other ringleaders, without sentence being passed upon them by the comitia. The senate, seeing that the danger was imminent, had invested Cicero and his colleague with power to do all that the exigencies of the state might require (videre ne quid res publica detrimenti caperet;) and although it was Cicero who recommended the measure and argued in its favour, it was the senate who pronounced the sentence, and assumed that, as traitors, the conspirators had forfeited their rights as citizens. The grateful people saw this clearly; and when Metellus Celer, one of the tribunes, would have prevented Cicero from giving an account of his administration at the close of the consular year, he swore that he saved his country, and his oath was confirmed by the acclamations of the multitude. This was a great triumph; and in sadder times he looked back to it with a justifiable self-complacency. [795] He now, as though his mission was accomplished, refused all public dignities except that of a senator: but he did not thus escape peril; he soon exposed himself to the implacable vengeance of a powerful and unscrupulous enemy. The infamous P. Clodius Pulcher intruded himself in female attire into the rites of the Bona Dea, which were celebrated in the house of Cæsar. Suspicion fell upon Cæsar’s wife, and a divorce was the consequence.[796] Clodius was brought to trial on the charge of sacrilege, and pleaded an alibi. Cicero, however, proved his presence in Rome on the very day on which the accused asserted that he was at Interamnum. Although the guilt of Clodius was fully established, his influence over the corrupt Roman judices was powerful enough to procure an acquittal. Henceforward he never could forgive Cicero, and determined to work his ruin. He caused himself to be adopted in a plebeian family; and thus becoming qualified for the tribunate was elected to that magistracy, B. C. 59. No sooner was he appointed, than he proposed a bill for the outlawry of any one who had caused the execution of a citizen without trial. Cicero at once saw that this blow was aimed against himself. He had disgusted Cæsar by his political coquetry; the false and selfish Pompey refused to aid him in his trouble; and spirit-broken, he fled to Brundisium,[797] and thence to Thessalonica. He had an interview with Pompey before his flight, but it led to no results.[798] He had sworn to help him as long as he
  • 53. felt that there was danger, lest he should join Cæsar’s party; but when he saw that his foes were successful, he deserted him. In his absence his exile was decreed, and his town and country houses were given up to plunder. It cannot be denied that during his banishment he exhibited weakness and pusillanimity: his reverses had such an effect upon his mind that he was even supposed to be mad.[799] His great fault was vanity, of which defect he was himself conscious, and confessed it;[800] and disappointed vanity was the cause of his affliction. He could bear anything better than loss of popular applause; and on this occasion, more than any other, he gave grounds for the assertion, that “he bore none of his calamities like a man, except his death.” Rome, however, could not forget her preserver; and in the following year he was recalled, and entered Rome in triumph, in the midst of the loud plaudits of the assembled people.[801] Still, however, he was obliged to secure the prosperity which he had recovered by political tergiversation. The measures of the triumvirate, which he had formerly attacked with the utmost virulence, he did not hesitate now to approve and defend. After his return[802] he was appointed to a seat in the College of Augurs; a dignity which he had anxiously coveted before his exile, and to obtain which, he had offered almost any terms to Cæsar and Pompey.[803] The following year, much against his will, the province of Cilicia was assigned to him. Strictly did the accuser of Verres act up to the high and honourable principles which he professed. His was a model administration: a stop was put to corruption, wrongs were redressed, justice impartially administered. Those great occasions on which he was compelled to act on his own responsibility, and to listen to the dictates of his beautiful soul, “seine schöne seele,”[804] his pure, honest, and incorruptible heart, are the bright points in Cicero’s career. The emergency of the occasion overcame his constitutional timidity. In the year B. C. 49, he returned to Rome, and finding himself in a position in which he could calmly observe the current of affairs, and determine unbiassed what part he should take in them, or whether it was his duty to take any part at all, his weak, wavering, vacillating temper again got the mastery over him. He would not do anything dishonest, but he was not chivalrous enough to spurn at once that which was dishonourable. Cæsar and Pompey were now at open war,
  • 54. and he could not make up his mind which to join.[805] He felt, probably, that the energy, ability, and firmness of Cæsar, would be crowned with success; and yet his friends, his party, and his own heart were with Pompey, and he dreaded the scorn which would be heaped upon him if he forsook his political opinions. His were not the stern, unyielding principles of a Cato; but the fear of what men would say of him made him anxious and miserable. The struggle was a long one between caution and honour, but at length honour overcame caution. He made his decision, and went to the camp of Pompey; but he could never rally his spirits, or feel sanguine as to the result. He immediately saw that Pharsalia decided the question for ever, and consequently hastened to Brundisium, where he awaited the return of the conqueror. It was a long time to remain in suspense; but at last the generous Cæsar relieved him from it by a full and free pardon. And now again his character rose higher, and his good qualities had room to display themselves. There were no longer equally balanced parties to revive the discord which formerly distracted his mind, nor were the circumstances of the times such as to demand his active interference in the cause of his country; but he was as great in the exercise of his contemplative faculties as he had been in the brightest period of his political life. The same faults may, perhaps, be discerned in his philosophical speculations: the same indecision which rendered him incapable of being a statesman or a patriot caused him to adopt in philosophy a skeptical eclecticism. Truth was to him as variable as political honesty; but he is always the advocate and supporter of resignation, and fortitude, and purity, and virtue. He had hitherto suffered as a public man: he was now bowed down by domestic affliction. A quarrel with his wife Terentia ended in a divorce:[806] such was the facility with which at Rome the nuptial tie could be severed. His second wife was his own ward—a young lady of large fortune; but disparity of years and temper prevented this connexion from lasting long. In B. C. 45 he lost his daughter Tullia. The blow was overwhelming: he sought in vain to soothe his grief in the woody solitudes of his maritime villa at Astura, and it was long before the bereaved father found consolation in philosophy. The political crisis which ensued upon the assassination of Cæsar alarmed him for his own personal safety: he therefore meditated a
  • 55. voyage to Greece; but being wind-bound at Rhegium, the hopes of an accommodation between Antony and the senate (a hope destined not to be realized) induced him to return. Antony now left Rome, and Cicero delivered that torrent of indignant and eloquent invective—his twelve Philippic orations.[807] He was again the popular idol—crowds of applauding and admiring fellow-citizens attended him to the Forum in a kind of triumphant procession, as they had on his return from exile. But soon the second triumvirate was formed. Each member readily gave up friends to satisfy the vengeance of his colleagues, and Octavius sacrificed Cicero. The story of his death is a brief and sad one. He was enjoying the literary retirement of his Tusculan villa when his friends warned him of his approaching fate. He was too great a philosopher to fear death; but too high-principled and resigned to the Divine will to commit suicide. Still he scarcely thought life worth preserving: “I will die,” he said, “in my fatherland, which I have so often saved.” However, at the entreaty of his brother, to whom he was affectionately attached, he endeavoured to escape. He first went across the country to Astura, and there embarked. The weather was tempestuous, and as he suffered much from sea-sickness, he again landed at Gaëta. A treacherous freedman betrayed him, and as he was being carried in a litter he was overtaken by his pursuers. He would not permit his attendants to make any resistance; but patiently and courageously submitted to the sword of the assassins, who cut off his head and hands and carried them to Antony. A savage joy sparkled in the eyes of the triumvir at the sight of these bloody trophies. His wife, Fulvia, gloated with inhuman delight upon the pallid features, and in petty spite pierced with a needle that once eloquent tongue. The head and hands were fixed upon the rostrum which had so often witnessed his unequalled eloquence. All that passed by bewailed his death, and gave vent to their affectionate feelings. Although it is impossible to be blind to the numerous faults of Cicero, few men have been more maligned and misrepresented, and the judgment of antiquity has been, upon the whole, generally unfavourable. He was vain, vacillating, inconstant, constitutionally timid, and the victim of a morbid sensibility; but he was candid, truthful, just, generous, pure-minded, and warm-hearted. His amiability, acted upon by timidity, led him to set too high a value on
  • 56. public esteem and favour; and this weakened his moral sense and his instinctive love of virtue. That he possessed heroism is proved by his defence of Roscius, although the favourite of the terrible Sulla was his adversary. He was not entirely destitute of decision, or he would not so promptly have expressed his approbation of Cæsar’s assassins as tyrannicides. He had resolution to strive against his over- sensitiveness, and wisdom to see that mental occupation was its best remedy; for in the midst of the distractions and anxieties of that eventful and critical year which preceded the consulship of Hirtius and Pansa an almost incredible number of works proceeded from his pen.[808] There are many circumstances to account for his political inconsistency and indecision. He had an early predilection for the aristocratic party; but he saw that they were narrow-minded and behind their age. All the patricians, except Sulla and his small party, were on the popular side. He was proud of his connexion with Marius; and his friend Sulpicius Rufus, whom he greatly admired, joined the Marians. For these reasons, Cicero was inconsistent as a politician. Again, during periods of revolutionary turbulence, moderate men are detested by both sides; and yet it was impossible for a philosophic temper, which could calmly and dispassionately weigh the merits and demerits of both, to sympathize warmly with either. Cicero saw that both were wrong: he was too temperate to approve, too honest to pretend a zeal which he did not feel, and, therefore, he was undecided. Again, having a large benevolence, and a firm faith in virtue, he was unconscious of guile himself, and thought no evil of others. He therefore mistook flattery for sincerity, and compliments for kindness. He was vain; but vanity is a weakness not inconsistent with great minds, and in the case of Cicero it was fed by the unanimous voice of public approbation. As an advocate his delight was to defend, not to accuse.[809] In three only of his twenty-four orations did he undertake the office of an accuser. Gentle, sympathizing, and affectionate, he lived as a patriot and died as a philosopher.
  • 58. CHAPTER X. CICERO NO HISTORIAN—HIS ORATORICAL STYLE DEFENDED—ITS PRINCIPAL CHARM—OBSERVATIONS ON HIS FORENSIC ORATION— HIS ORATORY ESSENTIALLY JUDICIAL—POLITICAL ORATIONS— RHETORICAL TREATISES—THE OBJECT OF HIS PHILOSOPHICAL WORKS—CHARACTERISTICS OF ROMAN PHILOSOPHICAL LITERATURE—PHILOSOPHY OF CICERO—HIS POLITICAL WORKS— LETTERS—HIS CORRESPONDENTS—VARRO. Such were the life and character of Cicero. The place which he occupies in a history of Roman literature is that of an orator and philosopher. It has been already stated that he had some taste for poetry: in fact, without imagination he could scarcely have been so eminent as an orator; but though the power which he wielded over prose was irresistible, he had not fancy enough to give a poetical character to the language. Nor had he, notwithstanding the versatility of his talents, any taste for historical investigation. He delighted to read the Greek historians, for the same purpose for which he studied the Attic orators, merely as an instrument of intellectual cultivation; but he was ignorant of Roman history, because he took no interest in original research. His countrymen[810] expected from him an historical work, but he was unfit for the task. It is plain from his “Republic,” how little he knew as an antiquarian. The greatest praise of an orator’s style is to say that he was successful. The end and object of oratory is to convince and persuade —to rivet the attention of the hearer, and to gain a mastery over the minds of men. If, therefore, any who study the speeches of Cicero in the closet find faults in his style, they must remember the very faults themselves were suited to the object which he was carrying into
  • 59. execution. During the process of raising the public taste to the highest standard, he carried his hearers with him: he was not too much in advance; he did not aim his shafts too high; they hit the head and heart. Senate, judges, people understood his arguments, and felt his passionate appeals. Compared with the dignified energy and majestic vigour of the Athenian orator, the Asiatic exuberance of some of his orations may be fatiguing to the sober and chastened taste of the modern classical scholar; but in order to form a just appreciation, he must transport himself mentally to the excitements of the thronged Forum—to the senate composed, not of aged, venerable men, but statesmen and warriors in the prime of life, maddened with the party spirit of revolutionary times—to the presence of the jury of judices, as numerous as a deliberative assembly, whose office was not merely calmly to give their verdict of guilty or not guilty, but who were invested as representatives of the sovereign people with the prerogative of pardoning or condemning. Viewed in this light, his most florid passages will appear free from affectation—the natural flow of a speaker carried away with the torrent of his enthusiasm. The melodious rise and fall of his periods are not the result of studied effect, but of a true and musical ear. Undoubtedly, amongst his earlier orations, are to be found passages somewhat too declamatory and inconsistent with the principles which he afterwards laid down when his taste was more matured, and when he undertook to write scientifically on the theory of eloquence. Nor must it be concealed that some of the staid and stern Romans of his own days were daring enough, notwithstanding his popularity and success, to find the same fault with him. “Suorum temporum homines,” says Quintilian, “incessere audebant eum ut tumidiorem et Asianum[811] et redundantem et in repetitionibus nimium et in salibus aliquando frigidum et in compositione fractum et exsultantem et pene viro molliorem.” But it is not only the brilliance and variety of expression, and the finely-modulated periods, which constituted the principal charm of Ciceronian oratory, and rendered it so effective. Its effectiveness was mainly owing to the great orator’s knowledge of the human heart, and of the national peculiarities of his countrymen. Its charm was owing to his extensive acquaintance with the stores of literature and philosophy, which his sprightly wit moulded at will, to the varied
  • 60. learning which his unpedantic mind made so pleasant and popular, to his fund of illustration at once interesting and convincing. Even if his knowledge, because it spread over so wide a surface, was superficial, in this case profoundness was unnecessary. In a work like the present it is only possible to devote a few brief observations to the most important of his numerous orations, in which, according to the criticism of Quintilian, he combined the force of Demosthenes, the copiousness of Plato, and the elegance of Isocrates. Knowledge of law, far superior to that possessed by the great orators of the day,[812] distinguishes his earliest extant oration, the defence of P. Quinctius.[813] Hortensius was the defendant’s counsel. Nævius, the defendant, who had unjustly possessed himself of the property of the plaintiff’s deceased brother, was a deserter from the Marians, and therefore a protégé of Sylla; but, notwithstanding these disadvantages, Cicero gained his cause. In the masterly defence of S. Roscius,[814] Cicero again defied Sulla. His client was accused of parricide: there was not a shadow of proof, and Cicero saved the life of an innocent man. The noble enthusiasm with which he inveighs against tyranny in this oration strikingly contrasts with the language, full of sweetness, in which he describes Roman rural life. The passage on parricide was too glowing and Asiatic for the taste of his maturer years, and he did not hesitate to make it the subject of severe criticism.[815] Passing over speeches of less interest, we come to the six celebrated Verrian orations. Of these chefs- d’œuvre the first only was delivered.[816] The others were merely published; for the voluntary exile of the criminal rendered further pleading unnecessary. The first is entitled “Divinatio,” i. e., an inquiry as to who should have the right of prosecuting: Cæcilius, who had been quæstor to the accused, claimed this privilege, wishing to make the suit a friendly one, and thus quash the proceedings. Nothing can surpass the ironical and sarcastic exposure of this fraudulent attempt to defeat the ends of justice. The noble passages in the succeeding orations of the series are well known; the sketch of the wicked proconsul’s antecedent career; the graceful eulogy of that province, in the welfare of which Cicero himself felt so warm an interest; the tasteful description of the statues and antiquities which tempted the more than Roman cupidity of Verres; the interesting history of ancient art which accompanies it; the burst of pathetic indignation with which he paints the horrible tortures to which not
  • 61. only the provincials, but even Roman citizens, were exposed. Transports of joy pervaded the whole of Sicily at Cicero’s success; and the Sicilians caused a medal to be struck with this inscription —“Prostrato Verre Trinacria.” The oration for Fonteius[817] is a skilful defence of an unpopular governor; that in defence of Cluentius[818] is one of the most remarkable causes célèbres of antiquity; and the complicated scene of villany which Cicero’s forcible and soul-harrowing language paints, makes one shudder with horror, whilst we are struck with admiration at the clearness of intellect with which he unravels the web of guilt woven by Oppianicus and Sassia. This remarkable oration has been analyzed by Dr. Blair.[819] Again, passing over other forensic orations, we come to that on which he had evidently expended all his resources of art, taste, and skill—the speech for the poet Archias.[820] If possible it is even too elaborate and polished for so graceful a theme. Although the object of the advocate was simply to establish the right of his client to Roman citizenship, the genius of the poet of Antioch furnished an opportunity not to be neglected for digressing into the fields of literature, and for pronouncing a truly academical eulogium on poetry. It is satisfactory to the admirers of Cicero to find that the attack which has been made on the genuineness of this pleasing oration is groundless and unwarrantable.[821] The oration pro Cælio[822] is the most entertaining in the whole collection. It contains a rich fund of anecdote, seasoned with witty observations; a knowledge of human nature illustrated in a piquant and humorous style, expressed in a tone of most gentlemanlike yet playful eloquence, and interspersed with passages of great beauty. It presents a marked contrast to the coarse personal abuse which defaces the otherwise powerful invective against L. Piso, which was delivered in the following year.[823] The list, though many more marvellous specimens are omitted, must be closed with the oration in defence of T. Annius Milo. On this occasion Cicero lost his wonted self-possession. When the court opened, Pompey was presiding on the bench, and he had caused the Forum to be occupied with soldiers. The sight, added, perhaps, to the consciousness that he was advocating a bad cause, struck Cicero with alarm; his voice trembled, his tongue refused to give utterance to the
  • 62. conceptions which he had formed. The judges were unmoved; and Milo remained in his self-imposed exile at Marseilles. When Cicero left the court his courage and calmness returned. He penned the oration which is now extant. He had little or no proof or evidence to offer, and therefore, as an argumentative work, it is unconvincing; but for force, pathos, and the externals of eloquence, it deserves to be reckoned amongst his most wonderful efforts. When the exiled Milo read it, he is said to have exclaimed, “O, Cicero, if you had pleaded so, I should not be eating such capital fish here!” The author himself and his contemporaries thought this his finest oration; probably its deficiencies were concealed by its eloquence and ingenuity. It appears that the oration which he actually delivered was taken down in writing by reporters, and was extant in the time of Asconius Pedianus, the most ancient commentator on Cicero’s orations.[824] Its feebleness proved the correctness of the judgment of antiquity. The oratory of Cicero was essentially judicial: he was himself conscious that his talents lay in that direction, and he saw that in that field was the best opportunity for displaying oratorical power. Even his political orations are rather judicial than deliberative. He was not born for a politician. He possessed not that analytical character of mind which penetrates into the remote causes of human action, nor the synthetical power which enables a man to follow them out to their farthest consequences; he had not that comprehensive grasp of mind which can dismiss at once all points of minor importance and useless speculation, and, seizing all the salient points, can bring them to bear together upon questions of practical expediency. Of the three qualities necessary for a statesman he possessed only two, honesty and patriotism: he had not political wisdom. Hence, in the finest specimens of his political harangues, his Catilinarians and Philippics, and that in support of the Manilian law, we look in vain for the calm, practical weighing of the subject which is necessary in addressing a deliberative assembly. This was not the habit of his mind. He was only lashed to action by circumstances of great emergency; but even then he is still an advocate—all is excitement, personal feeling, and party spirit: he deals in invective and panegyric, and the denunciation of the enemies of his country; and the parts which especially call forth our admiration differ in
  • 63. nothing from those which we admire in his judicial orations. Nevertheless, so irresistible was the influence which he exercised upon the minds of his hearers, that all his political speeches were triumphs. His panegyric on Pompey,[825] in the speech for the Manilian law, carried his appointment as commander-in-chief of the armies of the East. The consequence of the oration de Provinciis Consularibus continued to Cæsar his administration of Gaul. He crushed in Catiline one of the most formidable traitors that had ever menaced the safety of the republic. Antony’s fall followed the complete exposure of his debauchery in private life, and the factiousness of his public career.[826] Of the Catilinarians, the first and fourth were delivered in the senate, the second and third in the presence of the people. Every one knows the burst of indignation which the consul, rising in his place, aims at the audacious conspirator who dared to pollute with his presence the temple of the deity, and the most august assembly of the Roman people. In less than twenty-four hours Catiline had left Rome, and the conspiracy had become a war. In four words Cicero announced this to the assembled Romans the day after he had addressed the senate. The third is a piece of self-complacent but pardonable egotism. Success has overwhelmed him—he sees that all eyes are turned upon himself—he is the hero of his own story; still he demands no reward but the approbation of his fellow-citizens, and reminds them that to the gods alone their gratitude is due. Two days pass away, and after Cæsar and Cicero had spoken, Cicero again addresses the senate, and recommends that measure which was the beginning of his troubles, the condemnation of the conspirators. The zeal of the senate made the act their own, but Cicero paid the penalty. The position which Cicero occupies on this occasion invests his speech with more dignity than is displayed in any of the preceding. He is the chief magistrate of the republic, performing the duty of pronouncing a capital sentence on the guilty. The excitement of the crisis is subsiding; and he has the more composure, because he knows that he carries with him the sympathies of the senate and people. The Philippics, so named after the orations of Demosthenes, are fourteen in number. Cicero commenced his attack[827] upon the object of his implacable hatred with a defence of the laws of Cæsar,
  • 64. which Antony wished to repeal. He followed it up with the celebrated second oration, in which he demolished the character of Antony; a speech which Juvenal pronounced to be his chef-d’œuvre, but which Niebuhr thought was undeserving of being so highly exalted. He delivered the remaining twelve in the course of the succeeding year; they were the last monuments of his eloquence; he never spoke again. The fourteenth is a brilliant panegyric, but nothing more; the gallant army of Octavius received their deserved applause; but in this political crisis the orator could not discern or even catch a glimpse of the future destinies of his country. In his rhetorical works, Cicero left a legacy of practical instruction to posterity. The treatise “De Inventione,” although it displays genius, is merely interesting as the juvenile production of a future great man; and the author himself alludes to it as a rude and unfinished production.[828] Of the Rhetorical Hand-Book, in four sections, addressed to Herennius, it is unnecessary to speak, as it is now universally pronounced spurious.[829] The De Oratore, Brutus sive de claris Oratoribus, and Orator ad M. Brutum,[830] are the result of his matured experience. They form together one series; the principles are first laid down; their developments are carried out and illustrated; and lastly, in the Orator, he places before the eyes of Brutus the model of ideal perfection. In his treatment of this subject, he shows a mind imbued with the spirit of Plato: he invests it with dramatic interest, and transports the reader into the scene which he so graphically describes. The conversation contained in the first of these works has been already described. The scene of the second is laid on the lawn of Cicero’s palace at Rome: Cicero, Atticus and M. Brutus are the dramatis personæ; and their taste receives inspiration from a statue of Plato which adorns the garden. In the third, Cicero himself, at the request of M. Brutus, paints, as Plato would have done, the portrait of a faultless orator. Three more short treatises must be added—(1.) The dialogue, De Partitione Oratoria,[831] an elementary book, written for his son. (2.) The De Optimo Genere Oratorum,[832] a short preface to a translation of the Greek oration, De Corona. (3.) The Topica,[833] i. e., a treatise on the commonplaces of judicial oratory. Philosophy of Cicero.
  • 65. Cicero somewhat arrogantly claims the credit of being the first to awaken a taste for philosophy, and to illuminate the darkness in which it lay hid by the light of Roman letters.[834] He did not confess the obligations under which he lay to his predecessors, because he never could forget that he was an orator.[835] He could not deny that some of them thought justly; but he denied that they possessed the power of expressing what they thought. He felt that there was nothing in the philosophical writings already existing to tempt his countrymen to study the subject: they were dry, unadorned, unpolished. It required an orator to array philosophy in an enticing garb. He proposed, therefore, to assuage his anxieties—to seek repose from the harassing cares of politics[836] —by rendering his countrymen independent of Greek philosophical literature. This was all he proposed to himself: it was all that his predecessor had attempted; nor did he pretend to originality. The periods which he devoted to the task, and to which all philosophical works belong, were those during which he was excluded from political life. The first of these was the triumvirate of Cæsar, Pompey, and Crassus; the second was coincident with the dictatorship of Cæsar and the consulship of Antony. Not only did his contemplative spirit delight in such studies, but, whilst all the avenues to distinction were closed against him, his ambition sought this road to fame, and his patriotism urged him to take this method of benefiting his country. But as he was not the first who introduced philosophy to the Romans, it will be necessary briefly to sketch its progress up to the time at which his labours commenced. Roman philosophy was neither the result of original investigation nor the gradual development of the Greek system. It arose rather from a study of ancient philosophical literature than from an examination of philosophical principles. The Roman intellect did not possess the power of abstraction in a sufficiently high degree for research, nor was the Latin language capable of representing satisfactorily abstract thoughts. Cicero was quite aware of the poverty of its scientific nomenclature, as compared with that of Greece. In one treatise,[837] he writes,—“Equidem soleo etiam, quod uno Græci, si aliter non possum, idem pluribus verbis exprimere.” Pliny[838] and Seneca[839] assert the same fact. “Magis damnabis,” writes the latter, “angustias Romanas si scieris unam syllabam esse,
  • 66. quam mutare non possim. Quæ hæc sit quæris? το ον.” The practical character also of the people prompted them to take advantage of the material already furnished by others, and to select such doctrines as it approved, without regard to their relation to each other. The Roman philosopher, therefore, or rather (to speak more correctly) philosophical student, did not throw himself into the speculations of his age, pursue them contemporaneously, or deduce from them fresh results. He went back to the earlier ages of Greek philosophy, studied, commented on, and explained the works of the best authors, and adopted some of their doctrines as fixed scholastic dogmas. Consequently, the spirit in which philosophical study was pursued by the Romans was a literary and not a scientific one. A taste for literature had been awakened, and philosophy was considered only as one species of literature, although its importance was recognised as bearing upon the practical duties, the highest interests and happiness of man. The practical view which Cicero took of philosophy, and the extensive influence which he attributed to it, is manifest from numerous passages in his works,[840] and is imbodied in the following beautiful apostrophe in the Tusculan Disputations:[841] “O vitæ Philosophia dux! O virtutis indagatrix, expultrixque vitiorum! Quid non modo nos, sed omnino vita hominum sine te esse potuisset? Tu urbes peperisti; tu dissipatos homines in societatem vitæ convocasti; tu eos inter se primo domiciliis, deinde conjugiis, tum literarum et vocum communione junxisti; tu inventrix legum, tu magistra morum et disciplinæ fuisti; ad te confugimus, a te opem petimus; tibi nos, ut antea magna ex parte, sic nunc penitus totosque tradimus.” It is plain, therefore, that the chief characteristics of Roman philosophy would be—(1.) Learning, for it consisted in bringing together doctrines and opinions scattered over a wide field; (2,) Generally speaking, an ethical purpose and object, for Romans would be little inclined to value any subject of study which had no ultimate reference to man’s political and social relations; (3,) Eclecticism; for although there were certain schools, such as the Epicurean and Stoic, which were evidently favourites, the dogmas of different teachers were collected and combined together often without regard to consistency.
  • 67. The defects of such a system are fatal to its claim to be considered philosophical; for the scientific connexion of its parts is lost sight of, and results are presented independent of the chain of causes and effects by which they are connected with principles. Such a system must necessarily be illogical and inconsequential. Even the liberality which adopts the principle, “Nullius jurare in verba magistri,” and which, therefore, appears to be its chief merit, was absurd; and the willingness with which all views were readily admitted led to skepticism, or doubt whether such a thing as absolute truth had a real existence. Greek philosophy was probably first introduced into Rome by the Achæan exiles, of whom Polybius was one.[842] The embassy of Carneades the Academic, Diogenes the Stoic, and Critolaus the Peripatetic, followed six years afterwards. In vain the stern M. Porcius Cato caused their dismissal; for some of the most illustrious and accomplished Romans, such as Africanus, Lælius, and Furius, had already profited by their lectures and instructions.[843] Whilst the educated Romans were gaining an historical insight into the doctrines of these schools, the Stoic Panætius, who was entertained in the household of Scipio Africanus, was unfolding the mysterious and transcendental doctrines of the great object of his veneration, Plato. But although the Romans could appreciate the majestic dignity and poetical beauty of his style, they were not equal to the task of penetrating his hidden meaning; they were, therefore, content to take upon trust the glosses and commentaries of his expositors. These inclined to the New Academy rather than to the Old: in its skeptical spirit they compared and balanced opposing probabilities; and went no farther than recommending the adoption of opinions upon which they could not pronounce with certainty. Neither did the Peripatetic doctrines meet with much favour, although the works of Aristotle had been brought to Rome by the dictator Sulla, partly, as Cicero says, because of the vastness of the subjects treated, partly because they seemed incapable of satisfactory proof to unskilled and inexperienced minds.[844] The philosophical system which first arrested the attention of the Romans, and gained an influence over their minds, was the Epicurean.[845] But it is somewhat remarkable that, although this philosophy was in its general character ethical, a people so eminently
  • 68. practical in their turn of mind should have especially devoted themselves to the study of the physical speculations of this school. [846] The only apparent exception to this statement is Catius, but even his principal works, although he wrote one, “de Summo Bono,” are on the physical nature of things.[847] Cicero accounts for the popularity of Epicureanism by saying that it was easy—that it appealed to the blandishments of pleasure; and that its first professors, Amafanius and Rabirius, used none of the refinements of art or subtleties or dialectic, but clothed their discussions in a homely and popular style, suited to the simple and unlearned. There were many successors to Amafanius; and the doctrines which they taught rapidly spread over the whole of Italy. Many illustrious statesmen, also, were amongst the believers in this fashionable creed; of whom the best known are C. Cassius, the fellow-conspirator of Brutus, and T. Pomponius Atticus, the friend of Cicero. All the monuments and records, however, of the Epicurean philosophy, which were published in Latin, have perished, with the exception of the immortal work of T. Lucretius Carus, “De Naturâ Rerum.” Nor was Stoicism, the severe principles of which were in harmony with the stern old Roman virtues, without distinguished disciples; such as were the unflinching M. Brutus, the learned Terentius Varro, the jurist Scævola, the unbending Cato of Utica, and the magnificent Lucullus—a Stoic in creed, though not in life and conduct. The part which Cicero’s character qualified him to perform in the philosophical instruction of his countrymen was scarcely that of a guide: he could give them a lively interest in the subject, and reveal to them the discoveries and speculations of others, but he could not mould and form their belief, and train them in the work of original investigation. Not being himself devoutly attached to any system of philosophical belief, he would be cautious of offending the philosophical prejudices of others. He loved learning, but his temper was undecided and vacillating: whilst, therefore, he delighted in accumulating stores of Greek erudition, the tendency of his mind was, in the midst of a variety of inconsistent doctrines, to leave the conclusion undetermined. Although he listened to various instructors—Phædrus the Epicurean, Diodotus the Stoic, and Philo the Academician—he found the eclecticism of the latter more
  • 69. congenial to his taste. Its preference of probability to certainty suited one who shrunk from the responsibility of deciding. It is this personality, as it were, which gives a special interest to the Ciceronian philosophy. The reflexion of his personal character which pervades it rescues it from the imputation of being a mere transcript of his Greek originals. Cicero brings everything as much as possible to a practical standard. If the question arises between the study of morals and politics and that of physics or metaphysics, he decides in favour of the former, on the grounds that the latter transcends the capacities of the human intellect;[848] that in morals and politics we are under obligations from which in physics we are free; that we are bound to tear ourselves from these abstract studies at the call of duty to our country or our fellow-creatures, even if we were able to count the stars or measure the magnitude of the universe.[849] In the didactic method which he pursues he bears in mind that he is dealing not with contemplative philosophers, or minds that have been logically trained, but with statesmen and men of the world; he does not therefore claim too much, or make his lessons too hard, and is always ready to sacrifice scientific system to a method of popular instruction. His object seems to be to recommend the subject—to smoothe difficulties, and illustrate obscurities. He evidently admires the exalted purity of Stoical morality; and the principles of that sect are those which he endeavours to impress upon his son.[850] His only fear is that their system is impracticable.[851] Cicero believed in the existence of one supreme Creator and Governor of the universe, and also in His spiritual nature;[852] but his belief is rather the result of instinctive conviction, than of the proofs derived from philosophy; for as to them, he is, as on other points, uncertain and wavering. He disbelieved the popular mythical religion; but, uncertain as to what was the truth, he would not have that disturbed which he looked upon as a political engine.[853] Amidst the doubtful and conflicting reasons, respecting the human soul and man’s eternal destiny, there is no doubt that, although he finds no satisfactory proof, he is a believer in immortality.[854] It is unnecessary to pursue the subject of his philosophical creed any further, because it is not a system, but only a collection of precepts, not of investigations. Its materials are borrowed, its illustrations alone novel. But, nevertheless, the study of Cicero’s philosophical
  • 70. works is invaluable, in order to understand the minds of those who came after him. It must not be forgotten, that not only all Roman philosophy after his time, but a great part of that of the middle ages, was Greek philosophy filtered through Latin, and mainly founded on that of Cicero. Cicero’s works on speculative philosophy generally consist of—(1.) The Academics, or a history and defence of the belief of the New Academy. (2.) The De Finibus Bonorum et Malorum, dialogues on the supreme good, the end of all moral action. (3.) The Tusculanæ Disputationes, containing five independent treatises on the fear of death, the endurance of pain, the power of wisdom over sorrow, the morbid passions, the relation of virtue to happiness. In these treatises Stoicism predominates, although opinions are adduced from the whole range of Greek philosophy. (4.) Paradoxa, in which the six celebrated Stoical paradoxies are touched upon in a light and amusing manner. (5.) A dialogue in praise of philosophy, named after Hortensius. (6.) Translations of the Timæus and Protagoras of Plato. Of these last three treatises only a few fragments remain. His moral philosophy comprehends—(1.) The De Officiis, a Stoical treatise on moral obligations, addressed to his son Marcus, at that time a student at Athens. (2.) The unequalled little essays on Friendship and Old Age. A few words also are preserved of two books on Glory, addressed to Atticus; and one which he wrote on the Alleviation of Grief when bereaved of his beloved daughter.[855] He left one theological work in three parts: the first part is on the “Nature of the Gods;” the second on the “Science of Divination;” the third on “Fate,” of which an inconsiderable fragment is extant. His office of augur probably suggested to him the composition of these treatises. His political works are two in number—the De Republica[856] and De Legibus; both are imperfect. The remains of the former are only fragmentary; of the latter, three out of six books are extant, and those not entire. Nevertheless, sufficient of both remains to enable us to form some estimate of their philosophical character. Although he does not profess originality, but confesses that they are imitations of the two treatises of Plato, which bear the same name, still they are more inductive than any of his other treatises. His purpose is, like that of Plato, to give in the one an ideal republic, and in the other a
  • 71. sketch of a model legislation; but the novelty of the treatment consists in their principles being derived from the Roman constitution and the Roman laws. The questions which he proposes to answer are, what is the best government and the best code: but the limits within which he confines himself are the institutions of his country. In the Republic he first discusses, like the Greek philosophers, the merits and demerits of the three pure forms of government; and upon the whole decides in favour of monarchy[857] as the best. With Aristotle[858] he agrees that all the pure forms are liable to degenerate,[859] and comes to the conclusion that the idea of a perfect polity is a combination of all three.[860] In order to prove and illustrate his theory, he investigates, though it must be confessed in a meager and imperfect manner, the constitutional history of Rome, and discovers the monarchical element in the consulship, the aristocratic in the senate, and the popular in the assembly of the people and the tribunitial authority. The Romans continued jealously to preserve the shadow of their constitution even after they had surrendered the substance. Nominally, the titles and offices of the old republic never perished— the Emperor was in name nothing more than (Imperator) the commander-in-chief of the armies of the republic, but in him all power centred: he was absolute, autocratic, the chief of a military despotism.[861] Cicero, as the treatise De Legibus plainly shows, saw, with approbation, that this state of things was rapidly coming to pass; that the people were not fitted to be trusted with liberty, and yet that they would be contented with its semblance and name. The method which he pursues, is, firstly, to treat the subject in the abstract, and to investigate the nature of law; and, secondly, to propose an ideal code, limited by the principles of Roman jurisprudence. Thus Cicero’s polity and code were not Utopian—the models on which they were formed had a real tangible existence. His was the system of a practical man, as the Roman constitution was that of a practical people. It was not like Greek liberty, the realization of one single idea; it was like that of England, the growth of ages, the development of a long train of circumstances, and expedients, and experiments, and emergencies. Cicero prudently acquiesced in the ruin of liberty as a stern necessity; but he evidently thought that
  • 72. Rome had attained the zenith of its national greatness immediately before the agitations of the Gracchi. Both these works are written in the engaging form of dialogues. In the one, Scipio Æmilianus, Lælius, Scævola, and others, meet together in the Latin holidays (Feriæ Latinæ,) and discuss the question of government. In the other, the writer himself, with his brother Quintus and Atticus, converse on jurisprudence whilst they saunter on a little islet near Arpinum at the confluence of the Liris and Fibrena. We must, lastly, contemplate Cicero as a correspondent. This intercourse of congenial minds separated from one another, and induced by the force of circumstances to digest and arrange their thoughts in their communication, forms one of the most delightful and interesting, and at the same time one of the most characteristic, portions of Roman literature. A Roman thought that whenever he put pen to paper it was his duty, to a certain extent, to avoid carelessness and offences against good taste, and to bestow upon his friend some portion of that elaborate attention which, as an author, he would devote to the public eye. In fact the letter-writer was almost addressing the same persons as the author; for the latter wrote for the approbation of his friends, the circle of intimates in which he lived: the approbation of the public was a secondary object. The Greeks were not writers of letters: the few which we possess were mere written messages, containing such necessary information as the interruption of intercourse demanded. There was no interchange of hopes and fears, thoughts, sentiments, and feelings. The extent of Cicero’s correspondence is almost incredible: even those epistles which remain form a very voluminous collection— more than eight hundred are extant. The letters to his friends and acquaintances (ad Familiares) occupy sixteen books; those to Atticus sixteen more; and we have besides three books of letters to Quintus, and one to Brutus; but the authenticity of this last collection is somewhat doubtful. It is quite clear that none of them were intended for publication, as those of Pliny and Seneca were. They are elegant without stiffness, the natural outpourings of a mind which could not give birth to an ungraceful idea. When speaking of the perilous and critical politics of the day, more or less restraint and reserve are apparent, according to the intimacy with the person whom he is
  • 73. addressing, but no attempt at pompous display. His style is so simple that the reader forgets that Cicero ever wrote or delivered an oration. There is the eloquence of the heart, not of the rhetoric school. Every subject is touched upon which could interest the statesman, the man of letters, the admirer of the fine arts, or the man of the world. The writer reveals in them his own motives, his secret springs of actions, his loves, his hatreds, his strength, his weakness. They extend over more than a quarter of a century, the most interesting period of his own life, and one of the most critical in the history of his country. The letters to Quintus are those of an elder brother to one who stood in great need of good advice. Although Quintus was not deserving of his brother’s affection, M. Cicero was warmly attached to him, and took an interest in his welfare. Quintus was proprætor of Asia, and not fitted for the office; and Cicero was not sparing in his admonitions, though he offered them with kindness and delicacy. The details of his family concerns form not the least interesting portion of this correspondence. There is, as might be expected, more reserve in the letters ad Familiares than in those addressed to Atticus. They are written to a variety of correspondents, of every shade and complexion of opinions, many of them mere acquaintances, not intimate friends; but whilst, for this reason, less historically valuable, they are the most pleasing of the collection, on account of the exquisite elegance of their style. They are models of pure Latinity. In the letters to Atticus, on the other hand, he lays bare the secrets of his heart; he trusts his life in his hands; he is not only his friend but his confidant, his second self. Were it not for the letters of Cicero, we should have had but a superficial knowledge of this period of Roman history, as well as of the inner life of Roman society. An elegant poetic compliment paid to Cicero by Laurea Tullus, one of his freedmen, has been preserved by Pliny.[862] The subject of it is a medicinal spring in the neighbourhood of the Academy.
  • 74. Quo tua Romanæ vindex clarissime linguæ Silva loco melius surgere jussa viret Atque Academiæ celebratam nomine villam Nunc reparat cultu sub potiore Vetus: Hic etiam adparent lymphæ non ante repertæ Languida quæ infuso lumina rore levant. Nimirum locus ipse sui Ciceronis honori Hoc dedit hac fontes cum patefecit opes Ut quoniam totum legitur sine fine per orbem Sint plures oculis quæ medeantur, aquæ. Father of eloquence in Rome, The groves that once pertained to thee Now with a fresher verdure bloom Around thy famed Academy. Vetus at length this favoured seat Hath with a tasteful care restored; And newly at thy loved retreat A gushing fount its stream has poured. These waters cure an aching sight; And thus the spring that bursts to view Through future ages shall requite The fame this spot from Tully drew. Elton. The correspondents of Cicero included a number of eminent men. Atticus was the least interesting, for his politic caution rendered him unstable and insincere; but there was Cassius the tyrannicide; the Stoical Cato of Utica; Cæcina, the warm partisan of Pompey; the orator Cælius Rufus; Hirtius and Oppius, the literary friends of Cæsar; Lucceius the historian; Matius the mimiambic poet; and that patron of arts and letters,[863] C. Asinius Pollio. Pollio was a scion of a distinguished house, and was born at Rome B. C. 76.[864] Even as a youth he was distinguished for wit and sprightliness;[865] and at the age of twenty-two was the prosecutor of C. Cato. He was with Cæsar at the Rubicon, at Pharsalia, in Africa, and in Spain; and was finally intrusted with the conduct of the war in that province against Sextus Pompey. On the establishment of the first triumvirate, Pollio, after some hesitation, sent in his adhesion; and Antony intrusted him with the administration of Gallia Transpadana, including the allotment of the confiscated lands among
  • 75. the veteran soldiers. He thus had opportunity of protecting Virgil and saving his property. In B. C. 40, Octavian and Antony were reconciled at Brundisium by his mediation. A successful campaign in Illyria concluded his military career with the glories of a triumph,[866] and he then retired from public life to his villa at Tusculum, and devoted himself to study. He enjoyed life to the last, and died in his eightieth year. He left three children, one of whom, Asinius Gallus, [867] wrote a comparison between his father and Cicero, which was answered by the Emperor Claudius.[868] In oratory, poetry, and history, Pollio enjoyed a high reputation among contemporary critics, and yet none of his works have survived. The solution of this difficulty may, perhaps, be found in the following circumstances:—1. His patronage of literary men rendered him popular, and drew from the critics a somewhat partial verdict. His kindness caused Horace to extol[869] him, and Virgil to address to him his most remarkable eclogue.[870] 2. His taste was formed before the new literary school commenced. He had always a profound admiration for the old writers, and frequently quoted them. His style probably appeared antiquated and pedantic, and, therefore, never became generally popular. A later writer[871] says, that he was so harsh and dry as to appear to have reproduced the style of Attius and Pacuvius, not only in his tragedies, but also in his orations. Quintilian observes,[872] that he seemed to belong to the pre- Ciceronian period. Niebuhr, who could only form his opinion upon the slight fragments preserved by Seneca, for the three letters in Cicero’s collection[873] are only despatches, affirms that he seems to stand between two distinct generations,[874] namely, the literary periods of Cicero and Virgil. His great work was a history of the civil wars, in seventeen books. He pretended to be a critic, but his criticism was fastidious and somewhat ill-natured. He found blemishes in Cicero, inaccuracies in Cæsar, pedantry in Sallust, and provincialism (Patavinitas) in Livy. The correctness of his judgment respecting the charming narratives of the great historian has been assumed from generation to generation, yet no one can discover in what this Pativinity consists. It was easier to find fault than to write correctly; for, whilst all the labours of the critic have perished, Cicero, Cæsar, Sallust, and Livy are immortal. Vehemence and passion developed his character.