SlideShare a Scribd company logo
Data	Curation
Why	should	we	care?
Yasmin	AlNoamany
Old	Dominion	University
Web	Science	and	Digital	Libraries	Group	
ws-dl.cs.odu.edu
@yasmina_anwar @WebSciDL
1
Presented for CLIR Postdoctoral Fellow for Data Curation at
Vanderbilt University
About	me
2
Academic	degrees
3
Yasmin	AlNoamany
Ph.D.	Candidate	at	ODU
yasmin@cs.odu.edu
• Bachelor's	degree	of	
Computer	Science	
• Master's	degree	in	
Computer	Science	
• A	Doctor	of	Philosophy	in	
Computer	Science
Old	Dominion	University(2011-2016)
• Research	Assistant:	
integrating	the	past	
with	the	present	
“Storytelling	for	
Summarizing	
Collections	in	Web	
Archives”	
• Teaching	Assistant
4
Archived	collectionsStorytelling	services
Archived	enriched	
stories
Internet	Archive(summer	2014-fall	2014)
• Log	analysis	
• Tools	for	managing	
seed	URIs
5
0.11.160.135 [02/Feb/2012:00:01:03] "GET
http://web.archive.org/web/20070519015308i
m_/http://www.jcdl.org/images/jcdl2007-
edie.jpg HTTP/1.1" 200 2137 "-"
"Mozilla/5.0"
0.11.160.135 [02/Feb/2012:00:01:03] "GET
http://staticweb.archive.org/images/toolba
r/wayback-toolbar-logo.png HTTP/1.1" 200
3700 "–" "Mozilla/5.0"
0.151.147.108 [02/Feb/2012:00:01:03] "GET
http://web.archive.org/web/20100102003557/
about:blank HTTP/1.1" 302 0 "www.xx.com"
"Mozilla/4.0"
Personal	
• Women	in	Tech	
communities:	
@anitaborg,	
@systers,	@arabwic
• Photography
• A	mom	for	this	
adorable	7	years	old
6
Awards	and	publications
• Best	Teaching	Award
• Best	Student	Paper	
Award
• 9	papers,	in	which	3	
are	journals.
7
Data	Curation
8
Why	should	we	care?
9
Data	Management
10
11
Data	Management
How	I	got	the	logs	from	the	IA
12
Source:	http://www.tamr.com/real-data-scientists-enterprise/
Even	we	save	the	data,	how	it	will	be	
shared	and	re-used?
13
Metadata	is	important
14
The	call	for	a	revolution	in	Egypt
• It	all	started	on	
Facebook
15
Multiple	initiatives	for	documenting	
the	Egyptian	Revolution
16
Several	studies	and	books	about	
the	Egyptian	Revolution	
17
These	studies	and	books	cited	
these	sites
18
They	do	not	exist	any	more!
19
Data	preservation	is	important	
for	posterity	
• A	year	after	the	
Egyptian	Revolution,	
11%	of	the	social	media	
documentation	is	gone.
20Source:	http://ws-dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html
Data	Curation	is	important	for	
scholarly	research
• Managing	your	research	data	saves	time	
• Universities	and	other	research	organizations	invest	very	large	
sums	of	money	into	research	activities
• Digital	data	is	inherently	prone	to	loss	
• Future	access	to	valuable	digital	assets	depends	upon	
curation/preservation	actions	taken	today	
• Funding	agency	requirements	
• Research	data	should	be	shared	and	publicly	accessible:
• Increase	the	impact	of	your	research	
• Make	attribution	easy
• Call	for	accountability	and	transparency
• Permits	others	to	replicate	the	findings	of	a	study
• Scholarly	communication	chain—connecting	data	to	publication	
21
What	is	Data	Curation?
• “Data	curation	is	the	active	and	ongoing	
management	of	research	data	through	its	lifecycle	
of	interest	and	usefulness	to	scholarship,	science,	
and	education.”	– Carole	Palmer,	UIUC	GSLIS
• Data	management
• Adding	value	to	data	
• Data	preservation	for	later	re-use
22
DCC	Curation	life	cycle
23Source:	http://www.dcc.ac.uk/resources/curation-lifecycle-model
CONCEPTUALIZE
Step-by-step	instruction	and	templates	for	creating,	
publishing	and	sharing	data	management	plans	that	
satisfy	funding	agency	mandates
24
CREATE	OR	RECEIVE
25
A	collaborative	working	space	and	data-sharing	platform
APPRAISE	&	SELECT
26
Identification,	Validation,	Characterization
INGEST
27
• Handle	a	wide	variety	of	transfer	processes
• Assure	the	availability	of	the	research	data	across	
institutions	and	publishers	and	keep	it	discoverable
PRESERVATION	ACTION
28
• Extract		metadata	in	XML	format
• Create	checksum,	or	hashtag	for	the	data	objects
• Facilitate	data	discovery	and	re-use
• Raise	interest	in	your research
• Facilitate	preservation
STORE
29
• Get	credit	for	your	data	and	build	your	reputation.
• You	data	is	discoverable	and	can	be	attributed	to	you.
• Other	researchers	can	find	data	associated	with	a	
publication	and	explore	new	ways	to	use	it.
ACCESS,	USE	&	REUSE
30
TRANSFORM
31
Migrating	the	data	and	put	them	into	another	format
Summary
• What	is	Data	Curation?
• Annotation
• Management
• Validation	
• Preservation
• Sharing
• Access	and	Re-use
• Authentication
32
• Why	do	we	need	Data	
Curation?	
• Long-term	access	
• Re-use
• Interoperability
• Reproducibility
• Cost-effective
• Time-saving
• Creditability
• Accountability
“Data	curation	systems	should	be	
integrated	with	the	active	research	
phase”
33
Yasmin	AlNomanyOld	Dominion	University
Web	Science	and	Digital	Libraries	Group	
ws-dl.cs.odu.edu
http://www.cs.odu.edu/~yasmin/
https://www.linkedin.com/in/yasminalnoamany
https://github.com/yasmina85/
@yasmina_anwar @WebSciDL
Backup	Slides
34
Data	&	Complexity	
• Research	problems	
increasingly	interdisciplinary	
and	complex	
• Collaboration	requires	open	
sharing	of	data	
• Data	are	highly	
heterogeneous	and	largely	
incompatible	in	their	native	
forms	
• The	semantics	and	contexts	
within	which	data	are	
gathered	and	interpreted	are	
important	to	preserve	
35
36
(Comic from The Official Dilbert Store)
37
http://www.christianitytoday.com/edstetzer/2015/february/3-
ways-social-media-benefits-church-leaders.html

More Related Content

PDF
Lcwebinar rise of-the_databrarian_73961
PPTX
Many to One: Centralizing Collection Management Through Implementation of Arc...
PPTX
Electronic information resources for teachers and students
PDF
E-content Trends and Libraries
PDF
Introduction to Sentiment Mining
PDF
In Context: Case Studies in Integrated Physical and Virtual Library Service D...
PPTX
Building a Collaboration for Digital Publishing
PPTX
Getting connected.wsu gme orientation.12
Lcwebinar rise of-the_databrarian_73961
Many to One: Centralizing Collection Management Through Implementation of Arc...
Electronic information resources for teachers and students
E-content Trends and Libraries
Introduction to Sentiment Mining
In Context: Case Studies in Integrated Physical and Virtual Library Service D...
Building a Collaboration for Digital Publishing
Getting connected.wsu gme orientation.12

What's hot (20)

PDF
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
PPTX
UMichigan Library Emergent Research slides
PDF
Virtual Verse in the Library: Capturing Online-Only Poetry for Scholarship an...
PPTX
User Engagement with Digital Archives: A Case Study of Emblematica Online
PPTX
Your digital humanities are in my library! No, your library is in my digital ...
PDF
Building and Managing Social Media Collections
PPTX
Blending in-person and online library services by utilizing mobile technology
PDF
User Engagement with Digital Archives: A Case Study of Emblematica Online
PPTX
Let's Get Visible! with Karla Smith, Winnefox Library System
PPTX
Humanities Users in the Digital Age: Library Needs Assessment
PPTX
GU School of Medicine Library Orientation 2009
PPTX
DHD UIC Library Tips & Tricks 10/15/15
PPT
Library Of The Future – An Academic Librarian
PDF
Virtual Verse in the Library: Surveying the E-Poetry Landscape
PDF
Scholarly Requirements for Large Scale Text Analysis
DOCX
Resume
PPTX
The Open Source Library: It's Free As in Puppy
PPTX
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
PPTX
Digitization and public libraries
PPTX
What Libraries Still Need from Discovery Layers
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
UMichigan Library Emergent Research slides
Virtual Verse in the Library: Capturing Online-Only Poetry for Scholarship an...
User Engagement with Digital Archives: A Case Study of Emblematica Online
Your digital humanities are in my library! No, your library is in my digital ...
Building and Managing Social Media Collections
Blending in-person and online library services by utilizing mobile technology
User Engagement with Digital Archives: A Case Study of Emblematica Online
Let's Get Visible! with Karla Smith, Winnefox Library System
Humanities Users in the Digital Age: Library Needs Assessment
GU School of Medicine Library Orientation 2009
DHD UIC Library Tips & Tricks 10/15/15
Library Of The Future – An Academic Librarian
Virtual Verse in the Library: Surveying the E-Poetry Landscape
Scholarly Requirements for Large Scale Text Analysis
Resume
The Open Source Library: It's Free As in Puppy
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
Digitization and public libraries
What Libraries Still Need from Discovery Layers
Ad

Similar to Data curation vanderbilt (20)

PPTX
Reference Rot and Linked Data: Threat and Remedy
PPTX
Embedding Librarians in Virtual Communities
PPT
Old Dominion University Computer Science IIPC New Member
PDF
Orientation - Computer Science - 13_0827
PPTX
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
PPTX
Planning the Future and Preserving the Past: Emerging Technology in the Libra...
PPTX
Web Archiving for University Records
PDF
SFSU - Digital Preservation
PPTX
Research Data Management
PPTX
Capture All the URLS: First Steps in Web Archiving
PDF
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
PPTX
Linked Open Data for Libraries, Archives, and Museums: An Aggregators View
PPTX
Sharing Your Digital Collection
PPTX
Once and Future Digital Collections
PDF
The+university+of+scranton conten tdm
PPTX
Trailblazing in the Wilderness of Data Management
PPTX
Semantic web technologies and digital library search
PPTX
Archive What I See Now - 2014 NEH ODH Overview
PPTX
The Tribal Approach Academia Takes to Research Data Management
PDF
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Reference Rot and Linked Data: Threat and Remedy
Embedding Librarians in Virtual Communities
Old Dominion University Computer Science IIPC New Member
Orientation - Computer Science - 13_0827
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
Planning the Future and Preserving the Past: Emerging Technology in the Libra...
Web Archiving for University Records
SFSU - Digital Preservation
Research Data Management
Capture All the URLS: First Steps in Web Archiving
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Linked Open Data for Libraries, Archives, and Museums: An Aggregators View
Sharing Your Digital Collection
Once and Future Digital Collections
The+university+of+scranton conten tdm
Trailblazing in the Wilderness of Data Management
Semantic web technologies and digital library search
Archive What I See Now - 2014 NEH ODH Overview
The Tribal Approach Academia Takes to Research Data Management
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Ad

More from Yasmin AlNoamany, PhD (14)

PDF
A Guide for Reproducible Research
PDF
Software as a Well-Formed Research Object
PDF
csvconfyasmin2017_05_03
PDF
Using Web Archives to Enrich the Live Web Experience Through Storytelling - P...
PDF
Generating stories from Archive-It collections
PDF
Using Web Archives to Enrich the Live Web Experience Through Storytelling
PDF
Detecting Off-Topic Pages in Web Archives
PDF
Characteristics of Social Media Stories
PDF
Detecting Off-Topic Pages in Web Archives
PDF
User Access Patterns in Web Archives
PDF
Who and What Links to the Internet Archive
PDF
Access Patterns for Robots and Humans in Web Archives
PDF
Using Web Archives to Enrich the Live Web Experience Through Storytelling
PDF
Access Patterns for Robots and Humans in Web Archives
A Guide for Reproducible Research
Software as a Well-Formed Research Object
csvconfyasmin2017_05_03
Using Web Archives to Enrich the Live Web Experience Through Storytelling - P...
Generating stories from Archive-It collections
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Detecting Off-Topic Pages in Web Archives
Characteristics of Social Media Stories
Detecting Off-Topic Pages in Web Archives
User Access Patterns in Web Archives
Who and What Links to the Internet Archive
Access Patterns for Robots and Humans in Web Archives
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Access Patterns for Robots and Humans in Web Archives

Recently uploaded (20)

PPTX
Pharmacology of Autonomic nervous system
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
The scientific heritage No 166 (166) (2025)
PPT
protein biochemistry.ppt for university classes
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
Microbiology with diagram medical studies .pptx
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Pharmacology of Autonomic nervous system
Classification Systems_TAXONOMY_SCIENCE8.pptx
7. General Toxicologyfor clinical phrmacy.pptx
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
The scientific heritage No 166 (166) (2025)
protein biochemistry.ppt for university classes
Taita Taveta Laboratory Technician Workshop Presentation.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
INTRODUCTION TO EVS | Concept of sustainability
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
TOTAL hIP ARTHROPLASTY Presentation.pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
ECG_Course_Presentation د.محمد صقران ppt
POSITIONING IN OPERATION THEATRE ROOM.ppt
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Biophysics 2.pdffffffffffffffffffffffffff
Microbiology with diagram medical studies .pptx
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...

Data curation vanderbilt