SlideShare a Scribd company logo
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Martin Klein
Los Alamos National Laboratory
@mart1nkle1n
Herbert Van de Sompel
DANS
@hvdsomp
Memento Tracer
An Innovative Approach Towards Balancing
Scale and Fidelity for Web Archiving
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Background: Scholarly Orphans Project
The Scholarly Orphans project
is funded by the Andrew W. Mellon Foundation
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Scholarly Orphans Team
• Los Alamos National Laboratory:
• Lyudmila Balakireva
• Martin Klein
• James Powell
• Harihar Shankar
• Herbert Van de Sompel (now at DANS)
• Old Dominion University:
• Sawood Alam
• Grant Atkins (now at Mitre)
• Shawn Jones
• Mat Kelly
• Michael L. Nelson
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
• Consideration
• Researchers deposit artifacts in web platforms
• Status quo - Not systematically archived
• No frameworks like LOCKSS/Portico exist for these artifacts
• Researchers only selectively deposit artifacts in portals that
provide archival guarantees; to obtain a cite-able DOI
• Can’t expect researchers to (also) upload all artifacts in IRs
• Web archives only incidentally archive these artifacts, cf.
anecdotal & Hiberlink project evidence
Research and Research Communication on the Web
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Emma Schymanski
https://orcid.org/0000-0001-6868-8145
https://github.com/schymane
https://www.slideshare.net/EmmaSchymanski
https://figshare.com/authors/Emma_Schymanski/5087039
https://publons.com/author/1538491/emma-schymanski#profile
https://www.eawag.ch/en/aboutus/portrait/organisation/staff/profile/emma-schymanski/
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Emma’s SlideShare Artifact: 0 Mementos
https://www.slideshare.net/EmmaSchymanski/dmcm2018-community-resources-connecting-chemistry-and-toxicity-knowledge
http://timetravel.mementoweb.org/
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Shawn Jones
https://orcid.org/0000-0002-4372-870X
http://www.shawnmjones.org/
https://github.com/shawnmjones
https://www.slideshare.net/shawnmjones
https://en.wikipedia.org/wiki/User:Shawnmjones
https://www.blogger.com/profile/17827543974149663194
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Shawn’s GitHub Artifact: 1 Memento
https://github.com/shawnmjones/mediawiki
https://web.archive.org/web/*/https://github.com/shawnmjones/mediawiki
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Hiberlink Evidence
Web resources referenced in Elsevier corpus (1996-2012)
without representative Memento in public web archives
Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE
https://doi.org/10.1371/journal.pone.0115253
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Scholarly Orphans Project
How to faithfully capture Scholarly Orphans
for long-term archiving?
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Scale vs. Fidelity
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Scale!
https://twitter.com/brewster_kahle/status/1016003169589981184
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Scale!!
https://twitter.com/brewster_kahle/status/1118172506777509890
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Scale!!!
https://twitter.com/brewster_kahle/status/1139700494748663809
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Fidelity?
https://ws-dl.blogspot.com/2017/01/2017-01-20-cnncom-has-
been-unarchivable.html
http://web.archive.org/web/*/http://cnn.com
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Fidelity!!
https://twitter.com/ianmilligan1/status/1136703505442324481https://twitter.com/MellonFdn/status/1138811967060267011
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Web Archiving: Scale?
https://twitter.com/mart1nkle1n/status/1136705116738904067
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Resource Boundary
https://www.slideshare.net/hvdsomp/paul-evan-peters-lecture
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Resource Boundary
https://github.com/mementoweb/memento_extensions
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Memento Tracer Framework
http://tracer.mementoweb.org
Inspired by:
• LOCKSS
• Same automated approach for resources of a class
• Webrecorder
• Manual recording of web resources
• Various attempts aimed at automating interactions/behaviors
• E.g., Brozzler, Browsertrix
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Memento Tracer Framework
http://tracer.mementoweb.org
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Memento Tracer DEMO
http://tracer.mementoweb.org
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Current Memento Tracer Capabilities
• Single clicks/links
• All links in an area
• Repeated click on links, with stop condition
• Slides
• Pagination
• Nested traces i.e., “trace in a trace”
• Trace for portal A  follow link to portal B  execute
trace for portal B
• Identification of page/portal for which a trace exists by URI
(pattern)
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Memento Tracer Benefits
• Scalability
• Trace created once is applicable to all web resources of
the same class
• Traces shared via repository (edits, versioning)
• Quality
• Trace used as set of instructions for browser-based
capture framework
• Resource boundary explicit
• Tradeoff
• Quality vs performance
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Memento Tracer Challenges
• Memento Tracer:
• Language used to express Traces (interoperability)
• Organization of the shared repository for Traces
• Limitations of the browser event listener approach for recording
Traces
• Selection of a Trace for capturing a web publication by other
means than URI pattern
• Legal constraints
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
myresearch.institute - Pilot
For more details and statistics, see our 2019 CNI Spring meeting slides:
https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
myresearch.institute - Pilot
For more details and statistics, see our 2019 CNI Spring meeting slides:
https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
myresearch.institute - Pilot
For more details and statistics, see our 2019 CNI Spring meeting slides:
https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
myresearch.institute - Pilot
For more details and statistics, see our 2019 CNI Spring meeting slides:
https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
Memento Tracer
@mart1nkle1n @hvdsomp
The web that was, Amsterdam, NL, June 20 2019
Martin Klein
Los Alamos National Laboratory
@mart1nkle1n
Herbert Van de Sompel
DANS
@hvdsomp
Memento Tracer
An Innovative Approach Towards Balancing
Scale and Fidelity for Web Archiving
The Scholarly Orphans project
is funded by the Andrew W. Mellon Foundation

More Related Content

PPTX
What If Libraries Were Ubiquitous
PDF
A Framework for Verifying the Fixity of Archived Web Resources
PPTX
An Institutional Perspective to Rescue Scholarly Orphans
PPTX
Combining Storytelling and Web Archives
PPTX
Impacts, consequences and outcomes of open policies in Europe
PPTX
An Institutional Perspective to Rescue Scholarly Orphans
PPT
Open the Door, Let \'em In: Virtual School Libraries
PPTX
NDLC wikipedia: a bridge from basic markup to the research cycle
What If Libraries Were Ubiquitous
A Framework for Verifying the Fixity of Archived Web Resources
An Institutional Perspective to Rescue Scholarly Orphans
Combining Storytelling and Web Archives
Impacts, consequences and outcomes of open policies in Europe
An Institutional Perspective to Rescue Scholarly Orphans
Open the Door, Let \'em In: Virtual School Libraries
NDLC wikipedia: a bridge from basic markup to the research cycle

What's hot (20)

DOCX
Breit links
PPTX
Open Access eBooks and Scholarly Publishing
PPTX
How to start editing Wikidata (for Wikipedians and GLAM staff)
PPTX
PDE 3rd year March 2015
PPTX
Digital Divide and Conquer: Why Open Access and Information Fluency Make a Gr...
PPTX
WS-DL’s Work towards Enabling Personal Use of Web Archives
PPTX
Universities as a site for innovation in publishing: the Ubiquity Press case ...
PDF
Museum Information Visualization Research Files
PDF
Detecting Off-Topic Pages in Web Archives
PPTX
ICT applications in Sociology Research
PDF
csvconfyasmin2017_05_03
PPT
The Future of Libraries (for beginners)
PPTX
Storytelling for Summarizing Collections in Web Archives
PDF
PPTX
Open Access for Research: A Librarian Overview
PPTX
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
PPT
Useful websites for TEFL Teachers
PDF
#OERde14 Keynote: "Generation Open: An International Look at the Coming Revol...
PPTX
Ready and Prepared for Research data
PDF
Genealogical Deeds Done Dirt Cheap: No Apologies to AC/DC
Breit links
Open Access eBooks and Scholarly Publishing
How to start editing Wikidata (for Wikipedians and GLAM staff)
PDE 3rd year March 2015
Digital Divide and Conquer: Why Open Access and Information Fluency Make a Gr...
WS-DL’s Work towards Enabling Personal Use of Web Archives
Universities as a site for innovation in publishing: the Ubiquity Press case ...
Museum Information Visualization Research Files
Detecting Off-Topic Pages in Web Archives
ICT applications in Sociology Research
csvconfyasmin2017_05_03
The Future of Libraries (for beginners)
Storytelling for Summarizing Collections in Web Archives
Open Access for Research: A Librarian Overview
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Useful websites for TEFL Teachers
#OERde14 Keynote: "Generation Open: An International Look at the Coming Revol...
Ready and Prepared for Research data
Genealogical Deeds Done Dirt Cheap: No Apologies to AC/DC
Ad

Similar to Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity for Web Archiving (20)

PPTX
To the Rescue of Scholarly Orphans
PPTX
To the Rescue of the Orphans of Scholarly Communication
PPTX
A Web-Centric Pipeline for Archiving Scholarly Artifacts
PPTX
The web is rotting and what to do about it
PPTX
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
PDF
Perseverance on persistence by Herbert Van de Sompel - EuropeanaTech Conferen...
PDF
Perseverance on Persistence by Herbert van de Sompel - EuropeanaTech Conferen...
PPTX
Perseverance on Persistence
PPTX
Smarter Data for Smarter Libraries
PDF
I pres 2014 slides
PPTX
Paul Evan Peters Lecture
PPTX
A Perspective on Archiving the Scholarly Record
PPSX
Tuesday 5 May: IIPC activities, Olga Holownia, IIPC
PPTX
Web archiving challenges and opportunities
PPTX
A Vision of the Library’s Role in Archiving Scholarly Artifacts
PPTX
Collecting the organizational scholarly record
PPTX
Reference Rot
PDF
Sla em-20140326
PPTX
Information sharing about Columbia University Library’s recent web archiving ...
PPTX
Preserving Public Government Information: The End of Term Web Archive
To the Rescue of Scholarly Orphans
To the Rescue of the Orphans of Scholarly Communication
A Web-Centric Pipeline for Archiving Scholarly Artifacts
The web is rotting and what to do about it
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
Perseverance on persistence by Herbert Van de Sompel - EuropeanaTech Conferen...
Perseverance on Persistence by Herbert van de Sompel - EuropeanaTech Conferen...
Perseverance on Persistence
Smarter Data for Smarter Libraries
I pres 2014 slides
Paul Evan Peters Lecture
A Perspective on Archiving the Scholarly Record
Tuesday 5 May: IIPC activities, Olga Holownia, IIPC
Web archiving challenges and opportunities
A Vision of the Library’s Role in Archiving Scholarly Artifacts
Collecting the organizational scholarly record
Reference Rot
Sla em-20140326
Information sharing about Columbia University Library’s recent web archiving ...
Preserving Public Government Information: The End of Term Web Archive
Ad

More from Martin Klein (20)

PPTX
On the Persistence of Persistent Identifiers of the Scholarly Web
PPTX
On the Persistence of Persistent Identifiers of the Scholarly Web
PPTX
Who is Asking - Humans and Machines Experience a Different Scholarly Web
PPTX
The Memento Tracer Framework: Balancing Quality and Scalability for Web Arch...
PPTX
Comparing the Performance of OAI-PMH with ResourceSync
PPTX
Evaluating Memento Service Optimizations
PPTX
First Steps in Research Data Management Under Constraints of a National Secur...
PPTX
Smart Routing of Memento Requests
PPTX
Building Event Collections from Crawling Web Archives
PPTX
Focused Crawl of Web Archives to Build Event Collections
PPTX
Creating Topical Collections: Web Archives vs. Live Web
PPTX
Robust Linking to Web Resources
PPTX
Signposting for Repositories
PPTX
Discovering Scholarly Orphans Using ORCID
PPTX
Using the Memento Framework to Assess Content Drift in Scholarly Communication
PPTX
Uniform Access to Raw Mementos
PPTX
Robust Links - a proposed solution to reference rot in scholarly communication
PDF
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
PPTX
web_archive_interoperability_memento
PPTX
Reference Rot in Scholarly Communication: A Reliable Quantification and a P...
On the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly Web
Who is Asking - Humans and Machines Experience a Different Scholarly Web
The Memento Tracer Framework: Balancing Quality and Scalability for Web Arch...
Comparing the Performance of OAI-PMH with ResourceSync
Evaluating Memento Service Optimizations
First Steps in Research Data Management Under Constraints of a National Secur...
Smart Routing of Memento Requests
Building Event Collections from Crawling Web Archives
Focused Crawl of Web Archives to Build Event Collections
Creating Topical Collections: Web Archives vs. Live Web
Robust Linking to Web Resources
Signposting for Repositories
Discovering Scholarly Orphans Using ORCID
Using the Memento Framework to Assess Content Drift in Scholarly Communication
Uniform Access to Raw Mementos
Robust Links - a proposed solution to reference rot in scholarly communication
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
web_archive_interoperability_memento
Reference Rot in Scholarly Communication: A Reliable Quantification and a P...

Recently uploaded (20)

PDF
Uptota Investor Deck - Where Africa Meets Blockchain
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
E -tech empowerment technologies PowerPoint
PDF
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
t_and_OpenAI_Combined_two_pressentations
PDF
simpleintnettestmetiaerl for the simple testint
PPTX
Slides PPTX: World Game (s): Eco Economic Epochs.pptx
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
PDF
The Evolution of Traditional to New Media .pdf
PPT
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
PDF
The Ikigai Template _ Recalibrate How You Spend Your Time.pdf
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PPTX
Mathew Digital SEO Checklist Guidlines 2025
PPTX
Introduction to cybersecurity and digital nettiquette
PPTX
Database Information System - Management Information System
PPT
250152213-Excitation-SystemWERRT (1).ppt
DOC
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
PPTX
artificial intelligence overview of it and more
Uptota Investor Deck - Where Africa Meets Blockchain
SAP Ariba Sourcing PPT for learning material
E -tech empowerment technologies PowerPoint
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
t_and_OpenAI_Combined_two_pressentations
simpleintnettestmetiaerl for the simple testint
Slides PPTX: World Game (s): Eco Economic Epochs.pptx
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
The Evolution of Traditional to New Media .pdf
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
The Ikigai Template _ Recalibrate How You Spend Your Time.pdf
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
Power Point - Lesson 3_2.pptx grad school presentation
Mathew Digital SEO Checklist Guidlines 2025
Introduction to cybersecurity and digital nettiquette
Database Information System - Management Information System
250152213-Excitation-SystemWERRT (1).ppt
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
artificial intelligence overview of it and more

Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity for Web Archiving

  • 1. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Martin Klein Los Alamos National Laboratory @mart1nkle1n Herbert Van de Sompel DANS @hvdsomp Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity for Web Archiving
  • 2. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Background: Scholarly Orphans Project The Scholarly Orphans project is funded by the Andrew W. Mellon Foundation
  • 3. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Scholarly Orphans Team • Los Alamos National Laboratory: • Lyudmila Balakireva • Martin Klein • James Powell • Harihar Shankar • Herbert Van de Sompel (now at DANS) • Old Dominion University: • Sawood Alam • Grant Atkins (now at Mitre) • Shawn Jones • Mat Kelly • Michael L. Nelson
  • 4. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 • Consideration • Researchers deposit artifacts in web platforms • Status quo - Not systematically archived • No frameworks like LOCKSS/Portico exist for these artifacts • Researchers only selectively deposit artifacts in portals that provide archival guarantees; to obtain a cite-able DOI • Can’t expect researchers to (also) upload all artifacts in IRs • Web archives only incidentally archive these artifacts, cf. anecdotal & Hiberlink project evidence Research and Research Communication on the Web
  • 5. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Emma Schymanski https://orcid.org/0000-0001-6868-8145 https://github.com/schymane https://www.slideshare.net/EmmaSchymanski https://figshare.com/authors/Emma_Schymanski/5087039 https://publons.com/author/1538491/emma-schymanski#profile https://www.eawag.ch/en/aboutus/portrait/organisation/staff/profile/emma-schymanski/
  • 6. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Emma’s SlideShare Artifact: 0 Mementos https://www.slideshare.net/EmmaSchymanski/dmcm2018-community-resources-connecting-chemistry-and-toxicity-knowledge http://timetravel.mementoweb.org/
  • 7. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Shawn Jones https://orcid.org/0000-0002-4372-870X http://www.shawnmjones.org/ https://github.com/shawnmjones https://www.slideshare.net/shawnmjones https://en.wikipedia.org/wiki/User:Shawnmjones https://www.blogger.com/profile/17827543974149663194
  • 8. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Shawn’s GitHub Artifact: 1 Memento https://github.com/shawnmjones/mediawiki https://web.archive.org/web/*/https://github.com/shawnmjones/mediawiki
  • 9. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Hiberlink Evidence Web resources referenced in Elsevier corpus (1996-2012) without representative Memento in public web archives Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE https://doi.org/10.1371/journal.pone.0115253
  • 10. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Scholarly Orphans Project How to faithfully capture Scholarly Orphans for long-term archiving?
  • 11. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Scale vs. Fidelity
  • 12. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Scale! https://twitter.com/brewster_kahle/status/1016003169589981184
  • 13. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Scale!! https://twitter.com/brewster_kahle/status/1118172506777509890
  • 14. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Scale!!! https://twitter.com/brewster_kahle/status/1139700494748663809
  • 15. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Fidelity? https://ws-dl.blogspot.com/2017/01/2017-01-20-cnncom-has- been-unarchivable.html http://web.archive.org/web/*/http://cnn.com
  • 16. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Fidelity!! https://twitter.com/ianmilligan1/status/1136703505442324481https://twitter.com/MellonFdn/status/1138811967060267011
  • 17. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Web Archiving: Scale? https://twitter.com/mart1nkle1n/status/1136705116738904067
  • 18. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Resource Boundary https://www.slideshare.net/hvdsomp/paul-evan-peters-lecture
  • 19. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Resource Boundary https://github.com/mementoweb/memento_extensions
  • 20. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Memento Tracer Framework http://tracer.mementoweb.org Inspired by: • LOCKSS • Same automated approach for resources of a class • Webrecorder • Manual recording of web resources • Various attempts aimed at automating interactions/behaviors • E.g., Brozzler, Browsertrix
  • 21. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Memento Tracer Framework http://tracer.mementoweb.org
  • 22. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Memento Tracer DEMO http://tracer.mementoweb.org
  • 23. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Current Memento Tracer Capabilities • Single clicks/links • All links in an area • Repeated click on links, with stop condition • Slides • Pagination • Nested traces i.e., “trace in a trace” • Trace for portal A  follow link to portal B  execute trace for portal B • Identification of page/portal for which a trace exists by URI (pattern)
  • 24. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Memento Tracer Benefits • Scalability • Trace created once is applicable to all web resources of the same class • Traces shared via repository (edits, versioning) • Quality • Trace used as set of instructions for browser-based capture framework • Resource boundary explicit • Tradeoff • Quality vs performance
  • 25. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Memento Tracer Challenges • Memento Tracer: • Language used to express Traces (interoperability) • Organization of the shared repository for Traces • Limitations of the browser event listener approach for recording Traces • Selection of a Trace for capturing a web publication by other means than URI pattern • Legal constraints
  • 26. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 myresearch.institute - Pilot For more details and statistics, see our 2019 CNI Spring meeting slides: https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
  • 27. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 myresearch.institute - Pilot For more details and statistics, see our 2019 CNI Spring meeting slides: https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
  • 28. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 myresearch.institute - Pilot For more details and statistics, see our 2019 CNI Spring meeting slides: https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
  • 29. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 myresearch.institute - Pilot For more details and statistics, see our 2019 CNI Spring meeting slides: https://www.slideshare.net/martinklein0815/an-institutional-perspective-to-rescue-scholarly-orphans
  • 30. Memento Tracer @mart1nkle1n @hvdsomp The web that was, Amsterdam, NL, June 20 2019 Martin Klein Los Alamos National Laboratory @mart1nkle1n Herbert Van de Sompel DANS @hvdsomp Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity for Web Archiving The Scholarly Orphans project is funded by the Andrew W. Mellon Foundation