SlideShare a Scribd company logo
Webrecorder:
Web archiving for all!
ARLIS/NA
February 26 & 27, 2018
Anna Perricci
Webrecorder / Rhizome
Web archiving fundamentals
• Web archiving: the process of selecting, capturing, saving and making
accessible select content available online (e.g. websites)
• Web archiving is a new and growing field and we need people with new ideas
and evolving skill sets
• Web archiving has a distinct lack of ‘silver bullets’ or comprehensive one-size-
fits-all solutions
About Webrecorder
Create high-fidelity, interactive captures of any web pages you browse
http://webrecorder.io Webrecorder Player App
A project by
with generous support from
Webrecorder Project
● Robust tools
● Free to use
● Fully open source
● Using open standards
● Growing user community
● Quickly evolving
Webrecorder Team
Dragan Espenschied
Rhizome's Digital Conservator
Ilya Kreymer
Lead developer & Creator
Mark Beasley
Senior Front-End Developer
Pat Shiu
Design Lead
Anna Perricci
Associate Director of
Strategic Partnerships
High fidelity web collecting (archiving)
• Capture any web page loaded in the browser
• Archive interactive content (only available after user input)
• Same system for recording and playback (web browser)
Collecting at human scale
• Webrecorder: web archiving for all!
• Collecting is done by a person via a web browser one page at a time
• Can import and augment collections created by crawlers
The payoff for careful capture is an
accurate representation of the original
Record=capture / replay=browse
• Webrecorder.io is used to make interactive captures of web pages as users
see them while archiving, but is not a screen recording software that can play
recordings back like a video
• Replay means you can access the content captured in the web archive and
browse it interactively like the live web (or a bit like a slideshow with arrow
button)
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Browsing a bound archive
• Each collection is a separate unit so at this time you can only navigate
content within one collection at a time
• This gives tight curatorial control though the boundaries of the collection can
sometimes be hit quickly
Patching with Open Web Archives & Live Web
• What is Patching? – Filling in missing resources in an archive using other
sources
• Other sources = other web archives and/or the live web
Webrecorder: Web Archiving for All!
Importing Content from Open Web Archive
• Extraction is the importation of content from other open web archives
• Archives included in public-web-archives repository can be extracted from
Preconfigured browsers
• Using a preconfigured browsers to capture and replay web content that may
not be supported in current or future web browsers
• e.g. Java applets or Flash
• Access with a preconfigured browser ensures greater faithfulness to the
original look and feel of web pages
• Browsers use HTTP proxy mode = even better fidelity
Preconfigured browsers
Recording and replaying Flash content
What about social media?
• Webrecorder can capture content from social media sites, and works
especially well with Instagram and Twitter
• Some websites deliver content individualized for each user
• Webrecorder can record the content you see when you are logged in to a
social media profile
Account login is optional
• One does not need to login to use Webrecorder to capture web content
(though we do recommend it!)
• Users can download the captures right away (as a WARC file) & save
them locally
• For continued access to archived content online & to be able to add to a
collection, one must create and log in to a free account
Access & sharing options
• User created collections can be kept private or made public through
Webrecorder.io
• Public collections can be viewed by anyone
• Finer access controls are being considered
Webrecorer Sample
Collections
https://webrecorder.io/wrsc
Webrecorder Player
• Desktop application for OSX, Windows and Linux
• User friendly application to browse any web archive (saved in standard WARC
format)
• Can browse web archives offline, no internet connection required!
Using Webrecorder
Hosted Service
Sign-up at https://webrecorder.io/ for a free account
Run your own Webrecorder instance
Install from https://github.com/webrecorder/webrecorder-deploy
Use Webrecorder Player on your Desktop
Download from https://github.com/webrecorder/webrecorderplayer-electron
Toosheh project
https://www.netfreedompioneers.org/toosheh1
The (Obama) White House
Social Media Archive
http://archive.rhizome.org/narrative-
archives/thxobama.html
Net Art Anthology: Marisa Olson
https://anthology.rhizome.org/marisa-s-american-idol-audition-training-blog
Rhizome net art Microgrants
http://rhizome.org/editorial/2017/jul/18/open-call-rhizome-microgrants-2017/
Ethics & Archiving the Web
Hope to see you tomorrow!
A project by
with generous support from
Thank you

More Related Content

PDF
Slides for Web Archiving in the Heritage and Archive Sectors
PPTX
Webrecorder: Building, Maintaining & Growing
PDF
No one said this would be easy: Sustaining Webrecorder as a robust web archiv...
PPTX
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
PDF
Are you wiki?
PPTX
Multisite Implementation Within Nonprofit Organization by Wigid Triyadi
PPTX
PPT
Sgmp Wiki - GenNxt Wiki Concepts
Slides for Web Archiving in the Heritage and Archive Sectors
Webrecorder: Building, Maintaining & Growing
No one said this would be easy: Sustaining Webrecorder as a robust web archiv...
Human Scale Web Collecting for Individuals and Institutions (Webrecorder Work...
Are you wiki?
Multisite Implementation Within Nonprofit Organization by Wigid Triyadi
Sgmp Wiki - GenNxt Wiki Concepts

What's hot (7)

PPTX
AtoM Community Update 2016
PPTX
Introducing Access to Memory
ODP
Open sourcery
PPSX
Html5
PPTX
Wikis 1
PDF
Open Container Initiative Update
PDF
National Archives of Norway - AtoM and Archivematica intro workshop
AtoM Community Update 2016
Introducing Access to Memory
Open sourcery
Html5
Wikis 1
Open Container Initiative Update
National Archives of Norway - AtoM and Archivematica intro workshop
Ad

Similar to Webrecorder: Web Archiving for All! (20)

PPTX
Archiving for Now and Later - workshop at Common Field Convening 2019
PPTX
Social Contexts of Web Archiving: Collaboration and Ethical Collection Building
PPTX
[Webinar] Discover eZ platform v2.4
PPTX
Alfresco overview EDM
PPTX
Web archiving challenges and opportunities
PDF
A modern web centric development-deployment environment
PDF
Something That Works: Implementing ResourceSpace Open Source Digital Asset Ma...
PDF
Linuxcon secureefficientcontainerimagemanagementharbor
PDF
Digital Archives on a Dime
PDF
The Hellenic Aggregator
PPTX
Static Site Generators - Developing Websites in Low-resource Condition
PPSX
Olympya web-tools 2011
PDF
Repository Management with JFrog Artifactory
PPT
Web browser architecture.87 to 88
PPTX
Urbanesia - Development History
PDF
Preparing your dockerised application for production deployment
PPTX
Web browser architecture.pptx
PPT
Archiving In Content Management - A Deeper Look
PPTX
CONTENTdm concept of library Science.pptx
PPT
Web Archiving Intro (circa 2015)
Archiving for Now and Later - workshop at Common Field Convening 2019
Social Contexts of Web Archiving: Collaboration and Ethical Collection Building
[Webinar] Discover eZ platform v2.4
Alfresco overview EDM
Web archiving challenges and opportunities
A modern web centric development-deployment environment
Something That Works: Implementing ResourceSpace Open Source Digital Asset Ma...
Linuxcon secureefficientcontainerimagemanagementharbor
Digital Archives on a Dime
The Hellenic Aggregator
Static Site Generators - Developing Websites in Low-resource Condition
Olympya web-tools 2011
Repository Management with JFrog Artifactory
Web browser architecture.87 to 88
Urbanesia - Development History
Preparing your dockerised application for production deployment
Web browser architecture.pptx
Archiving In Content Management - A Deeper Look
CONTENTdm concept of library Science.pptx
Web Archiving Intro (circa 2015)
Ad

More from Anna Perricci (20)

PDF
Introduction to Web Archiving
PPTX
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
PPTX
Ethics & Archiving the Web - presentation at ACH 2019 closing plenary
PPTX
Webrecorder: Web Archiving for All!
PPTX
Archiver le web pour les artistes : Atelier Webrecorder
PPTX
Dismantling Silos to Build Robust Shared Print Projects
PPTX
Retention Modeling for the Eastern Academic Scholars' Trust (EAST)
PPTX
Information sharing about Columbia University Library’s recent web archiving ...
PPTX
Collaboration and Cash: Web Archiving Incentive Awards
PDF
Contemporary Composers Web Archive (CCWA): Progress in Collaboratively Collec...
PDF
Collaborative Web Archiving with Ivy Plus / Borrow Direct
PDF
Building Web Archiving Collaborations to Save [More of] the Web
PPTX
Establishing and growing a multi-institutional web archiving collaboration f...
PPTX
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
PDF
Web archiving collaborations: a presentation for colleagues working in the Li...
PDF
Lightning talk on MARC records for the Contemporary Composers Web Archive pre...
PDF
SAA Web Archiving Roundtable Education Needs Assessment Survey Results
PPTX
METRO Conference 2014: How collaboration can save [more of] the web: recent p...
PDF
ACRL/NY 2013 poster: Assessment of the Effectiveness of the Human Rights Web ...
PDF
Best Practices Exchange 2013: How collaboration can save [more of] the web: r...
Introduction to Web Archiving
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
Ethics & Archiving the Web - presentation at ACH 2019 closing plenary
Webrecorder: Web Archiving for All!
Archiver le web pour les artistes : Atelier Webrecorder
Dismantling Silos to Build Robust Shared Print Projects
Retention Modeling for the Eastern Academic Scholars' Trust (EAST)
Information sharing about Columbia University Library’s recent web archiving ...
Collaboration and Cash: Web Archiving Incentive Awards
Contemporary Composers Web Archive (CCWA): Progress in Collaboratively Collec...
Collaborative Web Archiving with Ivy Plus / Borrow Direct
Building Web Archiving Collaborations to Save [More of] the Web
Establishing and growing a multi-institutional web archiving collaboration f...
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
Web archiving collaborations: a presentation for colleagues working in the Li...
Lightning talk on MARC records for the Contemporary Composers Web Archive pre...
SAA Web Archiving Roundtable Education Needs Assessment Survey Results
METRO Conference 2014: How collaboration can save [more of] the web: recent p...
ACRL/NY 2013 poster: Assessment of the Effectiveness of the Human Rights Web ...
Best Practices Exchange 2013: How collaboration can save [more of] the web: r...

Recently uploaded (20)

PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PPTX
presentation_pfe-universite-molay-seltan.pptx
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PDF
An introduction to the IFRS (ISSB) Stndards.pdf
PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PDF
Cloud-Scale Log Monitoring _ Datadog.pdf
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PDF
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
PPT
tcp ip networks nd ip layering assotred slides
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PDF
Testing WebRTC applications at scale.pdf
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
innovation process that make everything different.pptx
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
Design_with_Watersergyerge45hrbgre4top (1).ppt
presentation_pfe-universite-molay-seltan.pptx
Tenda Login Guide: Access Your Router in 5 Easy Steps
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
Unit-1 introduction to cyber security discuss about how to secure a system
Slides PDF The World Game (s) Eco Economic Epochs.pdf
An introduction to the IFRS (ISSB) Stndards.pdf
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
Cloud-Scale Log Monitoring _ Datadog.pdf
introduction about ICD -10 & ICD-11 ppt.pptx
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
tcp ip networks nd ip layering assotred slides
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
Decoding a Decade: 10 Years of Applied CTI Discipline
Testing WebRTC applications at scale.pdf
522797556-Unit-2-Temperature-measurement-1-1.pptx
innovation process that make everything different.pptx
Introuction about ICD -10 and ICD-11 PPT.pptx
SAP Ariba Sourcing PPT for learning material
Module 1 - Cyber Law and Ethics 101.pptx

Webrecorder: Web Archiving for All!

  • 1. Webrecorder: Web archiving for all! ARLIS/NA February 26 & 27, 2018 Anna Perricci Webrecorder / Rhizome
  • 2. Web archiving fundamentals • Web archiving: the process of selecting, capturing, saving and making accessible select content available online (e.g. websites) • Web archiving is a new and growing field and we need people with new ideas and evolving skill sets • Web archiving has a distinct lack of ‘silver bullets’ or comprehensive one-size- fits-all solutions
  • 3. About Webrecorder Create high-fidelity, interactive captures of any web pages you browse http://webrecorder.io Webrecorder Player App
  • 4. A project by with generous support from Webrecorder Project ● Robust tools ● Free to use ● Fully open source ● Using open standards ● Growing user community ● Quickly evolving
  • 5. Webrecorder Team Dragan Espenschied Rhizome's Digital Conservator Ilya Kreymer Lead developer & Creator Mark Beasley Senior Front-End Developer Pat Shiu Design Lead Anna Perricci Associate Director of Strategic Partnerships
  • 6. High fidelity web collecting (archiving) • Capture any web page loaded in the browser • Archive interactive content (only available after user input) • Same system for recording and playback (web browser)
  • 7. Collecting at human scale • Webrecorder: web archiving for all! • Collecting is done by a person via a web browser one page at a time • Can import and augment collections created by crawlers The payoff for careful capture is an accurate representation of the original
  • 8. Record=capture / replay=browse • Webrecorder.io is used to make interactive captures of web pages as users see them while archiving, but is not a screen recording software that can play recordings back like a video • Replay means you can access the content captured in the web archive and browse it interactively like the live web (or a bit like a slideshow with arrow button)
  • 20. Browsing a bound archive • Each collection is a separate unit so at this time you can only navigate content within one collection at a time • This gives tight curatorial control though the boundaries of the collection can sometimes be hit quickly
  • 21. Patching with Open Web Archives & Live Web • What is Patching? – Filling in missing resources in an archive using other sources • Other sources = other web archives and/or the live web
  • 23. Importing Content from Open Web Archive • Extraction is the importation of content from other open web archives • Archives included in public-web-archives repository can be extracted from
  • 24. Preconfigured browsers • Using a preconfigured browsers to capture and replay web content that may not be supported in current or future web browsers • e.g. Java applets or Flash • Access with a preconfigured browser ensures greater faithfulness to the original look and feel of web pages • Browsers use HTTP proxy mode = even better fidelity
  • 26. Recording and replaying Flash content
  • 27. What about social media? • Webrecorder can capture content from social media sites, and works especially well with Instagram and Twitter • Some websites deliver content individualized for each user • Webrecorder can record the content you see when you are logged in to a social media profile
  • 28. Account login is optional • One does not need to login to use Webrecorder to capture web content (though we do recommend it!) • Users can download the captures right away (as a WARC file) & save them locally • For continued access to archived content online & to be able to add to a collection, one must create and log in to a free account
  • 29. Access & sharing options • User created collections can be kept private or made public through Webrecorder.io • Public collections can be viewed by anyone • Finer access controls are being considered
  • 31. Webrecorder Player • Desktop application for OSX, Windows and Linux • User friendly application to browse any web archive (saved in standard WARC format) • Can browse web archives offline, no internet connection required!
  • 32. Using Webrecorder Hosted Service Sign-up at https://webrecorder.io/ for a free account Run your own Webrecorder instance Install from https://github.com/webrecorder/webrecorder-deploy Use Webrecorder Player on your Desktop Download from https://github.com/webrecorder/webrecorderplayer-electron
  • 34. The (Obama) White House Social Media Archive http://archive.rhizome.org/narrative- archives/thxobama.html
  • 35. Net Art Anthology: Marisa Olson https://anthology.rhizome.org/marisa-s-american-idol-audition-training-blog
  • 36. Rhizome net art Microgrants http://rhizome.org/editorial/2017/jul/18/open-call-rhizome-microgrants-2017/
  • 38. Hope to see you tomorrow!
  • 39. A project by with generous support from Thank you

Editor's Notes

  • #20: Demo this in the browser. Use as an example: http://howtoappearofflineforever.online/
  • #22: Show patching in action.
  • #24: Show extracting in action.
  • #27: Flash-based interactive documentary – At Home – from NFB (National Filmboard of Canada). Demo example.
  • #28: Demo recording of Twitter.
  • #32: Demo with the player – later on in the Workshop.