SlideShare a Scribd company logo
Web 1.0, Web 2.0 and Digital Preservation Brian Kelly UKOLN University of Bath Bath, UK Email [email_address] UKOLN is supported by: http://www.ukoln.ac.uk/web-focus/events/workshops/mla-london-2008-07/ About This Talk A recap of Web preservation challenges and approaches to the preservation of Web content. But will use of Web 2.0 services lead to new preservation concerns?  How much of a concern is this? And what steps can be taken to minimise the risks of data loss? This work is licensed under a Attribution-NonCommercial-ShareAlike 2.5 licence (but note caveat) Resources bookmarked using ' mla-london-2008-07 ' tag
Contents What’s the Problem? Disappearing domains Disappearing data Broken services Preservation and Web 1.0 Case studies Toolkit Preservation and Web 2.0 Third party services Communications rather than resources
Is Web Site Preservation An Issue? Digital Resources Don't Rot Digital resources (images, video, software, Web sites, …) don't degrade due to environmental factors.  This is a key difference with physical resources. Web sites are made from various digital resources: HTML pages, GIF, JPEG, etc. image files, PDF resources, software (scripts, JavaScript, etc.) These won't degrade so why is Web site preservation an issue? Isn't the fact that old Web sites won't disappear and may be embarrassing more of a challenge? The Problem
Digital Resources Do Rot! In fact digital resource do 'rot': Operating systems are upgraded and existing applications case to work Security holes are identified and there is a need to install patches Resources may be dependent on external resources (e.g. links, news feeds, …) which may disappear Resources may be hosted by external services and there is a need for ongoing funding for the hosting … The Problem
Preservation In A Web 1.0 World The Web 1.0 environment: Static content Managed by organisation The challenges: Changes in funding “Mothballing Legal issues (not covered!) … Web 1.0
The Nightmare Scenario To be avoided: The funding finishes Project staff leave, partnership dissolves Hosting agency upgrades operating system, resulting in scripts to access resources from backend database are broken User finds page with invitation to project launch and travels to meeting. Unfortunately the event took place in 2002 Invoice for domain name is not paid, as administrator has left  Web site domain taken over by porn company Prime Minister picks up pen containing project URL and visits pornographic Web site Web 1.0
It Has Happened Webtechs.com Software company which hosted early HTML validation service In 1998/99 confusion over payment of domain name March 1999 company receives many messages saying validation service is now a porn site Over 30,000 links to Web site! Sept 1999 porn company agrees to sell domain name back to Webtech Web 1.0
Technical Issues Standards And Formats Has the Web site been designed using open standards, which should help future-proofing? Have proprietary formats been used (for which backwards compatibility may not be considered) Architecture & Implementation Has the technical architecture of the Web site been documented? Can I continue to use technical systems after funding has finished  Web 1.0 Note that in reality content owners may have little control over the formats used and the technical architecture.
Content Issues Accuracy: Is the content of my Web site accurate today – and tomorrow Could the content of my Web site be misleading Usability: Are links working today – and tomorrow Legal: Is my Web site legal today (accessibility; copyright; defamation; IPR; …)? Will my Web site be legal tomorrow, if new legislation is enacted? Web 1.0 Note that in reality rather than necessarily taking a safe position over, say, legal issues, a risk assessment approach may be taken
Mothballing Your Web Site (1) Before funding finishes you should take steps for the mothballing of your Web site: Run a link check across the Web site.  Fix broken internal links and as many external links as is reasonable.  Document the link report. Run HTML (and CSS) validation checks across the Web site.  Fix as many invalid pages as is reasonable.  Document the findings.  Run an accessibility check across the Web site.  Fix as many inaccessible pages as is reasonable.  Document the findings.  This should not be an onerous task if you have following best practices. Note that errors found later occurred after your funding finished. Web 1.0
Mothballing Your Web Site (2) You should also address technical areas: Remove any backend scripts which are no longer needed (e.g. online booking forms for old events). Remember that scripts, etc. are liable to go wrong.  Ensure that applications are configured to break gracefully and provide meaningful errors: The config.ssi is missing.  This should be reported to the systems administrator  (email administrator@foo.org.uk or ring +44 020 123 123.  Please provide the URL of the broken page and the project name) Apache error 6963 Web 1.0
Mothballing Your Web Site (3) You should also address the content of your Web site: Clarify the status of the Web site on the home page. Ensure the tense of the content reflects the position i.e. don't say &quot; This project will … &quot; Ensure that contact details will remain valid i.e. provide generic email addresses not an individuals Remember that many users will arrive deep in your Web site (e.g. using Google).  If necessary use CSS to flag all pages with a watermark  This Web site is no longer maintained.  See home page for details See <http://www.ukoln.ac.uk/qa-focus/ documents/briefings/briefing-04/> Web 1.0
Case Study 1 -  Exploit Interactive Exploit Interactive : EU-funded ejournal available at  < http://www.exploit-lib.org/ > Funded from Jan 1999 – Dec 2000 Web site is still hosted locally Issues: Should we continue hosting domain after 3 years? What is the cost of this (domain name registration, disk storage, system maintenance)? Web 1.0
Case Study -  Exploit Interactive Findings : Disk storage is 4Gb (large proportion is log files) A 30 Gb disk drive costs ~ £40 It was decide to run an annual link check of the Web site.  Although there were broken links to external sites, the internal links all worked. It was estimated that it would take about 30 minutes / year to run a link check and document findings. A policy for the ongoing provision of the Web site was agreed See < http://www.ukoln.ac.uk/qa-focus/ documents/case-studies/case-study-17/ > Web 1.0
Is Web 2.0 Different? How does Web site preservation differ for Web 2.0: Use of 3 rd  party services Emphasis on collaboration and communication, rather than access to resources More complex IPR issues Richer diversity of services Let’s look at: Case study 1 - wikis Case study 2 – blogs Case study 3 – reusing data Case study 4 – comms tools (disposable data) Case study 5 – recording events Web 2.0
Case Study 1: A Public Wiki WetPaint wiki used to support UKOLN workshop Approaches taken: Open access to all prior to & during event (to minimise barriers to creating content) Access restricted to WetPaint users after event Access later restricted to event organisers Web 2.0 Many aspects of Web site curation are to do with implementing such best practices, rather than implementing technical solutions
Case Study 1: A Public Wiki WetPaint provides an option for backing up data. A zipped file of the pages can be saved for storing on a locally managed service. Web 2.0 There are limitations in this particular service (poor quality HTML, internal links don’t work, …) But this does illustrate an approach which can be taken.
Case Study 2: Blog Migration How might you migrate the contents of a blog (e.g. you’re leaving college)? This question was raised by Casey Leaver, shortly before leaving Warwick University Web 2.0
Case Study 2: Blog Migration She migrated her blog from blogs at Warwick Univ to Wordpress Web 2.0 Note, though, that not all data was transferred (e.g. title, but not contents) so there’s a need to check transfer mechanisms
Case Study 2: Blog Migration A backup of UK Web Focus blog is available on Vox: Manual migration of new posts every few weeks Only migrates text Doesn’t migrate images, embedded videos, internal links, comments, … Web 2.0 Migration of blogs, wikis, etc. is not currently an easy task  
Case Study 3: Reusing Data Blog post in Facebook. Possible concerns: It’s not sustainable You’ve given ownership to Facebook Web 2.0 Response: The post is managed in WordPress; Fb displays copy (to new audience) Fb don’t claim ownership – they claim rights to make money (e.g. through ads)
Case Study 4: Disposable Data Twitter – example of a micro-blogging application Facebook status messages is another related example Web 2.0 Issues: Is the Twitter service will sustainable over a long period? What will happen to the data? What about the IPR for ‘tweets’? …
Case Study 4: Disposable Data Many twitterers regard their tweets as disposal I tend to use Twitter as a ‘virtual water cooler’ – sharing gossip, jokes and occasional work-related information with (mainly) people I know Web 2.0 You could make use of clients which manage your tweets (e.g. treat like email) But you should develop your policies first, prior to exploring technologies
Case Study 4: Disposable Data Skype  (or your preferred VoIP application) are growing in popularity Web 2.0 Issues: Is the digital data (the call) preserved? What about the video and the IM chats? Possible responses: Am I bovvered? I didn’t bother with analogue phones, why should I worry now?
Case Study 5: Digitized Talks Seminar on Open Science given at UKOLN in Feb 2008. Video clip of opening 10 mins taken & uploaded to YouTube Issues: Privacy Quality Benefits Long term access Benefits identified – now how do we seek to deploy recordings of seminars, conferences, etc. on a more systematic basis? This is work in progress – but see IWMW 2007 videos
Role Of The Internet Archive Can we leave everything to the Internet Archive? Has role to play in Web 1.0 Seems to archive some public blogs May not access images or other embedded content Still has limitations (cf. UCE/BCU) Can’t (currently) access Facebook pages, for example Web 2.0
Role Of The Internet Archive The Open University has a presence in Facebook.  In Feb 2008: 5,411 fans 705 wall posts 31 discussion topics Is anyone: Recording the history? Curating the data Managing possible risks? Web 2.0
The Research Challenges Some thoughts: Preservation of Web sites in known to be difficult Additional difficulties in a Web 2.0 world Complexities include technical challenges and business issues However: Is avoiding Web 2.0 a realistic answer? There may be some simple processes which may help Web 2.0
Questions

More Related Content

PPT
“Library 2.0: Balancing the Risks and Benefits to Maximise the Dividends”
PPT
Realising Potential Of Web 2 0
PPTX
WEB 1.0 - 3.0
PDF
Web 2.0 and Web 3.0
DOCX
Document of presentation(web 3.0)(part 2)
PPTX
Web 1.0, 2.0 & 3.0
PDF
Why Portability matters (full presentation)
DOCX
Part b
“Library 2.0: Balancing the Risks and Benefits to Maximise the Dividends”
Realising Potential Of Web 2 0
WEB 1.0 - 3.0
Web 2.0 and Web 3.0
Document of presentation(web 3.0)(part 2)
Web 1.0, 2.0 & 3.0
Why Portability matters (full presentation)
Part b

What's hot (20)

PPTX
Internet and computer are your future
PPTX
Web 1.0
PDF
Tics Article 6 Ideas
PPT
Web 2.0: Implications For The Cultural Heritage Sector
PPTX
Web 4.0 and beyond?
PPT
Web 1.0, Web 2.0 & Web 3.0
PPT
Revolutionising Library Management
PDF
Web 2.0 and New Learning Paradigms
PPTX
What is Web 3.0?
PPT
Linked In Data & Web Content Management Systems by TERMINALFOUR
PPT
Web 2.0 By Nyros Developer
PDF
The Next Big Thing is Web 3.0. Catch It If You Can
ODP
Web 2.0
PPT
Web 2.0 Awareness for Section 508
PPT
Web 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
PPT
Comparative study of web 1, Web 2 and Web 3
PPT
Defining Web 2.0 and RIA
PPT
Engaging Virtual Communities: Web 2.0
PPT
PPTX
20110110 ARMA Dallas Managing Web 2.0 Records: Facebook, Twitter and Everythi...
Internet and computer are your future
Web 1.0
Tics Article 6 Ideas
Web 2.0: Implications For The Cultural Heritage Sector
Web 4.0 and beyond?
Web 1.0, Web 2.0 & Web 3.0
Revolutionising Library Management
Web 2.0 and New Learning Paradigms
What is Web 3.0?
Linked In Data & Web Content Management Systems by TERMINALFOUR
Web 2.0 By Nyros Developer
The Next Big Thing is Web 3.0. Catch It If You Can
Web 2.0
Web 2.0 Awareness for Section 508
Web 1.0 to Web 3.0 - Evolution of the Web and its Various Challenges
Comparative study of web 1, Web 2 and Web 3
Defining Web 2.0 and RIA
Engaging Virtual Communities: Web 2.0
20110110 ARMA Dallas Managing Web 2.0 Records: Facebook, Twitter and Everythi...
Ad

Viewers also liked (20)

PPT
Building (and Sustaining) Impact for your Web Resource
PPTX
F1: Summary: Future Technologies and Their Applications
PPT
IWMW 2012: Welcome
PPTX
Let's Predict the Future: D1 Agile Thinking
PPT
Mobile Technologies: Why Library Staff Should be Interested
PPT
Delivering Information: Document vs. Content
PPT
Blogging practices to support project work
PPT
Accessibility 2.0: Blended Learning For Blended Accessibility
PPTX
Wikipedia workshop, SpotOn 2013 Conference
PPT
What Uses for New Digital Technologies?
PPT
Community Led Activities
PPTX
Accessibility is Primarily About People and Processes, Not Digital Resources!
PPTX
Managing Your Research Profile
PPTX
IWMW 2014: Welcome
PPTX
Digital Life Beyond The Institution
PPTX
Major Technology Trends that will Impact Library Services?
PPT
Web 2.0: What Can It Offer The Research Community?
PPT
Web Preservation in a Web 2.0 Environment
PPT
Exploiting The Potential Of Blogs and Social Networks Introduction
PPTX
D2: Group Exercise: Future Technologies and Their Applications
Building (and Sustaining) Impact for your Web Resource
F1: Summary: Future Technologies and Their Applications
IWMW 2012: Welcome
Let's Predict the Future: D1 Agile Thinking
Mobile Technologies: Why Library Staff Should be Interested
Delivering Information: Document vs. Content
Blogging practices to support project work
Accessibility 2.0: Blended Learning For Blended Accessibility
Wikipedia workshop, SpotOn 2013 Conference
What Uses for New Digital Technologies?
Community Led Activities
Accessibility is Primarily About People and Processes, Not Digital Resources!
Managing Your Research Profile
IWMW 2014: Welcome
Digital Life Beyond The Institution
Major Technology Trends that will Impact Library Services?
Web 2.0: What Can It Offer The Research Community?
Web Preservation in a Web 2.0 Environment
Exploiting The Potential Of Blogs and Social Networks Introduction
D2: Group Exercise: Future Technologies and Their Applications
Ad

Similar to Web 1.0, Web 2.0 and Digital Preservation (20)

PPT
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
PPT
MS PowerPoint format
PPT
JISC-PoWR Project: Web 1.0 Preservation
PPT
Preservation of Web Resources: The JISC PoWR Project
PPT
The JISC-PoWR Handbook - Identifying Web Issues (Richard Davis, ULCC)
PPT
Preservation for the Next Generation
PPT
Challenges for Web Resource Preservation, Marieke Guy, UKOLN
PPTX
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
PDF
Introduction to Web Archiving
PPT
Library 2.0: Opportunities and Challenges
PPT
How Recent Web Developments Offer Low-cost Opportunities for Service Development
PPTX
Making Sense of a Rapidly Changing Technical Environment
PPT
Benefits of the Social Web: How Can It Help My Museum?
PPT
Exploiting The Social Aspects Of Web 2.0 In HE Institutions
PPT
Bubbles and Easter eggs - Museum Pecha Kucha
PPT
280 eileen fenton presentation
PPT
Creating and Maintaining Web Archives
PPTX
Social web for Tech Comm, STC March 2013
PPT
A Risks And Opportunities Framework For Archives 2.0
PPTX
Cultural heritage collections in a web 2
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
MS PowerPoint format
JISC-PoWR Project: Web 1.0 Preservation
Preservation of Web Resources: The JISC PoWR Project
The JISC-PoWR Handbook - Identifying Web Issues (Richard Davis, ULCC)
Preservation for the Next Generation
Challenges for Web Resource Preservation, Marieke Guy, UKOLN
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
Introduction to Web Archiving
Library 2.0: Opportunities and Challenges
How Recent Web Developments Offer Low-cost Opportunities for Service Development
Making Sense of a Rapidly Changing Technical Environment
Benefits of the Social Web: How Can It Help My Museum?
Exploiting The Social Aspects Of Web 2.0 In HE Institutions
Bubbles and Easter eggs - Museum Pecha Kucha
280 eileen fenton presentation
Creating and Maintaining Web Archives
Social web for Tech Comm, STC March 2013
A Risks And Opportunities Framework For Archives 2.0
Cultural heritage collections in a web 2

More from lisbk (20)

PPTX
Introduction to Cloud Storage
PPTX
Wyld Morris: Zoom summary for mtg 6
PPTX
Wyld Morris: Zoom summary for mtg 3
PPTX
Predicting and Preparing For Emerging Learning Technologies
PPTX
Web Preservation, or Managing your Organisation’s Online Presence After the O...
PPTX
G1 Conclusions
PPTX
F1 Making the Case
PPTX
E1 Scenario Planning
PPTX
D1: The NMC Methodology
PPTX
C1: Future Technology Detecting Tools & Techniques
PPTX
B1: Exploring emerging technologies
PPTX
Preparing for the Future: Technological Challenges and Beyond A1 Introduction
PPTX
Digital Life Beyond The Institution
PPTX
Developing an Ethical Approach to Using Wikipedia as the Front Matter to all ...
PDF
The Agile University
PPTX
Welcome to IWMW 2015
PPTX
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
PPTX
Preparing Our Users For Digital Life Beyond the Institution
PPTX
Why and how librarians should engage with Wikipedia
PPTX
Working with Wikimedia Serbia
Introduction to Cloud Storage
Wyld Morris: Zoom summary for mtg 6
Wyld Morris: Zoom summary for mtg 3
Predicting and Preparing For Emerging Learning Technologies
Web Preservation, or Managing your Organisation’s Online Presence After the O...
G1 Conclusions
F1 Making the Case
E1 Scenario Planning
D1: The NMC Methodology
C1: Future Technology Detecting Tools & Techniques
B1: Exploring emerging technologies
Preparing for the Future: Technological Challenges and Beyond A1 Introduction
Digital Life Beyond The Institution
Developing an Ethical Approach to Using Wikipedia as the Front Matter to all ...
The Agile University
Welcome to IWMW 2015
BS 8878: Systematic Approaches to Documenting Web Accessibility Policies and ...
Preparing Our Users For Digital Life Beyond the Institution
Why and how librarians should engage with Wikipedia
Working with Wikimedia Serbia

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
cuic standard and advanced reporting.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Machine learning based COVID-19 study performance prediction
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Electronic commerce courselecture one. Pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Chapter 3 Spatial Domain Image Processing.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Empathic Computing: Creating Shared Understanding
Machine learning based COVID-19 study performance prediction
Dropbox Q2 2025 Financial Results & Investor Presentation
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Electronic commerce courselecture one. Pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Web 1.0, Web 2.0 and Digital Preservation

  • 1. Web 1.0, Web 2.0 and Digital Preservation Brian Kelly UKOLN University of Bath Bath, UK Email [email_address] UKOLN is supported by: http://www.ukoln.ac.uk/web-focus/events/workshops/mla-london-2008-07/ About This Talk A recap of Web preservation challenges and approaches to the preservation of Web content. But will use of Web 2.0 services lead to new preservation concerns? How much of a concern is this? And what steps can be taken to minimise the risks of data loss? This work is licensed under a Attribution-NonCommercial-ShareAlike 2.5 licence (but note caveat) Resources bookmarked using ' mla-london-2008-07 ' tag
  • 2. Contents What’s the Problem? Disappearing domains Disappearing data Broken services Preservation and Web 1.0 Case studies Toolkit Preservation and Web 2.0 Third party services Communications rather than resources
  • 3. Is Web Site Preservation An Issue? Digital Resources Don't Rot Digital resources (images, video, software, Web sites, …) don't degrade due to environmental factors. This is a key difference with physical resources. Web sites are made from various digital resources: HTML pages, GIF, JPEG, etc. image files, PDF resources, software (scripts, JavaScript, etc.) These won't degrade so why is Web site preservation an issue? Isn't the fact that old Web sites won't disappear and may be embarrassing more of a challenge? The Problem
  • 4. Digital Resources Do Rot! In fact digital resource do 'rot': Operating systems are upgraded and existing applications case to work Security holes are identified and there is a need to install patches Resources may be dependent on external resources (e.g. links, news feeds, …) which may disappear Resources may be hosted by external services and there is a need for ongoing funding for the hosting … The Problem
  • 5. Preservation In A Web 1.0 World The Web 1.0 environment: Static content Managed by organisation The challenges: Changes in funding “Mothballing Legal issues (not covered!) … Web 1.0
  • 6. The Nightmare Scenario To be avoided: The funding finishes Project staff leave, partnership dissolves Hosting agency upgrades operating system, resulting in scripts to access resources from backend database are broken User finds page with invitation to project launch and travels to meeting. Unfortunately the event took place in 2002 Invoice for domain name is not paid, as administrator has left Web site domain taken over by porn company Prime Minister picks up pen containing project URL and visits pornographic Web site Web 1.0
  • 7. It Has Happened Webtechs.com Software company which hosted early HTML validation service In 1998/99 confusion over payment of domain name March 1999 company receives many messages saying validation service is now a porn site Over 30,000 links to Web site! Sept 1999 porn company agrees to sell domain name back to Webtech Web 1.0
  • 8. Technical Issues Standards And Formats Has the Web site been designed using open standards, which should help future-proofing? Have proprietary formats been used (for which backwards compatibility may not be considered) Architecture & Implementation Has the technical architecture of the Web site been documented? Can I continue to use technical systems after funding has finished Web 1.0 Note that in reality content owners may have little control over the formats used and the technical architecture.
  • 9. Content Issues Accuracy: Is the content of my Web site accurate today – and tomorrow Could the content of my Web site be misleading Usability: Are links working today – and tomorrow Legal: Is my Web site legal today (accessibility; copyright; defamation; IPR; …)? Will my Web site be legal tomorrow, if new legislation is enacted? Web 1.0 Note that in reality rather than necessarily taking a safe position over, say, legal issues, a risk assessment approach may be taken
  • 10. Mothballing Your Web Site (1) Before funding finishes you should take steps for the mothballing of your Web site: Run a link check across the Web site. Fix broken internal links and as many external links as is reasonable. Document the link report. Run HTML (and CSS) validation checks across the Web site. Fix as many invalid pages as is reasonable. Document the findings. Run an accessibility check across the Web site. Fix as many inaccessible pages as is reasonable. Document the findings. This should not be an onerous task if you have following best practices. Note that errors found later occurred after your funding finished. Web 1.0
  • 11. Mothballing Your Web Site (2) You should also address technical areas: Remove any backend scripts which are no longer needed (e.g. online booking forms for old events). Remember that scripts, etc. are liable to go wrong. Ensure that applications are configured to break gracefully and provide meaningful errors: The config.ssi is missing. This should be reported to the systems administrator (email administrator@foo.org.uk or ring +44 020 123 123. Please provide the URL of the broken page and the project name) Apache error 6963 Web 1.0
  • 12. Mothballing Your Web Site (3) You should also address the content of your Web site: Clarify the status of the Web site on the home page. Ensure the tense of the content reflects the position i.e. don't say &quot; This project will … &quot; Ensure that contact details will remain valid i.e. provide generic email addresses not an individuals Remember that many users will arrive deep in your Web site (e.g. using Google). If necessary use CSS to flag all pages with a watermark This Web site is no longer maintained. See home page for details See <http://www.ukoln.ac.uk/qa-focus/ documents/briefings/briefing-04/> Web 1.0
  • 13. Case Study 1 - Exploit Interactive Exploit Interactive : EU-funded ejournal available at < http://www.exploit-lib.org/ > Funded from Jan 1999 – Dec 2000 Web site is still hosted locally Issues: Should we continue hosting domain after 3 years? What is the cost of this (domain name registration, disk storage, system maintenance)? Web 1.0
  • 14. Case Study - Exploit Interactive Findings : Disk storage is 4Gb (large proportion is log files) A 30 Gb disk drive costs ~ £40 It was decide to run an annual link check of the Web site. Although there were broken links to external sites, the internal links all worked. It was estimated that it would take about 30 minutes / year to run a link check and document findings. A policy for the ongoing provision of the Web site was agreed See < http://www.ukoln.ac.uk/qa-focus/ documents/case-studies/case-study-17/ > Web 1.0
  • 15. Is Web 2.0 Different? How does Web site preservation differ for Web 2.0: Use of 3 rd party services Emphasis on collaboration and communication, rather than access to resources More complex IPR issues Richer diversity of services Let’s look at: Case study 1 - wikis Case study 2 – blogs Case study 3 – reusing data Case study 4 – comms tools (disposable data) Case study 5 – recording events Web 2.0
  • 16. Case Study 1: A Public Wiki WetPaint wiki used to support UKOLN workshop Approaches taken: Open access to all prior to & during event (to minimise barriers to creating content) Access restricted to WetPaint users after event Access later restricted to event organisers Web 2.0 Many aspects of Web site curation are to do with implementing such best practices, rather than implementing technical solutions
  • 17. Case Study 1: A Public Wiki WetPaint provides an option for backing up data. A zipped file of the pages can be saved for storing on a locally managed service. Web 2.0 There are limitations in this particular service (poor quality HTML, internal links don’t work, …) But this does illustrate an approach which can be taken.
  • 18. Case Study 2: Blog Migration How might you migrate the contents of a blog (e.g. you’re leaving college)? This question was raised by Casey Leaver, shortly before leaving Warwick University Web 2.0
  • 19. Case Study 2: Blog Migration She migrated her blog from blogs at Warwick Univ to Wordpress Web 2.0 Note, though, that not all data was transferred (e.g. title, but not contents) so there’s a need to check transfer mechanisms
  • 20. Case Study 2: Blog Migration A backup of UK Web Focus blog is available on Vox: Manual migration of new posts every few weeks Only migrates text Doesn’t migrate images, embedded videos, internal links, comments, … Web 2.0 Migration of blogs, wikis, etc. is not currently an easy task 
  • 21. Case Study 3: Reusing Data Blog post in Facebook. Possible concerns: It’s not sustainable You’ve given ownership to Facebook Web 2.0 Response: The post is managed in WordPress; Fb displays copy (to new audience) Fb don’t claim ownership – they claim rights to make money (e.g. through ads)
  • 22. Case Study 4: Disposable Data Twitter – example of a micro-blogging application Facebook status messages is another related example Web 2.0 Issues: Is the Twitter service will sustainable over a long period? What will happen to the data? What about the IPR for ‘tweets’? …
  • 23. Case Study 4: Disposable Data Many twitterers regard their tweets as disposal I tend to use Twitter as a ‘virtual water cooler’ – sharing gossip, jokes and occasional work-related information with (mainly) people I know Web 2.0 You could make use of clients which manage your tweets (e.g. treat like email) But you should develop your policies first, prior to exploring technologies
  • 24. Case Study 4: Disposable Data Skype (or your preferred VoIP application) are growing in popularity Web 2.0 Issues: Is the digital data (the call) preserved? What about the video and the IM chats? Possible responses: Am I bovvered? I didn’t bother with analogue phones, why should I worry now?
  • 25. Case Study 5: Digitized Talks Seminar on Open Science given at UKOLN in Feb 2008. Video clip of opening 10 mins taken & uploaded to YouTube Issues: Privacy Quality Benefits Long term access Benefits identified – now how do we seek to deploy recordings of seminars, conferences, etc. on a more systematic basis? This is work in progress – but see IWMW 2007 videos
  • 26. Role Of The Internet Archive Can we leave everything to the Internet Archive? Has role to play in Web 1.0 Seems to archive some public blogs May not access images or other embedded content Still has limitations (cf. UCE/BCU) Can’t (currently) access Facebook pages, for example Web 2.0
  • 27. Role Of The Internet Archive The Open University has a presence in Facebook. In Feb 2008: 5,411 fans 705 wall posts 31 discussion topics Is anyone: Recording the history? Curating the data Managing possible risks? Web 2.0
  • 28. The Research Challenges Some thoughts: Preservation of Web sites in known to be difficult Additional difficulties in a Web 2.0 world Complexities include technical challenges and business issues However: Is avoiding Web 2.0 a realistic answer? There may be some simple processes which may help Web 2.0