SlideShare a Scribd company logo
The meaning and value of web
archives for research
Deutsche Nationalbibliothek
28 November 2018
Dr Peter Webster
Webster Research and Consulting
@pj_webster
On the contemporary
religious history of the Web
• ‘Religion in Web history’ in The Sage Handbook of Web History
(2018)
• ‘Technology, ethics and religious language: early Anglophone
Christian reactions to “cyberspace”, Internet Histories 2:3
(2018)
• ‘Rowan Williams, archbishop of Canterbury, and the sharia law
controversy of 2008’ in The Web as History (2017)
• 'Lessons from cross-border religion in the Northern Irish web
sphere….’ in The Historical Web and Digital Humanities. The
case of national web domains (2019)
The web its own archive?
Open UK Web Archive 2004-13 comparison.
@anjacks0n http://britishlibrary.typepad.co.uk/webarchive/2014/10/what-is-still-on-
the-web-after-10-years-of-archiving-.html
Who might use web archives?
• journalists
• activists
• lawyers
• anyone with a stake in a record of the recent
past
Planned disappearance
southtippcoco.ie, captured by archive.org, 4 Jan 2014
Unplanned disappearance
‘Perhaps Stanley will counter that at the
technical end of his period, 2000, the
internet was not what it has become
eighteen years later…. [but] the internet
does not merit an entry in Stanley’s index,
yet it has changed everything.’
Diarmaid MacCulloch, reviewing Brian
Stanley, Christianity in the Twentieth Century
(TLS, Sept 7th
2018)
The emerging discipline of
Web history
Conferences: ReSAW 2015, 2017, 2019
Journal: Internet Histories (2017-)
Method: The Sage Handbook of Web History
(2018); Brügger, The Archived Web (2018)
Case studies: Web History (Peter Lang, 2010);
The Web as History (UCL Press, 2017); Web25
(Peter Lang, 2017)
What is web archiving?
“deliberate and purposive collection and
preservation of web material” (Brügger,
2018)
• very small- or very large-scale
• harvesting, screen capture, file delivery
• public, restricted, or no access
Who are the Web archivists?
• Internet Archive
• national libraries
• corporate archives
• research-driven: universities and individuals
• activists
See: Webster, ‘Existing web archives’, Sage
Handbook of Web History (2018); ‘Towards a
cultural history of web archiving’, Web25 (2017)
National libraries
• 16 of 28 EU member states
• Iceland, Switzerland, Norway also
• Sweden the first (1996)
• some with legal deposit provision: Denmark
(2005); France (2006), UK (2013)
Legal deposit web archiving:
characteristics
• broad domain crawl, plus selective
• definition of the nation varies
• types of content included varies
• access restrictions
Selective harvesting
• in absence of NPLD, based on permissions
• part of the case for obtaining NPLD law
• key resources, eg. government, media
• events: elections, Olympics, Eurovision
• themes: political extremism, climate change
Web archives in the UK
Temporal scope Content scope Access
Open UKWA 2004-present Selective Online
Legal Deposit
UKWA
2013-present Comprehensive
(for UK)
Onsite
JISC UK
Domain Dataset
1996-2013 Comprehensive
(for .uk)
Index only
UK Government
Web Archive
1996-present UK government Online
Parliamentary
Web Archive
2009-present UK parliament Online
Univ. of Oxford 2011-present University sites Online
University-based archiving
As records management
• Bodleian Libraries, Oxford
For research
• Innsbrucker Zeitungsarchiv
• Digital Archive of Chinese Studies [Leiden /
Heidelberg]
The meaning and value of web archives for research
Five analytical strata
• element (paragraph, image, border, menu)
• page
• site
• Web sphere
• the Web as a whole
… each of which has visible and invisible
aspects.
[Brügger, The Archived Web (2018), 31-35]
The Web element
Visible
• circulation of images or memes, embedded
media
Invisible
• Anne Helmond on trackers (Web25)
• Brügger on the hyperlink (Web25)
The Web page: changing aesthetic
gov.ie, captured by archive.org, 15 August 2000
[https://web.archive.org/web/19980129080224/http://www.kbr.be/fr/index.html]
Changing page content over time
Anthony Cocciolo, Information Research 20;3 (2015)
http://www.informationr.net/ir/20-3/paper682.html
Single pages as social and
political evidence
A case study of a public dispute in the UK
about the place of religion in public life:
Webster, ‘Religious discourse in the archived Web: Rowan
Williams, archbishop of Canterbury, and the sharia law
controversy of 2008’ in Brügger and Schroeder (eds), The Web as
History (2017)
[Wikimedia Commons, CC BY SA 2.0, by Brian (of Toronto)]
Rowan Williams and sharia law
[https://web.archive.org/web/20080211003812/http://www.newsoftheworld.co.uk/1002_sharia.shtml]
[ https://web.archive.org/web/20080212010015/http://www.britishblogs.co.uk/categories/sharia-law/ ]
[https://web.archive.org/web/20080214231017/http://community.tigranetworks.co.uk/ ]
Studies on whole sites
Single organisations
Allah.com (Hofheinz in Web History, 2010)
University of Bologna (Nanni, DHQ 11,
2017)
Platforms
Milligan on Geocities (The Web as History,
2017)
Paloque-Berges on Usenet (Web25, 2017)
The Web sphere
Definition: Web materials from more than one
site with a ‘shared event, concept, theme or
geographic area’ (Brügger, 2018)
Two examples: one national, one thematic
The shape of a national web sphere
Anat Ben-David (@anatbd), ‘What does the Web remember of its deleted past? An
archival reconstruction of the former Yugoslav top-level domain’, New Media and
Society, 18:7 (2016)
One island, two states
Counties of Ireland, north and south
(Wikimedia Commons)
CC-BY-SA 3.0
A unique mix of faith and politics?
Ian Paisley and Edward Carson, Stormont (1985)
(Burns Library, Boston College, CC-BY-NC-ND 2.0 via Flickr)
Cross-border religion?
• Historic Christian denominations: RC,
Presbyterian (PCI), Church of Ireland,
Methodist, Baptist
• all organised on an all-Ireland basis
• … spanning two political jurisdictions
• …. and two ccTLDs - .uk and .ie
All-Ireland religion
Church of Ireland dioceses
(CoI, via Wikimedia Commons)
CC-BY-SA 3.0
Research questions
Using link graph data, to ask:
• how does web estate of each church
interact across the border (& between
ccTLDs)?
• are there distinct web spheres for each in
NI and the RoI?
Baptists in Ireland (2016)
• Association of Baptist Churches in Ireland
has 117 congregations: 28 in RoI, 89 in NI
• 8.5k members, community of 20k
• Including independents, 93 in NI and 30 in
RoI
• 28 congregations with domains in RoI, 77 NI
Counties of Northern Ireland
(Map by Maximilian Dörrbecker, CC-BY-SA 2.5)
Where are the congregations?
County County Code % of congregations
(with domains)
Antrim AN 44
Armagh AR 7
Down DO 25
Londonderry LD 10
Tyrone TY 10
Fermanagh FE 4
Where are the domains?
Domains Coverage .uk .com .ie Other
% % % % %
Baptist 101 > 80 40 24 - 36
Baptist
(Antrim)
48 40 31 29
UK Host Link Graph (1996-
2010)
• 2008 | catholic_church.co.uk | catholic_church.ie | 4
• 2001 | belfast_anglican.co.uk | derry_anglican.co.uk | 1
• 2002 | derry_anglican.org.uk | derry_catholic.co.uk | 1
Data in public domain: data.webarchive.org.uk
Coded link graph (NI-to-NI)
• 1999 | AN27 | AR07 | 11
• 2003 | DW13 | LD05 | 17
• 2010 | AN11 | AN21 | 3
Total NI-to-NI edges
Inbound Outbound Internal
AN 248 299 151
AR 56 98 10
DW 125 64 15
LD 52 20 2
Conclusions
• the Baptist web sphere very tightly localised
• … but spread across several TLDs
• little cross-border linkage
• link analysis hard in national web archives
Webster, 'Lessons from cross-border religion in the
Northern Irish web sphere’ in The Historical Web and
Digital Humanities. The case of national web domains
(2019)
Questions ?
Peter Webster
peter@websterresearchconsulting.com
@pj_webster / @WebsterRandC
peterwebster.me
websterresearchconsulting.com

More Related Content

PPTX
JudaicaLink: Linked Data from Jewish Encyclopediae
PDF
LIBER's role in supporting European research libraries
PPTX
ROAD: the ISSN as a matching key to aggregate quality, open access resources
PDF
The OAIS reference model and archaeological data
PPT
UKSG 2014 - ROAD Directory of Open Access Scholarly Resources
PPTX
Estermann wd glam-intro_20181204
PDF
JudaicaLink: Linked Data in the Jewish Studies FID
PPTX
Jackie Raw, Alison Felstead & Svenja Kunze: Adapting to Electronic Legal Deposit
JudaicaLink: Linked Data from Jewish Encyclopediae
LIBER's role in supporting European research libraries
ROAD: the ISSN as a matching key to aggregate quality, open access resources
The OAIS reference model and archaeological data
UKSG 2014 - ROAD Directory of Open Access Scholarly Resources
Estermann wd glam-intro_20181204
JudaicaLink: Linked Data in the Jewish Studies FID
Jackie Raw, Alison Felstead & Svenja Kunze: Adapting to Electronic Legal Deposit

What's hot (15)

PPTX
Princeton University Art Museum IIIF Use Cases, by Cathryn Goodwin - College ...
PPTX
ROAD: the ISSN as a matching key to aggregate quality, open access resources
PDF
Welcome and introduction to the ARIADNE project
PDF
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
PDF
Open Access in Spain
PPT
EHRI Project: Developing a Pan-European Archival Infrastructure for Holocaust...
PDF
Researching Archives and Documents
PDF
Open Access of Research Data - The Present and Future Situation in Germany
PPTX
Concluding Remarks
PPTX
'Scholars Portal: What's Now, What's Next' by Steve Marks
PDF
Eaa2021 476 ways and capacity in archaeological data management in serbia
PDF
Digital history bob shoemaker 28 may 2013
PPTX
Presentation of the OpenAIRE webinars during the Open Access Week 2016
PPS
Towards a Repository for Dutch Development Organizations
PPTX
Discovery, Reuse, Research and Crowdsourcing: IIIF experiences from the NLW
Princeton University Art Museum IIIF Use Cases, by Cathryn Goodwin - College ...
ROAD: the ISSN as a matching key to aggregate quality, open access resources
Welcome and introduction to the ARIADNE project
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
Open Access in Spain
EHRI Project: Developing a Pan-European Archival Infrastructure for Holocaust...
Researching Archives and Documents
Open Access of Research Data - The Present and Future Situation in Germany
Concluding Remarks
'Scholars Portal: What's Now, What's Next' by Steve Marks
Eaa2021 476 ways and capacity in archaeological data management in serbia
Digital history bob shoemaker 28 may 2013
Presentation of the OpenAIRE webinars during the Open Access Week 2016
Towards a Repository for Dutch Development Organizations
Discovery, Reuse, Research and Crowdsourcing: IIIF experiences from the NLW
Ad

Similar to The meaning and value of web archives for research (20)

PPTX
Contemporary web archives ihr
PDF
Understanding cross-border religion in the Irish web
PDF
Prospects and pitfalls in using web archives for research
PDF
The limitations of the ccTLD as a proxy for the national Web: lessons from cr...
PPT
Introduction to British Library digital resources for social scientists
PDF
Religion, social media and the web archive: Peter Webster at International Co...
PPT
Niels Brügger's slides from Digital Conversations event on 26/09/2013
PDF
041018 It Committee Bog Onlyejewish
PDF
Challenges, Choices, Collaboration
PDF
Download full ebook of What are Archives Louise Craven instant download pdf
PDF
Digital contemporary history: sources, tools, methods, issues
PDF
Digital contemporary history: sources, tools, methods, issues
PDF
3e Studiedag Webarchivering - Promise
PPT
Missing links closing talk - with notes
PDF
Download full ebook of What are Archives Louise Craven instant download pdf
PPTX
Faculty Presentation
PPT
Working with the archived web, 1996-2013
PDF
Peter webster interrogating the archived uk web
PDF
Building a Collection of the Historical UK Web for scholarly use
Contemporary web archives ihr
Understanding cross-border religion in the Irish web
Prospects and pitfalls in using web archives for research
The limitations of the ccTLD as a proxy for the national Web: lessons from cr...
Introduction to British Library digital resources for social scientists
Religion, social media and the web archive: Peter Webster at International Co...
Niels Brügger's slides from Digital Conversations event on 26/09/2013
041018 It Committee Bog Onlyejewish
Challenges, Choices, Collaboration
Download full ebook of What are Archives Louise Craven instant download pdf
Digital contemporary history: sources, tools, methods, issues
Digital contemporary history: sources, tools, methods, issues
3e Studiedag Webarchivering - Promise
Missing links closing talk - with notes
Download full ebook of What are Archives Louise Craven instant download pdf
Faculty Presentation
Working with the archived web, 1996-2013
Peter webster interrogating the archived uk web
Building a Collection of the Historical UK Web for scholarly use
Ad

Recently uploaded (20)

PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PPTX
Database Information System - Management Information System
PDF
The Ikigai Template _ Recalibrate How You Spend Your Time.pdf
PPTX
newyork.pptxirantrafgshenepalchinachinane
PPTX
Funds Management Learning Material for Beg
PPTX
Internet___Basics___Styled_ presentation
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PPTX
artificialintelligenceai1-copy-210604123353.pptx
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPTX
t_and_OpenAI_Combined_two_pressentations
PPT
250152213-Excitation-SystemWERRT (1).ppt
PDF
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
PDF
Introduction to the IoT system, how the IoT system works
DOCX
Unit-3 cyber security network security of internet system
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PPTX
SAP Ariba Sourcing PPT for learning material
Module 1 - Cyber Law and Ethics 101.pptx
The New Creative Director: How AI Tools for Social Media Content Creation Are...
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
Power Point - Lesson 3_2.pptx grad school presentation
Database Information System - Management Information System
The Ikigai Template _ Recalibrate How You Spend Your Time.pdf
newyork.pptxirantrafgshenepalchinachinane
Funds Management Learning Material for Beg
Internet___Basics___Styled_ presentation
SASE Traffic Flow - ZTNA Connector-1.pdf
artificialintelligenceai1-copy-210604123353.pptx
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
t_and_OpenAI_Combined_two_pressentations
250152213-Excitation-SystemWERRT (1).ppt
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
Introduction to the IoT system, how the IoT system works
Unit-3 cyber security network security of internet system
Exploring VPS Hosting Trends for SMBs in 2025
SAP Ariba Sourcing PPT for learning material

The meaning and value of web archives for research

  • 1. The meaning and value of web archives for research Deutsche Nationalbibliothek 28 November 2018 Dr Peter Webster Webster Research and Consulting @pj_webster
  • 2. On the contemporary religious history of the Web • ‘Religion in Web history’ in The Sage Handbook of Web History (2018) • ‘Technology, ethics and religious language: early Anglophone Christian reactions to “cyberspace”, Internet Histories 2:3 (2018) • ‘Rowan Williams, archbishop of Canterbury, and the sharia law controversy of 2008’ in The Web as History (2017) • 'Lessons from cross-border religion in the Northern Irish web sphere….’ in The Historical Web and Digital Humanities. The case of national web domains (2019)
  • 3. The web its own archive? Open UK Web Archive 2004-13 comparison. @anjacks0n http://britishlibrary.typepad.co.uk/webarchive/2014/10/what-is-still-on- the-web-after-10-years-of-archiving-.html
  • 4. Who might use web archives? • journalists • activists • lawyers • anyone with a stake in a record of the recent past
  • 7. ‘Perhaps Stanley will counter that at the technical end of his period, 2000, the internet was not what it has become eighteen years later…. [but] the internet does not merit an entry in Stanley’s index, yet it has changed everything.’ Diarmaid MacCulloch, reviewing Brian Stanley, Christianity in the Twentieth Century (TLS, Sept 7th 2018)
  • 8. The emerging discipline of Web history Conferences: ReSAW 2015, 2017, 2019 Journal: Internet Histories (2017-) Method: The Sage Handbook of Web History (2018); Brügger, The Archived Web (2018) Case studies: Web History (Peter Lang, 2010); The Web as History (UCL Press, 2017); Web25 (Peter Lang, 2017)
  • 9. What is web archiving? “deliberate and purposive collection and preservation of web material” (Brügger, 2018) • very small- or very large-scale • harvesting, screen capture, file delivery • public, restricted, or no access
  • 10. Who are the Web archivists? • Internet Archive • national libraries • corporate archives • research-driven: universities and individuals • activists See: Webster, ‘Existing web archives’, Sage Handbook of Web History (2018); ‘Towards a cultural history of web archiving’, Web25 (2017)
  • 11. National libraries • 16 of 28 EU member states • Iceland, Switzerland, Norway also • Sweden the first (1996) • some with legal deposit provision: Denmark (2005); France (2006), UK (2013)
  • 12. Legal deposit web archiving: characteristics • broad domain crawl, plus selective • definition of the nation varies • types of content included varies • access restrictions
  • 13. Selective harvesting • in absence of NPLD, based on permissions • part of the case for obtaining NPLD law • key resources, eg. government, media • events: elections, Olympics, Eurovision • themes: political extremism, climate change
  • 14. Web archives in the UK Temporal scope Content scope Access Open UKWA 2004-present Selective Online Legal Deposit UKWA 2013-present Comprehensive (for UK) Onsite JISC UK Domain Dataset 1996-2013 Comprehensive (for .uk) Index only UK Government Web Archive 1996-present UK government Online Parliamentary Web Archive 2009-present UK parliament Online Univ. of Oxford 2011-present University sites Online
  • 15. University-based archiving As records management • Bodleian Libraries, Oxford For research • Innsbrucker Zeitungsarchiv • Digital Archive of Chinese Studies [Leiden / Heidelberg]
  • 17. Five analytical strata • element (paragraph, image, border, menu) • page • site • Web sphere • the Web as a whole … each of which has visible and invisible aspects. [Brügger, The Archived Web (2018), 31-35]
  • 18. The Web element Visible • circulation of images or memes, embedded media Invisible • Anne Helmond on trackers (Web25) • Brügger on the hyperlink (Web25)
  • 19. The Web page: changing aesthetic gov.ie, captured by archive.org, 15 August 2000
  • 21. Changing page content over time Anthony Cocciolo, Information Research 20;3 (2015) http://www.informationr.net/ir/20-3/paper682.html
  • 22. Single pages as social and political evidence A case study of a public dispute in the UK about the place of religion in public life: Webster, ‘Religious discourse in the archived Web: Rowan Williams, archbishop of Canterbury, and the sharia law controversy of 2008’ in Brügger and Schroeder (eds), The Web as History (2017)
  • 23. [Wikimedia Commons, CC BY SA 2.0, by Brian (of Toronto)]
  • 24. Rowan Williams and sharia law
  • 28. Studies on whole sites Single organisations Allah.com (Hofheinz in Web History, 2010) University of Bologna (Nanni, DHQ 11, 2017) Platforms Milligan on Geocities (The Web as History, 2017) Paloque-Berges on Usenet (Web25, 2017)
  • 29. The Web sphere Definition: Web materials from more than one site with a ‘shared event, concept, theme or geographic area’ (Brügger, 2018) Two examples: one national, one thematic
  • 30. The shape of a national web sphere Anat Ben-David (@anatbd), ‘What does the Web remember of its deleted past? An archival reconstruction of the former Yugoslav top-level domain’, New Media and Society, 18:7 (2016)
  • 31. One island, two states Counties of Ireland, north and south (Wikimedia Commons) CC-BY-SA 3.0
  • 32. A unique mix of faith and politics? Ian Paisley and Edward Carson, Stormont (1985) (Burns Library, Boston College, CC-BY-NC-ND 2.0 via Flickr)
  • 33. Cross-border religion? • Historic Christian denominations: RC, Presbyterian (PCI), Church of Ireland, Methodist, Baptist • all organised on an all-Ireland basis • … spanning two political jurisdictions • …. and two ccTLDs - .uk and .ie
  • 34. All-Ireland religion Church of Ireland dioceses (CoI, via Wikimedia Commons) CC-BY-SA 3.0
  • 35. Research questions Using link graph data, to ask: • how does web estate of each church interact across the border (& between ccTLDs)? • are there distinct web spheres for each in NI and the RoI?
  • 36. Baptists in Ireland (2016) • Association of Baptist Churches in Ireland has 117 congregations: 28 in RoI, 89 in NI • 8.5k members, community of 20k • Including independents, 93 in NI and 30 in RoI • 28 congregations with domains in RoI, 77 NI
  • 37. Counties of Northern Ireland (Map by Maximilian Dörrbecker, CC-BY-SA 2.5)
  • 38. Where are the congregations? County County Code % of congregations (with domains) Antrim AN 44 Armagh AR 7 Down DO 25 Londonderry LD 10 Tyrone TY 10 Fermanagh FE 4
  • 39. Where are the domains? Domains Coverage .uk .com .ie Other % % % % % Baptist 101 > 80 40 24 - 36 Baptist (Antrim) 48 40 31 29
  • 40. UK Host Link Graph (1996- 2010) • 2008 | catholic_church.co.uk | catholic_church.ie | 4 • 2001 | belfast_anglican.co.uk | derry_anglican.co.uk | 1 • 2002 | derry_anglican.org.uk | derry_catholic.co.uk | 1 Data in public domain: data.webarchive.org.uk
  • 41. Coded link graph (NI-to-NI) • 1999 | AN27 | AR07 | 11 • 2003 | DW13 | LD05 | 17 • 2010 | AN11 | AN21 | 3
  • 42. Total NI-to-NI edges Inbound Outbound Internal AN 248 299 151 AR 56 98 10 DW 125 64 15 LD 52 20 2
  • 43. Conclusions • the Baptist web sphere very tightly localised • … but spread across several TLDs • little cross-border linkage • link analysis hard in national web archives Webster, 'Lessons from cross-border religion in the Northern Irish web sphere’ in The Historical Web and Digital Humanities. The case of national web domains (2019)
  • 44. Questions ? Peter Webster peter@websterresearchconsulting.com @pj_webster / @WebsterRandC peterwebster.me websterresearchconsulting.com