SlideShare a Scribd company logo
WWW Caching
George Neisser
Manchester Computing
University of Manchester
George.Neisser@mcc.ac.uk
Overview of Presentation
• Why caching?
• Caching Infrastructures.
• National Caching.
• Caching hardware and software
• Implementation of caching
• Non-Technical Issues
Why Caching?
• 1000s of users ‘surfing’ the Internet each
with their own browser.
• Users and browsers are ‘independant’
resulting in a large amount of replication
of information carried over the network.
• Popular Web sites may have many
simultaneous connections transmitting
identical copies of a single item over the
same network trunk routes. This state of
affairs is highly undesirable because...
Why Caching?
• Bandwidth - especially
international bandwidth - is
very expensive, and must be
used cost-effectively.
• Web ‘hot-spots’ are created.
• Web object retrieval times are
increased.
Why Caching?
• Caching, or Web Caches are an
attempt to:
– Minimise bandwidth wastage.
– Decrease object retrieval times.
– reduce number of ‘Hot-Spots’
Caching Infrastructures
• Caches may be implemented:
– Within departments
– Within Institutions
– Nationally
– Internationally
• Caches can co-operate. So we
have meshes of caches or
caching infrastructures.
Caching Infrastructures
• Caching infrastructures are
developing at every level.
– Quite a few departmental caches.
– Many Institutions now operate
caches.
– Within the UK a National caching
infrastructure is developing.
– International infrastructures in
place and developing.
Caching Infrastructures
• Cooperation between caches.
– Achieved by the ICP cache
communication protocol in one of
two modes:
• Unicast mode - individual
connections established to
interrogate caches.
• Multicast mode - an ICP multicast
packet is ‘multicast’ to a group of
cooperating caches.
– Intuitively the multicast approach
should be more efficient - reduce
bandwidth, etc.
Caching Infrastructures
• For example at Manchester:
– Central campus cache and several
departmental caches use it in
unicast mode.
– Parent relationships with other
caches in the UK, Europe and
USA.
National Caching
• HENSA pioneered caching with
their Public Caching Proxy
Server. Initiated around 1992.
– Used Lagoon initially
– Then the CERN server
– Then Netscape Proxy
– And some Squid
• Details described at First
International WWW
Conference:
http://www.hensa.ac.uk/www94
National Caching
• The existing service is hosted
by University of Kent at
Canterbury and University of
Leeds.
• From 1st August 1997 it will be
hosted by the University of
Manchester and Loughborough
University.
• Selection by a recent
competitive tendering process.
National Caching
• The situation so far.
– Service still at HENSA and
Leeds. We are preparing for the
transition.
– Initially exisiting equipment will
be used.
– Projection of demand performed
and hardware upgrade path
budgeted for.
National Caching
• The ‘new’ service will have:
– a service ‘arm’
– a development ‘arm’
• The National service will be
directed by a steering
committee and will be, as far as
possible, user driven.
• National Caching Web site,
regular newsletter, mailing lists,
help desk system, fault
reporting mechanism, etc, etc.
Benefits of National
Caching
• Trans-Atlantic bandwidth and
bandwidth to Europe are both
very expensive and in great
demand. Caching reduces
bandwidth consumption. The
resulting cost savings can be
used to fund other things.
• Faster document retrieval time -
in theory!
National Caching - Useful
addresses and URLs
• Email addresses:
– wwwcache-users@wwwcache.ja.net
general mailing list for users.
– cybercache@wwwcache.ja.net mailing
list for Special Interest Group.
– natcache@wwwcache.ja.net, National
Cache Joint Team mailing list.
• Some URLS:
– http://www.hensa.ac.uk
– http://www.net.lboro.ac.uk/caching/
– http://www.mcc.ac.uk/Cache/
Caching Hardware
• Any Unix platform
• Linux
• FreeBSD
Caching Software
• Lagoon
• CERN
• Netscape
• Harvest
• Squid
Using Caches
• Users interact with caches
directly using their favourite
browser.
• Caches interact or co-operate
with other caches using ICP.
• Browser - cache interaction is a
‘client-server’ type interaction.
Implementation -
Browsers
• Netscape
– Manual configuration - Select
network preferences from Options
menu...
– Automatic configuration - proxy
configuration can be automated
with Javascript...
• Others: Lynx, Mosaic, Microsoft
Internet Explorer.
Implementation - caches
• With reference to Squid
– Installation
– Configuration
– Operations
• Some problems
– disk space
– discarding documents
Implementation -Installation
• Retrieve from:
– http://squid.nlanr.net/Squid/
– Decompress and extract.
– configure
– compile
– install
• Operating Systems
– Unix, AIX, FreeBSD, HP-UX,
IRIX, Linux, OSF/1, Solaris,
SunOS
Implementation - Configuration
• Configuration file
– http_port
– icp_poty
– mcast_groups
– Cache_host
– cache_host_domain
– cache_swap
– cache_swap_low
– cache_swap_high
– cache_dir
– cache_access_log
Implemetation - configuration
• Configuration file continued...
– pid_filename
– debug_options
– ftpget_program
– negative_ttl
• Access Control lists
– http_access allow
– htp_access deny
Implemetation - configuration
• Administration parameters
– cache_mgr
– cache_announce
– logfile_rotate
– minimum_direct_hops
– and so on...
Operation
• Parent or sibling?
• Log files
• Statistics
• Number of requests per day
• Machine loading
• Network loading
• Improvement in latency?
• Reduction in bandwidth usage?
Other Issues
• Copyright
• Pornography
• Log statistsics
• Data protection act.
Should I run a cache?
• Should I run a:
– Departmental cache?
– Institutional cache?
• Should I link together
departmental caches?
• Should I link departmental
caches to my Institutional
cache?
• Should I link my institutional
cache to the National Cache?
Should I run a cache?
• There are no hard and fast rules.
Clearly caching saves bandwidth and
improves latency, but it is not
obvious how best to construct a
hierarchy to achieve this.
• We are are at the learning stage. Part
of the remit of the National Web
Network Caching Service will be to
investigate this and produce
guidelines and recommendations for
individual sites.
Should I run a cache?
• The answer is yes!
• Consider
– number of users
– Type of work
– Local Area Network
• loading
• Bottlenecks
– Expected demand
• Analyse statistics
Futures
• The National WWW Network
Caching Service will be
involved in the development of
caching in the UK. Will
investigate hardware and
software. Findings will be
published on the National
Cache Web site:
URL: http://www.wwwcache.ac.uk

More Related Content

PPT
IWMW 1997: Database-WWW Integration
PPT
PPTX
PPTX
Robust Applications in Mesos using External Storage
PPTX
Web server architecture
PPT
Drupalcamp Estonia - High Performance Sites
PDF
Caching
PPTX
Omeka.net, briefly
IWMW 1997: Database-WWW Integration
Robust Applications in Mesos using External Storage
Web server architecture
Drupalcamp Estonia - High Performance Sites
Caching
Omeka.net, briefly

What's hot (18)

PDF
Php training in bhubaneswar
PDF
Asp.Net 3 5 Part 1
PDF
DRUPAL CACHE SYSTEMS AND VARNISH
PDF
HBaseCon2017 Apache HBase at Didi
PPTX
Database , 17 Web
PDF
e-Learning Delivery System : The Challenges
PPTX
Drop acid
PDF
Simple server side cache for Express.js with Node.js
PPT
StorageArchitecturesForCloudVDI
PDF
WebsitePerformance
PDF
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
PDF
Reliable dedicated server hosting provider
PDF
Ceph Day Beijing: Containers and Ceph
PPTX
Beyond the Basics 1: Storage Engines
PDF
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
PPTX
Bootstrap SaaS startup using Open Source Tools
PDF
Nosql databases for the .net developer
PPT
Web Application Cache (APC, Memcache)
Php training in bhubaneswar
Asp.Net 3 5 Part 1
DRUPAL CACHE SYSTEMS AND VARNISH
HBaseCon2017 Apache HBase at Didi
Database , 17 Web
e-Learning Delivery System : The Challenges
Drop acid
Simple server side cache for Express.js with Node.js
StorageArchitecturesForCloudVDI
WebsitePerformance
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Reliable dedicated server hosting provider
Ceph Day Beijing: Containers and Ceph
Beyond the Basics 1: Storage Engines
Cybersecurity and fraud detection at ING Bank using Presto & Alluxio on S3
Bootstrap SaaS startup using Open Source Tools
Nosql databases for the .net developer
Web Application Cache (APC, Memcache)
Ad

Viewers also liked (8)

PPT
IWMW 1998: Events online
PDF
Awesotech magnetics
PDF
4 Blog Tax Policy FINAL
PDF
24 Polarization observable measurements for γp → K+Λ and γp → K+Σ for energie...
PPT
Cute Animal Pics
PPTX
мастер класс
PDF
28 Measurement of η photoproduction on the proton from threshold to 1500-MeV ...
IWMW 1998: Events online
Awesotech magnetics
4 Blog Tax Policy FINAL
24 Polarization observable measurements for γp → K+Λ and γp → K+Σ for energie...
Cute Animal Pics
мастер класс
28 Measurement of η photoproduction on the proton from threshold to 1500-MeV ...
Ad

Similar to IWMW 1997: WWW Caching (20)

PPTX
Helen Tabunshchyk "Handling large amounts of traffic on the Edge"
PDF
«Scrapy internals» Александр Сибиряков, Scrapinghub
PDF
Exploring off path caching with edge caching in information centric networkin...
PPTX
IWMW 1998: Server Management (6) Web Caching
PPTX
Backup_Archive_Replication_Presentation.pptx
PPTX
CLIMB System Introduction Talk - CLIMB Launch
PPTX
Azug - successfully breeding rabits
PDF
A closer look to locaweb IaaS
PDF
The Wix Microservice Stack
PDF
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
PDF
What is Nginx and Why You Should to Use it with Wordpress Hosting
PDF
Toward 10,000 Containers on OpenStack
PDF
Debugging applications with network security tools
PPT
PDF
Virtualisation For Network Testing & Staff Training
PPTX
High performace network of Cloud Native Taiwan User Group
PDF
Understanding and Designing Ultra low latency systems | Low Latency | Ultra L...
PDF
Kubernetes - Hosted OSG Services
PDF
DNUG46 - Build your own private Cloud environment
PDF
Build your own private Cloud environment
Helen Tabunshchyk "Handling large amounts of traffic on the Edge"
«Scrapy internals» Александр Сибиряков, Scrapinghub
Exploring off path caching with edge caching in information centric networkin...
IWMW 1998: Server Management (6) Web Caching
Backup_Archive_Replication_Presentation.pptx
CLIMB System Introduction Talk - CLIMB Launch
Azug - successfully breeding rabits
A closer look to locaweb IaaS
The Wix Microservice Stack
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
What is Nginx and Why You Should to Use it with Wordpress Hosting
Toward 10,000 Containers on OpenStack
Debugging applications with network security tools
Virtualisation For Network Testing & Staff Training
High performace network of Cloud Native Taiwan User Group
Understanding and Designing Ultra low latency systems | Low Latency | Ultra L...
Kubernetes - Hosted OSG Services
DNUG46 - Build your own private Cloud environment
Build your own private Cloud environment

More from IWMW (20)

PPT
Look who's talking now
PPTX
Introduction to IWMW 2000 (Liz Lyon)
PPTX
Web Tools report
PPT
Personal Contingency Plan - Beat The Panic
PPT
Whose site is it anyway?
PPT
Open Source - the case against
PPT
IWMW 2002: Avoiding Portal Wars - an MIS view
PDF
What does open source mean for the institutional web manager?
PDF
Library 2.0
PPT
Social participation in student recruitment
PDF
Supporting Institutions in Changing Times: Manifesto
PPTX
IWMW 2019 photo scavenger hunt highlights
PDF
How to Turn a Web Strategy into Web Services
PPTX
Static Site Generators - Developing Websites in Low-resource Condition
PPTX
Looking to the Future
PPTX
Looking to the Future
PPTX
Developing Communities of Practice
PDF
How to train your content- so it doesn't slow you down...
PPTX
Grassroots & Guerrillas: The Beginnings of a UX Revolution
PPTX
Connecting Your Content: How to Save Time and Improve Content Quality through...
Look who's talking now
Introduction to IWMW 2000 (Liz Lyon)
Web Tools report
Personal Contingency Plan - Beat The Panic
Whose site is it anyway?
Open Source - the case against
IWMW 2002: Avoiding Portal Wars - an MIS view
What does open source mean for the institutional web manager?
Library 2.0
Social participation in student recruitment
Supporting Institutions in Changing Times: Manifesto
IWMW 2019 photo scavenger hunt highlights
How to Turn a Web Strategy into Web Services
Static Site Generators - Developing Websites in Low-resource Condition
Looking to the Future
Looking to the Future
Developing Communities of Practice
How to train your content- so it doesn't slow you down...
Grassroots & Guerrillas: The Beginnings of a UX Revolution
Connecting Your Content: How to Save Time and Improve Content Quality through...

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
advance database management system book.pdf
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
Empowerment Technology for Senior High School Guide
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
HVAC Specification 2024 according to central public works department
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
My India Quiz Book_20210205121199924.pdf
Computing-Curriculum for Schools in Ghana
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
advance database management system book.pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
B.Sc. DS Unit 2 Software Engineering.pptx
Virtual and Augmented Reality in Current Scenario
Weekly quiz Compilation Jan -July 25.pdf
Empowerment Technology for Senior High School Guide
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Share_Module_2_Power_conflict_and_negotiation.pptx
HVAC Specification 2024 according to central public works department
202450812 BayCHI UCSC-SV 20250812 v17.pptx
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
My India Quiz Book_20210205121199924.pdf

IWMW 1997: WWW Caching

  • 1. WWW Caching George Neisser Manchester Computing University of Manchester George.Neisser@mcc.ac.uk
  • 2. Overview of Presentation • Why caching? • Caching Infrastructures. • National Caching. • Caching hardware and software • Implementation of caching • Non-Technical Issues
  • 3. Why Caching? • 1000s of users ‘surfing’ the Internet each with their own browser. • Users and browsers are ‘independant’ resulting in a large amount of replication of information carried over the network. • Popular Web sites may have many simultaneous connections transmitting identical copies of a single item over the same network trunk routes. This state of affairs is highly undesirable because...
  • 4. Why Caching? • Bandwidth - especially international bandwidth - is very expensive, and must be used cost-effectively. • Web ‘hot-spots’ are created. • Web object retrieval times are increased.
  • 5. Why Caching? • Caching, or Web Caches are an attempt to: – Minimise bandwidth wastage. – Decrease object retrieval times. – reduce number of ‘Hot-Spots’
  • 6. Caching Infrastructures • Caches may be implemented: – Within departments – Within Institutions – Nationally – Internationally • Caches can co-operate. So we have meshes of caches or caching infrastructures.
  • 7. Caching Infrastructures • Caching infrastructures are developing at every level. – Quite a few departmental caches. – Many Institutions now operate caches. – Within the UK a National caching infrastructure is developing. – International infrastructures in place and developing.
  • 8. Caching Infrastructures • Cooperation between caches. – Achieved by the ICP cache communication protocol in one of two modes: • Unicast mode - individual connections established to interrogate caches. • Multicast mode - an ICP multicast packet is ‘multicast’ to a group of cooperating caches. – Intuitively the multicast approach should be more efficient - reduce bandwidth, etc.
  • 9. Caching Infrastructures • For example at Manchester: – Central campus cache and several departmental caches use it in unicast mode. – Parent relationships with other caches in the UK, Europe and USA.
  • 10. National Caching • HENSA pioneered caching with their Public Caching Proxy Server. Initiated around 1992. – Used Lagoon initially – Then the CERN server – Then Netscape Proxy – And some Squid • Details described at First International WWW Conference: http://www.hensa.ac.uk/www94
  • 11. National Caching • The existing service is hosted by University of Kent at Canterbury and University of Leeds. • From 1st August 1997 it will be hosted by the University of Manchester and Loughborough University. • Selection by a recent competitive tendering process.
  • 12. National Caching • The situation so far. – Service still at HENSA and Leeds. We are preparing for the transition. – Initially exisiting equipment will be used. – Projection of demand performed and hardware upgrade path budgeted for.
  • 13. National Caching • The ‘new’ service will have: – a service ‘arm’ – a development ‘arm’ • The National service will be directed by a steering committee and will be, as far as possible, user driven. • National Caching Web site, regular newsletter, mailing lists, help desk system, fault reporting mechanism, etc, etc.
  • 14. Benefits of National Caching • Trans-Atlantic bandwidth and bandwidth to Europe are both very expensive and in great demand. Caching reduces bandwidth consumption. The resulting cost savings can be used to fund other things. • Faster document retrieval time - in theory!
  • 15. National Caching - Useful addresses and URLs • Email addresses: – wwwcache-users@wwwcache.ja.net general mailing list for users. – cybercache@wwwcache.ja.net mailing list for Special Interest Group. – natcache@wwwcache.ja.net, National Cache Joint Team mailing list. • Some URLS: – http://www.hensa.ac.uk – http://www.net.lboro.ac.uk/caching/ – http://www.mcc.ac.uk/Cache/
  • 16. Caching Hardware • Any Unix platform • Linux • FreeBSD
  • 17. Caching Software • Lagoon • CERN • Netscape • Harvest • Squid
  • 18. Using Caches • Users interact with caches directly using their favourite browser. • Caches interact or co-operate with other caches using ICP. • Browser - cache interaction is a ‘client-server’ type interaction.
  • 19. Implementation - Browsers • Netscape – Manual configuration - Select network preferences from Options menu... – Automatic configuration - proxy configuration can be automated with Javascript... • Others: Lynx, Mosaic, Microsoft Internet Explorer.
  • 20. Implementation - caches • With reference to Squid – Installation – Configuration – Operations • Some problems – disk space – discarding documents
  • 21. Implementation -Installation • Retrieve from: – http://squid.nlanr.net/Squid/ – Decompress and extract. – configure – compile – install • Operating Systems – Unix, AIX, FreeBSD, HP-UX, IRIX, Linux, OSF/1, Solaris, SunOS
  • 22. Implementation - Configuration • Configuration file – http_port – icp_poty – mcast_groups – Cache_host – cache_host_domain – cache_swap – cache_swap_low – cache_swap_high – cache_dir – cache_access_log
  • 23. Implemetation - configuration • Configuration file continued... – pid_filename – debug_options – ftpget_program – negative_ttl • Access Control lists – http_access allow – htp_access deny
  • 24. Implemetation - configuration • Administration parameters – cache_mgr – cache_announce – logfile_rotate – minimum_direct_hops – and so on...
  • 25. Operation • Parent or sibling? • Log files • Statistics • Number of requests per day • Machine loading • Network loading • Improvement in latency? • Reduction in bandwidth usage?
  • 26. Other Issues • Copyright • Pornography • Log statistsics • Data protection act.
  • 27. Should I run a cache? • Should I run a: – Departmental cache? – Institutional cache? • Should I link together departmental caches? • Should I link departmental caches to my Institutional cache? • Should I link my institutional cache to the National Cache?
  • 28. Should I run a cache? • There are no hard and fast rules. Clearly caching saves bandwidth and improves latency, but it is not obvious how best to construct a hierarchy to achieve this. • We are are at the learning stage. Part of the remit of the National Web Network Caching Service will be to investigate this and produce guidelines and recommendations for individual sites.
  • 29. Should I run a cache? • The answer is yes! • Consider – number of users – Type of work – Local Area Network • loading • Bottlenecks – Expected demand • Analyse statistics
  • 30. Futures • The National WWW Network Caching Service will be involved in the development of caching in the UK. Will investigate hardware and software. Findings will be published on the National Cache Web site: URL: http://www.wwwcache.ac.uk