SlideShare a Scribd company logo
HTTP Caching




Alexander Shopov
Alexander Shopov

By day: Software Engineer at Cisco
By night: OSS contributor
Coordinator of Bulgarian Gnome TP
                     Contacts:
E-mail: ash@kambanaria.org
Jabber: al_shopov@jabber.minus273.org
LinkedIn: http://www.linkedin.com/in/alshopov
Google: Just search “al_shopov”
Please Learn And Share



       License: CC-BY v3.0
Creative Commons Attribution v3.0
Disclaimer



My opinions, knowledge and experience!
         Not my employer's.
Why Cache At All?
●   Lowers number of requests, improves latency,
    provides scaling
●   AJAX caching leads to lively applications
●   Lowers server load for all kinds of content, but
    especially important (and hard) for dynamic content
MOST IMPORTANT RESOURCE!
●   RFC 2616 http://tools.ietf.org/html/rfc2616
●   HTTP caching:
       –   http://tools.ietf.org/html/rfc2616#section-13
Purpose of caching
●   Eliminate the need for requests
        –   No server round trip at all – fastest way
        –   Expiration – received data is fine
●   Eliminate the need for full answers
        –   Lower traffic, narrow bandwidth
        –   Validation – received data probably fine, check it
HTTP participants
●   All of them in the protocol from day 1 – not an
    afterthought!
        –   Origin server
        –   Gateway/reverse proxy (shared cache)
        –   Proxy (shared cache)
        –   Client (can have internal cache – non shared cache)
●   Gateway is similar to Proxy
        –   Proxies – chosen by client (or clients)
        –   Gateways – chosen by server
Client ↔ Intermediaries ↔ Server
●   Easy/safe upgrade of protocol during conversation
●   Caching principles:
        –   Semantically transparent
        –   Explicit permits for non transparent actions
        –   Intermediaries can add warnings
        –   Caching headers/directives can be one way
●   Different behaviour for requests:
        –   Safe requests: GET/HEAD. Breaking this is server's fault, not
              clients'. All other requests must reach origin server
        –   Idempotent requests – repeating ≡ doing them once:
              GET/HEAD + PUT/DELETE/OPTIONS/TRACE
HTML: Meta tags
●   Widely used and as widely ineffective:
        –   The only thing HTML designers can put
        –   Not read/used by intermediaries
        –   Not all browser caches honour it
●   Do not rely on them! No real reason to use them.
    (actually the real reason is that habit is second
    nature).
HTTP 1.0
●   Pragma: no-cache
        –   Pragmas are problematic – not all participants honour
              them.
●   Proper equivalent in HTTP 1.1:
        –   Cache-Control: no-cache
        –   Take from server even if available from cache
HTTP 1.1
●   Expires – until then have it fresh
●   ETag – (do) you have this version
●   Cache-Control – fine grained tuning
Expires
●   Expires: absolute_date
●   To mark a resource already expired include header:
    Expires = Date
ETag – 1
●   No ordering, just value – either matches (single
    value or a value from set) or does not.
●   Per URI – no sense in comparing tags from different
    URIs, E = entity
●   ETag: resource tag
        –   ETag: "xyzzy" – strong, bit by bit equivalence
        –   ETag: W/"xyzzy" – weak, semantic equivalence
●   Different matches
        –   Strong – matches and all tags are strong.
        –   Weak – matches, possible for tag to be weak.
ETag – 2
●   Conditional requests: if matching – just
    confirmation, otherwise – data itself
        –   If-Modified-Since
        –   If-Unmodified-Since
        –   If-Match
        –   If-None-Match
        –   If-Range
●   Strong tags allow for caching of partial answers
Cache-Control
●   All HTTP 1.1 participants MUST obey it (otherwise
    they are broken.
●   MUST reach all participants
●   Cannot target a particular intermediary
Cache-Control
●   cache-request-directive    ●   cache-response-directive
        –   no-cache                   –   public
        –   no-store                   –   private
        –   max-age                    –   no-cache
        –   max-stale                  –   no-store
        –   min-fresh                  –   no-transform
        –   no-transform               –   must-revalidate
        –   only-if-cached             –   proxy-revalidate
        –   cache-extension            –   max-age
                                       –   s-maxage
                                       –   cache-extension
Cache-Control Categories
●   What is cacheable – only imposed by server
●   What can be stored in cache – imposed by server
    and client
●   Modifications on expiration – imposed by server
    and client
●   Control over cache revalidation and reload – only
    imposed by client
●   Control over transformation of entities
●   Extensions to the caching system
Cache-Control – Requests 1
●   no-cache – cache should revalidate with server
●   no-store – do not store on durable media
●   max-age[=sec] – clients wants info no older than
    this
●   max-stale[=sec] – client accepts stale information
    but no more stale than this
Cache-Control – Requests 2
●   min-fresh[=sec] – clients wants info that will stay
    fresh for this time
●   no-transform – no trasnform by intermediary
        –   Medical Xray Photo from PNG to JPEG
●   only-if-cached – when connection to server is bad.
    Better to get 504 (Gateway Timeout) than wait
●   cache-extension – extensions
Cache-Control – Responses 1
●   public – may be cached by any cache
●   private – must not be cached by shared cache
●   no-cache – cache should revalidate with server
●   no-store – do not store on durable media
●   no-transform – no trasnform by intermediary
●   must-revalidate – server requested revalidation of
    stale data
●   proxy-revalidate – same as above but not for user
    agent cache
Cache-Control – Responses 2
●   max-age[=sec] – for any cache
●   s-maxage[=sec] – for shared cache, priority over
    max-age and Expires.
●   cache-extension – extensions
Status Codes 1
●   201 Created – can contain ETag, resource created
        –   (contrast with 202)
●   203 Non-Authoritative Information
        –   not from originating server but from cache
●   206 Partial Content – range partial GET request
        –   (contains ETag, Expires, Cache-Control, Vary if
              changeable). Result to If-Range. If either ETag or
              Last-Modified don't match – cache does not
              combine them with others. If no support from
              ranges in cache – 206 not cached.
Status Codes 2
●   302 Found – redirect that can change. Use Cache-
    Control or Expires
●   304 Not Modified – conditional GET, resource not
    changed, body of response empty (ETag/Content-
    Location, Expires, Cache-Control or Vary)
●   305 Use Proxy – per request, generated by server
●   307 Temporary Redirect – similar to 302
Conditional requests/responses
●   Origin servers
        –   Should provide both ETag (preferably strong unless
              not feasible) and Last-Modified
        –   Must avoid reusing specific strong ETag for different
             entities
●   Clients
        –   Must/should use ETag Last-Modified and them in
             conditional requests
AJAX
●   Use cache directives in AJAX
●   Try to make your AJAX responses cacheable (you
    will have to think!)
●   POSTs are mostly uncacheable, prefer GETs to
    fetch information
●   Generate Content-Length response headers and
    reuse TCP/IP connection
Tools 1
●   Firefox addons:
        –   Firebug
        –   LiveHTTPHeaders
        –   Modify Headers
●   Chrome, Opera, Internet Explorer dev tools (F12)
Tools 2
●   Mark Nottingham: Caching tutorial
●   Redbot: Check cacheability
●   Old, but gold: Cacheability

More Related Content

PPT
Caching Techniquesfor Content Delivery
KEY
Load Balancing with Apache
PPTX
Using memcache to improve php performance
PDF
HTTP Acceleration with Varnish
PPTX
Nginx Scalable Stack
PPTX
Choosing A Proxy Server - Apachecon 2014
PPTX
HTTP Request Smuggling
PDF
Http smuggling 1 200523064027
Caching Techniquesfor Content Delivery
Load Balancing with Apache
Using memcache to improve php performance
HTTP Acceleration with Varnish
Nginx Scalable Stack
Choosing A Proxy Server - Apachecon 2014
HTTP Request Smuggling
Http smuggling 1 200523064027

What's hot (20)

PDF
Clug 2012 March web server optimisation
PDF
Memcached Presentation
PPTX
Usenix LISA 2012 - Choosing a Proxy
PPTX
Http caching
PDF
Nginx: Accelerate Rails, HTTP Tricks
PDF
Php & web server performace
PPT
Memcache
PPTX
Reverse proxy & web cache with NGINX, HAProxy and Varnish
PPTX
NGINX High-performance Caching
PDF
Content Caching with NGINX and NGINX Plus
PDF
High Availability Content Caching with NGINX
PDF
Extending functionality in nginx, with modules!
PDF
Web performance optimization - MercadoLibre
PPTX
PHP Performance with APC + Memcached
PDF
NginX - good practices, tips and advanced techniques
PDF
Nginx dhruba mandal
PDF
ReplacingSquidWithATS
PPT
Web Server Load Balancer
PPT
Oscon 2010 - ATS
PDF
Varnish SSL / TLS
Clug 2012 March web server optimisation
Memcached Presentation
Usenix LISA 2012 - Choosing a Proxy
Http caching
Nginx: Accelerate Rails, HTTP Tricks
Php & web server performace
Memcache
Reverse proxy & web cache with NGINX, HAProxy and Varnish
NGINX High-performance Caching
Content Caching with NGINX and NGINX Plus
High Availability Content Caching with NGINX
Extending functionality in nginx, with modules!
Web performance optimization - MercadoLibre
PHP Performance with APC + Memcached
NginX - good practices, tips and advanced techniques
Nginx dhruba mandal
ReplacingSquidWithATS
Web Server Load Balancer
Oscon 2010 - ATS
Varnish SSL / TLS
Ad

Similar to Caching in HTTP (20)

KEY
Fearless HTTP requests abuse
PDF
HTTP colon slash slash: the end of the road?
PDF
HTTP headers that will make your website go faster
PDF
REST in ( a mobile ) peace @ WHYMCA 05-21-2011
PPTX
The Most Frequently Used Caching Headers
PDF
HCLT Whitepaper: Accelerated Web Content Delivery
PDF
12 core technologies you should learn, love, and hate to be a 'real' technocrat
PDF
HTTP cache @ PUG Rome 03-29-2011
PPTX
Caching
PDF
Caching Hypermedia APIs
PPT
Rest services caching
PDF
HTTP Caching in Web Application
PDF
HTTP headers that make your website go faster - devs.gent November 2023
PDF
HTTP headers that make your website go faster
PPTX
Caching in Drupal 8
PDF
Modelling RESTful applications – Why should I not use verbs in REST url
PPTX
cache concepts and varnish-cache
PDF
Thijs Feryn - Leverage HTTP to deliver cacheable websites - Codemotion Berlin...
PDF
Thijs Feryn - Leverage HTTP to deliver cacheable websites - Codemotion Berlin...
PDF
Indic threads delhi13-rest-anirudh
Fearless HTTP requests abuse
HTTP colon slash slash: the end of the road?
HTTP headers that will make your website go faster
REST in ( a mobile ) peace @ WHYMCA 05-21-2011
The Most Frequently Used Caching Headers
HCLT Whitepaper: Accelerated Web Content Delivery
12 core technologies you should learn, love, and hate to be a 'real' technocrat
HTTP cache @ PUG Rome 03-29-2011
Caching
Caching Hypermedia APIs
Rest services caching
HTTP Caching in Web Application
HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster
Caching in Drupal 8
Modelling RESTful applications – Why should I not use verbs in REST url
cache concepts and varnish-cache
Thijs Feryn - Leverage HTTP to deliver cacheable websites - Codemotion Berlin...
Thijs Feryn - Leverage HTTP to deliver cacheable websites - Codemotion Berlin...
Indic threads delhi13-rest-anirudh
Ad

More from Alexander Shopov (10)

ODP
700 Tons of Code Later
ODP
Knots - the Lazy Data Transfer Objects for Dealing with the Microservices Craze
ODP
Нови приключения на преводачите
PDF
In Vogue Dynamic
PDF
Bundling Packages and Deploying Applications with RPM
PDF
Beyond the Final Frontier of jQuery Selectors
PDF
Oracle's Take On NoSQL
ODP
I Know Kung Fu - Juggling Java Bytecode
PDF
Lifting The Veil - Reading Java Bytecode During Lunchtime
PDF
Lifting The Veil - Reading Java Bytecode
700 Tons of Code Later
Knots - the Lazy Data Transfer Objects for Dealing with the Microservices Craze
Нови приключения на преводачите
In Vogue Dynamic
Bundling Packages and Deploying Applications with RPM
Beyond the Final Frontier of jQuery Selectors
Oracle's Take On NoSQL
I Know Kung Fu - Juggling Java Bytecode
Lifting The Veil - Reading Java Bytecode During Lunchtime
Lifting The Veil - Reading Java Bytecode

Caching in HTTP

  • 2. Alexander Shopov By day: Software Engineer at Cisco By night: OSS contributor Coordinator of Bulgarian Gnome TP Contacts: E-mail: ash@kambanaria.org Jabber: al_shopov@jabber.minus273.org LinkedIn: http://www.linkedin.com/in/alshopov Google: Just search “al_shopov”
  • 3. Please Learn And Share License: CC-BY v3.0 Creative Commons Attribution v3.0
  • 4. Disclaimer My opinions, knowledge and experience! Not my employer's.
  • 5. Why Cache At All? ● Lowers number of requests, improves latency, provides scaling ● AJAX caching leads to lively applications ● Lowers server load for all kinds of content, but especially important (and hard) for dynamic content
  • 6. MOST IMPORTANT RESOURCE! ● RFC 2616 http://tools.ietf.org/html/rfc2616 ● HTTP caching: – http://tools.ietf.org/html/rfc2616#section-13
  • 7. Purpose of caching ● Eliminate the need for requests – No server round trip at all – fastest way – Expiration – received data is fine ● Eliminate the need for full answers – Lower traffic, narrow bandwidth – Validation – received data probably fine, check it
  • 8. HTTP participants ● All of them in the protocol from day 1 – not an afterthought! – Origin server – Gateway/reverse proxy (shared cache) – Proxy (shared cache) – Client (can have internal cache – non shared cache) ● Gateway is similar to Proxy – Proxies – chosen by client (or clients) – Gateways – chosen by server
  • 9. Client ↔ Intermediaries ↔ Server ● Easy/safe upgrade of protocol during conversation ● Caching principles: – Semantically transparent – Explicit permits for non transparent actions – Intermediaries can add warnings – Caching headers/directives can be one way ● Different behaviour for requests: – Safe requests: GET/HEAD. Breaking this is server's fault, not clients'. All other requests must reach origin server – Idempotent requests – repeating ≡ doing them once: GET/HEAD + PUT/DELETE/OPTIONS/TRACE
  • 10. HTML: Meta tags ● Widely used and as widely ineffective: – The only thing HTML designers can put – Not read/used by intermediaries – Not all browser caches honour it ● Do not rely on them! No real reason to use them. (actually the real reason is that habit is second nature).
  • 11. HTTP 1.0 ● Pragma: no-cache – Pragmas are problematic – not all participants honour them. ● Proper equivalent in HTTP 1.1: – Cache-Control: no-cache – Take from server even if available from cache
  • 12. HTTP 1.1 ● Expires – until then have it fresh ● ETag – (do) you have this version ● Cache-Control – fine grained tuning
  • 13. Expires ● Expires: absolute_date ● To mark a resource already expired include header: Expires = Date
  • 14. ETag – 1 ● No ordering, just value – either matches (single value or a value from set) or does not. ● Per URI – no sense in comparing tags from different URIs, E = entity ● ETag: resource tag – ETag: "xyzzy" – strong, bit by bit equivalence – ETag: W/"xyzzy" – weak, semantic equivalence ● Different matches – Strong – matches and all tags are strong. – Weak – matches, possible for tag to be weak.
  • 15. ETag – 2 ● Conditional requests: if matching – just confirmation, otherwise – data itself – If-Modified-Since – If-Unmodified-Since – If-Match – If-None-Match – If-Range ● Strong tags allow for caching of partial answers
  • 16. Cache-Control ● All HTTP 1.1 participants MUST obey it (otherwise they are broken. ● MUST reach all participants ● Cannot target a particular intermediary
  • 17. Cache-Control ● cache-request-directive ● cache-response-directive – no-cache – public – no-store – private – max-age – no-cache – max-stale – no-store – min-fresh – no-transform – no-transform – must-revalidate – only-if-cached – proxy-revalidate – cache-extension – max-age – s-maxage – cache-extension
  • 18. Cache-Control Categories ● What is cacheable – only imposed by server ● What can be stored in cache – imposed by server and client ● Modifications on expiration – imposed by server and client ● Control over cache revalidation and reload – only imposed by client ● Control over transformation of entities ● Extensions to the caching system
  • 19. Cache-Control – Requests 1 ● no-cache – cache should revalidate with server ● no-store – do not store on durable media ● max-age[=sec] – clients wants info no older than this ● max-stale[=sec] – client accepts stale information but no more stale than this
  • 20. Cache-Control – Requests 2 ● min-fresh[=sec] – clients wants info that will stay fresh for this time ● no-transform – no trasnform by intermediary – Medical Xray Photo from PNG to JPEG ● only-if-cached – when connection to server is bad. Better to get 504 (Gateway Timeout) than wait ● cache-extension – extensions
  • 21. Cache-Control – Responses 1 ● public – may be cached by any cache ● private – must not be cached by shared cache ● no-cache – cache should revalidate with server ● no-store – do not store on durable media ● no-transform – no trasnform by intermediary ● must-revalidate – server requested revalidation of stale data ● proxy-revalidate – same as above but not for user agent cache
  • 22. Cache-Control – Responses 2 ● max-age[=sec] – for any cache ● s-maxage[=sec] – for shared cache, priority over max-age and Expires. ● cache-extension – extensions
  • 23. Status Codes 1 ● 201 Created – can contain ETag, resource created – (contrast with 202) ● 203 Non-Authoritative Information – not from originating server but from cache ● 206 Partial Content – range partial GET request – (contains ETag, Expires, Cache-Control, Vary if changeable). Result to If-Range. If either ETag or Last-Modified don't match – cache does not combine them with others. If no support from ranges in cache – 206 not cached.
  • 24. Status Codes 2 ● 302 Found – redirect that can change. Use Cache- Control or Expires ● 304 Not Modified – conditional GET, resource not changed, body of response empty (ETag/Content- Location, Expires, Cache-Control or Vary) ● 305 Use Proxy – per request, generated by server ● 307 Temporary Redirect – similar to 302
  • 25. Conditional requests/responses ● Origin servers – Should provide both ETag (preferably strong unless not feasible) and Last-Modified – Must avoid reusing specific strong ETag for different entities ● Clients – Must/should use ETag Last-Modified and them in conditional requests
  • 26. AJAX ● Use cache directives in AJAX ● Try to make your AJAX responses cacheable (you will have to think!) ● POSTs are mostly uncacheable, prefer GETs to fetch information ● Generate Content-Length response headers and reuse TCP/IP connection
  • 27. Tools 1 ● Firefox addons: – Firebug – LiveHTTPHeaders – Modify Headers ● Chrome, Opera, Internet Explorer dev tools (F12)
  • 28. Tools 2 ● Mark Nottingham: Caching tutorial ● Redbot: Check cacheability ● Old, but gold: Cacheability