HTTP Caching Lightning Summary

Why use cached data?

Speed and cost.

I'm a browser and the user returns to a photo page. Why spend time and network usage fetching it again?
I'm a proxy and a client has requested a static page I served another client recently. Why fetch it from the origin again?

Why not use cached data?

Storage

I'm a browser and the user returns to a photo page. But they were last there weeks ago, if I keep everything that long, the hard drive would be full.
I'm a proxy and a client has requested a static page I served another client today. But I served loads of stuff today. I can't keep all of it just in case!

Freshness

I'm a browser and the user returns to a newspaper front page. I could give them the same data as last time, but they want up-to-date news.
I'm a proxy and a client has requested a static page I served another client today. But that page was updated by its owner since I last served it. The client doesn't want my stale version.

Good news

This is fundamental to an efficient WWW, and hence the HTTP standards have it all nailed down.

See: https://www.rfc-editor.org/rfc/rfc9111.html

At a fairly fundamental level. GET is the HTTP method that benefits from caching. The other mainstream methods either have (potential) side-effects, or no body to cache.

If you want caching, use GET

Control

We have three collaborators here, you can't force any of them to do anything, but you want them to collaborate to do the best thing for everyone:

browser wants:
- snappy browsing
- efficient use of its own resources (storage, network)
- non-stale content
proxy wants:
- snappy browsing for its clients
- efficient use of its own resources (storage, downstream network)
- deliver non-stale content
- minimal redundant requests (same client asking for same content again)
origin server wants
- to deliver up-to-date content when necessary
- minimal redundant requests (same client asking for same content again. even if that client is a proxy)

Also remember that there could be any number of proxies, in a chain -- or none.

How can caching be controlled, when these components are loosely coupled, and have partially overlapping priorities?

The answer is HTTP headers:

Response headers: the origin server tells the proxy and the client things that help them decide about caching
- Example: Expires: Wed, 9 Oct 2024 13:28:00 GMT
Request headers: the client tells the server things that help it respond in cache-friendly ways
- Example: If-Modified-Since: Wed, 9 Oct 2024 15:06:05 GMT

A fun thing is, almost all of it is "advisory". Because the web is so focussed on backward-compatibility, loose coupling, and untrusted collaborators, when a server responds with, for example Cache-control: no-cache, there is no way to guarantee that a proxy even understands that directive, or that it won't go ahead and cache it anyway.

However, because for the most part the collaborators' interests are mutual, in mainstream circumstances you can expect headers to be honoured.

Invalidation

In this document, the way we'll think of things is that regardless of which cache it is, browser, proxy, whatever, a cached copy of the document will always be used, unless:

the cache has never seen the resource before
the cache has dropped its copy of the resource for arbitrary reasons of its own
the cache can, through rules, invalidate the cached copy

Invalidation uses all that information in the headers, and usually a clock, to reject stale items from the cache.

Non-exhaustive example scenarios

There are more cache-relevant headers than you might expect, and some trump others. For example the standards instruct us to ignore Expires if Cache-Control: max-age=... is present. We will not attempt to discuss them all here. Instead we'll describe a few scenarios in which some modern headers influence caching.

Cache-Control: max-age

As a blog publisher, publishing a couple of articles a day, I don't want my server to be hammered by requests as my users return to the page. But equally, when I publish an article, I want eyes on it. So I decide that I want clients and proxies to cache for 5 minutes.

I achieve this by configuring the server to set the header Cache-Control: max-age=360

Now a well-behaved client will, if hasn't seen this resource before, set an expiration time of now + 360s.

The next time the client wants that resource, it will compare its clock time with the expiration:

if now < expirationTime - use the cached copy
else forget the cached copy, fetch anew (and create another cached copy)

Now browsers won't fetch the resource more than once every 5 minutes -- unless the user overrides with a forced reload.

Better still, proxies will also honour this, so if 1000 people are coming through the same proxy, between them they'll only send one request to the origin every 5 minutes.

(You might be wondering how the 5 minutes don't stack up to more, through chains of proxies. Or how a hard refresh works when there's a proxy in the mix. Age header and Cache-Control: no-cache.)

If-Modified-Since

As a JS application running in a browser, I want to poll a list of posts, so I can update the user's view.

I poll every 10 seconds, but fetching the whole list of posts again every 10 seconds is affecting performance and server costs.

So, each time poll, I set a lastPoll timestamp, and I add to the request headers: If-Modified-Since {lastPoll}.

Now, if the resource has been modified since that time, the origin will respond with a 200 OK and the body, as normal. But if the resource has not been modified, it will respond with 304 Not Modified. My app can keep using the copy it already has.

Static site servers generally honour If-Modified-Since, based on file metadata, as you might expect.

Proxies are likely to use If-Modified-Since, to extend the validity of a cached resource that's past its max-age. If max-age is passed, request the resource with If-Modified-Since when it was last fetched. If the response is 304, keep the cached copy, updating its timestamps.

ETag (Entity Tags)

Sometimes timestamps aren't the most appropriate way to think about freshness. ETag is a mechanism to conditionally fetch versions of resources without using time.

We'd typically first encounter the ETag as a response header when we first fetch the resource. As browser app retrieving a list of posts, again, perhaps the response is:

200 OK
ETag: "bc8fce5e"

The ETag is a unique version tag for a resource. There are no official semantics or specifications for how it's generated. It could be a UUID, a hash, an incrementing number. What's specified is that if the ETag remains the same, then the body also remains the same.

So, next time my client polls, it can use the ETag in a request header: If-None-Match: "bc8fce5e".

If the server's current version of the resource has a matching ETag, it may respond with 304 Not Modified, saving network and computational resources for both client and server. Otherwise it responds with 200 OK and the body, as normal, with a new ETag..

(Note: While we're concentrating on cachable GET requests, ETag has broader uses. You can use If-Match to guard against PUT requests based on a stale previous version)

Cache Busting

Cacheing contributes greatly to the efficiency and usability of the web. But sometimes aggressive caching makes pain for site owners. A classic case is setting very long cache expiration on elements of a site, then later republishing the site with some of those elements changed, while retaining the same URL. For example, http://example.com/team-photo.jpeg might be changed when a new person joins, but since an admin set Cache-Control: max-age=31536000, users with a cache won't see the change for a year!

So, a fairly common strategy is to add a "cache-busting" suffix to the filename, and update links to it accordingly. http://example.com/team-photo-ac55de2a.jpeg. This assumes that the HTML linking to this resource is not so aggressively cached; long ages being reserved for large media resources.

I see cache-busting as something of a hack; use it when you're forced to, but for the most part really long max-age values aren't necessary.

So how do I get all this?

Sometimes you just get it

A lot of caching just quietly happens while you're not looking, and Just Works. Sometimes you notice it when it goes wrong

hence cache-busting. Requests initiated by the browser itself (as opposed to Javascript), and interactions between proxies and their downstream server, generally set and interpret cache-control headers smartly, so you always get fresh-enough content, but avoid overuse of the network.

Sometimes you want to configure it

Servers serving static sites from a filesystem (or the equivalent) will generally set the headers necessary for some caching. Sometimes it pays to change the configuration from the defaults.

Sometimes it's in your hosting infrastructure

Reverse proxies are built into services like AWS CloudFront, CloudFlare etc. and stuff will be cached at the edge by default. You can configure to change policies, and things you do with headers in the client and the origin server will have an effect.

Do it explicitly in your servers

If you're programatically handling GET requests in a server, it's up to you to handle cache directives if you want to -- or find a library that does it. Interpreting If-Modified-Since, or supporting ETags may save you hosting costs and improve application performance, especially if you also control the client.

You might get some caching in browser JS

If you're programatically making GET requests in the browser, you may get caching for free, but test and find out.

You probably need to work for it in backend JS

If you're programatically making GET requests in a back-end application, it's probable that you'll need to take some proactive steps to use caching effectively.

You can explicitly code for If-Modified-Since or If-None-Match
You can explicitly look at max-age or Expires to avoid re-fetching fresh resources.
Or you can reach for your language's library ecosystem. For example for Axios on Node.js there is Axios Cache Interceptor. This will (for example) inject an If-Modified-Since header into your requests, intercept 304 Not Modified responses, and replace them with a 200 OK containing the cached resource.

ukslim/HTTP_cache_control_lightning_summary.md