Speed and cost.
- I'm a browser and the user returns to a photo page. Why spend time and network usage fetching it again?
- I'm a proxy and a client has requested a static page I served another client recently. Why fetch it from the origin again?
Storage
- I'm a browser and the user returns to a photo page. But they were last there weeks ago, if I keep everything that long, the hard drive would be full.
- I'm a proxy and a client has requested a static page I served another client today. But I served loads of stuff today. I can't keep all of it just in case!
Freshness
- I'm a browser and the user returns to a newspaper front page. I could give them the same data as last time, but they want up-to-date news.
- I'm a proxy and a client has requested a static page I served another client today. But that page was updated by its owner since I last served it. The client doesn't want my stale version.
This is fundamental to an efficient WWW, and hence the HTTP standards have it all nailed down.
See: https://www.rfc-editor.org/rfc/rfc9111.html
At a fairly fundamental level. GET
is the HTTP method that benefits from caching. The other mainstream methods either
have (potential) side-effects, or no body to cache.
If you want caching, use GET
We have three collaborators here, you can't force any of them to do anything, but you want them to collaborate to do the best thing for everyone:
- browser wants:
- snappy browsing
- efficient use of its own resources (storage, network)
- non-stale content
- proxy wants:
- snappy browsing for its clients
- efficient use of its own resources (storage, downstream network)
- deliver non-stale content
- minimal redundant requests (same client asking for same content again)
- origin server wants
- to deliver up-to-date content when necessary
- minimal redundant requests (same client asking for same content again. even if that client is a proxy)
Also remember that there could be any number of proxies, in a chain -- or none.
How can caching be controlled, when these components are loosely coupled, and have partially overlapping priorities?
The answer is HTTP headers:
- Response headers: the origin server tells the proxy and the client things that help them decide about caching
- Example:
Expires: Wed, 9 Oct 2024 13:28:00 GMT
- Example:
- Request headers: the client tells the server things that help it respond in cache-friendly ways
- Example:
If-Modified-Since: Wed, 9 Oct 2024 15:06:05 GMT
- Example:
A fun thing is, almost all of it is "advisory". Because the web is so focussed on backward-compatibility, loose coupling, and
untrusted collaborators, when a server responds with, for example Cache-control: no-cache
, there is no way to guarantee
that a proxy even understands that directive, or that it won't go ahead and cache it anyway.
However, because for the most part the collaborators' interests are mutual, in mainstream circumstances you can expect headers to be honoured.
In this document, the way we'll think of things is that regardless of which cache it is, browser, proxy, whatever, a cached copy of the document will always be used, unless:
- the cache has never seen the resource before
- the cache has dropped its copy of the resource for arbitrary reasons of its own
- the cache can, through rules, invalidate the cached copy
Invalidation uses all that information in the headers, and usually a clock, to reject stale items from the cache.
There are more cache-relevant headers than you might expect, and some trump others. For example the standards instruct us
to ignore Expires
if Cache-Control: max-age=...
is present. We will not attempt to discuss them all here. Instead we'll
describe a few scenarios in which some modern headers influence caching.
As a blog publisher, publishing a couple of articles a day, I don't want my server to be hammered by requests as my users return to the page. But equally, when I publish an article, I want eyes on it. So I decide that I want clients and proxies to cache for 5 minutes.
I achieve this by configuring the server to set the header Cache-Control: max-age=360
Now a well-behaved client will, if hasn't seen this resource before, set an expiration time of now + 360s
.
The next time the client wants that resource, it will compare its clock time with the expiration:
- if
now < expirationTime
- use the cached copy - else forget the cached copy, fetch anew (and create another cached copy)
Now browsers won't fetch the resource more than once every 5 minutes -- unless the user overrides with a forced reload.
Better still, proxies will also honour this, so if 1000 people are coming through the same proxy, between them they'll only send one request to the origin every 5 minutes.
(You might be wondering how the 5 minutes don't stack up to more, through chains of proxies. Or how a hard refresh works
when there's a proxy in the mix. Age
header and Cache-Control: no-cache
.)
As a JS application running in a browser, I want to poll a list of posts, so I can update the user's view.
I poll every 10 seconds, but fetching the whole list of posts again every 10 seconds is affecting performance and server costs.
So, each time poll, I set a lastPoll
timestamp, and I add to the request headers: If-Modified-Since {lastPoll}
.
Now, if the resource has been modified since that time, the origin will respond with a 200 OK
and the body, as normal. But
if the resource has not been modified, it will respond with 304 Not Modified
. My app can keep using the copy it already has.
Static site servers generally honour If-Modified-Since
, based on file metadata, as you might expect.
Proxies are likely to use If-Modified-Since
, to extend the validity of a cached resource that's past its max-age
.
If max-age
is passed, request the resource with If-Modified-Since
when it was last fetched. If the response is 304
,
keep the cached copy, updating its timestamps.
Sometimes timestamps aren't the most appropriate way to think about freshness. ETag
is a mechanism to conditionally
fetch versions of resources without using time.
We'd typically first encounter the ETag as a response header when we first fetch the resource. As browser app retrieving a list of posts, again, perhaps the response is:
200 OK
ETag: "bc8fce5e"
The ETag is a unique version tag for a resource. There are no official semantics or specifications for how it's generated. It could be a UUID, a hash, an incrementing number. What's specified is that if the ETag remains the same, then the body also remains the same.
So, next time my client polls, it can use the ETag in a request header: If-None-Match: "bc8fce5e"
.
If the server's current version of the resource has a matching ETag, it may respond with 304 Not Modified
, saving network
and computational resources for both client and server. Otherwise it responds with 200 OK
and the body, as normal, with a new ETag..
(Note: While we're concentrating on cachable GET
requests, ETag has broader uses. You can use If-Match
to guard against
PUT
requests based on a stale previous version)
Cacheing contributes greatly to the efficiency and usability of the web. But sometimes aggressive caching makes pain for
site owners. A classic case is setting very long cache expiration on elements of a site, then later republishing the site
with some of those elements changed, while retaining the same URL. For example, http://example.com/team-photo.jpeg
might
be changed when a new person joins, but since an admin set Cache-Control: max-age=31536000
, users with a cache won't see
the change for a year!
So, a fairly common strategy is to add a "cache-busting" suffix to the filename, and update links to it accordingly.
http://example.com/team-photo-ac55de2a.jpeg
. This assumes that the HTML linking to this resource is not so aggressively
cached; long ages being reserved for large media resources.
I see cache-busting as something of a hack; use it when you're forced to, but for the most part really long max-age
values aren't necessary.
A lot of caching just quietly happens while you're not looking, and Just Works. Sometimes you notice it when it goes wrong
- hence cache-busting. Requests initiated by the browser itself (as opposed to Javascript), and interactions between proxies and their downstream server, generally set and interpret cache-control headers smartly, so you always get fresh-enough content, but avoid overuse of the network.
Servers serving static sites from a filesystem (or the equivalent) will generally set the headers necessary for some caching. Sometimes it pays to change the configuration from the defaults.
Reverse proxies are built into services like AWS CloudFront, CloudFlare etc. and stuff will be cached at the edge by default. You can configure to change policies, and things you do with headers in the client and the origin server will have an effect.
If you're programatically handling GET requests in a server, it's up to you to handle cache directives if you want to
-- or find a library that does it. Interpreting If-Modified-Since
, or supporting ETags may save you hosting costs and
improve application performance, especially if you also control the client.
If you're programatically making GET requests in the browser, you may get caching for free, but test and find out.
If you're programatically making GET requests in a back-end application, it's probable that you'll need to take some proactive steps to use caching effectively.
- You can explicitly code for
If-Modified-Since
orIf-None-Match
- You can explicitly look at
max-age
orExpires
to avoid re-fetching fresh resources. - Or you can reach for your language's library ecosystem. For example for Axios on Node.js there is
Axios Cache Interceptor. This will (for example) inject
an
If-Modified-Since
header into your requests, intercept304 Not Modified
responses, and replace them with a200 OK
containing the cached resource.