- HTTP Archive page speed reports
- Cost to the user
- Web performance optimisation case studies
- 2012, Page weight matters
- 2013, Ericson mobility report (Visualiser with 2020 data)
- 2014, "websites do not need to look exactly the same in every browser"
- 2014, "there is no fold"
- July 2018, Google adds page speed to ranking
- 2019, The cost of javascript
-
-
Save MatMoore/4bb1ce31f0738fd24ac51c9cccf44b2d to your computer and use it in GitHub Desktop.
- Is it happening? -> First paint, First contentful paint
- Is it useful? (Can users engage with the content?) -> First meaningful paint
- Is it usable? (Can users interact with the page?) -> Time to interactive, DOMContentLoaded
- Is it delightful? (Are the interactions smooth and natural) -> Cumulative layout shift
This is the time from when the request is initiated until the browser receives some data from the server
It depends on:
- Latency
- Connection speed
- Server timing
Each of these steps adds delay:
- The user makes the intial request e.g. by typing in a URL, clicking a link
- The browser resolves the domain
- The browser establishes a connection to the server
- The browser and server negotiate a secure connection
- The server issues potential redirects
- The browser receives the HTML response from the server
Round trip time (RTT) to the server is a huge factor, which is why we use CDNs.
Ilya Grigorik wrote an article about this: Latency: The New Web Performance Bottleneck
As an example, if a request has to cross the atlantic ocean, round trips to the server will take 100ms of milliseconds at best. The RTT can be more than a second.
The speed of the network also matters, mobile networks are a lot slower.
Every 3rd party request to a new domain is a new connection process.
CSS and scripts in the head tag block following content from rendering.
Largest contentful paint (LCP) is the render time of the largest content element visible in the viewport.
It's a rough indicator of when the user can meaningfully engage with the website.
The following resources can all cause delays between the first paint and LCP, because the browser has to make additional round trips to the server to process them:
- Images (additional round trips)
- Custom fonts
- Javascript
Images are particuarly problematic because after they load, text beneath them will reflow which is very disruptive.
Images that use data URIs are blocking: since all the data is there the browser will immediately start rendering, before following content is rendered.
By default, fetching external scripts blocks rendering as well.
If a server fails to respond, browsers can wait ages for a response, which leaves the website in a broken state. Chrome waits 30s; iOS waits 75s.
3rd party scripts can request other 3rd party scripts. Request map is a useful tool for visualising 3rd party scripts. For example optimizeley JS makes 3 more requests to other optimizeley URLs.
3rd party scripts can inject content haphazardly
Median time to interactive is 9.3 seconds on mobile. CNN is 10s on desktop!
You should aim for < 5s on a slow 3G connection on a median mobile device.
When using frameworks like React, there is a pattern of rendering a version of the page on the server and then rehydrating on the client. But this forces the device to render the content twice!
The react docs state:
If you intentionally need to render something different on the server and the client, you can do a two-pass rendering. Components that render something different on the client can read a state variable like this.state.isClient, which you can set to true in componentDidMount(). This way the initial render pass will render the same content as the server, avoiding mismatches, but an additional pass will happen synchronously right after hydration. Note that this approach will make your components slower because they have to render twice, so use it with caution.
But this comes with the following warning:
Remember to be mindful of user experience on slow connections. The JavaScript code may load significantly later than the initial HTML render, so if you render something different in the client-only pass, the transition can be jarring. However, if executed well, it may be beneficial to render a “shell” of the application on the server, and only show some of the extra widgets on the client. To learn how to do this without getting the markup mismatch issues, refer to the explanation in the previous paragraph.
Rendering on the web describes different rendering models including this one.
The primary downside of SSR with rehydration is that it can have a significant negative impact on Time To Interactive, even if it improves First Paint. SSR’d pages often look deceptively loaded and interactive, but can’t actually respond to input until the client-side JS is executed and event handlers have been attached. This can take seconds or even minutes on mobile.
Perhaps you’ve experienced this yourself - for a period of time after it looks like a page has loaded, clicking or tapping does nothing. This quickly becoming frustrating... “Why is nothing happening? Why can’t I scroll?”
Performance metrics collected from real websites using SSR rehydration indicate its use should be heavily discouraged. Ultimately, the reason comes down to User Experience: it's extremely easy to end up leaving users in an “uncanny valley”.
There’s hope for SSR with rehydration, though. In the short term, only using SSR for highly cacheable content can reduce the TTFB delay, producing similar results to prerendering. Rehydrating incrementally, progressively, or partially may be the key to making this technique more viable in the future.
The total kilobytes metric is important as data costs money. But it's not something a user would notice directly until they saw their bill.
- Input delay: how long does it take to respond to user interaction
- Custom metrics for your site (for example, Twitter used the time to open tweet the box)
What's causing the metric to be slower than what we expect?
-> Set goals and budgets -> Avoid regression
Focus first on everything to the left of the "Start render" line, because all of those resources block rendering.
The browser main thread row shows how busy the main thread is.
How to read a WebPageTest Waterfall View chat by Matt Hobbs
- This is the best place to start
- Script tab can be used for automation
- Block tab can filter out URLs, so you can see how dependent your site is on particular URLs
- SPOF tab (single point of failure) mimics real life failures - you can emulate what happens if a dependency is hanging
- Can use blackhole.webpagetest.com directly in source code as well
This simulates devices rather than using real devices. Gives a high level score, web vitals metrics, and recommendations.
Cmd+Option+I
Firefox/Chrome:
- Performance tab tells you why page might feel laggy after loading
- Network tab shows waterfall
Chrome audits tab creates Lighthouse reports locally
Can connect with USB
- Budgets is the worst performance that's acceptable
- Goals is where you want to be
Start with competitors
See where competitors are doing better than you
Or use statistics from HTTP archive
E.g. 53% visits to mobiles sites are abandoned after 3 seconds
Can mark this on presentations
- Setting a performance budget
- Performance budgets, pragrmatically
- Responsive design on a budget
- Performance budget calculator - estimate TTI based on asset sizes
- Lighthousebot integrates with github, but doesn't catch everything happening on production
- Speedcurve - dashboard and alerts
- Calibre - good for communicating cost of 3rd party trackers and ads
- Audit sites worked on
- Identify 3 most critical performance problems
Cloudinary is a service that handles serving an appropriately sized images and videos. Can use it for media management.
Downside is dependency on 3rd party urls:
- point of failure
- additional network costs
- ImageOptim - a mac desktop app (jpg and png)
- Optimage - a mac desktop app (freemium). Supports other formats like webp
- WebPonize - converts to WebP
- Imagemin - compresses imagees
WebP is often lighter than jpg and png. Support is pretty good but not supported on older browsers
One way to do it is to configure server to rewrite jpg/png requests to webp.
AVIF is even newer https://jakearchibald.com/2020/avif-has-landed/
- HandBrake
- MiroVideoConverter
MPEG-4 works everywhere - but big filesize
WebM - better compression, less supported
Recommend serving both and negotiating with the browser
Fonts
- "subsetting" removes characters that aren't used. e.g. removing languages that won't be used.
- glyphhanger crawls a site and tells you what should be supported
- WOFF 2.0 is the most modern format (but doesn't support IE)
- Can use WOFF 1 as a fallback for IE11.
- Can package multiple variants of a font in one file
- Font Foundries offer fonts as a service, e.g. google web fonts - but has a cost of 3rd party request. Recommend hosting your own.
- May be a licensing factor to hosting fonts on your own site
- Minifying (e.g. removing unnecessary whitespace)
- CSS minifier is a tool for 1 off minifying
- cssmin is a build tool
- svgo is the standard tool. lots of options and good defaults.
- uglify-js
Webpack can do "Tree Shaking" (dead-code elimination in javascript)
- UnCSS removes unused rules
- UnCSS online can be used to test it out
- deflate is the main algorithm used (gzip)
- in network panel in dev tools you should see a difference between transfer size and actual size
- Brotli makes smaller files than gzip. Also uses deflate. Browser tells the server it can accept it.
- Should degrade to gzip for older browsers
- Real-World Effectiveness of Brotli: Brotli FCP improvement vs. Gzip: 3.462% decrease
Latency
- physical distance
- connection speed
Upfront connections (DNS/TLS/etc)
Static files are faster than dynamically generated files
High performance browser networking
e.g. trailing slashes (use canonical tag and serve both) No longer need this for SEO
Can use caching plugins or CDNs to serve static versions, mimic static server situation
e.g. 11ty jekyll
e.g. React
good for time to first byte
e.g. intially deliver skeleton with grey boxes
not very useful for users
The user gets nothing until they get everything
The browser blocks rendering while fetching standard link/script tags in the head of the page. "blocking file latency"
If they're not needed for rendering, you can delay scripts with defer
or async
attributes. This is a common pattern for JS.
- Async - less common. Executes whenever it arrives
- Defer - load just before domContentLoaded, respect order in the HTML
type="module"
scripts are defered by default.- You can use a script to generate a script tag and make that async
Defering javascript is the biggest easiest improvement for time to first paint
Moving script to the end of the page also makes them load later.
You may want to load some critical JS early:
- Feature tests
- Polyfills
- File loaders
- Conditional logic to bootstrap the page
The goal is to deliver one smooth rendering, so you want to "enhance optimistically"
With this pattern there is a small amount of js in the head that adds an attribute to the page saying "this is being enhanced". You can style based on the presence of that class even when waiting for the rest of the JS, and then if the script fails to load after a reasonable amount of time you can remove it again to go back to the non-enhanced version of the page.
There's no way to add async
/defer
to css. But print stylesheets always download in the background.
This leads to a hack for loading CSS asynchronously: <link rel="stylesheet href="site.css" media="print" onload="this.media='all'">
You generally don't want to load CSS async because of flash of unstylish content, but there is a pattern that uses it only for "crticial CSS"
This leaves the HTML the same but the server/edge has to do stuff.
Cloudfare supports it via a header https://www.cloudflare.com/en-gb/website-optimization/http2/serverpush/
But it was removed from edge and chrome.
Indicates resources that will be required for rendering later on so you can load late-discovered resources early.
This differs from rel="prefetch"
which is low priority and intended for the next navigation.
Example use cases:
- Resources that are pointed to from inside CSS, like fonts or images.
- Resources that JavaScript can request, like JSON, imported scripts, or web workers.
- Larger images and video files.
<link rel="preload" href="font.woff2" as="font" type="font/woff2" crossorigin>
If external CSS resources are small, insert directly. This is bad for caching though. Tip: identify and inline CSS necessary for rendering "above the fold content"
grunt-criticalcss
is one of the tools that extracts critical css. This can be combined with the async trick to load the rest of it.
- You can have shared vs template CSS files to make use of cached files across pages.
- You can use media queries to load CSS for different screens seperately. If it doesn't match it will load asynchronously, like print stylesheets.
This is about the focal points of the page. The metric measures the largest content element visible in the viewport
Analogy: If the first paint sets the stage LCP brings the characters into the scene
This allows the browser to connect to a domain in advance. Use it when fetching 3rd party script later down the page.
dns-prefetch
is similar but just does the DNS resolution. It's not as useful.
Preload is a good tool for shuffling priorities if the architecture is messy
Example: preloading a/b test code that is dynamically included later
This can slow things down if you preload low priority things, so always test first.
If you use width/height in html, browsers use it as a hint for aspect ratio and can paint a blank box. CSS is still used for the actual size.
srcset
is supported in most browsers. You specify width for each image source and browser decides which one to load.sizes
specifies up front a desired size when the image is rendered in the layout.
For example: 100vw (max-width: 500px), 50vw
means 100% wide for viewports up to 500px, then 50%
Here vw
= viewport width.
You can use sizes
to implement a zoom feature!
This is good for when you have different crops at different viewport sizes (e.g. art-directed imagery).
Another common pattern is providing type fallbacks; e.g. provide webp but fall back to other type.
srcset
alone can request very large images if the device is high resolution, so it's useful to constrain srcset
s by providing a max-width
.
Then you can provide a fallback for the largest size.
preload
can be paired with picture
or img
elements with matching constraints. It's probably overkill for most images, though.
Video can be responsive too in a similar way to picture.
BBC use the lazy sizes project for lazy loading images when they come into the viewport.
Now there is a native attribute loading="lazy"
By default browsers hide text in custom font (AKA the flash of invisible text).
font-display: swap
CSS renders in fallback and swaps in custom when the preferred font loads.
It's dangerous to use icon fonts because if they are blocked you see ridiculous things (see tripadviser rating of four fax machines and a laptop)
If you have a lot of seperate fonts, they could come in at different times and cause a lot of repaints. In this situation you could use JS to load fonts.
Another advantage of JS approaches is you can inspect navigation.connection
to better target the enhancement.
See A comprehensive guide to font loading strategies by Zach Leatherman.
Remove unnecessary scripts (telegraph: remove it and see if anyone notices)
Some options if you can't do that:
- don't vary content for first-time visits
- load it async and only mess with stuff far down the page
- preconnect the scripts
- move content variation to server side
You can also uyse cloudfare workers push browser functionality into the middle tier. This is great for A/B tests and personalisation.
This should be a less problematic if the other metrics are good.
Server side rendering with rehydration has a drawback that it can look good but still take a long time to be interactive.
We can measure total blocking time.
The Vanilla Javascript Toolkit
You don't need JS to style selects, and diverging from browser native behaviour requires a lot of thought (e.g. about accessibility).
rollup.js is a great tool for that
Webpack can also do code splitting to split into different bundles. You can use this to load only what you need when you need it, and defer loading of features you don't need.
In chrome dev tools you can view the functions that take up the most CPU time.
Be sure to throttle the network and CPU.
First input delay is the time to respond to user interaction.
window.requestIdleCallback
is a more sophisticated setTimeout
which can help avoid interactivity delays. You can set a time frame.
window.requestAnimationFrame
is a similar idea for painting to the screen.
Intersection Observer lets us observe elements as they come in and out of the viewport. If you are listening to scroll events and checking if stuff is visible, this is a much more performant way to do that.
Expires
header is the old way of doing this.
Cache-Control
is more flexible.
Cache-Control: no-cache
Browser will cache, server will revalidate the file in cache every time
Cache-Control: public, max-age=2628000
Can be cached by the browser and anything in the middle for a long time.
If the user explicitly refreshes the page, the browser will revalidate.
Cache-Control: public, max-age=2628000, immutable
will avoid this.
If you change your mind you can bust caches by varying the filename. This is why it's useful to version asset filenames.
Prefetch
asks the browser to download and caches a resource in a background. It's treated as low priority (unlike preload).
Prerender
loads a URL and recursively fetches its resources.
You can also use service workers to manage requests and resposnes. The service worker will load and install after the initial page has loaded.
You can then cache a bunch of specific files for completely offline use. You can design for offline-first, then network, rather than the other way round.
Tuning Performance for new and "Old" friends
You can use service workers to set a customer header with versions of files in cache. The server can then use this to understand client state.
https://scottjehl.thinkific.com/courses/take/lfwp