Skip to content

Instantly share code, notes, and snippets.

@nicolasdao
Last active March 23, 2024 13:25
Show Gist options
  • Save nicolasdao/98e10d2c6820c3445b56322890a4d459 to your computer and use it in GitHub Desktop.
Save nicolasdao/98e10d2c6820c3445b56322890a4d459 to your computer and use it in GitHub Desktop.
AWS CloudFront Guide. Keywords: aws cloudfront cdn edge gzip brotli

AWS CLOUDFRONT GUIDE

Example on how to provision multiple configuration flavors of CloudFront available at https://gist.github.com/nicolasdao/d90015ff90aae77c0a599621f5a8f432#static-website.

Table of contents

CloudFront for S3 Static Website

  1. Create a static website on S3.
  2. Create a new CloudFront distribution and configure it as follow:
    • Under Origin domain select the static website on S3 that you created in the previous step. A message should appear with a button called Use website endpoint. Click on it. This will update the domain from YOURBUCKET.s3.amazonaws.com to the s3 website endpoint (whatever that was configure in the previous step).
    • Under the Default cache behavior section, under the Viewer setting, select the Redirect HTTP to HTTPS option.
    • If you need a custom domain, under the Settings section, configure the following aspects:
      • Under the Alternate domain name (CNAME) - optional config, add one or more custom domain (e.g., example.com, www.example.com,anotherexample.com).
      • Under the Custom SSL certificate - optional select an SSL certificate from AWS Certificate Manager list. The certificate must be configure for all the domains configured in the previous step (in our case: example.com, www.example.com,anotherexample.com). If no certificate exists yet, create one by clicking in the Request certificate link.
    • Click on the Create distribution button.
  3. If you need a custom domain, configure the hosted zone in AWS Route 53.

gzip and brotli Compression

By default, new CloudFront distribution do not enable compression. To enable gzip (for brotli keep reading):

  • Select the Behaviors tab.
  • Select the default behavior (Path pattern: Default (*)) and click on the Edit button.
  • Under the Settings pane, under the Compress objects automatically section, tick Yes.

If you're using a IaC tool (CF, Pulumi, Terraform), there should be a defaultCacheBehavior property under the aws.cloudfront.Distribution constructor. That defaultCacheBehavior should define a compress flag (e.g., https://www.pulumi.com/registry/packages/aws/api-docs/cloudfront/distribution/#distributiondefaultcachebehavior).

Though that compress flag enables gzip, it does not enable brotli by default. To enable brotli, the cache policy must be updated.

MORE ABOUT BROTLI COMING SOON...

Managing Cache policies

Original AWS doc: Improve your website performance with Amazon CloudFront

The confusing bit about configuring CloudFront cache

The CloudFront cache configuration is done via one or more Cache Behaviors. There is always a default cache behavior. A cache behavior defines the following 3 optional properties (more about them later):

  • minTtl
  • defaultTtl
  • maxTtl

This (including the official AWS documentation) tricks you to believe that those properties will also set the cache-control response header, but they do not. Those properties are somewhat related to the effect of the cache-control response header, but if that header is not explicitly returned by your origin server, it will not be added to your response by CloudFront, at least not by default. If you need CloudFront to set this cache-control header, you must:

  1. Create a new Response Headers Policy and define a custom cache-control header.
  2. Link it to your specific Cache Behavior.

Understanding the native Cache-Control HTTP header

In a nutshell:

  • cache-control: max-age=600: Object cached on both the CDN and the browser for 10 minutes.
  • cache-control: max-age=0,s-maxage=600: Object cached on the CDN for 10 minutes but not in the browser.
  • cache-control: private,max-age=600: Object only cached in the browser for 10 minutes.

Keep reading if you wish to undertand this in greater depth.

This header is the native response header that instructs the browser on how it should cache the response. CloudFront follows that specification too (more about this in the next section Configuring CloudFront's cache policy with default_ttl, min_ttl and max_ttl).

The most common values for this header are:

  • max-age: Value in seconds that defines how long the object can be cached.
  • s-maxage (the s stands for shared): Same as max-age except it only affects proxies (the browser ignores this property). If this property exists, the proxy uses it instead of max-age. CloudFront follows this specification.
  • public: No value, just add this attribute. This means that the object is cacheable by proxies. When max-age is specified, public is the default.
  • private: No value, just add this attribute. This means that the object is NOT cacheable by proxies. It is only cacheable by the browser. This attribute is usefull when the content is specific to the user. It allows to cache the object on the browser without affecting other users.

Let's have a look at a few HTTP response examples:

cache-control: max-age=600

This means that the object is cacheable for 10 minutes both on the browser and the proxy (e.g., CloudFront). The first user to request the object will fetch it from the origin server. If they request this object a second time within 10 minutes, no request is issued at all since the object is still in the browser. When a second user requests that object within 10 minutes, no request to the origin server is made. Instead, that request hits the proxy.

cache-control: max-age=0,s-maxage=600

This means that the object is not cached on the browser, but is cached for 10 minutes in the proxy (e.g., CloudFront).

cache-control: private,max-age=30

This means that the object is cacheable for 30 seconds on the browser, but not in any caches.

Configuring CloudFront's cache policy with default_ttl, min_ttl and max_ttl

As explained in the The confusing bit about configuring CloudFront cache section, there are 2 aspects when configuring the cache behavior of your content when using CloudFront:

  1. Configuring CloudFront's cache
  2. Configuring the response cache-control header

Configuring CloudFront's cache

As explained above, CloudFront follows the HTTP cache-control specification. However, it supports additional configuration that allows it to override the cache-control value from the origin's response. Those configurations are:

  • default_ttl: Value in seconds that is used if the max-age or s-maxage are not set by the origin server.
  • min_ttl: Value in seconds that overrides the max-age or s-maxage value is those values are too low. For example, if max-age is equal to 0 and min_ttl is 30, then the object is cached for 30 seconds in CloudFront. If the max-age had been 60, then the min_ttl would had been ignored and the object would have been cached for 60 seconds in CloudFront.
  • max_ttl: Words exactly like min_ttl, but for the upper limit.

Configuring the response cache-control header

  1. Create a new Response Headers Policy and define a custom cache-control header.
  2. Link it to your specific Cache Behavior.

Edge compute

CloudFront supports edge computing in two flavors:

  • Lambda@edge: For heavyweight processing or for tweaking the origin request or response before it is cached. Those functions could take a few seconds to execute. That cost may be mitigated if the function is only executed when the cache misses (origin-request or origin-response events).
  • CloudFront functions: For lightweight processing such as managing the cache-key, URL rewrites, basic token verification. Those functions are executed for each request but are extremely fast (~1ms).

Those functions can be triggered for the following 4 types of event:

  • viewer-request: Triggers the function for each request before it hits the cache.
  • viewer-response: Triggers the function for each response after it comes back from the cache.
  • origin-request (Lambda@edge only): Triggers the Lambda@edge function for each request that missed the cache before hitting the origin.
  • origin-response (Lambda@edge only): Triggers the Lambda@edge function for each response that was missed in the the cache after the origin responded.

Lambda@edge

Debugging a Lambda@edge

  • Browse to the CloudFront distro.
  • Click on Monitoring under the Telemetry menu, then select the Lambda@Edge tab.
  • Tick the desired lambda@edge, then click on the View distribution metrics button.
  • Click on the View function logs button and select the region you think contains valuable logs.

CloudFront functions

Limitations

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/edge-functions-restrictions.html#lambda-requirements-lambda-function-configuration

Advanced configuration

Dealing with 404 for SPA or PWA

SPAs (Single Page Applications) or PWAs (Progressive Web Applications) use dynamic routing. This means that the URL's path represents an application's state, but not necessarily a physical resource on a server (e.g., static page in an S3 bucket). By default, such application hosted on S3 and cached via CloudFront will expose a single object (most likely the index.html). This resource (e.g., index.html) contains Javascript that will update the URL history in accordance to its state changes (e.g., clicking on a button in the home page / opens the /blog page). Because this resource does not physically exist in the origin server, CloudFront returns a 404 error.

The solution is to configure CloudFront to catch all 404 errors and return an S3 resource that exists (in our case the path to the index.html object) along with a 200 status. This can be done via the Error pages tab in the CloudFront console or the customErrorResponses option in CloudFormation/Terrafform/Pulumi IaC tool. For example, with Pulumi:

customErrorResponses: [{
	errorCode:404,
	errorCachingMinTtl:300, 
	responseCode: 200,
	responsePagePath: '/'
}]

FAQ

How to check whether my CloudFront distribution compresses content with gzip or brotli?

  1. Makes sure that the browser supports it. If it does, the request should define an accept-encoding header containing the br value (e.g., accept-encoding: gzip, deflate, br means that the browser supports those 3 encodings).
  2. The encoding type is described in the response header content-encoding. If it contains br, the content is comrpessed with brotli.

Annexes

References

@nicolasdao
Copy link
Author

cloudfront-function-and-lambda-edge-2-1024x454
cloudfront-functions-only-lambda-egde-1024x413

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment