Skip to content

Instantly share code, notes, and snippets.

@hjhart
Last active August 6, 2021 07:35
Show Gist options
  • Save hjhart/34d78b9cdf4c6b3aa2b2a5f0fb23f16e to your computer and use it in GitHub Desktop.
Save hjhart/34d78b9cdf4c6b3aa2b2a5f0fb23f16e to your computer and use it in GitHub Desktop.

Effective Caching for Yarn, Bundler, and Rails Asset Pipeline in CircleCI

Phil Karlton said: "There are two hard problems in computer science: cache invalidation, naming things, and off-by-1 errors."

Well, maybe he didn't say exact that, but it's a decent joke anyway.

In this article I'll talk about the first of those problems, caching. And in doing so effectively, how we reduced our CircleCI build times by 33%. For those of you raising your eyebrows and saying "33% off of what?!", I'll direct your eyeballs to the graph below for some absolute numbers.

[ TODO: Show a bar graph of each job going down by a percentage ]

Amazing, right?!

Now I'll show you how. But first, a note on all of the code snippets in this blog post!

All of the code snippets I'm sharing are CircleCI commands, so they can be copy and pasted easily if you're using CircleCI config version 2.1.

Yarn Caching

The important thing to do is to know when to invalidate caches. In this case, we know that the node_modules directory will update if and only if our yarn.lock file updates. So yarn is actually pretty easy!

undefined

Bundler Caching

Bundler is similar too. The Gemfile.lock file is explicit with dependencies, so only when it changes do we need to invalidate the cache and generate a new one.

undefined

You'll notice that there are two keys within the restore_cache step. The first key, bundle-{{ arch }}-{{ checksum "~/voom/Gemfile.lock" }} will always match if the Gemfile.lock matches.

But what if the Gemfile.lock changes?

Well, it will fall back to the bundle-{{ arch }}- key, which was created the same time the first bundle-{{ arch }}-{{ checksum "~/voom/Gemfile.lock" }} was saved.

Caches are immutable in CircleCI, so once you create them you will not create them again.

Which presents us with a challenging caching scenario. Let's say that I run this job in January of this year. Then, in six months, when I run this job again and the bundle-{{ arch }} key matches, that means the cache is six months old. That might not be an improvement!

So it is good to invalidate those caches every so often. So, let's manipulate that key to invalidate every month or so.

This allows us to fall back to a cache if the initial key doesn't match.

undefined

Okay, this looks better. Now we have another step that dynamically exports a variable to BASH_ENV, which is a hugely useful tool for making your commands in CircleCI more dynamic.

Now, if our Gemfile.lock changes at most we will be falling back to a cache generated at the beginning of the month, so instead of a 6 month old cache, we will get a 1 month old cache.

You can tighten or loosen the time-based invalidation based on what you prefer. The first job where the cache is invalidated will be slow, and you should expect each subsequent build to be fast after that.

At this point, I need to point out a huge time waster that I discovered in my journey: If you are using checksum for cache invalidation, make sure the file does not change in between restore_cache and save_cache.

For instance, I was running bundle install with a different version of bundler inside of CircleCI. That would update the Gemfile.lockto change the version of bundler used.

That means that the checksum when saving the cache was based off a file that was never checked into version control.

:sad_emoji:

So that's why we are using bundlers --frozen flag during bundles to make sure that doesn't happen (again).

Asset Pipeline / Webpacker Caching

undefined

Here is another strategy for cache invalidation: find ~/voom/app/javascript ~/voom/app/assets -type f -exec md5 -q {} \; > ~/voom/dependency_checksum

What it says is: If any file within these directories (~/voom/app/javascript or ~/voom/app/assets) then let's generate a new cache.

I suspect there are holes in this cache key, like, if a yarn.lock file changes potentially it should invalidate the cache. And since caching is hard, and I don't really want to think too hard about it, I like to add the note to bump the cache when things get too slow (with a threshold, so anyone can bump the cache prefix).

Bonus: MacOS Homebrew caching

I'm going to post this one here because it took about a minute off of each macOS build for me, and it is working great. You've seen the strategies already, so I won't go into them:

undefined

Wrapping Up

Well, we have come a long way, but we still have a lot to go. Cocoapod installation still takes 3 minutes. xCode building takes about 5 minutes. We stand to gain a lot from a better caching strategy!

Please reach out to us at Voom if this article helped, or if you've found some other strategies that work well for you. Making CI faster means we can deploy faster!

yarn_install:
description: Yarn Install
steps:
- restore_cache:
keys:
- v1-yarn-{{ arch }}-{{ checksum "~/voom/yarn.lock" }}
- v1-yarn-{{ arch }}-
- run:
name: "Yarn Install (Note: Bump cache prefix if this step takes over 30 seconds.)"
command: cd ~/voom; yarn install
- save_cache:
key: v1-yarn-{{ arch }}-{{ checksum "~/voom/yarn.lock" }}
paths:
- ~/voom/node_modules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment