Skip to content

Instantly share code, notes, and snippets.

@jamesk
Created April 26, 2017 10:35
Show Gist options
  • Select an option

  • Save jamesk/95015e8876494439ecc880cd77b2e5eb to your computer and use it in GitHub Desktop.

Select an option

Save jamesk/95015e8876494439ecc880cd77b2e5eb to your computer and use it in GitHub Desktop.
A proposal for how Drone could handle monorepos

Proposal for how Drone could handle monorepos

Summary

By a monorepo I mean a single repository that contains multiple projects. As the number of projects in the repo grows or if the projects have very expensive builds it becomes very cumbersome to rebuild all projects on every commit.

So at the very least for a monorepo you would want to some how filter what build / build steps were triggered by a commit.

Approaches

Keep current .drone.yml but provide a changeset filter for pipeline steps

As suggested in harness/harness#1021 (comment) add an include/exclude for pipeline steps

Pros

  • Small amount of changes to current Drone architecture
  • Doesn't add new concepts to Drone and is very simple idea to understand
  • Also covers the non-monorepo case but where you don't want to run all build steps all the time e.g. for expensive test suites
  • If the projects in the repo share build steps you can easily reuse them (providing the ordering is the same)

Cons

  • Could quickly end up with unwieldy .drone.yml files
  • Harder to read .drone.yml with lots of conditional logic to think about
  • Possibly more merge conflicts as people working on different project could be editing .drone.yml concurrently

Allow for multiple .drone.ymls per repo

In this case Drone needs to know where the the sub .drone.yml files are. It also needs to know when to run builds for those sub ymls. And if we want to handle renaming / moving of projects within the repo we also need a way to track the sub ymls i.e. an id that never changes and refers to the sub .drone.yml

Root .drone.yml contains location of sub configs

Store the metadata for sub .drone.ymls in a root .drone.yml (perhaps with a different name e.g. .drone.root.yml)

Data per sub yml

  • An id for the sub yml
  • a display name
  • the location in the repo of the sub yml
  • An include array of paths that should trigger the sub yml
  • An include array of paths that should not trigger the sub yml

Pros

  • The drone server does not need to hold any new configuration that may well change over time / across branches
  • Simple to always find a repo's builds, just look at the root yml
  • Easily tell which files each project is concerned about
  • Each sub yml should remain identical to current .drone.yml

Cons

  • if you have a lot of sub ymls then perhaps the parsing on each commit could get slow?
    • Did a test with grep against 867,374 paths (all my system's paths) with the pattern '.*/.*\.js' returning 27,857 paths, the time taken was only 60ms
  • Easier to get merge conflicts as people working on different projects need to edit one file?
  • Might use same id on two different branches and drone would treat them as the same project?
    • Don't think there is anyway around this type of problem, even if you use path to sub yml you can have a similar issue (although less likely)
  • Introduces a new concept, this new "root" config file

Store sub yml metadata on drone server

Data per sub yml

  • id for the sub yml
  • relative path of the sub yml
  • Display name for the sub yml?
  • Include/Exclude paths?

Pros

  • Since data is on the server it should always be fast and easy to look up

Cons

  • What happens when a yml location moves? Server needs to be aware and update db record
  • What happens if the yml has moved on one branch but not on another?
  • Puts state / config about what / when to build into Drone server, now it is not so simple to look at a repo and know what is being built

Implicit scoping of sub yml files to their children directories

All .drone.yml files in a repo are picked up and builds will be run for them. Only run the build if a file in the same directory or in a descendant directory has changed in this commit

Pros

  • No extra config or state in Drone server
  • Simple concept

Cons

  • Lacks flexibility, might end up wanting to put lots of .drone.ymls in a base directory to cover 2 of the sub-directories but would implicitly cover all the sub-directories, perhaps quite a lot of them
  • If a sub yml's location is changed how do we track build history?
    • Could have an id in the sub yml just like for root yml approach
  • Can't have an unused .drone.yml in the repo, it would be picked up always

Scan for and read all sub yml files on each commit

As above but without the implicit scoping

Pros

  • No extra config or state in Drone server
  • Simple concept in the simple case

Cons

  • Complex now if you want to find what might trigger a build, need to check every yml in the repo to see if the includes/excludes cover a file/directory
  • If a sub yml's location is changed how do we track build history?
    • Could have an id in the sub yml just like for root yml approach
  • Can't have an unused .drone.yml in the repo, it would be picked up always
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment