To avoid peppering too many opinions in this “how to” guide, I’ve written a separate “opinion-piece” article on monorepos — you can read it here: https://medium.com/streamdal/mostly-terrible-the-monorepo-5db704f76bdb
Sometime in early Dec 2023, our team decided to migrate ~10 public repositories for an OSS project to a monorepo.
This article provides a “rough” outline for the meatiest parts of the migration.
Good luck! 🤞
We’ve got two goals for the migration:
- Put the contents of the individual repos under a subdir in the monorepo
- Inject the original commit log of the individual repos into the monorepo
We will be migrating the following repositories to a monorepo github.com/streamdal/mono:
streamdal/server
- Golang
streamdal/console
- TypeScript + Deno + React
streamdal/cli
- Golang
streamdal/docs
- Astro
streamdal/wasm
- Rust + Wasm
streamdal/protos
- Protobuf schemas for Go, Python, Rust, TS
streamdal/wasm-detective
- Rust lib
streamdal/wasm-transform
- Rust lib
- We’ll be operating from /Users/dselans/Code, referred to as the “work-dir”
- You will need to have git and zsh (for native chdir()) installed locally
- Most of the migration will be handled by a migration script
- Last, I performed this migration on MacOS (Sonoma) — if you’re using something else, you might need to do some tweaking 🤷♂
You need to figure out and come up with a directory structure/layout for your new mono repo. This is an extremely important step and if you fuck this up now, it’ll be twice as painful to unfuck this later on.
This is the layout I used for streamdal/streamdal - it is fairly common and non-controversial - it might work for you.
┌── assets <---- static assets used in monorepo
│ ├── img
│ └── ...
├── apps <---- target dir that will contain apps
│ ├── cli
│ ├── console
│ ├── docs
│ ├── server
│ └── ...
├── docs
│ ├── install
│ │ ├── docker
│ │ └── ...
│ ├── instrument
│ └── ...
├── libs <--- target dir for app dependencies, common/forked libs
│ ├── protos
│ ├── wasm
│ ├── wasm-detective
│ ├── wasm-transform
│ └── ...
├── scripts
│ ├── install
│ │ ├── streamdal.sh
│ │ └── ...
│ └── ...
├── LICENSE
├── Makefile
└── README.md
You should probably do this during off-hours when folks aren’t updating repos often. If that’s not possible, no big deal, you’ll just have to do some syncing post-migration.
Go through the list of repos and clone them to your work dir:
# Change into the work-dir
$ cd /Users/dselans/Code
# Grab the migration script
$ curl -o migrate.sh <https://raw.githubusercontent.com/streamdal/streamdal/main/scripts/monorepo/migrate.sh>
# Clone the "to-be-migrated" repos
$ git clone [email protected]:streamdal/server
$ ...
- The migrate.sh script expects repo dirs to exist
- Open migrate.sh in your editor and update the following bits:
- MONO_DIR - specify the target monorepo dir (mono)
- BASE_DIRS - specify the dirs that the script should create (can leave as-is, if the layout above makes sense)
- FILES - specify what files the script should create / touch
- Update REPOS with a space-separated list of the “to-be-migrated” repos you cloned
- Update SUB_DIR with the dir you want the migrated repos to live under
- ie. If you have REPOS="foo bar" , MONO_DIR="mono" and SUB_DIR="apps" - the “foo” and “bar” repos will be migrated to ./mono/apps/foo and ./mono/apps/bar.
- Save and exit editor
We are ready to begin the migration.
# From work-dir
$ zsh migrate.sh
The script will attempt to do the following:
- Create $MONO_DIR and initialize a git repo in it
- Make a copy of
../$REPO
as../$REPO.clone
- Perform all further work from
../$REPO.clone
dir - Move
../$REPO.clone/*
into../$REPO/$SUB_DIR
- Commit and merge changes in
../$REPO.clone/*
- Chdir to
$MONO_DIR
and set../$REPO.clone
as a remote - Merge
../$REPO
’s main into$MONO_DIR
’s - Commit and move on to the next
$REPO
specified inREPOS=
The “meaty” part of the migration is complete.
🫡
Performing the git part of migration is the first step in your “journey”. There will be a handful of other things you’ll need to do to get things into decent shape and ready for production.
Here are some tips to get you started:
- Repo size is a legitimate concern now, so start by identifying large, dupe, or garbage assets.
du -sh * | grep M
in your monorepo dir. Look for log files, build dirs, node_modules, Rust’s target dirs, accidentally checked in Docker-compose volumes, etc. - When you find garbage, don’t forget to add the paths to a .gitignore
- Unless you have a large number of repos (20+), I would do one repo migration at a time to catch any potential issues and fix them on the spot.
- You will probably not like the structure or redecide something and ultimately have to rerun the migration again.
rm -rf $MONO_DIR && rm -rf *.clone
to start fresh. - Don’t forget to update README.md's in the migrated repos to indicate that the main repo has moved to $XYZ. For good measure, “archive” the repo as well (via repo settings in GitHub).
- Attempting to update everything in one go is a huge undertaking and will take longer than you anticipated. Migrate the repos first, finish that, then tackle CI, then tackle READMEs, docs, and so on.
- Updating CI will probably be the heaviest lift. You first want to gate the workflows so that they run only when the PR contains changes for /apps/some-app/*
- You can accomplish this by using GitHub’s path filters on a pull_request trigger (and on push to main). Take a look at our workflows for reference.
UPDATE 01.2024: Gotta admit, having everything in one place is pretty nice. Intellij appears to be smart enough to understand that diff subdirs have diff languages — I would’ve imagined it would have problems, at least with indexing. Not bad.
We’ve got a community!
Want to nerd out with me and other misfits about your experiences with monorepos, deep-tech, or anything engineering-related?
Join our Discord, we’d love to have you! https://discord.gg/streamdal