- Follow C4 (https://rfc.zeromq.org/spec/42/).
- This implies much process.
- Little else should be added or documented, as C4 is nearly sufficient as-is.
- Roles per C4: Maintainer, Contributor, Administrator
- Contributors are literally anyone with a GitHub account (as per C4).
- Maintainers are as per C4, and should be documented publicly with CODEOWNERS, and not a semi-secret cabal.
- Administrators are as per C4, with the initial set: Aneesha, Evan, Salah, Brandon, maybe Robin
- Introduce "tiered CI" with 3 testing tiers (deviation from C4 Sections 2.5.1 and 2.5.3)
- Automated "tier 1" tests (a.k.a. the "integration tests") that must pass in order for a patch to merge into the 'developer' branch
- Automated "tier 2" tests (a.k.a. the "regression tests") that must pass in order for a patch to merge into the 'staging' branch
- Automated "tier 3" tests (a.k.a. the "long running tests" or the "performance tests") that must pass in order for a patch to merge into the 'release' branch
- In order to merge into the 'developer' branch, there must be a PR that passes
the (tier 1) integration tests and that has been awarded a passing review by
a Maintainer.
- The Maintainer must ensure that the patch:
- Meets the Patch Requirements (Section 2.3) of C4.
- Does not introduce untested code (implying that tests must precede or accompany new functionality), unless that code is disabled by a feature flag.
- Adds entries describing the changes in the CHANGELOG file, if user-visible changes result.
- Maintainers are not permitted to object to patches for any other reason. (See C4 Section 2.4.14: "Maintainers SHALL NOT make value judgments on correct patches".)
- Maintainers are expected to respond to requests for review in a timely manner. (Example: within 1d). "Punting" to another Maintainer is a valid response.
- The Maintainer must ensure that the patch:
- The commits on the 'developer' branch that pass integration testing (tier 1) are candidates for the regression tests (tier 2). Commits that pass the regression tests (tier 2) are merged into the 'staging' branch.
- The 'staging' branch commits that pass the long-running tests (tier 3) are merged into the 'release' branch.
- Using the above, simply choose the most recent commit on the 'release' branch
every quarter, tag it appropriately (e.g. '2024-Q1'), and that becomes the
release.
- With this method, it is not possible to miss a deadline.
- It is not possible to "not be able to release", as the commits on the 'release' branch are known-working and tested.
- Those who think that a release is "not ready" are invited to extend the test suites with (possibly) failing tests to prove it. Only failing tests can stop a commit from progressing to the 'release' branch.
- What has changed since the last release is very easy to view (with Git).
- The release notes are simply the entries from the CHANGELOG file since the last release, making comms about the release very easy.
- The Mina Foundation should maintain a team that chases developers who have
broken tests. This is akin to the andon
cord of the Toyota
Production System. Tests that were previously passing and which begin to fail
are treated like an emergency that "stops the world" for the entire team. No
work is done other than making the tests pass. This can be achieved by a code
reversion if a solution is not likely to be found shortly.
- This activity may turn out to be the primary (and most valuable) function of the Mina Foundation engineering team.
- Tier 1 tests (integration tests) are intended to run very quickly (10m maximum). They include buildability, linting, unit tests, fast smoke tests, and fast end-to-end tests.
- Tier 2 tests (regression tests) are intended to be thorough test suites that ensure existing functionality is not broken. It should include numerous end-to-end tests, and challenging tests that check for corner cases - not merely "happy path" tests. Tier 2 tests may take many hours to run. This is okay because developers need not wait for Tier 2 tests before merging their code.
- Tier 3 tests (long-running tests and performance tests) are intended to simulate real-world scenarios with large clusters of heterogeneous machines which take a long time (days) to simulate. Performance tests should also be included in this tier.
- Breakages in tests of any tier are considered stop-the-world emergencies, and corrected immediately, or the associated patches reverted.
- Performance regressions are less urgent, but still merit up-prioritization.
- Create a Contributor list in the repo.
- Create a CHANGELOG file in the repo.
- Deprecate everything to do with "teams" on the MinaProtocol/mina repo, as the team membership and structure is semi-private and semi-secret. Use a (documented, publicly viewable) CODEOWNERS file instead. The permissions allocated in this file are enforced automatically by GitHub.
- Create the 3 testing tiers and automate them. (That is a LOT of work, but the Mina Foundation engineering team began to implement this starting in June 2023.)
- Up-prioritize making tests pass (e.g. bug fixes that were previously thought too risky).
- ... more to come.