This is a proposal to develop a changelog generator tool more suitable for user-facing changelogs and based on experiences of existing projects. The rationale is provided to explain why this is better then existing alternatives. The foundational axioms are:
- Git history is not suitable as changelog because some commits are not important.
- Git messages are for developers, thus generally not suitable for changelog, which is intended for users instead.
- Chronological ordering is not the best. Important changes like vulnerabilities or API breaks need to be at the top.
- Changes should be organized to sections so that people can easily navigate them.
- Editing a file slows down git merges.
- People are forgetful.
Since I know someone who's learning Rust and this project is not too diffucult I propose to give him this as a learning task. We can kill multiple birds with one stone. The difficulty of this should make it a great task for a beginner and it gets our problem solved, we can share costs and make his learning experience better at the same time. I will personally review his code and am willing to contribute satoshis. Yes, he accepts btc.
A cool thing about Rust is we can produce a single binary (and cross-sign it) so the target systems don't need to install interpreters.
A single-binary program will be created. It will have these two subcommands:
check
- will verify the structure of commit messages since the last version/in the PR. Intended to be used in CI on PRs.gen
- will process git history and output the changelog in markdown format into a file/stdout.
The command is configured using .mkchlog.yml
- a per-project configuration file.
This is created when project is started (or added later) and rarely needs to be changed.
The main input is in commit messages.
# Possibly general settings here, probably none in the initial version
sections:
# section identifier selected by project maintainer
security:
# The header presented to the user
title: Security
# desctiption is optional and will appear above changes
description: This section contains very important security-related changes.
subsections:
vuln_fixes:
title: Fixed vulnerabilities
features:
title: New features
perf:
title: Performance improvements
Added ability to skip commits.
This allows commits to be skipped by typing changelog: skip
at the end of the commit. This is mainly useful for typo
fixes or other things irrelevant to the user of a project.
changelog:
inherit: all
section: features
Fixed grammar mistakes.
We found 42 grammar mistakes that are fixed in this commit.
changelog: skip
Don't reallocate the buffer when we know its size
This computes the size and allocates the buffer upfront.
Avoiding allocations like this introduces 10% speedup.
changelog:
section: perf
title: Improved processing speed by 10%
title-is-enough: true
Fixed TOCTOU race condition when opening file
Previously we checked the file permissions before opening
the file now we check the metadata using file descriptor
after opening the file. (before reading)
changelog:
section: security:vuln_fixes
title: Fixed vulnerability related to opening files
description: The application was vulnerable to attacks
if the attacker had access to the working
directory. If you run this in such
enviroment you should update ASAP. If your
working directory is **not** accessible by
unprivileged users you don't need to worry.
## Security
This section contains very important security-related changes.
### Fixed vulnerabilities
#### Fixed vulnerability related to opening files
The application was vulnerable to attacks if the attacker had access to the working directory.
If you run this in such enviroment you should update ASAP.
If your working directory is **not** accessible by unprivileged users you don't need to worry.
## New features
### Added ability to skip commits.
This allows commits to be skipped by typing changelog: skip
at the end of the commit. This is mainly useful for typo
fixes or other things irrelevant to the user of a project.
## Performance improvements
* Improved processing speed by 10%
Commit messages should contain descriptive information about the change.
However not all of it is suitable to be in the changelog.
Each commit must be explicitly marked as either skipped or has some basic information filled.
Commits with changelog: skip
will obviously not be included in the changelog.
Commits with inherit: all
will simply include both title and description of the commit in the changelog.
This should be used when the commit message and description is equally useful for developers and users.
inherit
could also accept additional options like title
to only copy the title.
section
is mandatory and defines in which section the change belongs.
title
and description
are those intended for the user.
The fictious "TOCTOU vulnerability fix" commit message above is hopefully a clear example.
For users it describes how it impacts them while for programmers it explains technical details of the issue.
title-is-enough: true
explicitly opts-out of description, intended for situation when additional information is not needed for the user.
People refer to sections by their identifiers, not titles so that they don't accidentally duplicate the section just because of typo. Unknown sections in commit messages are rejected. Sections without commits are not present in the output at all so projects can have big templates without worrying about bloat.
This is based on these experiences:
- Editing a single file in PRs gets annoying with sufficient number of contributions because it causes frequent rebases with conflicts. Commit messages can not possibly cause conflicts.
- Sorting changes to categories later is difficult because it's hard to know if someone forgot to add a label on GitHub or the label is not appropriate. CI complaining about missing information prevents forgetting.
Additionally, we don't want to depend on GitHub so that we can migrate easily if needed - thus not pulling information from PRs.
I found a nice list of changelog generators Many of them are dead, many work with GH (web VCS) only or blindly include all commits and don't allow separation of developer and user information. As far as I can see none of them satisifes requirements for high output quality and linting during PRs.
Yaml is chosen for configuration and commit message inputs because it's widely known and least annoying to write deeply nested structues in. Markdown is used in output because it's simple, user-editbale in text editor and popular. The output is stored in a file to allow easy editing.
Are you someone who would like to use something like this? Are you willing to contribute? Are there things that are not entirely in-line with your desires? Please let me know!
Make sure to integrate github hook to enforce it locally. Otherwise it's just a drag to have people hitting it in CI.