-
Create a PR
-
When ready, submit a new iteration via CanIHasReview.
-
Wait/respond to review comments.
-
Make changes, go back to step 2, repeat indefinitely.
… Pretty similar to most projects
What makes SE-EDU’s development process different compared to other GitHub projects you may have come across:
-
Clear signaling of PR status via CanIHasReview
-
Rebase workflow / "Commit organization" (Every commit is a single change)
-
Commit messages
-
Given A PR, reviewer decides "merge" or "don’t merge".
-
How should a reviewer decide?
-
Let’s think of some criteria:
-
Correctness of the code according to its specification
-
E.g. Function gives the correct output for an input
-
Question: What is the specification?
-
-
Does the PR fix the bug?
-
What does "fixing the bug" mean?
-
Is the bug just a sign of a much deeper problem?
-
-
Does the PR enhance the product?
-
Is it truly an enhancement?
-
Is it aligned with the project goals?
-
-
Given just the diff of the PR (and nothing else, not even communication with the PR author), can you decide if the PR should be merged?
-
"Each commit contains a single logical change, and this change must stand on its own."
-
"Each commit has a single responsibility, and the responsibility must be fully carried out.
Sounds familiar? This rule is inspired by two ideas in programming: the Single Responsibility Principle and Levels of Abstraction.
(via Wikipedia)
The single responsibility principle is a computer programming principle that states that every module, class, or function should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility. Robert C. Martin expresses the principle as, "A class should have only one reason to change," although, because of confusion around the word "reason" he more recently stated "This principle is about people."
- Abstraction
-
The process of hiding details to more closely attend to other details
- Levels/Layers of Abstraction
-
Each level represents a different model of the same information, but with varying amounts of detail.
-
Makes it easy to reason about the program, as well as to communicate with others.
-
Likewise, we can apply this concept to PRs as well, so as to make it easier for contributors and reviewers to reason about changes in the PR, and to communicate with each other about details in the PR.
- Logical
-
The logical framework(s) used by the project. (e.g. for software code: a refactoring)
- Single change
-
A single entity when viewer from a certain level of abstraction.
-
Which level of abstraction is suitable? Depends on who you ask, but do note that we only have space for two levels of abstraction — the PR level and the commit level.
-
Try to choose the level of abstraction that makes it easy for you and the reviewer to reason about the change!
-
-
Changes made to a single file
-
A single diff hunk
-
Why: The level of abstraction is usually too low for reviewers to easily reason about the diff.
-
Also, Git already provides this information, so choosing them as the level of abstraction won’t add any value.
-
Look for patterns in the changes you are making to the code. Changes matching the same pattern will likely belong to the same responsibility.
-
Try to give a "one liner summary" of what your changes are doing, without using "and" (or its equivalents).
-
Also, consider that each commit must not break the build.
- Prevent diffs from different changes from mixing together.
-
Makes it easier to review since reviewers won’t need to spend extra brain cycles separating the changes into their different logical components while reading the diff.
- Prevent unnecessary changes from being made.
-
When the "single logical change" is identified and stated in the commit message, any unnecessary change in the diff will stick out like a sore thumb.
- Makes discussing changes easier.
-
Sometimes, a single logical change could stretch across different files and diff hunks. Without separating them into different commits, we can only refer to them vaguely with prose or at the granularity of a diff hunk or file.
- Easier manipulation of the PR.
-
Commits in the PR can be easily split into separate PRs or joined together into a single PR as needed. (Depending on project requirements.)
-
"Squash and Merge" workflow: Very popular with GitHub projects nowadays.
-
"Single logical change per PR"?
-
Less flexible since we only have one level of abstraction to work with.
-
Each PR requires a separate branch, too heavyweight, discourages contributors from actually separating changes.
-
"My commits would all be squashed into a single one anyway, so why bother?"
-
For SE-EDU development, you have the freedom to rearrange/rewrite your commits. Try playing around to get a feel for how to effectively organize your commits.
Splitting a PR into commits, each with a clearly defined responsibility, gives you a language to reason about and communicate changes at a higher level.
-
… we should probably drop [3/6] because …
-
… perhaps [4/8] should come before [2/8] …
-
… compared to [v2 4/6], the implementation used in [v3 5/7] seems less clean because …
Try to strike a balance between a level of abstraction that is too low and a level of abstraction that is too high.
-
Don’t be afraid of submitting PRs with only one or two commits.
-
Most PRs will only have one commit.
-
You can always split your single commits into multiple commits when it becomes clear that it should be done.
-
-
Why is the PR too big?
-
Does the PR contain commits which are irrelevant to the overall goal?
-
Split them into another PR.
-
-
Is the goal of the PR defined too broadly?
-
Split into smaller PRs, each with smaller scope.
-
-
-
PRs that are split out must be valuable on their own.
-
Not convincing: This PR #1 doesn’t really add much value when it is merged, but when PR #2 is merged everything would be so much more awesome! (If so, why not put them together?)
-
You never truly know whether a PR/commit is needed in accomplishing a high level goal until you actually accomplish that high level goal.
-
-
Don’t be afraid of submitting PRs that are too "big" as well.
-
The "single logical change per commit" rule must be followed, no matter how many commits it ends up producing!
-
We have CanIHasReview to help us with generating interdiffs, so we can handle big PRs with ease.
-
Basically, don’t sweat the details, trust your intuition. As long as you follow the "single logical change per commit" rule, you are on the right track, no matter how big or small the PR ends up being.
If you want to build and refine your intuition, just practice :-)
-
To inform.
-
What is the problem.
-
What is the single responsibility of the commit. (You did decide that, right?)
-
What is the intent and expected result of the change.
-
Related background information.
-
-
More importantly, to persuade.
-
Give the reviewer/maintainer confidence that the change is a good thing for the project.
-
Don’t just describe the change, argue for the change!
-
Commit messages are not special! Any technique you already use to make your issue/PR description a good one applies to commit messages as well.
Don’t just follow the commit message template. Ask yourself (or your friends) whether the commit message you wrote is persuasive.
- To support the "one logical change per commit" rule
-
If the rule is followed, each commit would have a different topic, and thus a different description. Using commit messages would thus allow us to tie the description together with the commit, as opposed to squashing them all together into the issue/PR description.
- To leave an audit trail
-
E.g. for git bisect, searching through Git history
- For extra convenience
-
-
Since we have spent so much time crafting the issue/PR description, might as well bundle it with the commit itself as well.
-
GitHub allows us to view all commit messages at one go when browsing commits, can’t do that with issue/PR descriptions.
-
Supported by more tooling: IntelliJ will show the commit message when hovering over commit hashes, git bisect will print the commit message of any offending commit etc.
-
Commit messages are part of the SE-EDU review process to ensure they are kept up to date.
-
Don’t just look at the style, whether it follows the commit message template etc.!
-
Check whether arguments made are sound.
-
Check the understanding of contributors
-
If the contributor has thorough understanding of the technicalities behind the change, it is more likely that the change is correct.
-
If the contributor has also considered other alternatives, it is even more likely that the change would be beneficial for the project.
-
-
Allows reviewers to check that the interests of the contributors are aligned with the project.
Signs that the change may be a bad idea: Hand waving, imprecise language, glossing over details, factual errors, bold claims without substantiation, invalid or unsound arguments.
Believe it or not, SE-EDU’s contribution process isn’t really novel at all. Here are some projects which do nearly the same things that we do, and some examples of commits from their history.
daemon: fix length computation in newline stripping When git-daemon gets a pktline request, we strip off any trailing newline, replacing it with a NUL. Clients prior to 5ad312bede (in git v1.4.0) would send: git-upload-pack repo.git\n and we need to strip it off to understand their request. After 5ad312bede, we send the host attribute but no newline, like: git-upload-pack repo.git\0host=example.com\0 Both of these are parsed correctly by git-daemon. But if some client were to combine the two: git-upload-pack repo.git\n\0host=example.com\0 we don't parse it correctly. The problem is that we use the "len" variable to record the position of the NUL separator, but then decrement it when we strip the newline. So we start with: git-upload-pack repo.git\n\0host=example.com\0 ^-- len and end up with: git-upload-pack repo.git\0\0host=example.com\0 ^-- len This is arguably correct, since "len" tells us the length of the initial string, but we don't actually use it for that. What we do use it for is finding the offset of the extended attributes; they used to be at len+1, but are now at len+2. We can solve that by just leaving "len" where it is. We don't have to care about the length of the shortened string, since we just treat it like a C string. No version of Git ever produced such a string, but it seems like the daemon code meant to handle this case (and it seems like a reasonable thing for somebody to do in a 3rd-party implementation). Reported-by: Michael Haggerty <[email protected]> Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
rtc: rk808: Compensate for Rockchip calendar deviation on November 31st In A.D. 1582 Pope Gregory XIII found that the existing Julian calendar insufficiently represented reality, and changed the rules about calculating leap years to account for this. Similarly, in A.D. 2013 Rockchip hardware engineers found that the new Gregorian calendar still contained flaws, and that the month of November should be counted up to 31 days instead. Unfortunately it takes a long time for calendar changes to gain widespread adoption, and just like more than 300 years went by before the last Protestant nation implemented Greg's proposal, we will have to wait a while until all religions and operating system kernels acknowledge the inherent advantages of the Rockchip system. Until then we need to translate dates read from (and written to) Rockchip hardware back to the Gregorian format. This patch works by defining Jan 1st, 2016 as the arbitrary anchor date on which Rockchip and Gregorian calendars are in sync. From that we can translate arbitrary later dates back and forth by counting the number of November/December transitons since the anchor date to determine the offset between the calendars. We choose this method (rather than trying to regularly "correct" the date stored in hardware) since it's the only way to ensure perfect time-keeping even if the system may be shut down for an unknown number of years. The drawback is that other software reading the same hardware (e.g. mainboard firmware) must use the same translation convention (including the same anchor date) to be able to read and write correct timestamps from/to the RTC. Signed-off-by: Julius Werner <[email protected]> Reviewed-by: Douglas Anderson <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]>
Fix plan created for inherited UPDATE/DELETE with all tables excluded. In the case where inheritance_planner() finds that every table has been excluded by constraints, it thought it could get away with making a plan consisting of just a dummy Result node. While certainly there's no updating or deleting to be done, this had two user-visible problems: the plan did not report the correct set of output columns when a RETURNING clause was present, and if there were any statement-level triggers that should be fired, it didn't fire them. Hence, rather than only generating the dummy Result, we need to stick a valid ModifyTable node on top, which requires a tad more effort here. It's been broken this way for as long as inheritance_planner() has known about deleting excluded subplans at all (cf commit 635d42e9c), so back-patch to all supported branches. Amit Langote and Tom Lane, per a report from Petr Fedorov. Discussion: https://postgr.es/m/[email protected]