The git history of a repository a few years in or with more than one collaborator will look like thick weed unless you take care of it. During last Autumn in Lookback, the engineering team noticed that we needed to take care of some weeds.
Not only in our committing code culture, but also in the land of code reviews and Pull Requests at GitHub.
Here are some of the principles and guidelines we worked out.
Empathy for others and your future self
"Please have empathy for your coworkers when committing code. Be considerate of them as well as your future self".
When working in a long living code base with lots of complex and moving parts, bugs can surface weeks or months after you wrote the code. Guess what system helps you go back in time and find exactly what, when, and who changed a line of code?! git can! Guess what doesn't help git help you? These things:
- Unclear commit messages.
- Incoherent and too large commits.
- Not easily revertable commits (connected with the above).
- Merge commits.
Failure to avoid these things will bite you when debugging something in the future.
For us, we didn't enforce or point these things out in code reviews either. One person could file a Pull Request 10+ commits, where some of them wouldn't even be related to the PR itself. It was hard to give good feedback on a commit which said
fix some things in logging: what did you fix? Why?
"You've got to give love to receive love". Meaning, "you've got to give your co-workers good PRs in order to receive good feedback". A related joke is the old "20 commit PR? LGTM. One commit? A novella of feedback".
We decided to reboot our engineering culture around clean commits and PRs. What follows are some of the guidelines we drafted for working with code.
1. Committing code
Make each commit atomical. That is, make it include only the relevant code for your intended change. There's a lot to read about this subject on the web.
When submitting a patch for code review, consider grouping together similar commits with
git rebase("squashing") to keep commit count down (without breaking the rule about atomical commits above!).
master branch's history should ideally be a long string of commits. We should avoid merge commits by rebasing in
master into your feature branch before merging back.
- Try to focus on the why in commit messages. Not necessarily what. Think about how this commit will be read when doing a
git blamehalf a year later. The reader is probably interested in why this change happened.
- Capturing the why context is super important, since it's probably tricky to recover later. Save yourself some future slack and capture the why now.
- Adding any references (external stories, cards, etc) is appreciated.
"How to Write a Git Commit Message" includes some no brainer guidelines for good commit messages.
2. Submitting Pull Requests
Have mercy for your coworkers. Don't dump an unstructured, large PR on them with "Review this!!". Be considerate of their time and energy when submitting, as it might be very hard for them to get into the context you've been into the last couple of days while working on the code.
Make all PRs minimal, concise, and relevant. Don't include non-trivial refactors of modules in a PR that's really supposed to do something else.
Avoid long living branches. By making small, incremental changes, we avoid messy merge conflicts and best of all: we avoid risky deploys. It's easier to reason about small patches, and potentially rolling them back, than huge blobs of "fix everything" PRs.
Make sure your PR is rebased on
master when submitting it. That removes the need for merge commits when merging back to
When your PR is ready for review (implies that you also think it's ready for merge), you must make sure it's neat and tidy.
- Squash any similar commits together.
- Look at each commit and see if it makes sense: does it have a nice message? Is the code changes relevant and atomical? If a coworker does a
git login two months, will they be able to understand this change?
After receiving a code review with change requests, always try to squash those changes together with the initial commits made when first submitting the PR. It's okay if you have lots of smaller commits in your feature branch while working with the code. But once it's ready for review and later merge into
master, the final PR must be in a concise, minimal state.
The code reviewer may reject your PR if it's:
- Too large ("please break into separate PRs").
- Unstructured/needs squashing ("please restructure this PR", "please redo your commit history").
This does not mean "you suck". It's a signal that you can learn how to avoid this the next time, as well as avoding wasting your coworkers' time. We aim to learn and become a better developer (and co-developer!) with code reviews, since it's so easy to do your own thing when you've been down in a code cave for a couple of days.
3. Tools & patterns
When rebasing your branch – either to rebase in
master or rewrite history – the branch will be diverging from the remote, if you've pushed it already. So you need to force push next time (
git push -f).
Commit message templates
I just learned about this a couple of months ago: git supports message templates. Every time you're committing something from an external editor (not from the command line), you can use a pre-written template. The template I stole from the post linked below reminds me of the "Because". Why are you even doing this commit? Any business purpose, or is it tech?
[one line-summary of changes] Because: - [relevant context] - [why you decided to change things] - [reason you're doing it now] This commit: - [does X] - [does Y] - [does Z]
Read more about setting this template up.
See this guide from the git docs: "Rebasing".
git checkout my-feature-branch # ...do work # ...changes have been made in remote master # fetch/pull latest master git rebase master
This will re-apply your commits in the feature branch on top of what's in
master , to make it appear as your branch is branched off of latest commit in
git fetch origin master:master from your feature branch to fetch latest
master without leaving your branch.
There's a ton of magic things one can do with
rebase, please consult the internet or ask coworkers about rebasing strategies.
- Do rebase
masterinto your feature branch when there are upstream changes in
- Don't use
git mergewhen including changes from
masterto your feature branch.
At some point, there will be conflicts when rebasing. You'll be able to solve them in the same way as when manually solving conflicts when doing
git merge. This is all described in the terminal when there's a conflict. In a nutshell
# Find conflicts in code: git status # ..Fix them :) # Mark as resolved: git add
# Continue rebasing git rebase --continue
Force pushing to collaborative branches
Rewriting history requires you to force push the PR branch. I.e. you need to push with
git push -f
The problem with force pushing is that anyone collaborating on the branch will not be able to just
git pull. However, git introduced a switch on
pull which fixes this:
git pull --rebase
This is essentially a
git fetch origin git rebase origin/
Squashing commits together
This is nice for when you've done a lot of small, work preserving commits (WIPs or just to not lose them) that need to be "beautified" before submitting for review.
See this guide from the git docs: "Rewriting History".
In a nutshell:
# -i is for "interactive" # git rebase -i
# HEAD~ git rebase -i HEAD~5 # or git rebase -i git-hash^ # ^ is targetting the parent commit
An editor will open with your included commits listed. There'll be a legend below them describing how to rewrite history. You can drop commits (
d), squash together commits (
s), and more. You'll be able to rewrite the commits messages too.
After we introduced some baselines in how we commit and share code, I personally felt much more motivated to keep things neat and tidy (aka. "Broken window syndrome"). Git isn't just a tool to transport code from your laptop to a server: it's a super cool tool created to help us collaborate better, and to fix stuff when things go bad (which will happen sometime).
I found myself actually enjoying writing extended paragraphs in my commit messages. It brought back some memories in doing literate programming in CoffeeScript for university lab assignments. The shorter the commit diff, the longer my commit message would be – a funny inverse proportional property!
It also touches me every time I go back and see somebody had put in some effort in the commit message, and documented the process, the Why, and the outcome. It's like a time machine in your colleagues' minds, and also in their personal development to become better programmers.