9th May 2020
If you're a developer reading this, you're most likely familiar with Git. It's probably the tool you use most next to your text editor, but also the one you're least familiar with. You're probably not alone as many developers don't seem to take the time to develop the skills to use it effectively.
In this post, I'll go over the ways I use Git and hopefully this can help you get better at using it with a minimal amount of effort. Disclaimer - I don't profess to be an expert in Git, but over the years I've figured out how to get the most out of it for my workflow. Depending on your workflow, some of these tips might not work well, but I believe they should be useful in the majority of cases.
tldr - Use
git pull --rebase (
gup when using Oh My Zsh) to pull new changes on a branch.
Suppose you're working on an unprotected branch (you can push commits directly onto it) with
multiple people e.g.
master. Assume that one of your colleagues has pushed some new commits since
the last time you did a
git pull. You try to
git push and you get the following:
error: failed to push some refs to '[email protected]:your-repo/your-repo.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. Integrate the remote changes (e.g. hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details
Instinctively, you run a
git pull (the prompt even tells you to!). You've probably even
habitualised this and do it without thinking. If there weren't any merge conflicts, Git magically
sorts things out and then you can
git push successfully. End of the story right?
Unfortunately, no. If we look at the commit history we'll see something like the following:
* ed10c57bc - Merge branch 'master' of github.com:test/test into master |\ | * 0d4f8bce2 - My commit * | 2a95c660a - Colleague's commit |/ * 585b1b3e9 - Some other commit
We can see what looks like a hairpin loop in the history and a merge commit which says that the branch was merged into itself. If you take a second to think about this, it was probably not what you were trying to achieve when you went to push your code.
When working on smaller projects this might be fine, but on larger projects, many people doing this exact same pattern can quickly make the commit history into a thick forest of branches. Commit histories like this are nearly guaranteed to make your life unnecessarily hard further down the line when you're trying to figure out why a change was made.
What to do instead?
The simple fix is to use
git pull --rebase instead. This change means that the commit history now
* 0d4f8bce2 - My commit * 2a95c660a - Colleague's commit * 585b1b3e9 - Some other commit
This is a massive difference and it is instantly clear what is happening. No more hairpin loop or merge commit.
I use Oh My Zsh, and use its handy
gup alias for doing this.
You can also configure Git to use this by default by running
git config --global pull.rebase true.
If you're using IntelliJ, you can set
Settings > Version Control > Git > Update method to 'Rebase'.
tldr - Use
git rebase (
grb when using Oh My Zsh) to pull in new changes from a parent branch,
or change the parent branch completely. Avoid rebasing branches other people are working on.
We've talked about using rebasing in the context of a
git pull --rebase, but it can also generally
be used as a substitute for a
git merge. Typically, we need to merge when working on a feature
branch and need new changes from the parent branch. Inversely, we might need to merge a pull
request (PR) from the feature branch into the parent.
Both of the above cases would result in merge commits in our commit history. For example, say we're
working on a branch called
your-branch and want changes from
master. Most people will probably
git merge master and inadvertently create a commit history like the following:
* ed10c57bc - Merge pull request #1 from test-repo/your-branch |\ | * 0d4f8bce2 - Commit on your-branch * | 2a95c660a - Commit on master | * 6a2416680 - Merge branch 'master' into your-branch |/| * | 585b1b3e9 - Commit on master | * 217dd11cf - Commit on your-branch
This is quite confusing to look at. Notice that we now have two hairpins because we have the
additional merge commit at
6a241668 where we merged in master's changes.
If we use
git rebase master instead, we can avoid the merge commit and get the following:
* ed10c57bc - Merge pull request #1 from test-repo/your-branch |\ | * 0d4f8bce2 - Commit on your-branch | * 217dd11cf - Commit on your-branch * | 2a95c660a - Commit on master * | 585b1b3e9 - Commit on master | |
This is definitely an improvement as we've removed the unnecessary merge commit. We could stop here, but there are people that advocate going further to rebase the entire PR branch onto the master branch. On GitHub, this is possible using the 'Rebase and merge' option:
This is somewhat advantageous in that you get a linear commit history with the least noise possible:
* 0d4f8bce2 - Commit on your-branch * 217dd11cf - Commit on your-branch * 2a95c660a - Commit on master * 585b1b3e9 - Commit on master
Unfortunately, the downside is that we lose the extra information from the PR's merge commit. These merge commits typically link to the PR description, which can often provide additional context that the commits lack.
What you choose is ultimately up to you, but I'd advise only rebasing simpler PRs where the merge commit wouldn't provide any additional information.
tldr - Use
git push --force to push your rebased changes to your branch. Avoid using this if
you're working on a branch that is shared by other people.
It's important to understand that rebasing changes your commit history whilst merging does not. Usually a rebase will involve rewriting your commits so that they were created at the time the command was run. Suppose we have some commits like the following:
* 0d4f8bce2 - My commit 1 (1 day ago) * 2a95c660a - My commit 2 (2 days ago)
Upon doing a
git rebase the history might look like:
* 993e0b281 - My commit 1 (1 minute ago) * c76293c19 - My commit 2 (1 minute ago)
You should notice that the commit hashes and timestamps have changed. Normally this isn't a problem as we're unlikely to be interested in preserving the exact creation times.
If you've previously pushed these commits up to the remote branch, doing a
git push will get you
error: failed to push some refs to '[email protected]:your-repo/your-repo.git' hint: Updates were rejected because the tip of your current branch is behind hint: its remote counterpart. Integrate the remote changes (e.g. hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details.
This should look familiar. Run a
git status, and you should see:
On branch your-branch Your branch and 'origin/your-branch' have diverged, and have 2 and 2 different commits each, respectively. (use "git pull" to merge the remote branch into yours)
Git is telling us that your local branch and the remote branch have diverged. This is understandable
as we changed the commit history on the local branch. If we were to naively follow the advice of
git pull, the commit history would look like:
* 993e0b281 - My commit 1 (1 minute ago) * c76293c19 - My commit 2 (1 minute ago) * 0d4f8bce2 - My commit 1 (1 day ago) * 2a95c660a - My commit 2 (2 days ago)
This is definitely not what you want as the commits have been repeated!
So what do we need to do?
Instead of trying to
git push, just use
git push --force instead. This will forcibly push up
the local branch's changes to the remote branch and override its commits. We are telling Git to
accept our local branch's version of the commit history as the source of truth.
git push --force is a potentially dangerous command. Normally it isn't a problem because you will
use it on a branch you created and no one else is working on it.
Now imagine you're working on a branch with someone. They push commits to the remote branch, and you haven't pulled those changes down into your local branch. You decide to rebase and then force push. Doing this will erase the other person's commits on the remote branch and this could potentially be irreversible.
It's entirely possible to sort this out after the fact, but it can feel like a lot of trouble and is potentially asking for annoyed teammates.
A good rule of thumb is to not use
git rebase when you're working on a shared branch (you can
git pull --rebase). This way you won't need to
git push --force either.
tldr - Use
git rebase -i (
grbi when using Oh My Zsh) to cleanup your branch's history.
git commit --amend is great for changing the most recent commit.
As we've learnt earlier,
git rebase is able to change a branch's commit history. One interesting
flag we can use with it is
-i. It stands for 'interactive' and can be used to interactively
rewrite the commit history.
For example, if we want to change the last 2 commits on our branch we can do
git rebase -i HEAD~2.
This will show us the following interface in the CLI:
pick dca657264 Another commit pick ad8830adf Some commit Rebase 5b95f1854..ad8830adf onto 5b95f1854 (2 commands) # # Commands: # p, pick <commit> = use commit ...
The interface offers a bunch of options, but the ones I commonly use are:
This is a great way to tidy up a feature branch before submitting a PR for it. For example, the branch's commit history might look like this:
* 2a95c660a - Fix typo in feature * dca657264 - Add other part of feature * 0d4f8bce2 - Format feature code * ad8830adf - Add feature
Commits like 'Fix typo' or 'Format code' have limited value, and are particularly worthless if they're just fixing new code that should be working in the first place.
In general, it's not good practice to commit code in a broken state, but this can contradict other
advice to make frequent, smaller commits. By using
git rebase -i we can have the best of both
worlds. We can create frequent WIP commits and then clean up the commit history later, perhaps
before submitting the PR.
git rebase -i is a great general purpose command, I find it goes through a bit too much
ceremony when I simply want to change the most recent commit. For this purpose, we can use
git commit --amend instead. It allows us to change the most recent commit with whatever is
currently staged, and even the commit message.
tldr - Use a commit GUI to select specific lines and files for a commit. Try to keep commits as tight in scope as possible.
When working on new features, it's easy to end up making lots of changes without really thinking about how the commits will look. In general, it's advisable to make frequent, narrowly scoped commits, but many people end up with some combination of the following:
It can be hard to craft a commit history that makes sense. Complicated features can necessitate trial and error and often involve churn. This is a natural part of the development cycle, but we can still try to organise the chaos.
One thing I do to improve the scope of my commits is to use a GUI to select specific lines or files
to include. In IntelliJ, the commit GUI is available via
Ctrl + K or
VCS > Commit.
This allows me to (usually) work in a more unorganised way, and then think about the commit structure a little later. For bigger features, I may end up chopping it up into several commits. This is much better than a single, amorphous blob of changes where I can't easily describe what changed in the commit message.
tldr - Aim to make the subject line around 50 characters. Add detailed descriptions to the message body when possible. Wrap body lines at 72 characters. Use a commit GUI to help with these.
If you've been following up to now, my tips should hopefully have got you some of the way to writing better commit messages already. By structuring commits and their content better, we should find that we write more refined messages as a side effect.
Unfortunately, writing a good commit message is fairly subjective and could probably warrant its own blog post. Thankfully there's already some good advice out there. Chris Beams' fantastic post on How to Write a Git Commit Message is my usual recommendation. I've boiled it down to these main points:
I would highly recommend giving it a read as it goes into much more detail about the why. A decent
commit GUI should help you with following the 72 character limit and wrapping lines correctly. In
IntelliJ you can find these settings under
Settings > Version Control > Commit > Commit message inspections.
If you need convincing that commit messages are important, you should try to think about commits as documentation for your colleagues and future self to use. There's a bit of an upfront cost to writing it, but it can avoid many hours of wasted effort in trying to understand why a change was made. Peter Hutterer sums this up nicely:
Re-establishing the context of a piece of code is wasteful. We can’t avoid it completely, so our efforts should go to reducing it [as much] as possible. Commit messages can do exactly that and as a result, a commit message shows whether a developer is a good collaborator.
tldr - Use the Conventional Commits format. If your project allows it, you can use it to automate change logs and semantic version bumps based on commits.
Hopefully by now you should be equipped with the knowledge to author good commit messages, but what if we could take it even further to the real endgame.
To make commit messages the best they can be, we have the Conventional Commits specification. It prescribes the exact format that a commit message should take. The format was originally adopted from the Angular project and looks like the following:
feat(config)!: allow provided config object to extend other configs BREAKING CHANGE: `extends` key in config file is now used for extending other config files
The main characteristics are:
feat(feature). Other types like
testcan also be used
!:in the subject line
There is some wiggle room to change some of these parts, but ideally the format should be enforced by linting rules in a pre-commit hook.
By using this commit format, we can access tooling to automate change log generation and semantic
version bumps for your project. For example,
feat commits since the last release will
automatically trigger a minor version bump. This is recommended when using tools like
Lerna and makes version management fairly trivial across a
Conventional Commits are probably the most useful for projects that are consumed e.g. libraries/frameworks, but I would still recommend using them for private ones too. As commit messages are treated as first-class citizens, it forces everyone to apply more discipline and hopefully create better commits.
Whilst Git is a complicated tool, you can easily become proficient simply by acquiring these two skills:
Both of these may take a bit of time as they require you to habitualise new techniques that may not be immediately obvious. I encourage you to persevere! As your commits start to improve, your colleagues and future self will massively thank you when they next have to look at the commit history.