Why and how to correctly amend GitHub pull requests

By , March 24, 2015 3:00 pm

Like many F/OSS developers, I’m a heavy user of GitHub, collaborating on many projects which use the typical “fork & pull” workflow based on pull requests. The GitHub documentation on pull requests covers this workflow fairly comprehensively, but there seems to be one area which is significantly lacking in detail: why and how to amend existing pull requests. The article simply says:

After your pull request is sent, any new commits pushed to your branch will automatically be added to the pull request. This is especially useful if you need to make more changes.

The problem is that this completely ignores the fact that there are often very good reasons for amending existing commits within the pull request, not just for adding new commits it.

Why amend an existing pull request?

A peer review cycle can potentially reveal many issues which make the pull request unready for merging, e.g.

  • typos
  • bugs in the proposed code changes
  • missing features in the proposed code changes
  • incomplete test coverage
  • incomplete documentation changes
  • style inconsistencies (including whitespace issues)
  • incorrect or incomplete commit messages
  • the commits violate the rule of one logical change per commit
  • some changes are outside the scope of the pull request

This is of course what makes peer review of pull requests so valuable: the problems can be addressed even before they hit the master branch, which helps maintain high quality in the development trunk. But then how do we address the issues?

How not to amend your existing pull requests

One approach is for the submitter fix the issues in their original local branch, close the pull request, and submit a new one. However in most cases this is a bad idea, because each time you do it, you create an “orphaned” abandoned pull request, and there is no automatic linking between the new revised one and the abandoned one, which makes it a lot harder to follow the history of the review process.

Another approach (the one apparently implied by the GitHub documentation mentioned above) is to append “fixup” commits to the series which fix issues introduced by commits earlier in the series. But that’s almost always a really bad idea, because again it violates the rule of one logical change per commit. In particular:

  • It makes life much harder for reviewers, because they have to review a combination of broken commits and subsequent commits which are supposed to fix that brokenness. So for each change, they have follow this process:

    1. Review it as normal.
    2. If they spot brokenness, refrain from immediately commenting on it, and instead look ahead for any subsequent commit in the series which may claim to fix that brokenness.
    3. If such a commit is found, verify that it really does fix the brokenness.
    4. If it does fix it, selectively ignore that brokenness, whilst taking care not to accidentally ignore other brokenness in the same area.

    This problem is significantly compounded each review cycle, i.e. each time fixup commits are appended to the pull request.

    The only workaround is to review via the “Files changed” tab, but that brings considerable disadvantages when the pull request contains more than one commit, because then the reviewer cannot review individual logical changes one at a time.

  • After merging the pull request, it pollutes the project’s history with useless noise: fixup commits which are not linked to the commits they are fixing.
  • It breaks git bisect, git revert, and git blame.

The right way to amend your existing pull requests

The good news is that GitHub actually supports a much better alternative which works really nicely. As a maintainer and reviewer of many GitHub repositories, I’ve been continually surprised how many people are unaware of this simple trick; perhaps it’s due to the hole in GitHub’s pull request documentation which I mentioned above? The trick is simply this:

  1. Rewrite the history of your local branch corresponding to the pull request, so that the issues revealed during the review of the pull request are addressed.
  2. Do a force push (i.e. a git push command which includes the --force-with-lease option, not -f / --force which is dangerous and should never be used).

I won’t go into details here about how to rewrite history. git history rewriting via git rebase (with fixup, squash), git commit --amend etc. is something of an art form, and there are already many great resources out there which there would be no value in me replicating.

So really this whole blog post is a long way of saying:

When you force-push a new head for a pull request branch, GitHub handles it gracefully and automatically updates the pull request in the way you want.

It’s almost like magic. For example, if a reviewer commented on two hunks in the original pull request, and your force-push effectively rewrites one of those hunks but not the other, then the GitHub pull request UI will get updated to say something like “@reviewer commented on an outdated diff” for the rewritten hunk, but any outstanding review comments on the changed hunk will be presented as before the force-push.

Now don’t get me wrong – GitHub is not perfect. In some (but not all) ways, it is inferior to Gerrit’s review system which is better at exposing the complete history of the review cycle. I previously wrote another blog post examining GitHub’s review system in more depth. But in general, it’s usually plenty good enough.

A footnote about git’s force-push

If you’ve previously been fed the mantra “git force-push is evil, don’t ever use it”, or the more specific version “don’t ever force-push to public branches” then hopefully you now realise neither are correct. There are many perfectly valid use cases for force push, otherwise the extremely clever developers who implemented it wouldn’t have bothered (and let’s face it, they know far more about git than you or I ever will). Ultimately the validity / safety of doing a force push to a public branch depends on the audience’s expectations for the stability of that branch. If people expect it to be stable and never get rewritten (only fast-forwarded), then you shouldn’t force push, or at least not without a decent effort to correct those people’s expectations. However there is generally no reason why people should expect GitHub pull request branches to be stable, because people generally don’t base other branches on top of them.

The article entitled “The Dark Side of the Force Push” may help further your understanding.

Taking over amendment of someone else’s pull requests

Note that the title of the previous sections referred to your existing pull requests. GitHub doesn’t let you amend anyone else’s pull requests, and that’s a good thing for obvious reasons. But what if one developer submits a pull request, vanishes off on holiday, and whilst they are away, it gets reviewed and issues are revealed?

In this case it’s easy to effectively take over the process. Simply create a local branch which points to the same commit as the head of the pull request. If you have the totally awesome hub CLI tool installed, then this is as simple as:

git checkout https://github.com/github/hub/pull/134

Then amend your branch as necessary according to the existing reviews, and submit a new pull request which references the old pull request it’s superceding, either by URL or by number. You will probably also want to arrange for the old pull request to be closed with a comment linking to its successor.

Share

13 Responses to “Why and how to correctly amend GitHub pull requests”

  1. Cornelius Schumacher says:

    I have a very different view on this. I was bitten by force-pushed changes to pull request just a few too many times. It makes it hard or sometimes even impossible to follow what actually happened. References will be missing or be wrong (e.g. the list of commits in the initial comment of the pull request). GitHub tries to make sense of it, but it simply can’t when the history is being rewritten.

    So I very much prefer to never change history, not even of transient pull requests. Pushing additional commits to a pull request actually can be quite nice, because as a reviewer you then see how what comments have been addressed. With a force push you basically have to always review the full change again.

    Usually it is a good idea to clean up after the review and create a polished version of the pull requests with squashed commits, fixed typos in commit messages, etc. I find it a nice way to simply create a new pull request with a cleaned up version of the commits and put in the number of the original pull request in the comment. This will magically create the cross-linking between the two requests.

    That is my favorite work flow. I’m sure there are seven or seventy more 😉

    • Adam says:

      I have a very different view on this.

      Actually I don’t think you do …

      I was bitten by force-pushed changes to pull request just a few too many times. It makes it hard or sometimes even impossible to follow what actually happened. References will be missing or be wrong (e.g. the list of commits in the initial comment of the pull request). GitHub tries to make sense of it, but it simply can’t when the history is being rewritten.

      Sure; did you read the paragraph where I wrote this, or the earlier blog post linked from it?

      Now don’t get me wrong – GitHub is not perfect. In some (but not all) ways, it is inferior to Gerrit’s review system which is better at exposing the complete history of the review cycle. I previously wrote another blog post examining GitHub’s review system in more depth. But in general, it’s usually plenty good enough.

      I already mentioned some of these issues there. However in practice I find it very rare that there are significant problems.

      So I very much prefer to never change history, not even of transient pull requests. Pushing additional commits to a pull request actually can be quite nice, because as a reviewer you then see how what comments have been addressed. With a force push you basically have to always review the full change again.

      As I already said, Gerrit handles this better (but Gerrit is horrible in other ways). But you have to admit there are also disadvantages to pushing additional commits, e.g. there is no way to view the effect of some squashed commits until they are actually squashed.

      Usually it is a good idea to clean up after the review and create a polished version of the pull requests with squashed commits, fixed typos in commit messages, etc. I find it a nice way to simply create a new pull request with a cleaned up version of the commits and put in the number of the original pull request in the comment. This will magically create the cross-linking between the two requests.

      That’s an interesting idea, which combats one of the biggest problems of the “push additional commits into the PR” approach, namely pollution of the history with a bunch of ugly fixup commits. However two PRs per topic also makes it quite a bit more heavyweight.

      That is my favorite work flow. I’m sure there are seven or seventy more 😉

      Exactly, and this flexibility is one of the joys of git (and GitHub).

  2. David Majda says:

    For me, the force-push workflow doesn’t work very well.

    Many times I came to a pull request which was after several rounds of reviews and updates in this style. Almost invariably, such a pull request was a big mess with commits on the main page scattered among comments which often lacked context (because they referred to code or comments that were no longer visible in the pull request). I could make sense of the whole thing only by reading a compete set of notification e-mails (if I had it), but there I often found broken links to obsolete diffs, so I still didn’t have the complete picture.

    My ideal solution would be simple: Detect force-pushes on GitHub side and create a new version of the pull request each time (with a completely blank slate — just the title, description, and new set of commits). Link these versions together using next/previous links or some index. Then I’d be perfectly happy because the history and context in which comments and changes happened would be completely preserved.

    As a side note, what you perceive as GitHub doing the right thing on history rewrites (hunks etc.) I always perceived as randomness. Sometimes comments were preserved, sometimes not, sometimes notification links worked after a force-push, sometimes they broke, etc. Sure, I knew there is some system behind this, but it never became obvious to me. And I know I’m not alone.

  3. Thomas Gagne says:

    I’m trying to figure out commits /after/ the pull request are included in the pull request. I thought (or was expecting) that a pull request was specific to a commit. I added another feature after my pull request that isn’t ready to be pulled, but it’s included in the original.

    Is there a way to create a pull request at a specific commit hash, or must they always be fore wherever HEAD is?

    • Thomas Gagne says:

      Ok. So I checked-out the commit. Switched to a new branch, pushed it to the remote, then created a new pull request from the branch.

      Crazy.

      • Adam says:

        Each pull request is tied to a unique branch in your fork, so yes if you want to pile more commits on top of an existing pull request and publish them without adding them to the pull request, you need to put them on a different branch.

  4. aplanas says:

    Actually I think that force push is not so wrong. I can understand why in long PR discussions some noise appears after several forced push. I partially resolve this creating a comment after the push where I describe the changes in the commit.

    In this way the final view is a ordered list of comments from the author of the PR that describe the evolution of the code, and the reviewers can detect easily (via notification) that this change is in place.

    For example:
    https://github.com/crowbar/barclamp-nova_dashboard/pull/216

    There are some noise, but basically you can see the evolution of the changes over time, without creating extra PR

  5. Aleksey Kladov says:

    It is safer to use a `–force-with-lease` flag then a simple `–force`.

  6. IMO you should fixup and Squash Merge. That way you have multiple commits to follow and see changes made, no references are lost and at the end you have a single commit

  7. Pictor13 says:

    I am not much experienced with pull requests, so I could be wrong but, what if:

    * the author copies the SHA of the commit where the PR is actually pointing (let’s say: A)
    * it resets the HEAD and fixes what’s needed (let’s say: B, C and D)
    * pushes with force the new head of the PR (D)
    * adds a comment in the PR (or directly in the commit message of B or D) referencing the reset commit (A)

    This way one can see a clean pull request, but can also make sense of what changed and what previous comments are referring to, because it could go to the A commit and check its history.

    I think it would be the cleaner approach.
    Just my ignorance doesn’t let me know if the A commit and its history (till the common ancestor with B/C/D) would still be kept available or if there is the risk that a prune will remove the A commit and its eventual history because of not being referenced by any branch or tag…

    • Adam says:

      You’re essentially proposing what the blog post proposes, with the addition of adding a comment referring to the PR branch’s old HEAD. However your concern about commit A being garbage-collected was spot on – this is indeed a serious problem which I documented in https://github.com/isaacs/github/issues/997 and other issues linked from that.

Leave a Reply

 

Panorama Theme by Themocracy