git branch auto-magic: git-splice, git-transplant, git-deps, and announcing git-explode!

By , June 14, 2018 11:00 pm

For the last few years I’ve been enjoying the luxury of SUSE’s generous HackWeek policy to work on building four tools supporting higher-level workflows on top of git. I’d already (quietly) announced three of them: git-splice, git-transplant, git-deps (more details below). But I’m now excited to announce that I’ve just released the fourth: git-explode !

git-explode automatically explodes a large topic branch into a set of smaller, independent topic branches. It does this by harnessing git-deps to automatically detect inter-dependencies between commits in the large source branch and using that dependency tree to construct the smaller topic branches.

I recently presented all four tools at a Meetup of the Git London User Group, and thanks to the awesome services of the host venue Skills Matter, I’m delighted to announce that the talk is now online:

video of my talk on git auto-magic at the Git London User Group Meetup

If you don’t have time to watch the whole thing, you can look at the slides, or just keep on reading to see which ones you might be interested in. I’ll list them first, and then talk about the motivation for writing them.

Cloud rearrangement for fun and profit

By , May 17, 2015 4:42 am

In a populated compute cloud, there are several scenarios in which it’s beneficial to be able to rearrange VM guest instances into a different placement across the hypervisor hosts via migration (live or otherwise). These use cases typically fall into three categories:

  1. Rebalancing – spread the VMs evenly across as many physical VM host machines as possible (conceptually similar to vSphere DRS). Example use cases:
  2. Consolidation – condense VMs onto fewer physical VM host machines (conceptually similar to vSphere DPM). Typically involves some degree of defragmentation. Example use cases:
  3. Evacuation – free up physical servers:

Whilst one-shot manual or semi-automatic rearrangement can bring immediate benefits, the biggest wins often come when continual rearrangement is automated. The approaches can also be combined, e.g. first evacuate and/or consolidate, then rebalance on the remaining physical servers.

Other custom rearrangements may be required according to other IT- or business-driven policies, e.g. only rearrange VM instances relating to a specific workload, in order to increase locality of reference, reduce latency, respect availability zones, or facilitate other out-of-band workflows or policies (such as data privacy or other legalities).

In the rest of this post I will expand this topic in the context of OpenStack, talk about the computer science behind it, propose a possible way forward, and offer a working prototype in Python.

If you’re in Vancouver for the OpenStack summit which starts this Monday and you find this post interesting, ping me for a face-to-face chat!

managing your github notifications inbox with mutt

By , October 5, 2014 1:59 pm

Like many F/OSS developers, I’m a heavy user of GitHub. This means I interact with other developers via GitHub multiple times a day. GitHub has a very nice notifications system which lets me know when there has been some activity on a project I’m collaborating on.

I’m a fan of David Allen’s GTD (“Getting Things Done”) system, and in my experience I get the best results by minimising the number of inboxes I have to look at every day. So I use another great feature of GitHub, which is the ability to have notification emails delivered directly to your email inbox. This means I don’t have to keep checking in addition to my email inbox.

However, this means that I receive GitHub notifications in two places. Wouldn’t it be nice if when I read them in my email inbox, GitHub could somehow realise and mark them read at too, so that when I look there, I don’t end up getting reminded about notifications I’ve already seen in my inbox? Happily the folks at GitHub already thought of this too, and come up with a solution:

If you read a notification email, it’ll automatically be marked as read in the Notifications section. An invisible image is embedded in each mail message to enable this, which means that you must allow viewing images from in order for this feature to work.

But there’s a catch! Like many Linux geeks, I use mutt for reading and writing email. In fact, I’ve been using it since 1997 and I’m still waiting for another MUA to appear which is more powerful and lets me crunch through email faster. However mutt is primarily text-based, which means by default it doesn’t download images when displaying HTML-based email. Of course, it can. But do I want it to automatically open a new tab in my browser every time I encounter an HTML attachment? No! That would slow me down horribly. Even launching a terminal-based HTML viewer such as w3m or links or lynx would be too slow.

So I figured out a better solution. mutt has a nice message-hook feature where you can configure it to automatically execute mutt functions for any message matching specific criteria just before it displays the message. So we can use that to pipe the whole email to a script whenever a message is being read for the first time:

message-hook "(~N|~O) ~f" "push '<pipe-message>read-github-notification\n'"

(~N|~O) matches mails which have the N flag (meaning new unread email) or O (meaning old unread email) set.

The read-github-notifications script reads the email on STDIN, extracts the URL of the 1-pixel read notification beacon <img> embedded in the HTML attachment, and sends an HTTP request for that image, so that github knows the notification has been read.

This means an extra delay of 0.5 seconds or so when viewing a notification email, but for me it’s a worthwhile sacrifice.

If you want to try it, simply download the script and stick it somewhere on your $PATH, and then add the above line to your ~/.muttrc file.

UPDATE 2017/08/29: Unfortunately I have since discovered that the above solution breaks when you try to save one of these notifications before reading it. Suggestions on how to fix this are most welcome!


Email inboxes and the GTD 2-minute rule

By , March 20, 2014 12:46 am

Today’s dose of structured procrastination resulted in something I’ve been meaning to build for quite a while: a timer to help apply the two minute rule from David Allen’s famous GTD (Getting Things Done) system to the processing of a maildir-format email inbox.

Briefly, the idea is that when processing your inbox, for each email you have a maximum of two minutes to either:

  • perform any actions required by that email, or
  • add any such actions to your TODO list, and move the email out of the inbox. (IMHO, best practice is to move it to an archive folder and have a system for rapid retrieval of the email via the TODO list item, e.g. via a hyperlink which will retrieve the email based on its Message-Id: header, using an indexing mail search engine.)

However, I find that I frequently exhibit the bad habit of fidgeting with my inbox – in other words, checking it frequently to satisfy my curiosity about what new mail is there, without actually taking any useful action according to the processing (a.k.a. clarification) step. This isn’t just a waste of time; it also increases my stress levels by making me aware of things I need to do whilst miserably failing to help get them done.

Another bad habit I have is mixing up the processing/clarification phase with the organizing and doing phases – in other words, I look at an email in my inbox, realise that it requires me to perform some actions, and then I immediately launch right into the actions without any thought as to how urgent they are or how long they will take. This is another great way of increasing stress levels when they are not urgent and could take a long time, because at least subconsciously I’m usually aware that this is just another form of procrastination.

So today I wrote this simple Ruby timer program which constantly monitors the number of emails in the given maildir-formatted folder, and shows you how much of the two minutes you have left to process the item you are currently looking at. Here’s a snippet of the output:

1:23    24 mails
1:22    24 mails
1:21    24 mails
1:20    24 mails
Processed 1 mail in 41s!
Average velocity now 57s per mail
At this rate you will hit your target of 0 mails in 21m 55s, at 2014-03-19 23:18:59 +0000
2:00    23 mails
1:59    23 mails
1:58    23 mails
1:57    23 mails

You can see that each time you process mail and remove it from the email folder, it resets the counter back to two minutes. If you exceed the two minute budget, it will start beeping annoyingly, to prod you back into adherence to the rule.

So for example if you have 30 mails in your inbox, using this timer it should take you an absolute maximum of one hour to process them all (“process” in the sense defined within David Allen’s GTD system, not to complete all associated tasks).

Since gamification seems to be the new hip buzzword on the block, I should mention I’m already enjoying the fact that this turns the mundane chore of churning through an inbox into something of a fun game – seeing how quickly I can get through everything. And I already have an item on the TODO list for collecting statistics about each “run”, so that I can see stuff like:

  • on avarege how many emails I process daily
  • how often I process email
  • on average how many emails I process during each “sitting”
  • how much time I spend processing email
  • whether I’m getting faster over time

I also really like being able to see an estimate of the remaining time – I expect this will really help me decide whether I should be processing or doing. E.g. if I have deadlines looming and I know it’s going to take two hours to process my inbox, I’m more likely to consciously decide to ignore it until the work for my deadline is complete.

Other TODO items include improving the interface to give a nice big timer and/or progress bar, and the option of a GTK interface or similar. Pull requests are of course very welcome 😉

For mutt users, this approach can work nicely in conjunction with a trick which helps focus on a single mail thread at a time.

Hope that was useful or at least interesting. If you end up using this hack, I’d love to hear about it!


Easier upstreaming / back-porting of patch series with git

By , September 19, 2013 9:22 pm

Have you ever needed to port a selection of commits from one git branch to another, but without doing a full merge? This is a common challenge, e.g.

  • forward-porting / upstreaming bugfixes from a stable release branch to a development branch, or
  • back-porting features from a development branch to a stable release branch.

Of course, git already goes quite some way to making this possible:

  • git cherry-pick can port individual commits, or even a range of commits (since git 1.7.2) from anywhere, into the current branch.
  • git cherry can compare a branch with its upstream branch and find which commits have been upstreamed and which haven’t. This command is particularly clever because, thanks to git patch-id, it can correctly spot when a commit has been upstreamed, even when the upstreaming process resulted in changes to the commit message, line numbers, or whitespace.
  • git rebase --onto can transplant a contiguous series of commits onto another branch.

It’s not always that easy …

However, on the occasions when you need to sift through a larger number of commits on one branch, and port them to another branch, complications can arise:

  • If cherry-picking a commit results in changes to its patch context, git patch-id will return a different SHA-1, and subsequent invocations of git cherry will incorrectly tell you that you haven’t yet ported that commit.
  • If you mess something up in the middle of a git rebase, recovery can be awkward, and git rebase --abort will land you back at square one, undoing a lot of your hard work.
  • If the porting process is big enough, it could take days or even weeks, so you need some way of reliably tracking which commits have already been ported and which still need porting. In this case you may well want to adopt a divide-and-conquer approach by sharing out the porting workload between team-mates.
  • The more the two branches have diverged, the more likely it is that conflicts will be encountered during cherry-picking.
  • There may be commits within the range you are looking at which after reviewing, you decide should be excluded from the port, or at least porting them needs to be postponed to a later point.

It could be argued that all of these problems can be avoided with the right branch and release management workflows, and I don’t want to debate that in this post. However, this is the real world, and sometimes it just happens that you have to deal with a porting task which is less than trivial. Well, that happened to me and my team not so long ago, so I’m here to tell you that I have written and published some tools to solve these problems. If that’s of interest, then read on!

