Structured Procrastination work Archives – Structured Procrastination

Improving trust in the cloud with OpenStack and AMD SEV

By Adam, September 13, 2019 1:00 pm

This post contains an exciting announcement, but first I need to provide some context!

Ever heard that joke “the cloud is just someone else’s computer”?

Coffee mug saying "There is no cloud. It's just someone else's computer"

Of course it’s a gross over-simplification, but there’s more than a grain of truth in it. And that raises the question: if your applications are running in someone else’s data-centre, how can you trust that they’re not being snooped upon, or worse, invasively tampered with?

Until recently, the answer was “you can’t”. Well, that’s another over-simplification. You could design your workload to be tamperproof; for example even if individual mining nodes in Bitcoin or Ethereum are compromised, the blockchain as a whole will resist the attack just fine. But there’s still the snooping problem.

Hardware to the rescue?

However, there’s some good news on this front. Intel and AMD realised this was a problem, and have both introduced new hardware capabilities to help improve the level to which cloud users can trust the environment in which their workloads are executed, e.g.:

AMD SEV (Secure Encrypted Virtualization) which can encrypt the memory of a running VM with a key which is only accessible to the owner of that VM. This is done on-chip so that even if you have physical access to the machine, it makes it a lot harder to snoop in on the running VM¹.

It can also provide the guest owner with an attestation which cryptographically proves that the memory was encrypted correctly and can only be decrypted by the owner.
Intel MKTME (Multi-Key Total Memory Encryption) which is a similar approach.

But even with that hardware support, there is the question to what degree anyone can trust public clouds run on proprietary technology. There is a growing awareness that Free (Libre) / Open Source Software tends to be inherently more secure and trustworthy, since its transparency enables unlimited peer review, and its openness allows anyone to contribute improvements.

And these days, OpenStack is pretty much the undisputed king of the Open Source cloud infrastructure world.

An exciting announcement

So I’m delighted to be able to announce a significant step forward in trustworthy cloud computing: as of this week, OpenStack is now able to launch VMs with SEV enabled! (Given the appropriate AMD hardware, of course.)

The core functionality is all merged and will be in the imminent Train release. You can read the documentation, and you will also find it mentioned in the Nova Release Notes.

While this is “only” an MVP and far from the end of the journey (see below), it’s an important milestone in a strong partnership between my employer SUSE and AMD. We started work on adding SEV support into OpenStack around a year ago:

This resulted in one of the most in-depth technical specification documentations I’ve ever had to write, plus many months of intense collaboration on the code and several changes in design along the way.

I’d like to thank not only my colleagues at SUSE and AMD for all their work so far, but also many members of the upstream OpenStack community, especially the Nova team. In particular I enjoyed fantastic support from the PTL (Project Technical Lead) Eric Fried, and several developers at Red Hat, which I think speaks volumes to how well the “coopetition” model works in the Open Source world.

The rest of this post gives a quick tour of the implementation via screenshots and brief explanations, and then concludes with what’s planned next.

Continue reading 'Improving trust in the cloud with OpenStack and AMD SEV'»

front page, geek, work | AMD, architecture, cloud, coopetition, OpenStack, SEV

git branch auto-magic: git-splice, git-transplant, git-deps, and announcing git-explode!

Comments (10)

By Adam, June 14, 2018 11:00 pm

For the last few years I’ve been enjoying the luxury of SUSE’s generous HackWeek policy to work on building four tools supporting higher-level workflows on top of git. I’d already (quietly) announced three of them: git-splice, git-transplant, git-deps (more details below). But I’m now excited to announce that I’ve just released the fourth: git-explode !

git-explode automatically explodes a large topic branch into a set of smaller, independent topic branches. It does this by harnessing git-deps to automatically detect inter-dependencies between commits in the large source branch and using that dependency tree to construct the smaller topic branches.

I recently presented all four tools at a Meetup of the Git London User Group, and thanks to the awesome services of the host venue Skills Matter, I’m delighted to announce that the talk is now online:

If you don’t have time to watch the whole thing, you can look at the slides, or just keep on reading to see which ones you might be interested in. I’ll list them first, and then talk about the motivation for writing them.

Continue reading 'git branch auto-magic: git-splice, git-transplant, git-deps, and announcing git-explode!'»

front page, geek, hacks, work | backporting, branches, development, forward-porting, Free Software, git, hacking, software, upstreaming, version control, workflow

Report from the OpenStack PTG in Dublin

Comments (1)

By Adam, March 9, 2018 7:30 pm

Last week I attended OpenStack’s PTG (Project Teams Gathering) in Dublin. This event happens every 6 months in a different city, and is a fantastic opportunity for OpenStack developers and upstream contributors to get together and turbo-charge the next phase of collaboration.

I wrote a private report for my SUSE colleagues summarising my experience, but then Colleen posted her report publicly, which made me realise that it would be far more in keeping with OpenStack’s Four Opens to publish mine online. So here it is!

Continue reading 'Report from the OpenStack PTG in Dublin'»

front page, geek, travel, work | architecture, cloud, development, OpenStack

Squash-merging and other problems with GitHub

Comments (3)

By Adam, August 16, 2017 6:45 pm

(Thanks to Ben North, Colleen Murphy, and Nicolas Bock for reviewing earlier drafts of this post.)

In April 2016, GitHub announced a new feature supporting squashing of multiple commits in a PR at merge-time (announced on April 1st, but it was actually bona-fide 😉 ).

I appreciate that there was high demand for this feature (and similarly on GitLab), and apparently that many projects have a “squash before submitting a PR” policy, but I’d like to contend that this is a poor-man’s workaround for the lack of a real solution to the underlying problems.

Why squash-merge?

So what are the underlying problems which made this such a frequently requested feature? From reading the various links above, it seems that by far the biggest motivator is that people frequently submit pull requests (or merge requests, in GitLab-speak) which contain multiple commits, and these commits are seen as too “noisy” / fine-grained. In other words there is a desire to not pollute the target/trunk branch (e.g. master) with these fine-grained commits, and instead only have larger, less fine-grained commits merged.

But where does this desire come from? Well, if the fine-grained commits which accumulate on a PR branch are frequently amendments to earlier commits in the same PR (like “oops, fix typo I just made” or “oops, fix bug I just introduced”) then this desire is entirely understandable, because noone wants to see that kind of mess on master. However the real problem here is that that kind of mess should have never made it onto GitHub in the first place – not even onto a PR branch! It should have instead been fixed in the developer’s local repository. That is why there is a whole section in the “Pro Git” book dedicated to explaining how to rewrite local history, and why git-commit(1) and git-rebase(1) have native support for creating and squashing “fixup” commits into commits which they fix.

Use the force-push, Luke

If an existing PR needs to be amended, make the change and then rewrite local history so that it’s clean. The new version of the branch can then be force-pushed to GitHub via git push -f, which is an operation GitHub understands and in many situations handles reasonably gracefully. I have previously blogged about why this way is better, but one way of quickly summarising it is: don’t wash your dirty linen in public any more than you have to.

Continue reading 'Squash-merging and other problems with GitHub'»

front page, geek, work | development, git, github, hacking

Cloud rearrangement for fun and profit

Comments (1)

By Adam, May 17, 2015 4:42 am

In a populated compute cloud, there are several scenarios in which it’s beneficial to be able to rearrange VM guest instances into a different placement across the hypervisor hosts via migration (live or otherwise). These use cases typically fall into three categories:

Rebalancing – spread the VMs evenly across as many physical VM host machines as possible (conceptually similar to vSphere DRS). Example use cases:
- Optimise workloads for performance, by reducing CPU / I/O hotspots.
- Maximise headroom on each physical machine.
- Reduce thermal hotspots in order to reduce power consumption; for example, back in 2003, HP showed that intelligent workload placement can reduce energy consumption by more than 14%. (Here’s an old slidedeck I made years ago when researching this use case.)
Consolidation – condense VMs onto fewer physical VM host machines (conceptually similar to vSphere DPM). Typically involves some degree of defragmentation. Example use cases:
- Increase CPU / RAM / I/O utilization. CERN blogged last summer about this Tetris-like challenge, and it can be taken even further over-committing CPU/RAM and/or memory page sharing.
- Free up physical servers to reduce power consumption.
Evacuation – free up physical servers:
- for repurposing or decommissioning
- for maintenance (BIOS upgrades, re-cabling etc.) or repair
- to protect SLAs, e.g. when monitors indicate potential imminent hardware failure, or when HVAC failures are likely to cause servers to shutdown due to over-heating. (This is different to the somewhat confusingly named nova evacuate functionality which moves VMs to a new host after the server has already failed.)

Whilst one-shot manual or semi-automatic rearrangement can bring immediate benefits, the biggest wins often come when continual rearrangement is automated. The approaches can also be combined, e.g. first evacuate and/or consolidate, then rebalance on the remaining physical servers.

Other custom rearrangements may be required according to other IT- or business-driven policies, e.g. only rearrange VM instances relating to a specific workload, in order to increase locality of reference, reduce latency, respect availability zones, or facilitate other out-of-band workflows or policies (such as data privacy or other legalities).

In the rest of this post I will expand this topic in the context of OpenStack, talk about the computer science behind it, propose a possible way forward, and offer a working prototype in Python.

If you’re in Vancouver for the OpenStack summit which starts this Monday and you find this post interesting, ping me for a face-to-face chat!

Continue reading 'Cloud rearrangement for fun and profit'»

front page, geek, hacks, work | cloud, DPM, DRS, Free Software, migration, OpenStack, placement, software, virtualization, VMware, vSphere, workloads