The Martian

"I'm stranded on Mars. I have no way to communicate with Hermes or Earth. Everyone thinks I'm dead. I'm in a Hab designed to last 31 days. If the Oxygenator breaks down, I'll suffocate. If the Water Reclaimer breaks down, I'll die of thirst. If the Hab breaches, I'll just kind of explode. If none of those things happen, I'll eventually run out of food and starve to death. So yeah. I'm fucked." - Mark Watney

The Martian is one of my favorite books that I've read in a while. I've always felt that the "man vs. environment" theme was far under explored in scifi. Space is deadly. Most of the universe is completely hostile to life. And yet when major motion pictures do Mars movies they invent killer robots to trigger the suspense.

The Martian is a straight up hard scifi book about being stranded and surviving on Mars. It's got a great mix of problem solving, the unexpected, and a wise cracking protagonist. Every challenge he has to overcome is completely realistic. No crazy deus ex machina to inject suspense where this is none. If you like hard sci fi, you'll love this book.

And it's being turned into a major motion picture this October, hopefully landing before our drive in closes for the year. So if books aren't your thing, you could wait for the movie. But, you should really read the book. It's a lot of fun.

Neil Gaiman: How Stories Last - The Long Now

“Stories,” Gaiman said, “teach us how the world is put together and the rules of living in the world, and they come in an attractive enough package that we take pleasure from them and want to help them propagate.” Northwest coast native Americans have a tale about a beautiful woman and young man whose forbidden love was punished by the earth shaking, and black ash on snow, and finally fire coming from a mountain, killing many people. It stopped only when the beautiful woman was thrown into the burning mountain.

That is important information-- solid-seeming mountains can suddenly erupt, and early warnings of that are earthquakes and ash. As pure information it won’t last beyond three generations. But add in beauty and forbidden love and tragic death, and the story will be told as long as people live in the mountains.

Source: Neil Gaiman: How Stories Last - The Long Now

Neil Gaiman did the latest Seminar About Longterm Thinking, audio available to all, video available to Long Now members. 2 years in the making, this is a story about stories, and how we have stories that date back 5000 years.

I think my favorite moment was his explanation that stories are lies. When you say "Once upon a time", it's code for "I'm going to lie to you now". And when you say "this happened to a friend of mine", it's code for "I'm going to lie to you now, but I think there is a chance this might be true". But in those lies we layer elements of truth that endure, even as the stories adapt to the modern age.

As with all Long Now talks, this comes in over an hour and a half of content, but well worth your time.

Pluto is Red

Pluto is a completely different colour from the one we thought it was, according to new images that also show the huge heart that seems to be carved into its side.

Source: Pluto is red: New Horizons images throw out previous understanding of dwarf planet - News - Gadgets and Tech - The Independent

It's going to take me a long time to mentally adjust my model. Pluto is blue in my head, probably from some bit of pop fiction some time in the past.

It's going to be 18 months of trickling back all the data about Pluto, so even though the flyby is in just under 3 days, we're going to be getting new information about our favorite dwarf planet all through the next year.

Drones Stymie Rhino Poachers

Poaching is a threat to the survival of rhinos worldwide, and anti-poaching efforts have always been one step behind. Now, park rangers in South Africa have a leg up. John Petersen from the Air Shepherd program tells host Steve Curwood how the power of predictive analytics combined with drone technology could help to rescue the rhinos.

Source: Living on Earth: Drones Stymie Rhino Poachers

Very cool effort to re-purpose predictive analytics systems that were designed to find roadside bombs, to figure out where poachers are likely to be, then fly drones to find them. Initial results are really promising. No Rhinos were taken in the protected area during their 6 month trial, down from 12 - 14 a month previously.

To build a better Keyboard

I can't tell you how excited I am that Jesse and Kaia's kickstarter launched yesterday, and met their funding goal in the first 12 hours. A couple of years ago I caught up with Jesse at reunion and heard about the beginnings of this project. He was experimenting with building his own keyboard, and wasn't really sure where the project was headed. He commented that it was interesting that every other part of Doug Engelbart's Mother of all Demos has come to be a normal part of our technology lanscape, except his chorded keyboard. Maybe we were missing something. Maybe there were good ideas about keyboards that we just left on the drawing table.

Over the last couple of years he and Kaia went down this journey to build a better keyboard. Moving out to the west coast, doing a 4 month stint at a hardware incubator, many trips to Shenzen. All the interim steps have been amazing to watch. And the thing they created at the end of the day just looks outstanding.

They are now running a Kickstarter to fund the production of their first consumer unit, the Model 01. It just looks amazing. If you spend 8 hours a day at a computer, you owe it to yourself to take a look.

Python Design Patterns

A friend pointed me to this talk by Brandon Rhodes on python design patterns from PyOhio a couple of years ago.

The talk asks an interesting question: why aren't design patterns seen and talked about in the Python community. He walks through the patterns in Design Patterns: Elements of Reusable Object-Oriented Software one by one, and points out some that are features of the language, some that are used in the standard library, and some that are really applicable. All with some nice small code examples.

The thing that got me thinking though was a comment he makes both at the beginning and end of the talk. The reason you don't see these patterns in Python is because Python developers tend not to write the kind of software where they are needed. They focus on small tools that connect other components, or live within a framework.

I'm a newcomer to the community, been doing Python full time for only a few years on OpenStack. So I can't be sure whether or not it's true. However, I know there are times when I'm surprised by things that I would have expected to be solved already in the language, or incompatibilities that didn't need to be there in the python 2 to 3 transition, and wonder if these come from this community not having a ton of experience with software at large code base size, as well as long duration code bases, and the kinds of deprecation and upgrade guarantees needed there.

The Nova API in Kilo and Beyond

Over the past couple of years we've been trying to find a path forward with the Nova API. The Nova v2.0 API defined a very small core set of interfaces that were basically unchangable and copied from Rackspace early in the history of the project. We then added a way to extend it with extensions, both in the upstream community, as well as vendor extensions out of tree.

This created a user experience that was... suboptimal. We had 80+ extensions in tree, they were of various quality and documentation. Example: floating ips were an extension (officially out of the API), but are used so extensively that they were a de facto part of the core API. Except, of course, they weren't, so people trying to use the API had a few things they could count on, and a ton of things that might or might not be there.

This was a disaster for the promise of interoperable clouds.

Even figuring out what a cloud could do was pretty terrible. You could approximate it by listing the extensions of the API, then having a bunch of logic in your code to realize which extensions turned on or off certain features, or added new data to payloads.

We took a very long journey to come up with a better way, with lots of wrong turns and dead ends. That's a story for another day. Today I'd like just explain where we got to, and why.

A User-First Perspective

Let's take a step back and think about some stakeholders involved in OpenStack, and what they need and want out of it.

Jackson the Absent

Last year Jackson wrote an application that works against OpenStack; it's been deployed in production and is part of the workflow at his organization.

Jackson has decided to leave tech and take up goat farming. His application should continue to work without changes even after the OpenStack cloud it's running against has been upgraded multiple times.

Emma the Active

Emma is the kind of person that loves new features and eagerly awaits her OpenStack cloud upgrades to see the new things she can do. She should be able to have her application work fine across upgrades, and then be able to modify it to take advantages of new features exposed in the API.

Sophia the Multi-Cloud Integrator

Sophia has an application that spans multiple OpenStack clouds run by different organizations. These clouds will be at different versions of OpenStack, and thus will expose different API features.

Sophia needs to be able to talk to all these clouds, know what features they expose, and have a single program that can talk to them all simultaneously.

(Note: this is what our own nodepool does that runs all the OpenStack upstream tests)

Aiden the Cloud Operator

Aiden knows he has a lot of users of the OpenStack API at his location. He'd really like to know who the Jacksons and Emmas of the world are, so that he can keep an eye on the far future of whether it's ever safe to disable really old features in his cloud.

Olivia the Contributor

Olivia wants to get her feature added to Nova which needs exposing through the API. She'd like to be able to get that landed during a release, and not have to wait 3 years for an eventual API rewrite.

Requirements

Considering the needs of these various users helps to determine the key requirements for an API to something like OpenStack: an Open Source project that's deployed in many different companies and environments, with possibly years of difference in the version of code deployed at any of these locations.

  • The API should be as common as possible between deploys. Every optional feature is a feature that can't be depended on by an application writer. Or worse: if it's used, it's a lock-in to that cloud. That means software has to be rewritten for every cloud, or written with a horrible kluge layer.
  • It needs to be really clear exactly what a particular cloud's API supports.
  • Older applications must not be broken by new features, or need to be rewritten after their OpenStack cloud is upgraded.
  • We have to have a way to get new features out in a timely basis.
  • We have to be able to evolve the API one piece at a time, as the Nova API is sufficiently large that a major version bump is no longer possible (we learned this the hard way).

The Backwards Compatibility Fallacy

Nearly every conversation with a developer around this issue starts with "why can't we just add more data to structures in backwards compatible ways". This is how service providers like Amazon, Meetup, and others work.

The problem is we aren't a proprietary company with 1 revision of our API stack in the wild at a time. We are an Open Source project with thousands of deployments all over the world, all on different code revisions and upgrade cadence.

Let's play the exercise where we thought additive was good enough.

A great example currently exists: ipv6 filtering on server lists. Currently, the Nova client erroneously says it's supported. It's not; it's actually completely ignored on the server. A suggestion is that this is a backwards compatible addition, so we should just do it, and we don't need to signal to the user that this was an API change.

However, that assumes that time is monotonically moving forward; it's not. Sophia might run across a cloud that had gotten to this version of the code and realized she could filter by ipv6 address. Great! She writes her code to use and depend on that feature.

Then she runs her code at against another cloud, which runs a version of Nova that predates this change. She's now effectively gone back in time. Her code now returns thousands of records instead of 1, and she's terribly confused why. She also has no way to figure out if random cloud Z is going to support this feature or not. So the only safe thing to do is implement the filtering client side instead, which means the server side filtering actually gained her very little. It's not something she can ever determine will work ahead of time. It's an API that is untrustworthy, so it's something that's best avoided.

We've seen this problem before on the World Wide Web. Javascript APIs in browsers are different, and while parts of them converge towards standardization over time, you end up writing a lot of shim compatibility layers to take advantage of new features but not break on old implementations. This is the way of decentralized API evolution and deployment. Once you get past one implementation controlled by a single organization you have to deal with it.

API Microversions

There are some amazing things in the HTTP specification, some really great ideas that I am amazed were thought about back in the early days of the web. One of which is Content Negotiation. A resource is addressed by a URL (Uniform Resource Locator). However, that resource might be available in multiple representations: there might be a text version, an html version, and a pdf version. The HTTP spec provides a header that allows you to tell the server what kind of representation you would like for your resource. The server can say "that's not possible" and then you try again with something different, but it gives you as the client a lot of control in what you are going to get.

What if APIs worked like that? It's always a server, but I'd really like the 2.253 representation of it, which has some fields that are really handy for me.

Microversions are like Content Negotiation for the API.

Like Content Negotiation, the requested Microversion is passed as an HTTP header. Unlike Content Negotiation we don't support ranges, as the complexity to client programming gets out of control. Like Content Negotiation, if nothing is provided, we do a sane thing: send the minimum supported version.

Nova v2.1

Nova v2.1 is a new, cleaner backend implementation of the Nova v2.0 API. The one thing it adds is consistent input validation, so we catch bad requests at the API layer and return a sane error to the user. This is much more straight forward than our old model of trying to translate a stack trace (possibly triggered by a database violation) into a meaningful error message to the user.

Applications that work on v2.0 can be pointed to v2.1, and will just work. It should be transparent enough to the application authors that they'll never notice the transition.

And onto this Nova v2.1 API endpoint, we start adding microversion features. If you want features in the 2.3 microversion, you specify that fact in your header. Then you'll get the v2.3 versions of all the resources.

If you specify nothing, you get the minimum supported version, which rolls back to v2.1, which is the same as the v2.0 API. So all existing applications just work without doing anything. Only when an application wants to opt into new features does it need to make changes.

Solving for Stakeholders

Let's look at how this solves things for our stakeholders:

  • Jackson: his application keeps running, v2.1 is v2.0. His application needed to make 0 changes to run as it did before.
  • Emma: she can poll the Versions endpoint and discover that hercloud now supports 2.4. So she can start coding to those features in her application and put a 2.4 version request into all of her code.
  • Sophia: she can now probe all of the clouds she's working with to find out what feature levels they support, based on the information provided by the Versions endpoint. As request version is per request, she can either figure out some API version that intersects all her clouds and write to that, or she can write client-side paths based on different versions she's sure she can support and has tested (a 2.1 path, a 2.4 path, a 2.52 path) and dynamically use the best path supported on a particular cloud. This approach works even after BleedingEdgeCo cloud has set a minimum supported version at 2.50, even though ImSlowWithUpgradesCo cloud still only is up to 2.4. Sophia's job was never going to be fun, but it's now possible, without building a giant autoconf-like system to probe and detect what clouds actually support, or worse: trying to piece it together from a range of service provider and product documentation.
  • Aiden: he's now collecting client request version information on inbound requests, which means that he can figure out which of his users are still using older code. That provides the ability to know when, if ever, it's safe to move forward. Or even be able to have a chat with folks using Jackson's ancient tools to figure out what their long term support strategy should be.
  • Olivia: she can now help evolve the Nova API in a safe way, knowing that she's not going to break existing users, but will still be able to support new things needed by OpenStack.

Nova v2.1/v2.0 Forever (nearly)

There are some details about how we implement microversions internally in Nova, which means our assumption is that we're supporting the base v2.1 API forever. We have the facility to raise the minimum version; however we've stated the first time we're even going to have that conversation is in Barcelona in Fall of 2016. That doesn't mean we'll raise the minimum in Orzo, but we'll have our first conversation, with lots of data from operators and application developers to see how things are going, and what's a realistic path moving forward.

One API - no more extensions

There were multiple ways we could have gone about microversioning; one of the original suggestions was versions per resource. But the moment you start thinking about what the client code would look like to talk to that, you want to throw up a little bit. It was clear that to provide the best user experience we needed to draw a line in the sand and stop thinking about the Nova API as a toolkit to extend, and declare it as a solid thing that all users can expect from their clouds.

The extensions mechanism is deprecated. All the extensions in the Nova tree are now in the Nova API. Over the next couple of cycles we'll be consolidating some of the code to make this more consistent, and eventually remove the possibility of out-of-tree extensions working at all. This allows the API to have a meaningful monotonically increasing API version that will mean the same thing across all deploys.

This is also a signal to people working on Nova that "innovating on the API out-of-tree" is a thing we not only don't find valuable, but is fundamentally hostile to the creation of an application ecosystem for OpenStack.

If you need new things, come talk to us. Let's figure out how to do it together, in tree, in an interop-friendly way. And yes, this means some features won't be welcomed, or will be delayed as we consider a way to make them something that could work for a majority of deployers / hypervisors, and that could be a contract we could support long-term.

Never another API endpoint

You should never expect another API endpoint from Nova. Our versioning mechanism is no longer about the endpoint name, it's the Nova API with a Microversion header in the request. Applications will never need to think about a big rewrite because they have to jump API endpoints. They can instead take advantage of new features on a time table that makes sense to them.

Next Steps

Microversioning is a new thing, but it's already shown quite a bit of promise. It's also been implemented by other projects like Ironic to address the same kinds of concerns that we saw. It's very exciting to see this spread across OpenStack.

This mechanism will also let us bring in things like JSON Home to expose the resource tree in Nova as a resource itself. And work on concepts like the Tasks API to provide a better workflow for creating multi-part resources in Nova (a server with networks and volumes that should get built as an atomic unit).

Discoverability is not yet fully solved, as the policy that applies to a user is still hidden. We hope that with some of the upcoming work on Dynamic Policy in Keystone we can get that built into the API as well. That will give us a complete story where an application can know what it can do against a given cloud before it gets started (as a combination of supported microversion, and exposed policy).

And there is a lot of internal cleaing up, documentation, and testing still to do. The bulk of the Liberty cycle API work is going to be there to put some polish on what we've got.

A big thanks to Chris Yeoh

The journey that got us here was a long and hard one. It started 5 cycles ago at the Grizzly summit. A year ago no one on the team was convinced we had a path forward. Really hard things are some times really hard.

Chris Yeoh was the point person for the API work through this whole time, and without his amazing patience and perseverance as we realized some terrible corners we'd painted ourselves into, and work that had to be dropped on the floor, we probably wouldn't have come out the other side with anything nearly as productive as what we now have. We lost him this spring at far too young an age. This work is part of his legacy, and will be foundational for much of OpenStack for a long time to come.

Thank you Chris, and we miss you.

Do you want to learn more?

If you some how made it this far, and want even more details about what's going on the following is great follow up reading:

Facebook PGP Email

Facebook just added optional PGP support for all their email notifications to users:

To enhance the privacy of this email content, today we are gradually rolling out an experimental new feature that enables people to add OpenPGP public keys to their profile; these keys can be used to "end-to-end" encrypt notification emails sent from Facebook to your preferred email accounts. People may also choose to share OpenPGP keys from their profile, with or without enabling encrypted notifications.Source: Securing Email Communications from Facebook

Serious kudos to Facebook for building PGP into their infrastructure. The fact that Email is transported in the clear is something that most people forget. I tried it this morning, and it works well.

Various rambling thoughts from my personal corner of the internet