Tag Archives: git

Splitting up Git Commits

Human review of code takes a bunch of time. It takes even longer if the proposed code has a bunch of unrelated things going on in it. A very common piece of review commentary is “this is unrelated, please put it in a different patch”. You may be thinking to yourself “gah, so much work”, but turns out git has built in tools to do this. Let me introduce you to git add -p.

Lets look at this Grenade review – https://review.openstack.org/#/c/109122/1. This was the result of a days worth of hacking to get some things in order. Joe correctly pointed out there was at least 1 unrelated change in that patch (I think he was being nice, there were probably at least 4 things going that should have been separate). Those things are:

  • The quiece time for shutdown, that actually fixes bug 1285323 all on it’s own.
  • The reordering on the directory creates so it works on a system without /opt/stack
  • The conditional upgrade function
  • The removal of the stop short circuits (which probably shouldn’t have been done)

So how do I turn this 1 patch, which is at the bottom of a patch series, into 3 patches, plus drop out the bit that I did wrong?

Step 1: rebase -i master

Start by running git rebase -i master on your tree to put myself into the interactive rebase mode. In this case I want to be editing the first commit to split it out.

screenshot_171

Step 2: reset the changes

git reset ##### will unstage all the changes back to the referenced commit, so I’ll be working from a blank slate to add the changes back in. So in this case I need to figure out the last commit before the one I want to change, and do a git reset to that hash.

screenshot_173

Step 3: commit in whole files

Unrelated change #1 was fully isolated in a whole file (stop-base), so that’s easy enough to do a git add stop-base and then git commit to build a new commit with those changes. When splitting commits always do the easiest stuff first to get it out of the way for tricky things later.

Step 4: git add -p 

In this change grenade.sh needs to be split up all by itself, so I ran git add -p to start the interactive git add process. You will be presented with a series of patch hunks and a prompt about what to do with them. y = yes add it, n = no don’t, and lots of other options to be trickier.

screenshot_176

In my particular case the first hunk is actually 2 different pieces of function, so y/n isn’t going to cut it. In that case I can type ‘e’ (edit), and I’m dumping into my editor staring at the patch, which I can interactively modify to be the patch I want.

screenshot_177

I can then delete the pieces I don’t want in this commit. Those deleted pieces will still exist in the uncommitted work, so I’m not losing any work, I’m just not yet dealing with it.

screenshot_178

Ok, that looks like just the part I want, as I’ll come back to the upgrade_service function in patch #3. So save it, and final all the other hunks in the file that are related to that change to add them to this patch as well.

screenshot_179

Yes, to both of these, as well as one other towards the end, and this commit is ready to be ‘git commit’ed.

Now what’s left is basically just the upgrade_service function changes, which means I can git add grenade.sh as a whole. I actually decided to fix up the stop calls before doing that just by editing grenade.sh before adding the final changes. After it’s done, git rebase –continue rebases the rest of the changes on this, giving me a new shiney 5 patch series that’s a lot more clear than the 3 patch one I had before.

Step 5: Don’t forget the idempotent ID

One last important thing. This was a patch to gerrit before, which means when I started I had an idempotent ID on every change. In splitting 1 change into 3, I added that id back to patch #3 so that reviewers would understand this was an update to something they had reviewed before.

It’s almost magic

As a git user, git add -p is one of those things like git rebase -i that you really need in your toolkit to work with anything more than trivial patches. It takes practice to have the right intuition here, but once you do, you can really slice up patches in a way that are much easier for reviewers to work with, even if that wasn’t how the code was written the first time.

Code that is easier for reviewers to review wins you lots of points, and will help with landing your patches in OpenStack faster. So taking the time upfront to get used to this is well worth your time.

Prettier fonts for Git Gui on Ubuntu

The default fonts for git gui (aka gitk) in Ubuntu are down right horrible.  Even Ubuntu 10.04 defaults to tk8.4, which doesn’t support font smoothing.  Fortunately there is a simple way to fix this and make a whole bunch of applications look prettier all at once.

# sudo update-alternatives –config wish
There are 3 choices for the alternative wish (providing /usr/bin/wish).

Selection    Path                   Priority   Status
————————————————————
* 0            /usr/bin/wish-default   10000     auto mode
1            /usr/bin/wish-default   10000     manual mode
2            /usr/bin/wish8.4        841       manual mode
3            /usr/bin/wish8.5        840       manual mode

Then type ‘3’ and hit enter.  Now you’ll be using tk8.5 by default, and miracle of miracles your eyes won’t be scarred by jagged ugly fonts in gitk anymore.

A Git epiphany, a journey in 3 acts

There have been a series of posts about Git in the last week over at The Reinvigorated Programmer. It’s fascinating to watch someone come to terms with Git that’s also a brilliant writer, and gets to the heart of the challenges so quickly. If you’ve ever struggled getting over that hump with Git, I encourage you to read the three posts, in order, and see if it helps you on your journey.

Act 1: Git is a Harrier Jump Jet. And not in a good way

Act 2: Still hatin’ on git: now with added Actual Reasons!

Act 3: You could have invented git (and maybe you already have!)

Streaming talk on Git

Tomorrow night (Wed, Jan 6th) at 6pm EST I’ll be presenting at MHVLUG on Git, the distributed source code management system.  New and notable on this talk is that I’ll be streaming the talk live on ustream, and, assuming the tech doesn’t horribly break down in the middle of it, will be taking questions from the viewers online as well.

For those software folks that read this blog, or the facebook / twitter posts it generates, this might be of interest to you.  Getting your head around merges in git takes some work early on, and with any luck the diagrams and explanations I pulled together with help quite a bit.

OpenSim moves to git

Yesterday we completed the transition of OpenSim from subversion to git as our primary source code system.  This had actually been kicked around as an idea for nearly a year and a half, but our gating factor had always been that git support on windows was lacking.  Recent dramatic improvements with TortoiseGit took away that blocking element.

One of the reasons for this move is to make it easier for more people to participate in the project (I’ve written about this in the past).  The OpenSim core team has now grown past 20, and even coordinating changes among ourselves has become challenging.  Subversion is fine as long as only a couple of people are working in a particular area, past that it doesn’t do you any favors in merging in complex changes.  A number of complex refactorings in the OpenSim tree have been on more or less perpetual hold because of some of these svn challenges.  Hopefully this will help grease the wheels there.

With git the hope is to give us some tools that help us in a number of ways.  The first is to make it easier to collaborate on more complex work.  The second is to make it easier for non core contributors to contribute substantial work.  The ability to have an opensim clone with changes in it staged for upstream inclusion, and have a core member be able to directly pull those changes, should be a big help.

All changes come with challenges.  The most visible is the lack of a monotonically increasing version number.  Git changes are stored differently, so the version identifier is a SHA1 hash.  That’s going to be the first big mental change people will need to get past.  It seems like a deal breaker before you’ve used it, but don’t worry, it will be ok once you have gotten used it to.  It’s just different.

We’ve got an ever evolving set of instructions at the OpenSim wiki.  I also expect we’ll spend a lot of time in the OSGrid Office hours discussing the transition.

Software in the era of drive by contribution

I love git.  I’ll state that up front.  I also love github, which I’ve expressed in the past.  Both are making me look at software in a new way.  I also think the pair of them are changing some of the rules we know for how open source projects emerge and move forward.

Recently I was working on building a Rails based Event Calendar for MHVLUG.  This gave me a chance to dig in on ical, which has fascinated me since a set of talks at YAPC a decade ago.  There were 2 ruby ical libraries out there (icalendar.rb and vpim.rb), neither did quite what I wanted, and both projects were more or less dormant (the mailing lists were lots of “is anyone alive?” posts).  Ug, I was stuck, and if I had to start from scratch on ical, that was all I’d end up doing, never getting to my application.

I googled some more… and low and behold found a github.com fork of icalendar.rb, and forks of that.  Those forks implemented about 50% of the fixes I needed to get ical generation with timezones to work.  So I forked from one of those and 6 changesets later, had what I needed.  I then built my application, and life was good. 

A few days later I decided to collect up all the changes in all the github icalendar trees, and merge them into my tree.  While git itself can be somewhat confusing, github adds this really slick web interface on top of git trees, that makes the merge process pretty painless.  This is one of their key innovations, and it’s just incredible.  I selected all the outstanding changes that would merge cleanly, pulled them in, and now had a tree which largely encompassed the 8 existing forks on github.com.  I posted back to the dead mailing list and let people know there was this now living github tree where the project had seemed dead.  I got a couple of new patches people wanted in, and 2 months later the maintainer actually showed up again and gave me admin access to the icalendar project so I could publish official versions.

This pattern repeated a few more times on the project.  I found a piece of code on github that did 90% of what I needed, but I needed a change.  I created my fork, added my feature, and pushed it back out (with a pull request).  A few days later the maintainer pulled them back in, and now they are officially part of the project.  I’m not vested in those projects, but I had relevant fixes, and because we were all using a tool that makes it easy to be a casual contributor, they are now part of the open source projects in the sky.

Casual Contributions

If you haven’t seen the paper on participation inequality, go and read it… now!  Previously most of the studies on open source community participation focussed on big projects like the Linux Kernel, or Apache.  That’s sort of like trying to understand patterns of home construction by looking at Frank Loyd Wright’s houses.  Those projects are outliers in how communities work.  This study did a much broader look at online communities and found the striking 1-9-90 pattern:

This is how communities work.  1% of the population does most of the work, 9% are casual contributors, and 90% are just consumers.  Your user base is a silent majority.  In an open source world the 1% are the core contributors, and possibly the heavy power users.  9% is the people that file a bug now and then, maybe a patch or two, everyone else is the people that just download your code and you never hear from them.  This patern more or less holds true for all volunteer efforts.

In open source we’ve got an issue, which is that getting code from the 9% is hard.  The 1% typically has access to a central source management repository, and can merge code fixes as soon as they see them.  The 9% has to follow a completely different process, posting patches to trackers or mailing lists, many of which get lost because there are a bunch more manual steps to pull them into the main tree.  If any process requires more effort by the 1%, it typically won’t happen, they are full up on time as it is.

And this is where git and github, start making things interesting.  While I run a number of open source efforts, I end up in the 9% all the time.  If you are now using git for your main tree the 9% and the 1% are now using the same tools, which allow seemless inclusion of code.  The merge algorithm on git is really wonderous.  I’ve had instances of massive renaming of files while trying to integrate external fixes in those files, and everything just worked.  It actually surprised the hell out of me.

The 9% just want to casually contribute something they aren’t signing up for a lifestyle.  Get my fix out there, if other people want it, great, if not, so be it.  The fact that integration is 2 mouse clicks and 10 seconds of effort makes the chance of capturing those changes much more likely.

Recovering the Brown Field

Ever look at sourceforge.net?  or any of the clones?  50% of those projects never got off the ground.  Another 40% have died out for other reasons, the contributors: had a family, started working for a company that doesn’t let them work in OSS, got bored withthe project, died, or became inactive for any number of other reasons.  When open source software exploded in 2000, there was a lot of greenfield.  Everyone was out there building new stuff that no one had done before.  But now we have a lot of brown field.  A lot of 1/2 planned, 1/2 finished pieces of code that have useful bits in them, but have been abandonned by their original creators.

Tools like git and github help you recover that brown field.  In the last couple of months I run into project after project that petered out in 2006, but has a bunch of good code.  That means they are about 2 critical bug fixes away from being useful on modern systems.  It’s really not much work, but in the old system , with the projects locked up in a forge with an SVN or CVS source management system, they were dead.  You had to start over.  With github you can import that tree and keep working.

It’s a new pattern on how the open source community is going to function, while it could be built on any distributed SCM, the fact that git has a really good svn 2 way bridge, and that github made itself “person oriented” vs. “project oriented” really make me believe that it’s creating a uniquely new pattern for both recovering the brown field of open source, and enabling the 9% to be much more effective with their output.

Software in the era of drive by contribution

Now that we’ve got a set of tools that really were designed for helping the 1% and the 9% work together, I think we’re going to see a whole new blosoming of open source software.  The rules of what it means to be a project contributor are changing, in really exciting ways.  Forking used to cheap, and merging expensive, which is why forking was considered an insult.  But with tools like git merging is cheap, so the offensiveness of forking goes away.  It opens up for more experimentation, and more complex contributions happening outside the 1% group.  All this increases the velocity of contribution, and thus the volume of open source software out there.

I really think distributed source control is changing a lot of assumptions for how software gets developed.  So if you haven’t yet dug into the space, do it.

OpenSim Infrastructure Updates: fresh os, git mirror, and automated release building

Yesterday I upgraded the opensimulator.org machine (kindly provided by Adam Frisby) to the latest version of Debian.  The upgrade went seemlessly.  Now that we are on Debian 5.0 we’ve got some fresher software on the machine to make it possible to provide a few new things as part of the basic OpenSim infrastructure.

OpenSim via Git

We are now mirroring the experimental upstream code (aka subversion trunk) via git.  At least 5 of us on the OpenSim core team have been using git personally with the git-svn bridge for our own OpenSim work (I started doing this nearly a year ago).  Git provides some advantages in making it easy to try things out in a local tree, and throw away branches if things go wrong.  If you read my blog, you know, I love git. 🙂

While subversion remains our main tree, this git mirror will make it easy for developers (or budding developers) to experiment with this alternative source system.  You can use viewgit to see the git mirror, or clone this via:

git clone http://opensimulator.org/git/opensim

In addition, the viewgit system provides a very handy rss feed for changes, which is another way you can keep up to tabs on what’s changing in trunk.  There is an up to 10 minute lag in changes getting into the git mirror from svn, but hopefully that won’t bother anyone.

Automated Release Building for OpenSim

Something else I threw together last night was an automated release builder for OpenSim.  One of the challenges we had was getting all the parts of the release sorted out once a release tag was made was sometimes onerous, and meant that a release might only be an subversion tag for days or even weeks before source tarballs of that saw their way into the world.

I’ve now got a system in place that looks for all numeric tags in our source tree, checks them out, runs prebuild on them, and bundles them up as both a .zip and a .tar.gz.  This means they should be ready to compile with nant or MSVS.  This is running hourly on the OpenSim machine, and publishing all results to http://dist.opensimulator.org.  One of the immediate things you’ll see is that it now gives us a full set of historically populated releases.

I’m hoping you enjoy these extra bits of infrastructure for the project.  Please feel free to drop me a comment here if you have any thoughts or questions on them, feedback is always appreciated.