Tag Archives: git

Splitting up Git Commits

Human review of code takes a bunch of time. It takes even longer if the proposed code has a bunch of unrelated things going on in it. A very common piece of review commentary is “this is unrelated, please put it in a different patch”. You may be thinking to yourself “gah, so much work”, but turns out git has built in tools to do this. Let me introduce you to git add -p.

Lets look at this Grenade review – https://review.openstack.org/#/c/109122/1. This was the result of a days worth of hacking to get some things in order. Joe correctly pointed out there was at least 1 unrelated change in that patch (I think he was being nice, there were probably at least 4 things going that should have been separate). Those things are:

  • The quiece time for shutdown, that actually fixes bug 1285323 all on it’s own.
  • The reordering on the directory creates so it works on a system without /opt/stack
  • The conditional upgrade function
  • The removal of the stop short circuits (which probably shouldn’t have been done)

So how do I turn this 1 patch, which is at the bottom of a patch series, into 3 patches, plus drop out the bit that I did wrong?

Step 1: rebase -i master

Start by running git rebase -i master on your tree to put myself into the interactive rebase mode. In this case I want to be editing the first commit to split it out.

screenshot_171

Step 2: reset the changes

git reset ##### will unstage all the changes back to the referenced commit, so I’ll be working from a blank slate to add the changes back in. So in this case I need to figure out the last commit before the one I want to change, and do a git reset to that hash.

screenshot_173

Step 3: commit in whole files

Unrelated change #1 was fully isolated in a whole file (stop-base), so that’s easy enough to do a git add stop-base and then git commit to build a new commit with those changes. When splitting commits always do the easiest stuff first to get it out of the way for tricky things later.

Step 4: git add -p 

In this change grenade.sh needs to be split up all by itself, so I ran git add -p to start the interactive git add process. You will be presented with a series of patch hunks and a prompt about what to do with them. y = yes add it, n = no don’t, and lots of other options to be trickier.

screenshot_176

In my particular case the first hunk is actually 2 different pieces of function, so y/n isn’t going to cut it. In that case I can type ‘e’ (edit), and I’m dumping into my editor staring at the patch, which I can interactively modify to be the patch I want.

screenshot_177

I can then delete the pieces I don’t want in this commit. Those deleted pieces will still exist in the uncommitted work, so I’m not losing any work, I’m just not yet dealing with it.

screenshot_178

Ok, that looks like just the part I want, as I’ll come back to the upgrade_service function in patch #3. So save it, and final all the other hunks in the file that are related to that change to add them to this patch as well.

screenshot_179

Yes, to both of these, as well as one other towards the end, and this commit is ready to be ‘git commit’ed.

Now what’s left is basically just the upgrade_service function changes, which means I can git add grenade.sh as a whole. I actually decided to fix up the stop calls before doing that just by editing grenade.sh before adding the final changes. After it’s done, git rebase –continue rebases the rest of the changes on this, giving me a new shiney 5 patch series that’s a lot more clear than the 3 patch one I had before.

Step 5: Don’t forget the idempotent ID

One last important thing. This was a patch to gerrit before, which means when I started I had an idempotent ID on every change. In splitting 1 change into 3, I added that id back to patch #3 so that reviewers would understand this was an update to something they had reviewed before.

It’s almost magic

As a git user, git add -p is one of those things like git rebase -i that you really need in your toolkit to work with anything more than trivial patches. It takes practice to have the right intuition here, but once you do, you can really slice up patches in a way that are much easier for reviewers to work with, even if that wasn’t how the code was written the first time.

Code that is easier for reviewers to review wins you lots of points, and will help with landing your patches in OpenStack faster. So taking the time upfront to get used to this is well worth your time.

Prettier fonts for Git Gui on Ubuntu

The default fonts for git gui (aka gitk) in Ubuntu are down right horrible.  Even Ubuntu 10.04 defaults to tk8.4, which doesn’t support font smoothing.  Fortunately there is a simple way to fix this and make a whole bunch of applications look prettier all at once.

# sudo update-alternatives –config wish
There are 3 choices for the alternative wish (providing /usr/bin/wish).

Selection    Path                   Priority   Status
————————————————————
* 0            /usr/bin/wish-default   10000     auto mode
1            /usr/bin/wish-default   10000     manual mode
2            /usr/bin/wish8.4        841       manual mode
3            /usr/bin/wish8.5        840       manual mode

Then type ‘3’ and hit enter.  Now you’ll be using tk8.5 by default, and miracle of miracles your eyes won’t be scarred by jagged ugly fonts in gitk anymore.

A Git epiphany, a journey in 3 acts

There have been a series of posts about Git in the last week over at The Reinvigorated Programmer. It’s fascinating to watch someone come to terms with Git that’s also a brilliant writer, and gets to the heart of the challenges so quickly. If you’ve ever struggled getting over that hump with Git, I encourage you to read the three posts, in order, and see if it helps you on your journey.

Act 1: Git is a Harrier Jump Jet. And not in a good way

Act 2: Still hatin’ on git: now with added Actual Reasons!

Act 3: You could have invented git (and maybe you already have!)

Streaming talk on Git

Tomorrow night (Wed, Jan 6th) at 6pm EST I’ll be presenting at MHVLUG on Git, the distributed source code management system.  New and notable on this talk is that I’ll be streaming the talk live on ustream, and, assuming the tech doesn’t horribly break down in the middle of it, will be taking questions from the viewers online as well.

For those software folks that read this blog, or the facebook / twitter posts it generates, this might be of interest to you.  Getting your head around merges in git takes some work early on, and with any luck the diagrams and explanations I pulled together with help quite a bit.

OpenSim moves to git

Yesterday we completed the transition of OpenSim from subversion to git as our primary source code system.  This had actually been kicked around as an idea for nearly a year and a half, but our gating factor had always been that git support on windows was lacking.  Recent dramatic improvements with TortoiseGit took away that blocking element.

One of the reasons for this move is to make it easier for more people to participate in the project (I’ve written about this in the past).  The OpenSim core team has now grown past 20, and even coordinating changes among ourselves has become challenging.  Subversion is fine as long as only a couple of people are working in a particular area, past that it doesn’t do you any favors in merging in complex changes.  A number of complex refactorings in the OpenSim tree have been on more or less perpetual hold because of some of these svn challenges.  Hopefully this will help grease the wheels there.

With git the hope is to give us some tools that help us in a number of ways.  The first is to make it easier to collaborate on more complex work.  The second is to make it easier for non core contributors to contribute substantial work.  The ability to have an opensim clone with changes in it staged for upstream inclusion, and have a core member be able to directly pull those changes, should be a big help.

All changes come with challenges.  The most visible is the lack of a monotonically increasing version number.  Git changes are stored differently, so the version identifier is a SHA1 hash.  That’s going to be the first big mental change people will need to get past.  It seems like a deal breaker before you’ve used it, but don’t worry, it will be ok once you have gotten used it to.  It’s just different.

We’ve got an ever evolving set of instructions at the OpenSim wiki.  I also expect we’ll spend a lot of time in the OSGrid Office hours discussing the transition.

Software in the era of drive by contribution

I love git.  I’ll state that up front.  I also love github, which I’ve expressed in the past.  Both are making me look at software in a new way.  I also think the pair of them are changing some of the rules we know for how open source projects emerge and move forward.

Recently I was working on building a Rails based Event Calendar for MHVLUG.  This gave me a chance to dig in on ical, which has fascinated me since a set of talks at YAPC a decade ago.  There were 2 ruby ical libraries out there (icalendar.rb and vpim.rb), neither did quite what I wanted, and both projects were more or less dormant (the mailing lists were lots of “is anyone alive?” posts).  Ug, I was stuck, and if I had to start from scratch on ical, that was all I’d end up doing, never getting to my application.

I googled some more… and low and behold found a github.com fork of icalendar.rb, and forks of that.  Those forks implemented about 50% of the fixes I needed to get ical generation with timezones to work.  So I forked from one of those and 6 changesets later, had what I needed.  I then built my application, and life was good. 

A few days later I decided to collect up all the changes in all the github icalendar trees, and merge them into my tree.  While git itself can be somewhat confusing, github adds this really slick web interface on top of git trees, that makes the merge process pretty painless.  This is one of their key innovations, and it’s just incredible.  I selected all the outstanding changes that would merge cleanly, pulled them in, and now had a tree which largely encompassed the 8 existing forks on github.com.  I posted back to the dead mailing list and let people know there was this now living github tree where the project had seemed dead.  I got a couple of new patches people wanted in, and 2 months later the maintainer actually showed up again and gave me admin access to the icalendar project so I could publish official versions.

This pattern repeated a few more times on the project.  I found a piece of code on github that did 90% of what I needed, but I needed a change.  I created my fork, added my feature, and pushed it back out (with a pull request).  A few days later the maintainer pulled them back in, and now they are officially part of the project.  I’m not vested in those projects, but I had relevant fixes, and because we were all using a tool that makes it easy to be a casual contributor, they are now part of the open source projects in the sky.

Casual Contributions

If you haven’t seen the paper on participation inequality, go and read it… now!  Previously most of the studies on open source community participation focussed on big projects like the Linux Kernel, or Apache.  That’s sort of like trying to understand patterns of home construction by looking at Frank Loyd Wright’s houses.  Those projects are outliers in how communities work.  This study did a much broader look at online communities and found the striking 1-9-90 pattern:

This is how communities work.  1% of the population does most of the work, 9% are casual contributors, and 90% are just consumers.  Your user base is a silent majority.  In an open source world the 1% are the core contributors, and possibly the heavy power users.  9% is the people that file a bug now and then, maybe a patch or two, everyone else is the people that just download your code and you never hear from them.  This patern more or less holds true for all volunteer efforts.

In open source we’ve got an issue, which is that getting code from the 9% is hard.  The 1% typically has access to a central source management repository, and can merge code fixes as soon as they see them.  The 9% has to follow a completely different process, posting patches to trackers or mailing lists, many of which get lost because there are a bunch more manual steps to pull them into the main tree.  If any process requires more effort by the 1%, it typically won’t happen, they are full up on time as it is.

And this is where git and github, start making things interesting.  While I run a number of open source efforts, I end up in the 9% all the time.  If you are now using git for your main tree the 9% and the 1% are now using the same tools, which allow seemless inclusion of code.  The merge algorithm on git is really wonderous.  I’ve had instances of massive renaming of files while trying to integrate external fixes in those files, and everything just worked.  It actually surprised the hell out of me.

The 9% just want to casually contribute something they aren’t signing up for a lifestyle.  Get my fix out there, if other people want it, great, if not, so be it.  The fact that integration is 2 mouse clicks and 10 seconds of effort makes the chance of capturing those changes much more likely.

Recovering the Brown Field

Ever look at sourceforge.net?  or any of the clones?  50% of those projects never got off the ground.  Another 40% have died out for other reasons, the contributors: had a family, started working for a company that doesn’t let them work in OSS, got bored withthe project, died, or became inactive for any number of other reasons.  When open source software exploded in 2000, there was a lot of greenfield.  Everyone was out there building new stuff that no one had done before.  But now we have a lot of brown field.  A lot of 1/2 planned, 1/2 finished pieces of code that have useful bits in them, but have been abandonned by their original creators.

Tools like git and github help you recover that brown field.  In the last couple of months I run into project after project that petered out in 2006, but has a bunch of good code.  That means they are about 2 critical bug fixes away from being useful on modern systems.  It’s really not much work, but in the old system , with the projects locked up in a forge with an SVN or CVS source management system, they were dead.  You had to start over.  With github you can import that tree and keep working.

It’s a new pattern on how the open source community is going to function, while it could be built on any distributed SCM, the fact that git has a really good svn 2 way bridge, and that github made itself “person oriented” vs. “project oriented” really make me believe that it’s creating a uniquely new pattern for both recovering the brown field of open source, and enabling the 9% to be much more effective with their output.

Software in the era of drive by contribution

Now that we’ve got a set of tools that really were designed for helping the 1% and the 9% work together, I think we’re going to see a whole new blosoming of open source software.  The rules of what it means to be a project contributor are changing, in really exciting ways.  Forking used to cheap, and merging expensive, which is why forking was considered an insult.  But with tools like git merging is cheap, so the offensiveness of forking goes away.  It opens up for more experimentation, and more complex contributions happening outside the 1% group.  All this increases the velocity of contribution, and thus the volume of open source software out there.

I really think distributed source control is changing a lot of assumptions for how software gets developed.  So if you haven’t yet dug into the space, do it.

OpenSim Infrastructure Updates: fresh os, git mirror, and automated release building

Yesterday I upgraded the opensimulator.org machine (kindly provided by Adam Frisby) to the latest version of Debian.  The upgrade went seemlessly.  Now that we are on Debian 5.0 we’ve got some fresher software on the machine to make it possible to provide a few new things as part of the basic OpenSim infrastructure.

OpenSim via Git

We are now mirroring the experimental upstream code (aka subversion trunk) via git.  At least 5 of us on the OpenSim core team have been using git personally with the git-svn bridge for our own OpenSim work (I started doing this nearly a year ago).  Git provides some advantages in making it easy to try things out in a local tree, and throw away branches if things go wrong.  If you read my blog, you know, I love git. 🙂

While subversion remains our main tree, this git mirror will make it easy for developers (or budding developers) to experiment with this alternative source system.  You can use viewgit to see the git mirror, or clone this via:

git clone http://opensimulator.org/git/opensim

In addition, the viewgit system provides a very handy rss feed for changes, which is another way you can keep up to tabs on what’s changing in trunk.  There is an up to 10 minute lag in changes getting into the git mirror from svn, but hopefully that won’t bother anyone.

Automated Release Building for OpenSim

Something else I threw together last night was an automated release builder for OpenSim.  One of the challenges we had was getting all the parts of the release sorted out once a release tag was made was sometimes onerous, and meant that a release might only be an subversion tag for days or even weeks before source tarballs of that saw their way into the world.

I’ve now got a system in place that looks for all numeric tags in our source tree, checks them out, runs prebuild on them, and bundles them up as both a .zip and a .tar.gz.  This means they should be ready to compile with nant or MSVS.  This is running hourly on the OpenSim machine, and publishing all results to http://dist.opensimulator.org.  One of the immediate things you’ll see is that it now gives us a full set of historically populated releases.

I’m hoping you enjoy these extra bits of infrastructure for the project.  Please feel free to drop me a comment here if you have any thoughts or questions on them, feedback is always appreciated.

One step deployment of rails applications with git and passenger

I developed this pattern with mercurial, and have recently adapted it to work with git

Assumed

  • You want your production application to be deployed at /data/site/myrailssite on your remote system
  • You are running passenger for your rails applications (if you aren’t you should really take a look)

Setting up the production target

First, create a rails user on your production system.  This lets your rails app run under a different id than you, or your webserver.  Privilege isolation is a good thing.

Next, mkdir /data/site/myrailssite and chown rails /data/site/myrailssite.

Next, su – rails, and cd /data/site/myrailssite && git init

Next, chmod 755 .git/hooks/post-receive

And finally add the following lines to .git/hooks/post-receive.

export RAILS_ENV=production

DIR=`pwd | sed s/.git$//`

cd $DIR && git –git-dir=$DIR/.git –work-tree=$DIR reset –hard && rake db:migrate && touch tmp/restart.txt

Setting your source repo to push to production

On your source repository git remote add production ssh://rails@yourhostname/data/site/myrailssite.

Then, finally git push production master, and you are off.  On any future change the push to production will roll the git tree to the newest revision on master, kick off the migrations, and trigger a passenger restart.  This is a really handy pattern for making life really easy for deployment, and I’m rolling this through all my project sites as I slowly convert them from mercurial to git.

In praise of github

A few years ago I became sold on distributed source control.  Being able to do offline work, try out new ideas cheaply, and throw them away, all were great things.  I started with mercurial, but over the summer started using git.  A couple of things pushed me over the edge.

  • git appeared more modular, at the end of the day this wasn’t really true.  The lack of a libgit was actually very disappointing (especially after I had sworn there was one), as I’ve got a number of interesting ideas stalled behind that one.
  • the git-svn pluggin, which provides really good 2 way integration between svn and git trees.  I’ve stopped making anon svn clones, I now do a git-svn clone.  If I want to fix something locally, I can now version that fix.
  • github – free social hosting of git trees

Github helps you over the hump in publicly hosting git trees.  Honestly, the hump isn’t very high, but the documentation out there could be a bit more straight forward.  I’d been chugging along using github for all my random open source projects, some that are active, some which are stalled.  But the source code is out there for others to take a look at.  Github provides nice instructions for people to clone the work, and run with it.  It’s definitely a prettier interface.

Github really started to shine for me this past weekend though.  I was looking for ical generation code for ruby to replace an email tool that I wrote in perl for our MHVLUG monthly meeting emails.  There exists 2 ruby ical projects, vpim and icalendar, neither of which support timezones in the ical generation, and both with pretty inactive mailing lists.  Once it became clear that the problem was not solved, I decided to dig in and see if I could come up with something workable.

But once you go social, github really shines

There had been a post on the icalendar devel list a few months back that said he had fixed a couple of timezone issues and provided a github url.  I cloned that project, and realized that while it got closer to what I needed, it still didn’t quite do what I needed.  So I clicked the fork button.

I was now given my own fork of the icalendar source.  But more importantly, it also showed me all the other forks on github, which there were 5 others.  I made my fixes, pushed them back public, and then proceeded to start to accumulate up some of the other changes out there.  There is even a fork queue which shows all the outstanding changes in other forks out there, as well as odds on whether or not the patches will apply.

While you could figure all this out on your own with the command line, that kind of discovery and view is really a help and a timesaver.

And it’s even better if you are doing ruby

Github is written in ruby, though I’m not sure on the framework behind it.  As an added bonus to people hosting ruby code on the site, the team created a gem build service into github.  You add a specially formatted gem spec file to your github tree, and you’ll get a gem built on each checkin.  My 2 ruby libraries that are there now are configured to build gems, easy for all to install.

If you haven’t checked out git, or github, you should.  While I found the learning curve on git to be higher than I really wanted to deal with, the community is very active, and the number of things that support git now is quite high.  Rails generators even support git now, automatically source managing via git or svn if you ask them to.  Github popped out of no where in 2008, and I can’t wait to see where they are going to go in 2009.