Wednesday, September 26

Agile: Branches vs. Streams

Today I watched a subversion webinar where the presenter from a popular hosting/services company described three general branching strategies: unstable trunk, stable trunk, and agile. After seeing their "solution" for managing agile development with branches, I thought "I agree with the goal but what a complete merge mess using branches! Streams are significantly more natural and just plain easier." I'll explain by comparing to AccuRev's implementation of streams...

"Agile" Branching

Quick review... We've all used the unstable trunk pattern. You know... everyone commits to trunk and it's stability is not guaranteed; it only works well for a small team. The more pragmatic stable trunk pattern attempts to isolate projects/fixes on branches and control merging of only completed, stable changes to the trunk. The advantage here is that new projects or release lines can guarantee to start from a stable trunk configuration. Though, without locking trunk to fully test the merges, there is still a window of potential instability especially for those nasty runtime bugs.

Now to the point... Their agile branch pattern uses a branch-per-task strategy where tasks are eventually merged into target release branches. At various milestones, the release branches are merged into both trunk and ongoing task branches to keep them up-to-date. I've added the picture that was used during the presentation (though, I added the red 'merge' wording). See where I'm going with this? Notice how many merges are present for a trivial 4-task, single major/minor release scenario. I've used this exact type of pattern in a 250+ enterprise web development group with 40-50 parallel tasks contributed by local and remote teams on a 2 week release cycle -- it becomes completely and utterly unmanageable especially when you have to consider security concerns of who has visibility and control of branches and merge targets. It's a nightmare at best even with branch naming conventions and is exactly how most of the file/branch based SCM systems will work, fancy graphics or not.

Agile Streaming

Quick review... a stream represents a single configuration of source code. So you might have an "integration" stream, a "Tuesday night" build stream, or "3.0" official release stream. Any project will have a 'tree' of streams describing mainline development, previous releases, and maintenance work. The trick with streams is that they have a unique property where they automatically inherit changes from their parent. But it doesn't stop there. Any newer versions of files along the entire parent path of streams is inherited. If you're familiar with the OO programming model, it works very similar -- In the same way that adding a new method in a super class is automatically visible to all sub-classes, newer versions of files and directories in parent streams are visible to all child streams.

Now to the point... unlike using branches, streams don't require massive merging all over the place. Why? Built-in inheritance. Lets say you have 4 tasks as streams all working off of a mainline Integration stream (see pic of AccuRev stream browser client). If you promote a single task (i.e. bunch-o-files) to Integration, the other 3 task streams -automatically- have visibility to the newer versions! This allows you to merge-early, merge-often not by manual error-prone practice but accurately and predictably by stream technology. Translated to branch-speak, only a single merge is required to give complete visibility to newer versions to every other task. Furthermore, this example shows only 4 tasks. Lets say you have 40 or 400 concurrent tasks -- you still only need a single promote of a given task to have it automatic delivered to every other task in-progress.

In summary.... Comparing the two pictures, you're probably saying, "How can it be that simple?" Well... this is what a contemporary stream-based architecture gives you. Gone are the days of merge here, merge there... oops, we forgot to merge way over there. In addition, we-the-workhorse-developer don't have to be SCM tool merge experts struggling to determine which of our 9 branches need to be merged into the release candidate branch on Friday night. If your task is on the mainline stream path, you are absolutely guaranteed to have been up-to-date with anything you need. No more guesswork! Finally, the best part about inheritance is if you have a long-lived task, simply stay put and automatic inheritance will implicitly keep you up-to-date with the rest of the world!


tea41 said...

Interesting...I'm starting to grok streams. Keep the examples coming and you might win a convert

Mike said...

Good article. Once I 'got' streams I was ruined for all the previous models - particularly 'multi branch merge mess model.'

Kevin said...

My big concern with this method: can a developer control if (and when) his development streams get auto-updated?

This method of auto-rebasing (to use clearcase terminology) contains the assumption that the code delivered to the integration stream has been adequately tested prior to delivery. While this is not an unreasonable assumption, it's virtually guaranteed that at some point (due to human nature) a developer is going to deliver code that is not stable. With auto-rebasing, this problem would be delivered to all of the offspring streams, resulting in a borking of each developers development streams.

After having developed in an environment like this, I really appreciate the value value of a development stream in which I can make and test all of my changes while ignoring (for the most part) other deliveries to the integration stream. Of course you need to rebase prior to the final delivery, but I'd rather suffer the cost of a single controlled (tho larger) rebase rather than incremental and unexpected rebases along the way.

fepus said...

Thanks for the comments kevin. A developer absolutely can control when the update of changes occurs. First off, every developer has a private workspace. This means that you can continue to keep (commit) your changes all without updating. Furthermore, you perform the update only when ready (ie. after running a preview). Unlike working with branches, you can continue to commit even if conflicts exist without updating. Huge benefit.

Second, streams in AccuRev act like dynamic views in ClearCase. One of the features of the streams is that you can set a timerule to control or throttle the inheritance of changes. So if your project stream is a child of Integration, you may want to set a timerule on your project stream so you don't automatically get latest-n-greatest from Integration until the sanity tests have completed and blessed the Integration build. Then, you can unset your timerule and get everything from Integration.

Peter Kahn said...

Great post. Thanks for clearly describing what Accurev offers.

- How well does the automatic propagation work when refactoring?

- Have you heard of hour an agile branch/stream model has been used in git or bazaar?