Friday, February 23

SCM Video Humor

I recently did a YouTube search for the popular SCM tools on the market. Results presented In alphabetical order.

AccuRev
- much easier to install than clearcase.
- with a single user interface.
- and flexible stream-based software development models.
- this one is just neurotic.
- no wrappers or scripts here.
- and especially developer friendly.

Bitkeeper
- no match. Apparently all related videos are in the attic.

ClearCase
- better than CVS but requires a super hero to use.
- unrelated, ironic video on not being able to escape.

CVS
- no match. Unrelated video about extracting an alien from a human womb.

Perforce
- no match. Unrelated "talent" show with a singer wearing shades to hide his identity.

Subversion
- no match. Unrelated video game. Great for up to 4 players.

In SCM We Trust

Consider the following SCM tool-agnostic sequence of events:


  1. codefreeze.

  2. fix, fix. label RC1 --> build. package. deploy. test.

  3. fix, fix. label RC2 --> build. package. deploy. test.

  4. fix, fix. label RC3 --> build. package. deploy. test. Verified!

Now what? Do you....

[A] Declare RC3 the official release label and "copy" the verified package to production? The actual verified code is in production but how does someone really know from SCM that RC3 is in production? maybe RC3 was rolled back and RC2 is in production. How does someone know that RC4 isn't being worked on?

[B] Subsequently create a new release label with an official "event describing" name like "Release_X.Y" then rebuild, repackage and deploy to production? The new label accurately describes the official release and demarcates the end of release candidates but the rebuild and repackage were not verified. Do you trust your build system? (topic for a separate post)

[C] Deploy the verified code but subsequently create a new "event describing" label like "Release_X.Y" and trust that the configuration for the new label is identical to that of deployed RC3 configuration? Developers may fix bugs using the new release label but technically it wasn't used to generate the actual production code.

I advise doing [C]. Though, it helps to have a transaction-based SCM system (like AccuRev) where the label operation simply marks the transaction # rather than each-and-every file in the configuration - afterall, tainted files are untested files. This way, as long as two labels refer to the same transaction, they will refer to the same configuration.

Thursday, February 22

Vendor Code Management.... with Streams

Lets say you've just obtained someone else's source code.

Maybe you've obtained this code from an opensource project, a mutual business partner, or a department in your own company. The source code in-hand is nifty indeed and you have mentally crafted some grand new features you'd like to add -- after all, you have the source! But there's one small problem....you don't "own" the code. The original authors continue to develop their new features and bugfixes and have a future roadmap that plans to offer subsequent major, minor, and patch releases.

We can call that "someone else" the "vendor" and their source code the "vendor code." It's that simple.

So vendor code management deals with tracking your changes to someone else's code, tracking their new releases, and tracking the merges between your customizations with theirs. In more technical terms, we'll use the following definition.

Vendor Code Management -- The process of tracking and propagating custom changes to external, evolving 3rd party codelines.

Managing all these moving parts can be tricky indeed unless you have the right tools.

Old School. The solution used by traditional branch-based SCM systems is to use yet-another-branch called a "vendor branch" [clearcase, perforce, subversion, cvs]. The vendor branch was nothing more than a branch off of mainline (typically) where you committed raw vendor code and had the wildly laborious and error prone task of committing new vendor code releases and merging their changes, with your changes, all on... other branches. Without private workspaces you absolutely need a new branch in order to test fragile merges in isolation and commit them for savepoints in event a partial rollback is required. And then at some point merging all your feature branches to mainline, cutting a release candidate branch (branch late, right!) and procuring a stable release. Oh, and then merge the release candidate branch to mainline to share last minute bugfixes. Did you save the whiteboard diagram of all your merges? What diagram? exactly.

Enter Streams-based SCM.
We all read books and articles and continue to preach about software development best practices such as continuous integration, private workspaces, reproducible configurations, promotion levels, named stable baselines, etc. A stream-based SCM system (like AccuRev) introduces a fresh new paradigm for managing the development and maturation of software changes. With a promotion-based workflow model, best practices can be implemented simply using the single, basic building block -- streams. Lets say we wanted to solve a problem like... oh, say managing vendor code...
The adjacent picture is a labelled screenshot of the AccuRev StreamBrowser and shows 3 primary stream motifs that can be applied to achieve an intuitive development model. First, with vendor code rooted in the repository, snapshots that represent successive, raw vendor code imports serve as roots for custom, named stable base development. Second, Each codeline can adopt its own promotion-based workflow -- in this case, both visible codelines employ Integration -> QA -> Release. Third, each custom release (represented as snapshots) can further serve as new roots of development for release-specific patch or feature development. Creating a tree of streams helps tell a story -- some stories are told from the top to bottom, while others start at snapshots representing new roots of development.

Thankfully, a whitepaper has been written all about the subject thus keeping this blog post to a few paragraphs. It talks in gory detail about the pitfalls of the branch-based solution and the benefits the Stream-based solution. Do you have further questions about merging custom changes across codelines using change packages? It's in the whitepaper ;) I'd be interested to hear from folks who currently use a traditional branch-based solution and want to know more about Streams. I've used the old branch-based technique myself in a past life tracking custom, business-rule changes to Bugzilla -- oh the dreaded horror. If you have questions about the blog post or the whitepaper, feel free to drop me a line!