Wednesday, October 26, 2011

Branching in TFS 2010: Part I (Branching Theory)

NOTE: Part I of this article covers the theory behind branching and design patterns associated with it. Future parts will cover the specifics of how to implement this in TFS 2010.

The topic of branching is something most development teams tend to shy away from. It sounds scary and unknown. But it can also be quite useful, when treated with respect. As Eric Sink put it on his excellent “Source Control HOWTO”, “…you need to develop just the right amount of fear of branching.  This delicate balance seems to be very difficult to find.  Most people either have too much fear or not enough.” Eric is absolutely correct.

So what exactly is branching? Branching is taking a snapshot of your code and isolating it for a specific purpose. It essentially creates a “parallel universe” of your code. Think of the main universe as the one we live in and this alternate universe as a child universe that spawned off the main one (keep this in mind, because this parent-child relationship will become important later). From the point in time where the code branches off, that child universe will evolve differently from the main universe.

A simple parent-child branch


In short, branching is for isolation. If you’ve ever found yourself calling for a code freeze, you need branching. If you ever wanted to know what code was included in a release, you need branching. If you want to avoid releasing code that isn't ready, you need branching. If you want to allow developers to have a playground to experiment in without corrupting the real code, you need branching. If you've ever found yourself putting everything on hold to take on a massive refactoring effort, you need branching. Convinced yet? :)

Before we go further, here are some useful terms used in branching:
  • Main Branch – also called a trunk or mainline branch. This is the primary branch that all changes are eventually merged into. In continuous integration environments, continuous builds should usually run against this branch.
  • Development Branch – this is the branch that contains all active development. In many branching strategies, this can be one and the same as the main branch.
  • Release Branches – each release of the product should create a new branch, as a child branch of the main branch, in order to isolate what’s in that release.
  • Forward Integration – the act of merging changes from a parent to a child branch. For example, if you had a branch for a specific feature, merging the latest from the main branch into that feature branch would be of forward integration.
Forward Integration
  • Reverse Integration – the act of merging changes from a child to a parent branch. For example, merging the changes in a feature branch back to the main branch would be reverse integration.
Reverse Integration
  • Baseless Merge – the act of merging changes from one branch to another where the two branches are not in a parent-child relationship. An example would be merging changes from one feature branch into another feature branch. In general practice, it isn’t a good idea to use baseless merges.
Baseless Merge
Branching can be tricky and it does require some skill. Here are some tips to hopefully make it easier on you:

Five Keys to Making Branching Successful
  1. Branching requires discipline. Stick to the correct process, even if you think something will be “quick” or “easy”.
  2. Whenever possible, avoid branches off of other branches (cascading branches).
  3. Merge with forward integration and reverse integration often. The more often you do it, the less problems you will have.
  4. Do not confuse branching/merging with shelving. Shelving doesn’t create a branch, it is for temporary storage for work that you haven’t finished yet but needs to be backed up in TFS. Shelve when appropriate. Branch when appropriate.
  5. Never have code freezes! Create a branch instead.

Seven Behaviors That Indicate Something is Wrong
  1. Merging is put off to the end, so it becomes a nightmare to do properly.
  2. Merging happens too often, so much so that it gets in the way of actual development.
  3. Merging never happens, usually because people are too afraid of the consequences.
  4. The purpose of each branch isn't clear, or nobody remembers why a branch was created.
  5. You find yourself having to freeze development efforts.
  6. Branches are used to isolate developers instead of code changes.
  7. A change is merged to an older release.
Common Branching Patterns
  • Stairstep Release Branching - there are two different variations to this pattern. One is the Stairstep Release, which is the process of working on a single release at a time, and when that release goes into test, a new branch is created for the next release. You continue to "stairstep" down like that from release to release. This is the simplest pattern and requires the least amount of branching. It is also the least flexible pattern, and can require multiple test and build environments, and makes it very difficult to do concurrent release development.
Staircase Release Branching
  • Mainline Release Branching - this pattern consists of a single main branch that continues indefinitely, and each release branches off of it as development starts on each one. This version makes it easier to do concurrent release development, but at a cost. The branching with this pattern can be quite complex and integrating hotfixes can sometimes be a problem. It can also require multiple test environments.
Mainline Release Branching
  • Quality/Environment Branching - this pattern has, usually, three separate branches named something like Development, Test, and Production. Changes are essentially "promoted" from the Development branch into the Test branch, and eventually to the Production branch. This pattern is much more flexible than Release Branching, but can also get very complicated when bugs are discovered in production. For large projects, this pattern may require a dedicated person to manage the branching and merging process.
Quality Branching
  • Feature Branching - this pattern involves creating a separate branch for each major feature. I say "major" because you don't want to do this for correcting a typo on a web page, for example. Ideally, each feature would be completely independent of the other features. This pattern is the most flexible in terms of choosing what code should be released. It can also get complicated and could also require a separate test environment per feature (however, this can be mitigated - we'll see how later).
Feature Branching
Hybrid patterns that combine different aspects of the above are also possible. For example, some teams combine Quality and Feature branching and allow each feature to move through Development, Test, and Production.

Agile and Continuous Integration
In theory, whether you use Agile methodologies, such as continuous integration, or not should have no bearing on branching and merging. However in practice, sometimes it can make a difference. For example, if you use feature branching, technically each feature branch would need its own build and test environment, in order to keep them isolated. Needless to say, this is overkill and completely unnecessary for most development teams.
In reality, once development of your feature branch is complete, you should merge the latest changes from the main branch into your feature branch (forward integration). Then run your unit tests locally, as these should not hit a database. Once you are satisfied that everything is working, merge your feature branch back to the main  branch (reverse integration). At this point, your continuous build should run (because it should only be configured to run on the main branch, not each feature branch). If you have them, you can run integration tests at this point, too (which will hit the database, but only one is needed by this point). This process will keep your continuous build from running unnecessarily, and will also solve the aforementioned problem of needing a separate test environment when using the feature branching pattern.

Shared Code
We all learned in kindergarten that sharing is caring. That may be so, but it can come with its own set of problems. For example, how do you structure TFS to allow for shared code? There are many approaches to solving this problem, but in general trying to share the actual source code isn't worth the effort. Not only does it become complex, it is also less flexible. Instead, it's better to share the compiled binary assemblies.

One way to do this is to keep shared code in a separate Team Project and Visual Studio project. Store the binary assembly that is the output of the project in TFS. Then, when another project needs to reference the assembly, branch the folder that the assembly is in into a "Lib" folder in the referencing project. This method will allow each referencing project a choice in when to upgrade to a later version of the shared assembly. Requested updates to the assembly can then be done through forward integration from the main branch of the shared project.

That's all for Part I, which should give you a working knowledge of the basic principles behind branching and merging. Next time, we'll discuss which solution our team uses (hint: it's a hybrid pattern) and how to set the whole thing up in TFS. Go to Part II here.

Here is a slideshow that covers the information in this article:

No comments:

Post a Comment