Sunday, May 30, 2010

Planning Out Unit Tests

Test driven development is the practice of writing unit tests before writing any code. It's very helpful in knowing when to stop writing code (when all the tests pass) so you don't create unnecessary code bloat. TDD is a common practice in Agile or Extreme Programming methodologies.

I'm a fan of test driven development (TDD) about 99% of the time. The 1% I don't like it is when writing code that I'm not sure how to write. When you're prototyping, it's hard to write unit tests ahead of time, because you're not sure how the code is supposed to work or even what it's supposed to do yet. But this post is about the 99% of the time it is worthwhile (perhaps I'll talk about that other 1% in more detail at a later time).

So for the 99% of the time that it is beneficial to use TDD, how do you know how many unit tests to write and what to test? A web app I’m currently working on is used to process applications for licenses through a large state agency. When it came time to write a unit test for the application approval method, ApproveApplication(), we sat down and came up with all of the variables that could affect the approval process. A few mathematical calculations later and we discovered that there were roughly 3.2 billion combinations of the variables. Clearly, we were not going to write 3.2 billion unit tests for a single method. This is the first rule to remember when writing unit tests – you do not need to test every possible combination of variables.

We went into a conference room and wrote out all of the most likely use cases and came up with 523 tests. If this number results in an amount that fits your scope and schedule, then you can stop reading here and go write those tests! But for us, 523 tests for a single approval method was not acceptable, because our business domain consists of 11 different applications, so we were really talking about 5,753 unit test. We decided to see if any of these tests could be combined to reduce the total number even further. At this point, I bet one of my coworkers, Christine Lambden,  that we couldn't get the number down to 80 tests or less. She disagreed. But more on this later…

When trying to combine unit tests, avoid combining two tests into one if you just end up with a single test that is twice as long. This won’t gain you anything, except a meaningless reduction in the number of tests. The second rule is do not combine tests unless it gains you something meaningful.

First we looked at combining tests that were testing the same business cases, or where business cases overlapped. This got us down to roughly 140 tests, give or take a few. It also makes use of the third rule: combine tests with overlapping or duplicated use cases.

The next step was to combine tests that exercised the same section of code. Since the same piece of code doesn’t need to be tested more than once, combine tests that exercises that same piece of code.

The final step was to improve the performance of the unit tests. In our particular case, the setup for each test is very time-consuming, so by combining tests where we could would save us a considerable amount of time every time we ran unit tests. So if it makes a significant impact, combine tests if it will help performance.

In summary, my rules of planning out unit tests in order of importance are:
  1. Do not test every combination of variables.
  2. Combine tests when you can, but not unless it gains you something meaningful.
  3. Combine or eliminate tests with overlapping or duplicates business use cases.
  4. Combine or eliminate tests that exercise the same piece of code.
  5. Combine tests if it will significantly improve the performance of unit test execution.

Oh, and about that bet…our final tally was 43 tests, so I definitely owe my coworker lunch. That should be Rule #6: Never bet against coworkers with more real-world experience than yourself. That's okay, though, because whenever I speak with her, I usually end up learning something new, so I don't think I really lost at all. :)

Friday, May 28, 2010

Posting Source Code to Wordpress.com

I’ve been less than pleased with Wordpress.com’s facilities for posting source code. The built-in shortcode sourcecode isn’t too bad, but it’s hard to control through custom stylesheets. For me, I found the font size too small and hard to read. Also, it doesn’t always highlight the syntax the way I want. So here are a couple of solutions that worked for me.

CopySourceAsHtml – this is a add-in for Visual Studio 2005, 2008, and 2010. It lets you copy code into your clipboard as HTML from Visual Studio. This is nice and easy, but only if your code is coming from Visual Studio.

Windows Live Writer – this isn’t strictly for posting code, but an all-purpose blog editor that works with a lot of different blogging systems (including Wordpress.com). It’s much easier to use than the built-in Wordpress.com editor. You can get it here. It also accepts plug-ins, like the Source Code Formatter.

WLW Plug-in: Source Code Formatter – a plug-in for Windows Live Writer. It let’s you customize the style, such as font, color, box outline, and alternating shading. You can also set select lines to be highlighted. It’s reliable, easy to use, and isn’t dependent purely on Visual Studio.

Both of these solutions generate <pre> tags, but it’s possible that your Wordpress.com theme messes with that tag and thus makes your code unreadable. If that’s the case, you can simply append your CSS style sheet with the following to “reset” your <pre> tag style. The exact style you set it to is not that important, because the source code HTML you paste in should override what it needs to. It just needs to “undo” anything your theme’s CSS might be doing. Note that you will need the CSS Upgrade in order to do this.
pre {
background:#ffffff;
border:none;
font-family:consolas, "Courier New", courier, monospace;
font-size:12px;
margin:0;
padding:0;
}

Wednesday, May 26, 2010

Resume Tips

I spent yesterday and today looking through close to a hundred resumes in order to fill 3-4 developer positions. I'm not a big fan of throwing out a resume on a technicality, like a misspelled word, so I actually read through every single one of them (with help). Part way through this ordeal, I came to a conclusion: If I'm going to show people the courtesy of reading through every resume, then I expect them to respect my time and effort. So here are some tips to help people do just that:

1. Keep it short. I lost count of how many resumes had 12-16 pages, and these were for developers with 7-8 years of experience. There's no reason that resume should be more than 2-3 pages, max.

2. Do not list every inane detail of what you did, just give an overview. For example:

  • Sent a status report to my manager to update him on my status
  • Called methods using C#
  • Compiled code


(these are all real examples, unfortunately)

3. Don't capitalize words randomly. I guess this is a weak effort to emphasize key points:

  • wrote stored Procedures in order to optimize code Performance
  • wrote Business Objects in c++


4. User proper spelling and grammar. I can overlook a single misspelled word, but I saw literally dozens of resumes that each had at least 4-5 mistakes. Make sure your sentences make sense, use consistent verb tenses, and are complete. Not doing this consistently shows me that you're not a detail-oriented person, and therefore not someone I want to hire to write code.

and most of all,

5. DO NOT LIE! And if you do, at the very least do it well. Everyone at one point or another has probably done some harmless "resume padding", but outright lying is not acceptable. Especially if you reveal your own lie. A requirement for the job we're hiring for is at least one year of experience with NUnit. One resume put "3 years of experience with NUnit" in the summary section, but later in the detailed experience for one of the jobs, wrote:
I used NUnit for the first time in this position, which gave me an opportunity to learn something new.

The problem was, that position only started 4 months ago. So either this person was a liar or they can't do simple math. Either way, I don't want to hire him.

Monday, May 24, 2010

Setting an EntityRef in LINQ to SQL

I was debugging a unit test today and came across a behavior in LINQ to SQL that I thought it would be beneficial to review. It has to do with setting the EntityRef of a LINQ to SQL object in memory. According to MSDN, EntityRef is a structure that:
Provides for deferred loading and relationship maintenance for the singleton side of a one-to-many relationship in a LINQ to SQL application. 

In the system we're currently developing, most database records have temporary copies that we refer to as shadow records. A shadow record is a temporary copy of a database record that is used for editing until it is "approved", at which point it gets copied to the real record. Shadow records are identical to the actual record, but with a different primary key. They are stored in the same database table as the real records. Those tables have a foreign key relationship to themselves from a shadow source column to the primary key column.

For example, one table is Address. In addition to all the columns that represent the address information, there is the AddressId column and a ShadowSourceId column. The ShadowSourceId is a foreign key to the AddressId in the same table. Rows with a null value for ShadowSourceId are approved records, while rows with a value for ShadowSourceId are shadow records.
In LINQ, we create a shadow record by first duplicating the original record:
1: Address duplicateAddress = DuplicateAddress(originalAddress);

 Then we modify the duplicate to turn it into a shadow record:
1: duplicateAddress.AddressId = Guid.NewGuid();
2: duplicateAddress.ShadowSourceId = originalAddress.Id;



In the DBML designer, I named the EntityRef as ShadowSource, so that writing duplicateAddress.ShadowSource will give you originalAddress. This is where the tricky behavior comes in. In the lines above, I set the ShadowSourceId directly. At this point, if you try to access the ShadowSource EntityRef, you will get a null value (even though ShadowSourceId has been set). Instead, if I were to create the shadow record as follows:
1: duplicateAddress.AddressId = Guid.NewGuid();
2: duplicateAddress.ShadowSource = originalAddress;



then the ShadowSource EntityRef will have the expected value. However, now the ShadowSourceId property will contain a null.

The reason for this is that the ID property and the EntityRef, even though they reference the same thing, are not "wired up" by LINQ to SQL until you call SubmitChanges(). Whichever one you set, the other will still be null until you submit your changes and LINQ to SQL creates all the proper connections. If for some reason you don't want to submit your changes, you will have to set both the ID property and the EntityRef for them to have their proper values:
1: duplicateAddress.AddressId = Guid.NewGuid();
2: duplicateAddress.ShadowSource = originalAddress;
3: duplicateAddress.ShadowSourceId = originalAddress.Id;



While this seems redundant, sometimes it's necessary. In my situation, I didn't need to submit my changes because this was a unit test for a small piece of the process. If you need both of these to be available, then you'll need to decide which solution is the most appropriate in your situation - submitting your changes or simply setting both manually.

Sunday, May 23, 2010

Welcome to Code Connection!

Welcome to the Code Connection! In this blog, I will focus on practical applications of programming, as opposed to more theoretical topics. My focus will be primarily on C# and .NET, but other topics and technologies may come up from time to time. Please don't hesitate to share you comments and suggestions - I look forward to them!