Extra Cheese

A Blog


How I Started TDD

Nov 05, 2009

This story is about the first code I ever wrote with proper TDD. I'd been doing test-first for several months, but I didn't understand the design aspect. Fortunately, Corey Haines wanted to learn Python, and I wanted to learn TDD, so we paired up at a Coding Dojo. It went something like this.1

Corey: Let's write a test.
def test_fib_of_0_is_0():
        assert fib(0) == 0
1 test failed; 0 tests passed.
Corey: Now let's make it pass.
Me: Well, we could iterate...
Corey: Why?
Me: Because it's fibonacci...
Corey: The test says it returns zero!
Me: Oh. Well, OK.
def fib(n):
        return 0
1 test passed.
Corey: Let's write another test.
def test_fib_of_1_is_1():
        assert fib(1) == 1
1 test failed; 1 tests passed.
Corey: Now let's make it pass.
Me: OK, we need to recursively...

I stop myself. I know what this got me last time.

Me: We can check for which input we got.
Corey: We don't even need that.
def fib(n):
        return n
2 tests passed.
Corey: Let's write another test.
def test_fib_of_2_is_1():
        assert fib(2) == 1
1 test failed; 2 tests passed.
Corey: Now let's make it pass

I pause while I find the correct answer.

Me: Only the zero case is different.
def fib(n):
        if n == 0:
            return 0
        else:
            return 1
3 tests passed.

(I consider the implications of this. "Only the zero case is different." This is an inductive system, so it needs a basis case. Zero is only half of the basis case of a fibonacci sequence, but I never had to think about a basis case or recursion to write this code. The tests showed me what the code needed to do.)

Corey: Let's write another test.
def test_fib_of_3_is_2():
        assert fib(3) == 2
1 test failed; 3 tests passed.
Me: Another if?
Corey: Another if.
def fib(n):
        if n == 0:
            return 0
        elif n < 3:
            return 1
        else:
            return 2
4 tests passed.
Corey: Refactor!
Me: I don't know...

My brain hurts for a moment.

def fib(n):
        if n < 2:
            return n
        else:
            return n - 1
4 tests passed.

The full basis case is in place and we don't even need recursion yet. I'm surprised by how many cases we've written without needing recursion or iteration.

Corey: Another test.
def test_fib_of_4_is_3():
        assert fib(4) == 3
5 tests passed.
Me: It passed without changes. Is that OK?
Corey: Another test!
def test_fib_of_5_is_5():
        assert fib(5) == 5
1 test failed; 5 tests passed.

I think I can handle this now.

def fib(n):
        if n < 2:
            return n
        elif n == 5:
            return 5
        else:
            return n - 1
6 tests passed.
Corey: Refactor!
Me: Combine them into... recursion?
Corey: Combine them into recursion.
def fib(n):
        if n <= 1:
            return n
        else:
            return fib(n - 1) + fib(n - 2)
6 tests passed.

This isn't a perfect example of TDD, but that's not the point. The first thing you need to understand is the rough process: write the smallest failing test you can; then write the smallest code to make it pass; then refactor without changing behavior.

After getting this lesson from Corey, I went off and TDDed a couple thousand lines of code with almost no outside feedback. I was doing it very poorly, and often became frustrated, but in retrospect it was still the best code I'd ever written.

It takes years to learn how to do this well, and consistently, across a wide variety of situations. I've been doing it for two years, and I still have non-trivial problems, but I can almost always move forward confidently.

Building software without TDD was crushingly stressful, but I couldn't see it at the time. It was only shown to me when I started working one test at a time, one line of code at a time, with verification that the entire system is working in less than two seconds.

1 In reality, the Coding Dojo probably went only vaguely like this, and this isn't even the problem we solved, but that's not the point. This is what the first true TDD session always looks like.


Showing 20 comments

Posted by Josh Walsh at Thu Nov 5 12:47:17 2009

This is a great fundamental look at TDD where the focus is on the process, not the problem.  So many people I see trying to learn TDD get stuck on the algorithm and lose the process.

The Corey Haines in your head is very Zen-like.  It's as if he speaks with the fewest number of words possible.  Those who know Corey... well....


Posted by Nícolas at Thu Nov 5 18:27:39 2009

Great example! I used to listen about TDD, but nobody, even Wikipedia explain to me with so simplicity.

This post encourage me to buy a TDD book, and start to learn ASAP.

I'm a Ruby on Rails developer, but I haven't started with the TDD practice.

Thanks for the good reading, and motivation!

Nícolas Iensen


Posted by Vincent Manis at Thu Nov 5 18:49:51 2009

I too am a great believer in TDD. Unfortunately, this example could be misleading, as the result is a function that runs in exponential time. As with many outstanding agile practices, TDD is most effective when you use all the information you have. So I think you need a couple more steps, in the Dojo, where you look at the performance, are properly horrified, and then proceed to write an efficient linear time implementation, which magically passes all of the same tests!

I can't speak for the actual problem you guys solved, but I think you could dress the fib example, by adding the 2 steps, up to make this point quite straightforwardly.


Posted by Philip Schwarz at Thu Nov 5 19:05:34 2009

.
Great post.

From Appendix II of Kent Beck's Test Driven Development by Example, 'Fibonacci':

In answer to a question from one of the reviewers of this book, I posted a test-driven Fibonacci. Several reviewers commented that this example turned on the light about how TDD works. However, it is not long enough, nor does it demonstrate enough of TDD techniques, to replace the existing examples. If your lights are still dark after reading the main example, take a look here and see.

Like you, in his derivation of Fibonacci using TDD, Kent Beck works through test_fib_of_0_is_0, test_fib_of_1_is_1, and test_fib_of_2_is_1, at which point he has the same code as you:

def fib(n):
  if n == 0:
  return 0
  elif n < 3:
  return 1
  else:
  return 2

but the next test, test_fib_of_3_is_2, is his last, because that is when he goes 'recursive'. He refactors 'return 2' as follows:

return 1 + 1
return fib(n-1) + 1
return fib(n-1) + fib(n-2)

And ends up with

def fib(n):
  if n == 0:
  return 0
  if n == 1:
  return 1
  return fib(n-1)+fib(n-2)

I found the transformation from 2 to fib(n-1) + fib(n-2) a bit hard to follow. I think I would have found it easier to do that kind of refactoring when looking at your code for test_fib_of_5_is_5:

def fib(n):
  if n < 2:
  return n
  elif n == 5:
  return 5
  else:
  return n - 1

I would then find it easier to refactor 'return 5' as follows:

return 3 + 2
return fib(n-1) + 2
return fib(n-1) + fib(n-2)


Posted by Gary Bernhardt at Thu Nov 5 19:37:59 2009

To Nícolas' new excitement about TDD:

That's excellent! I'm glad that my post helped! If you can, I'd recommend pair programming with someone who's experienced with TDD. You'll learn much faster that way, and you won't get frustrated as often by not knowing what to do next.

To Vincent's concern about time complexity:

You're right, but that's really not the point of this post. For someone's very first introduction, you need to focus on the essence of TDD: smallest test; smallest code; refactor. I didn't want to complicate the post with anything else, and I actually removed a lot of content as I refined it.

To Philip's point about the confusing refactor:

I like your multi-step refactor example! I'm not very happy at all with the refactors in my post because they're very discontinuous. As you point out, refactors are easier to understand when expressed as many small steps. But I wanted to keep the post short and dense, so I condensed them significantly even though it's confusing. Hopefully, that doesn't damage the message of simplicity for new TDDers too much. :)


Posted by Andrew Dalke at Fri Nov 6 14:31:00 2009

Here are difficulties I have in understanding and agreeing with the TDD approach.

When you got to test_fib_of_2_is_1, what about the tests drove you to have an if statement in the solution?  I ask because "the smallest code to make it pass" cannot be longer than:

def fib(n): return [0, 1, 1][n]

Handling up to fib(5) still requires less code than your final solution, so under the "smallest code" guideline, why is the solution you give driven by TDD?

Thing is, I agree with you, since this list lookup cannot handle the general solution, but these tests seem to want to push me to that answer.

TDD are prescriptive. You write the test and then change the code to pass the tests. When you made the recursive implementation you went from having code which passed the tests to having code which you think could pass all future tests. This is like developing a scientific theory. Just because a theory post hoc explains something, you still need to construct new tests which probe to see if the code still passes. Nothing I know of in TDD mentions this need and you wrote no tests that would test if the new fib implementation really worked over a wider range.

Why did the second test check fib(1) instead of, say, fib(70)? That is, what about TDD makes you prefer one test over another? If I had said that I know fib(12) == 144 (which happens to also be 12*12), would that have been a more appropriate test?

There's also nothing in TDD which would drive things to what I would consider the better answer for Python, because it's faster and can handle a wider range of input:

def fib(n):
....i,j=0,1
....while n:
........i,j = j,i+j
........n-= 1
....return i


or the even shorter

def fib(n):
....i,j = 0,1
....for m in range(n):
........i,j = j,i+j
....return i

This last one is also shorter than your final solution, and much shorter than your un-refactored solution for test_fib_of_5_is_5.

Instead, it seems like you do a lot of typing to get to a less than optimal (more code, worse time complexity, and chance of running out of stack space).


Posted by Matt Wilson at Mon Nov 9 08:17:44 2009

I prefer to write out all the possible scenarios at first before I write any code.

I'll aim to only pass the first test, and then a few more, and a few more, until I get everything working right.  But I'm focused on the real solution from the start.

When I follow the one-by-one approach, I often forget a few corner cases, or, I write code that doesn't really turn out to be very useful.

I think I'd go all jihadist for testing if I could see how to work in some proof-by-induction mojo in there or see how to enforce big-oh runtime constraints with tests.

Right now, the thing I like most about tests is that they discourage over-encapsulation.  This is part of what I meant by TDD offsets bad OOP.  I still need to flesh this out.  It's a hunch more than a well-reasoned idea.

Really great article!  It would be a hell of a lot of work, but it would be fun to see TDD applied to a bigger problem, like figuring out how to play tic-tac-toe, or building a toy template language.


Posted by Philip Schwarz at Mon Nov 9 14:39:40 2009

@everyone

If you have not seen it yet, then check out this video of Uncle Bob doing the Bowling Kata BDD style, and these other Kata casts.


Posted by Dan Goldstein at Tue Nov 10 11:11:16 2009

As a non-TDDer, I don't know what to learn from this. You write the "simplest code possible" for each test, even though you know that code won't work for the general case. Then when you've reached a seemingly arbitrary point of complexity, you leap to the final formula. I don't see how Corey knew when it was time to make the leap or why Corey insisted on writing that throw-away code when you knew what the formula was all along.


Posted by Gary Bernhardt at Tue Nov 10 12:09:48 2009

Dan,

The refactor point isn't arbitrary; it's an expression of your definition of "simple". I mention this in my follow-up post:

http://blog.extracheese.org/2009/11/the_limits_of_tdd.html

Yes, we do know the solution all along, but that's not the point. I spend hours making posts like this as short as possible so they won't be boring. If I wrote a post in which I TDDed a real production system, without knowing how to implement it first, it would be a hundred times as long and you'd never finish it. You have to accept that a blog post you can read in five minutes is not going to be an exhaustive treatise on a practice that takes many years to master.

For most of the code you write, you don't know how to solve the problem. Often you don't even know what the problem is! (I mention that in my followup as well.) Sometimes you might know of a solution, but that's not the solution. Because TDD makes you test components in isolation, it forces them to be much more decoupled than they otherwise would be. This, and the fact that it guarantees you an exhaustive test suite, are the two greatest benefits for a beginner.


Posted by xilun at Tue Nov 10 12:14:07 2009

This is the failure of mathematics as a discipline, taken over by magic. And of course, magic failed, as it is supposed to do. Magic is present here, at the last step, when you both transform a false statement into a true one and use magic to do so. I fail to understand how this can lead to working programs (the principal reason being that you failed to explain, the reason for that being that this is impossible to explain, because this just can't happen). The only reason you did not stop at a previous step is that you just did NOT used TDD to solve this problem ; you already knew both the result you wanted to get and at each step you already knew if that provisional answer was correct or not.

Supporting TDD by an example that only shows clearly that tests did NOT drive you is strange.


Posted by Gary Bernhardt at Tue Nov 10 13:06:31 2009

xilun,

Building useful software is not a mathematical process. I wish that it were, because I hate the sloppy way that we build software today, and that includes TDD as I know it. Can you please suggest an alternative to TDD that is more rigorous, and is used to build actual systems in the real world? I'd love to hear about it. Until then, I'll stick with the most rigorous, repeatable method I've found so far.

Your statement that TDD can't generate working programs is somewhat insulting. Do you think that the many well-known advocates of TDD, who have written dozens of books about it, are lying? Do you think that the last two years of my life, during which I spent a few thousand hours learning TDD, were spent so that I could write a blog post in which I lie to you, claiming that it has worked when it hasn't?

These things are not the case. It has worked, and it does work. The fact that you don't fully understand how it works after spending five minutes reading my blog is not a failing of TDD. As I said in my last comment: You have to accept that a blog post you can read in five minutes is not going to be an exhaustive treatise on a practice that takes many years to master.

I repeat my request: please tell me what method is less "magical" than TDD. If your method is "I think about it and write down the answer", I assert that that is far more magical.


Posted by Steve at Tue Nov 10 21:51:48 2009

Yes, TDD is easy on trivial examples such as Fibonnacci, bowling or a Stack class.  Then you try to apply it in the real world and you end up with something like this:

http://www.infoq.com/news/2007/05/tdd-sudoku

After that, you realize why TDD is mostly pushed by consultants and you go back to coding the usual way.


Posted by Rudolf O. at Wed Nov 11 00:08:24 2009

This was awful.

It looks like you guys don't even understand how functions work!

They have inputs and outputs. The input variables have certain values that they can be. Generally we talk about them as sets. Integers are a HUGE set of numbers. In the case of the fib function, the set is the natural numbers. That's a pre-condition.

The same goes for the output. The output must be a number and it must be a natural number as well.

Thus, when you are writing your tests, you test a whole class of numbers. You don't test fib(1) or fib(2) or fib(5). You test fib(i), where 0 < i <= MAX, where MAX is whatever number you like. Your tests fail because they're too particular.


Posted by Gary Bernhardt at Wed Nov 11 01:04:50 2009

If you have come here from Reddit or Hacker news intending to leave a comment about how bad the TDDed solution is, please don't bother. This post is not about the particular problem or solution presented. Nor is it about correctness. It's about a particular hurdle that novice TDDers must overcome. If you have no interest in learning the practice of TDD, that's fine with me, but please don't express that lack of interest by posting an angry comment on my blog.


Posted by Mark Needham at Wed Nov 11 07:27:12 2009

That's a really good example for showing the flow of how TDD works when it's being done properly although I'm never sure of its value as an approach for problems which are more algorithmic in style.

Often it seems that with an algorithmic problem you either know exactly how to do it or you don't and often TDD doesn't help to get you a better understanding of how to do it from my experienced.

Having said that if you know that there are several different aspects to solving the problem and only some of them are algorithmic then it might make sense to TDD the other parts and then just write some unit tests to check that you've got the algorithm correct for various inputs for the algorithmic bits.

I think there might be something around TDD being really useful when we have a reasonable idea of where we're going with a solution but we don't know exactly how we're getting there but I think you do need some high level idea of the direction otherwise it's gonna be a real struggle.

Always good to read about people's experiences of this type of thing so thanks for writing.

Cheers, Mark


Posted by Jon at Wed Nov 11 21:05:11 2009

Performance is not just a "nice to have," in any real production code, it's a requirement. This example only serves to show how TDD leads you to the wrong solution, without no way of knowing it's wrong. Had it ended up with the iterative solution instead, it would probably have been a better example -- especially if you had added a test for performance (say, calculating fib(some_large_number) and realizing it takes too long)

This is a very sore point, because the vast majority of open source code bases perform terribly, and the authors either don't seem to care about performance, or not even know that it's generally a requirement for production quality code.

As for the questions about "why write the throw-away code" -- the point of TDD is not only to "grow" the code, but also that you have a great set of tests in the end, which you can re-apply to any other solution (including a re-factor that solves the performance problem). This article needs to be extended if it is to deliver on that point, though.


Posted by Gary Bernhardt at Wed Nov 11 22:51:04 2009

Jon,

The thing is, if you wrote the code without TDD, and came up with the recursive solution, you'd still be screwed! You have to know the recursive solution is slow with or without TDD. No process frees you from thinking. Regardless of how you arrive at the code, you must have the skills to understand its implications.

You're right that the post doesn't deliver a full explanation of how to build production quality software with TDD, but that wasn't the goal. Perhaps, if I follow this up with another twenty articles, the series as a whole will begin to approach that point. But I don't have the stamina to write such a thing as a single blog post, and I doubt most people have the stamina to read it as such. :)


Posted by Dale at Wed Jan 6 13:00:54 2010

Answer to the complaints about the recursion solution being slow:

You can encode performance targets into a test as well. Here, getting that test to pass would likely drive you towards making sure that a recursion is tail recursion, then converting that to an iterative algorithm. (Something any CS Major should be able to do in their sleep.)


Posted by Jeff Langr at Thu Jan 21 14:31:39 2010

I think this is an ok example, but only to show TDD technique. I got slammed for showing a trivial (stack) example some years ago--even with a disclaimer ("this is about technique, and you can only show so much into a short time"), people will whine that it's not real-world. So when demoing, I go for things I've actually had to do in real life, and have always been able to demonstrate several benefits of the TDD approach over the haphazard, code & fix non-process. A recent example involves comparing properties files across different environments for mismatches.

TDD works easily and very well in the predominance of apps, where 99.8% of the system requires devising/implementing algorithms that are not extremely complex. Even in some of the most complex systems (e.g. Sabre airline shopping), TDD is a great choice for most of the work--except maybe that relatively tiny core where we need an exceptionally creative algorithm. And even there, you can use the tests generated by TDD to help you massage and improve the algorithm once a solution's been built.

(Sudoku? Here's a TDD solution with some additional discourse: http://www.reddit.com/r/programming/comments/9sdcm/tdd_sudoku_i_took_a_stab_at_it_see_inside_part_1/)

I enjoy TDD and what it gives me (rapid verification, documentation, and better design), but mostly I just do it because it allows me to continually keep the code and design clean. Now that's real fun.

But I don't recommend TDD for everyone--it sucks if you think your sh*t don't stink.


Name:


E-mail:


URL:


Comment: