How do you refactor? - unit-testing

I was wondering how other developers begin refactoring. What is your first step? How this process (refactoring) differ if you refactor code which is not yours? Do you write tests while refactoring?

do not refactor anything non-trivial that does not already have unit tests
write unit tests, then refactor
refactor small pieces and re-run the tests frequently
stop refactoring when the code is DRY* clean
* DRY = Don't Repeat Yourself

What is your first step?
The first step is to run the unit tests to make sure they all pass. Indeed, you can waste a large amount of time looking for which of your changes broke the a test if it was already broken before your modify the code.
How this process differ if you refactor code which is not yours ?
I'm certainly doing smaller steps when refactoring code I didn't write (or code I wrote a long time ago). I may also verify the test coverage before to proceed, to avoid relying on unit tests that always pass ... but that do no test the area I'm working on.
Do you write tests while refactoring ?
I usually don't, but I may add new tests in the following circumstances (list not exhaustive) :
idea of a new test sparkles in my mind ("what happen if ... ?" - write
a test to know)
discover hole in the test coverage
it also depends on the refactoring being performed. When extracting a function, I may create a new test if it can be invoked a different way as it was before.
Here are some general advice:
First thing is to maintain a list of the code smells noticed while working on the code. This allows freeing one's mind from the burden of remembering what was seen in the code. Also,
The golden rule is never refactor when the the unit tests do not pass completely.
Refactor when the code is stable, before adding something you know will be impacted by a future refactoring, before to integrate and above all before to say your done.
In the absence of unit tests, you'll have to put the part of code you want to refactor under test. If unit tests are too hard to retrofit, as it usually is the case, then you can create characterization tests, as recommended per Michael Feathers in Working Effectively with Legacy Code. In short they are end-to-end tests that allow you to pin down the current behavior of the code (which is not assumed to be working perfectly all the time).
Don't be afraid to do baby steps. Do not do two things at the same time. If you remark something requiring refactoring, note it down, do not fix it right now, even if it seem very easy.
Check in very often, when the tests pass. So that you can revert a bad refactoring without losing what was done before.
Keep on mind refactoring does not add value to your customer (this can be discussed), but the customer does not pay you to refactor. One rule of thumb is to refactor prior making changes or adding new capabilities to the code.

Read Martin Fowler's book "Refactoring"
BTW - that's Martin Fowler's own Amazon exec link, if you're wondering :)

I take crap and make it less crappy. :-)
Seriously. I don't refactor to create new functionality. Refactoring occurs before new stuff. If there are no tests I write tests to make sure that I'm not breaking anything with my refactoring. If there are tests, I use those. If the tests are insufficient, I might write more tests but I would consider this separate from the refactoring and do it first.
The first step for me is to notice that I can abstract something and make it more general (and useful in other places that need the functionality now), or I notice that something is bad and could be better (subjective). I don't refactor to generality without a reason. YAGNI principle applies.
We have the concept of shared ownership so the code is always mine -- I may not have written it, but I don't consider that when refactoring. I may seek to understand things before I decide it needs refactoring if the purpose isn't clear -- although that's almost always a reason to refactor in and of itself.

Depends very much on my goals. As has been said, you need unit tests to make sure your refactoring hasn't broken anything, and if it has you have to commit time to fixing it. For many situations, I test the existing solution, and if it works, wrap it rather than refactoring it, as this minimises the possibility of a break.
If I have to refactor, for example I recently had to port a bunch of ASCII based C++ to UNICODE, I tend to make sure I have good regression tests that work at end-user as well as unit level. Again, I try to use tools rather than manually refactoring, as this is less prone to error, and the errors you do get are systematic rather than random.

For me first thing is make sure the code hits all of the best practices of our office.
For example, using strict, warnings and taint for our Perl scripts.
If there are efficienry or speed troubles focus on them.
Things like find a better algorithm, or find a better way to to do what the quadruply nested for loop is doing.
And lastly see if there is a way to make the code more readable. This is normally accomplished by by turning 5 little scripts that do similar things into 1 module(class).

I refactor whilst writing new code, using unit tests. I'll also refactor old code, either mine or someone else's, if methods are too long, or variables named badly, or I spot duplication etc.

Start with getting unit tests, and then use automated refactoring tools. If the refactoring can't be automated then it's not truly a mechanical transformation of the code and so isn't a refactoring. The unit tests are to make sure that you are truly just performing mechanical transformations from one codebase to an equivalent one.

Refactoring without Unit Test is dangerous. Always have unit test. If you change something without having good testing you might be safe for some portion of the code but something elsewhere might not have the same behavior. With Unit Testing you protect any change.
Refactoring other code is fine too but the extreme is not. It's normal that someone else do not program like you do. It's not "kind" to change stuff because you would have did it in the other way. Just refactoring if it's really necessary.

I remove duplication, which unifies the thought patterns inherent in the code. Refactoring needs to achieve those two things. If you have code that does the same thing twice, refactor it to a common location, unifying the abstraction. If you have the same literal in three places, put it in a constant, unifying the purpose. If you have the same group of arguments, ensure they're always used in the same order or, better yet, put them in a common structure, unifying information groups.

I'm a lot more reluctant to refactor code written by others than to refactor my own.
If it's been written by one of my predecessors, I generally refactor only within a function. E.g. I might replace an if statement with a switch. Anything much bigger than that is usually out of scope and not within budget.
For my own code, I usually refactor as I'm writing, whenever something looks ugly or starts to smell. It's a lot easier to fix it now instead of waiting for it to cause problems down the road.

I agree with the other posters when you're refactoring code you wrote.
If it's code you didn't write, and especially if there's lots of it, I would start by using tools like fxCop, Visual Studio's Code Analysis, DevPartner -- I'm sure there are other good ones. They would give you ideas about where to start and what the most common coding issues are. I would also do stress testing to see where the bottlenecks are, therefore the greatest return on your effort at improving the code.
I love to refactor my code, but it is possible to overdo it. If you aren't really improving the app's performance, or seriously improving code readability, you should probably stop. There is always the possibility of introducing new bugs when you refactor, especially if you're working without unit tests.

First step: Identify a code smell.
Second step: Consider alternative implementations and what the trade offs are and which do I accept in terms of which is "better."
Third step: Implement better solution.
This doesn't differ if the code is mine or not as sometimes I may go back over code I wrote months or years ago and it'll look like code from someone else. I may write tests if I'm making new methods or there aren't adequate tests to the code, IMO.

General approach
I may start out by peeking at the software (component) from bird-view. Dependency analysis and graphing tools are a great help here (see later sections). I look for a circle in either the package-level or the class-level dependencies, and also alternatively for classes with too many dependencies. These are good candidates for refactoring.
Tools of the trade
Dependency analysis tools:
Python
Java

Related

When doing TDD, why should I do "just enough" to get a test passing?

Looking at posts like this and others, it seems that the correct way to do TDD is to write a test for a feature, get just that feature to pass, and then add another test and refactor as necessary until it passes, then repeat.
My question is: why is this approach used? I completely understand the write tests first idea, because it helps your design. But why wouldn't I create all tests for a specific function, and then implement that function all at once until all tests pass?
The approach comes from the Extreme Programming principal of You Aren't Going to Need It. If you actually write a single test and then the code that makes it pass then repeating that process you usually find that you write just enough to get things working. You don't invent new features that are not needed. You don't handle corner cases that don't exist.
Try an experiment. Write out the list of tests you think you need. Set it aside. Then go with the one test at a time approach. See if the lists differ and why. When I do that I almost always end up with fewer tests. I almost always find that I invented a case that I didn't need if I do it the all the tests first way.
For me, it is about "thought burden." If I have all of the possible behaviors to worry about at once, my brain is strained. If I approach them one at a time, I can give full attention to solving the immediate problem.
I believe this derives from the principle of "YAGNI" ("You're Ain't Gonna Need It")(*), which states that classes should be as simple as necessary, with no extra features. Hence when you need a feature, you write a test for it, then you write the feature, then you stop. If you wrote a number of tests first, clearly you would be merely speculating on what your API would need to be at some point in the future.
(*) I generally translate that as "You are too stupid to know what will be needed in the future", but that's another topic......
imho it reduces the chance of over engineering the piece of code you are writing.
Its just easier to add unnecessary code when you are looking at different usage scenarios.
Dan North has suggested that there is no such thing as test-driven design because the design is not really driven out by testing -- that these unit tests only become tests once functionality is implemented, but during the design phase you are really designing by example.
This makes sense -- your tests are setting up a range of sample data and conditions with which the system under test is going to operate, and you drive out design based on these example scenarios.
Some of the other answers suggest that this is based on YAGNI. This is partly true.
Beyond that, though, there is the issue of complexity. As is often stated, programming is about managing complexity -- breaking things down into comprehensible units.
If you write 10 tests to cover cases where param1 is null, param2 is null, string1 is empty, int1 is negative, and the current day of the week is a weekend, and then go to implement that, you are having to juggle a lot of complexity at once. This opens up space to introduce bugs, and it becomes very difficult to sort out why tests are failing.
On the other hand, if you write the first test to cover an empty string1, you barely have to think about the implementation. Once the test is passing, you move on to a case where the current day is a weekend. You look at the existing code and it becomes obvious where the logic should go. You run tests and if the first test is now failing, you know that you broke it while implementing the day-of-the-week thing. I'd even recommend that you commit source between tests so that if you break something you can always revert to a passing state and try again.
Doing just a little at a time and then verifying that it works dramatically reduces the space for the introduction of defects, and when your tests fail after implementation you have changed so little code that it is very easy to identify the defect and correct it, because you know that the existing code was already working properly.
This is a great question. You need to find a balance between writing all tests in the universe of possible tests, and the most likely user scenarios. One test is, IMHO, not enough, and I typically like to write 3 or 4 tests which represent the most common uses of the feature. I also like to write a best case test and a worst case test as well.
Writing many tests helps you to anticipate and understand the potential use of your feature.
I believe TDD advocates writing one test at a time because it forces you to think in terms of the principle of doing the simplest thing that could possibly work at each step of development.
I think the article you sent is exactly the answer. If you write all the tests first and all of the scenarios first, you will probably write your code to handle all of those scenarios at once and most of the time you probably end up with code that is fairly complex to handle all of these.
On the other hand, if you go one at a time, you will end up refactoring your existing code each time to end up with code probably as simple as it can be for all the scenarios.
Like in the case of the link you gave in your question, had they written all the tests first, I am pretty sure they would have not ended up with a simple if/else statement, but probably a fairly complex recursive piece of code.
The reason behind the principle is simple. How practical it is to stick to is a separate question.
The reason is that if you are writing more code that what is needed to pass the current test you are writing code that is, by definition, untested. (It's nothing to do with YAGNI.)
If you write the next test to "catch up" with the production code then you've just written a test that you haven't seen fail. The test may be called "TestNextFeature" but it may as well return true for all the evidence you have on it.
TDD is all about making sure that all code - production and tests - is tested and that all those pesky "but I'm sure I wrote it right" bugs don't get into the code.
I would do as you suggest. Write several tests for a specific function, implement the function, and ensure that all of the tests for this function pass. This ensures that you understand the purpose and usage of the function separately from your implementation of it.
If you need to do a lot more implementation wise than what is tested by your unit tests, then your unit tests are likely not comprehensive enough.
I think part of that idea is to keep simplicity, keep to designed/planned features, and make sure that your tests are sufficient.
Lots of good answers above - YAGNI is the first answer that jumps to mind.
The other important thing about the 'just get the test passing' guideline though, is that TDD is actually a three stage process:
Red > Green > Refactor
Frequently revisiting the final part, the refactoring, is where a lot of the value of TDD is delivered in terms of cleaner code, better API design, and more confidence in the software. You need to refactor in really small short blocks though lest the task become too big.
It is hard to get into this habit, but stick with it, as it's an oddly satisfying way to work once you get into the cycle.

TDD with unclear requirements

I know that TDD helps a lot and I like this method of development when you first create a test and then implement the functionality. It is very clear and correct way.
But due to some flavour of my projects it often happens that when I start to develop some module I know very little about what I want and how it will look at the end. The requirements appear as I develop, there may be 2 or 3 iterations when I delete all or part of the old code and write new.
I see two problems:
1. I want to see the result as soon as possible to understand are my ideas right or wrong. Unit tests slow down this process. So it often happens that I write unit tests after the code is finished what is known to be a bad pattern.
2. If I first write the tests I need to rewrite not only the code twice or more times but also the tests. It takes much time.
Could someone please tell me how can TDD be applied in such situation?
Thanks in advance!
I want to see the result as soon as possible to understand are my ideas right or wrong. Unit tests slow down this process.
I disagree. Unit tests and TDD can often speed up getting results because they force you to concentrate on the results rather than implementing tons of code that you might never need. It also allows you to run the different parts of your code as you write them so you can constantly see what results you are getting, rather than having to wait until your entire program is finished.
I find that TDD works particularly well in this kind of situation; in fact, I would say that having unclear and/or changing requirements is actually very common.
I find that the best uses of TDD is ensuring that your code is doing what you expect it to do. When you're writing any code, you should know what you want it to do, whether the requirements are clear or not. The strength of TDD here is that if there is a change in the requirements, you can simply change one or more of your unit tests to reflect the changed requirements, and then update your code while being sure that you're not breaking other (unchanged) functionality.
I think that one thing that trips up a lot of people with TDD is the assumption that all tests need to be written ahead of time. I think it's more effective to use the rule of thumb that you never write any implementation code while all of your tests are passing; this simply ensures that all code is covered, while also ensuring that you're checking that all code does what you want it to do without worrying about writing all your tests up front.
IMHO, your main problem is when you have to delete some code. This is waste and this is what shall be addressed first.
Perhaps you could prototype, or utilize "spike solutions" to validate the requirements and your ideas then apply TDD on the real code, once the requirements are stable.
The risk is to apply this and to have to ship the prototype.
Also you could test-drive the "sunny path" first and only implement the remaining such as error handling ... after the requirements have been pinned down. However the second phase of the implementation will be less motivating.
What development process are you using ? It sounds agile as you're having iterations, but not in an environment that fully supports it.
TDD will, for just about anybody, slow down initial development. So, if initial development speed is 10 on a 1-10 scale, with TDD you might get around an 8 if you're proficient.
It's the development after that point that gets interesting. As projects get larger, development efficiency typically drops - often to 3 on the same scale. With TDD, it's very possible to still stay in the 7-8 range.
Look up "technical debt" for a good read. As far as I'm concerned, any code without unit tests is effectively technical debt.
TDD helps you to express the intent of your code. This means that writing the test, you have to say what you expect from your code. How your expectations are fulfilled is then secondary (this is the implementation). Ask yourself the question: "What is more important, the implementation, or what the provided functionality is?" If it is the implementation, then you don't have to write the tests. If it is the functionality provided then writing the tests first will help you with this.
Another valuable thing is that by TDD, you will not implement functionality that will not be needed. You only write code that needs to satisfy the intent. This is also called YAGNI (You aint gonna need it).
There's no getting away from it - if you're measuring how long it takes to code just by how long it takes you to write classes, etc, then it'll take longer with TDD. If you're experienced it'll add about 15%, if you're new it'll take at least 60% longer if not more.
BUT, overall you'll be quicker. Why?
by writing a test first you're specifying what you want and delivering just that and nothing more - hence saving time writing unused code
without tests, you might think that the results are so obvious that what you've done is correct - when it isn't. Tests demonstrate that what you've done is correct.
you will get faster feedback from automated tests than by doing manual testing
with manual testing the time taken to test everything as your application grows increases rapidly - which means you'll stop doing it
with manual tests it's easy to make mistakes and 'see' something passing when it isn't, this is especially true if you're running them again and again and again
(good) unit tests give you a second client to your code which often highlights design problems that you might miss otherwise
Add all this up and if you measure from inception to delivery and TDD is much, much faster - you get fewer defects, you're taking fewer risks, you progress at a steady rate (which makes estimation easier) and the list goes on.
TDD will make you faster, no question, but it isn't easy and you should allow yourself some space to learn and not get disheartened if initially it seems slower.
Finally you should look at some techniques from BDD to enhance what you're doing with TDD. Begin with the feature you want to implement and drive down into the code from there by pulling out stories and then scenarios. Concentrate on implementing your solution scenario by scenario in thin vertical slices. Doing this will help clarify the requirements.
Using TDD could actually make you write code faster - not being able to write a test for a specific scenario could mean that there is an issue in the requirements.
When you TDD you should find these problematic places faster instead of after writing 80% of your code.
There are a few things you can do to make your tests more resistant to change:
You should try to reuse code inside
your tests in a form of factory
methods that creates your test
objects along with verify methods
that checks the test result. This
way if some major behavior change
occurs in your code you have less
code to change in your test.
Use IoC container instead of passing
arguments to your main classes -
again if the method signature
changes you do not need to change
all of your tests.
Make your unit tests short and Isolated - each test should check only one aspect of your code and use Mocking/Isolation framework to make the test independent of external objects.
Test and write code for only the required feature (YAGNI). Try to ask yourself what value my customer will receive from the code I'm writing. Don't create overcomplicated architecture instead create the needed functionality piece by piece while refactoring your code as you go.
Here's a blog post I found potent in explaining the use of TDD on a very iterative design process scale: http://blog.extracheese.org/2009/11/how_i_started_tdd.html.
Joshua Block commented on something similar in the book "Coders at work". His advice was to write examples of how the API would be used (about a page in length). Then think about the examples and the API a lot and refactor the API. Then write the specification and the unit tests. Be prepared, however, to refactor the API and rewrite the spec as you implement the API.
When I deal with unclear requirements, I know that my code will need to change. Having solid tests helps me feel more comfortable changing my code. Practising TDD helps me write solid tests, and so that's why I do it.
Although TDD is primarily a design technique, it has one great benefit in your situation: it encourages the programmer to consider details and concrete scenarios. When I do this, I notice that I find gaps or misunderstandings or lack of clarity in requirements quite quickly. The act of trying to write tests forces me to deal with the lack of clarity in the requirements, rather than trying to sweep those difficulties under the rug.
So when I have unclear requirements, I practise TDD both because it helps me identify the specific requirements issues that I need to address, but also because it encourages me to write code that I find easier to change as I understand more about what I need to build.
In this early prototype-phase I find it to be enough to write testable code. That is, when you write your code, think of how to make it possible to test, but for now, focus on the code itself and not the tests.
You should have the tests in place when you commit something though.

Writing Unit Tests Later

I know that TDD style is writing the test first, see it fails then go and make it green, which is good stuff. Sometimes it really works for me.
However especially when I was experimenting with some stuff (i.e. not sure about the design, not sure if it's going to work) or frantically writing code, I don't want to write unit tests, it breaks my flow.
I tend to write unit tests later on and especially just before stuff getting too complicated. Also there is another problem writing them later is generally more boring.
I'm not quite sure if this is a good approach (definitely not the best).
What do you think? Do you code write your unit tests later? Or how do you deal this flow problem or experimental design / code stage.
What I've learned is that there is no experimental code, at least not working in production environments and/or tight deadlines. Experiments are generally carried out until something "works" at which point that becomes the production code.
The other side of this is that TDD from the start will result in better design of your code. You'll be thinking more about it, reworking it, refactoring it more frequently than if you write the tests after the fact.
I've written tests after the fact. Better late then never. They are always worth having.
However, I have to say, the first time I wrote them before writing the tested code, it was extremely satisfying. No more fiddling around with manual testing. I was surprised just how good it felt.
Also, I tend to write unit tests before refactoring legacy code - which, almost by definition, means that I'm writing tests to test code that's already written. Provides a security blanket that makes me more comfortable with getting into big blocks of code written by others.
"I'm not quite sure if this is a good approach (definitely not the best)."
Not good? Why not?
Are you designing for testability? In that case, your design is test-driven. What more can anyone ask for?
Whether the tests come first, in the middle or last doesn't matter as much as designing for testability. In the end, changes to the design will make tests fail, and you can fix things. Changes to the tests in anticipation of design changes will make the tests fail, also. Both are fine.
If you get to the end of your design work, and there's something hard to test in the middle, then you failed to do TDD. You'll have to refactor your design to make it testable.
I often take the same approach you're talking about. What seems to work well is to treat the exerimental code exactly as such, and then start a proper design based on what you've learned. From here you can write your tests first. Otherwise, you're left with lots of code that was written as temporary or experimental, and probably won't get around to writing tests for all of it.
I would say that for normal development, TDD works extremely well. There are cases where you may not need to write the tests first (or even at all), but those are infrequent. Sometimes, however, you need to do some exploration to see what will work. I would consider this to be a "spike", and I don't necessarily think that TDD is absolutely necessary in this case. I would probably not use the actual "spike" code in my project. After all, it was just an exploration and now that I have a better idea of how it ought to work, I can probably write better code (and tests) than my "spike" code. If I did decide to use my "spike" code, I'd probably go back and write tests for it.
Now, if you find that you've violated TDD and written some production code before your tests - it happens - then, too, I'd go back and write the tests. In fact, on the occasions where this has happened to me I've often found things that I've neglected once I start writing the tests because more tests come to mind that aren't handled by the code. Eventually, you get back in the TDD rythym (and vow never to do that again).
Consider the psychological tendencies associated with sunk cost. That is, when you get to the second part of the equation, that laziness gene we all have makes us want to protect the work we have already done. The consequences?
If you write the tests first...
You tend to write the code to fit the tests. This encourages the "simplest thing that solves the problem" type development and keeps you focused on solving the problem not working on meta-problems.
If you write the code first...
You will be tempted to write the tests to fit the code. In effect this is the equivalent of writing the problem to fit your answer, which is kind of backwards and will quite often lead to tests that are of lesser value.
Although I'd be surprised if 1 programmer out of 50 ALWAYS writes tests first, I'd still argue that it is something to strive for if you want to write good software.
I usually write my tests first but sometime while experimenting I write the code after. Once I get an idea of what my code is supposed to do, I stop the code and start the tests.
Writing the code first is natural when you're trying to figure out how your code is going to work. Writing the test first helps you determine what your code show do (not how it should do it). If you're writing the code first, you're trying to solve the problem without completely defining the problem. This isn't necessarily "bad", but you are using unit tests as a regression tool rather than a development tool (again, not "bad" - just not TDD).
VS 2008 has a nice feature that will generate test classes for an object, the tests needs to me tweaked but it dose a lot of the grunt work for you. Its really nice for crating tests for your less then diligent co-workers.
Another good point for this is it help to prevent you from missing something, expectantly when your working on code that isn't yours.
if your using a different testing framework then MSUnitTest, it's fairly simple to convert then tests from MSUnit to Nunit, etc. just do some copy and past.
I would like to say that I always write Unit tests first but of course I don't (for numerous reasons well known to any real programmer :-)). What I (ok, also not always...) do is to convert every bug which takes me more than five minutes to find into a unit test. Even before I fix it. This has the following advantages:
It documents the bug and alerts me if it shows up again at a later point of time.
It helps in finding the bug, since I have a well-defined place to put debugging code into (setting up my data structures, call the right methods, set breakpoints on etc.) Before I discovered unit testing I modified the main() function for this testing code resulting in strange results when I forgot to remove it afterwards ...
Usually it gives me good ideas what else could go wrong, so it quite often evolves in a whole bunch of unit tests and resulting in more than one bug getting discovered resp. fixed.

Should unit tests be written before the code is written?

I know that one of the defining principles of Test driven development is that you write your Unit tests first and then write code to pass those unit tests, but is it necessary to do it this way?
I've found that I often don't know what I am testing until I've written it, mainly because the past couple of projects I've worked on have more evolved from a proof of concept rather than been designed.
I've tried to write my unit tests before and it can be useful, but it doesn't seem natural to me.
Some good comments here, but I think that one thing is getting ignored.
writing tests first drives your design. This is an important step. If you write the tests "at the same time" or "soon after" you might be missing some design benefits of doing TDD in micro steps.
It feels really cheesy at first, but it's amazing to watch things unfold before your eyes into a design that you didn't think of originally. I've seen it happen.
TDD is hard, and it's not for everybody. But if you already embrace unit testing, then try it out for a month and see what it does to your design and productivity.
You spend less time in the debugger and more time thinking about outside-in design. Those are two gigantic pluses in my book.
There have been studies that show that unit tests written after the code has been written are better tests. The caveat though is that people don't tend to write them after the event. So TDD is a good compromise as at least the tests get written.
So if you write tests after you have written code, good for you, I'd suggest you stick at it.
I tend to find that I do a mixture. The more I understand the requirements, the more tests I can write up front. When the requirements - or my understanding of the problem - are weak, I tend to write tests afterwards.
TDD is not about the tests, but how the tests drive your code.
So basically you are writing tests to let an architecture evolve naturally (and don't forget to refactor !!! otherwise you won't get much benefit out of it).
That you have an arsenal of regression tests and executable documentation afterwards is a nice sideeffect, but not the main reason behind TDD.
So my vote is:
Test first
PS: And no, that doesn't mean that you don't have to plan your architecture before, but that you might rethink it if the tests tell you to do so !!!!
I've lead development teams for the past 6-7 years. What I can tell for sure is that as a developer and the developers I have worked with, it makes a phenomenal difference in the quality of the code if we know where our code fits into the big picture.
Test Driven Development (TDD) helps us answer "What?" before we answer "How?" and it makes a big difference.
I understand why there may be apprehensions about not following it in PoC type of development/architect work. And you are right it may not make a complete sense to follow this process. At the same time, I would like to emphasize that TDD is a process that falls in the Development Phase (I know it sounds obsolete, but you get the point :) when the low level specification are clear.
I think writing the test first helps define what the code should actually do. Too many times people don't have a good definition of what the code is supposed to do or how it should work. They simply start writing and make it up as they go along. Creating the test first makes you focus on what the code will do.
Not always, but I find that it really does help when I do.
I tend to write them as I write my code. At most I will write the tests for if the class/module exists before I write it.
I don't plan far enough ahead in that much detail to write a test earlier than the code it is going to test.
I don't know if this is a flaw in my thinking or method's or just TIMTOWTDI.
I start with how I would like to call my "unit" and make it compile.
like:
picker = Pick.new
item=picker.pick('a')
assert item
then I create
class Pick
def pick(something)
return nil
end
end
then I keep on using the Pick in my "test" case so I could see how I would like it to be called and how I would treat different kinds of behavior. Whenever I realize I could have trouble on some boundaries or some kind of error/exception I try to get it to fire and get an new test case.
So, in short. Yes.
The ratio doing test before is a lot higher than not doing it.
Directives are suggestion on how you could do things to improve the overall quality or productivity or even both of the end product. They are in no ways laws to be obeyed less you get smitten in a flash by the god of proper coding practice.
Here's my compromise on the take and I found it quite useful and productive.
Usually the hardest part to get right are the requirements and right behind it the usability of your class, API, package... Then is the actual implementation.
Write your interfaces (they will change, but will go a long way in knowing WHAT has to be done)
Write a simple program to use the interfaces (them stupid main). This goes a long way in determining the HOW it is going to be used (go back to 1 as often as needed)
Write tests on the interface (The bit I integrated from TDD, again go back to 1 as often as needed)
write the actual code behind the interfaces
write tests on the classes and the actual implementation, use a coverage tool to make sure you do not forget weid execution paths
So, yes I write tests before coding but never before I figured out what needs to be done with a certain level of details. These are usually high level tests and only treat the whole as a black box. Usually will remain as integration tests and will not change much once the interfaces have stabilized.
Then I write a bunch of tests (unit tests) on the implementation behind it, these will be much more detailed and will change often as the implementation evolves, as it get's optimized and expanded.
Is this strictly speaking TDD ? Extreme ? Agile...? whatever... ? I don't know, and frankly I don't care. It works for me. I adjust it as needs go and as my understanding of software development practice evolve.
my 2 cent
I've been programming for 20 years, and I've virtually never written a line of code that I didn't run some kind of unit test on--Honestly I know people do it all the time, but how someone can ship a line of code that hasn't had some kind of test run on it is beyond me.
Often if there is no test framework in place I just write a main() into each class I write. It adds a little cruft to your app, but someone can always delete it (or comment it out) if they want I guess. I really wish there was just a test() method in your class that would automatically compile out for release builds--I love my test method being in the same file as my code...
So I've done both Test Driven Development and Tested development. I can tell you that TDD can really help when you are a starting programmer. It helps you learn to view your code "From outside" which is one of the most important lessons a programmer can learn.
TDD also helps you get going when you are stuck. You can just write some very small piece that you know your code has to do, then run it and fix it--it gets addictive.
On the other hand, when you are adding to existing code and know pretty much exactly what you want, it's a toss-up. Your "Other code" often tests your new code in place. You still need to be sure you test each path, but you get a good coverage just by running the tests from the front-end (except for dynamic languages--for those you really should have unit tests for everything no matter what).
By the way, when I was on a fairly large Ruby/Rails project we had a very high % of test coverage. We refactored a major, central model class into two classes. It would have taken us two days, but with all the tests we had to refactor it ended up closer to two weeks. Tests are NOT completely free.
I'm not sure, but from your description I sense that there might be a misunderstanding on what test-first actually means. It does not mean that you write all your tests first. It does mean that you have a very tight cycle of
write a single, minimal test
make the test pass by writing the minimal production code necessary
write the next test that will fail
make all the existing tests pass by changing the existing production code in the simplest possible way
refactor the code (both test and production!) so that it doesn't contain duplication and is expressive
continue with 3. until you can't think of another sensible test
One cycle (3-5) typically just takes a couple of minutes. Using this technique, you actually evolve the design while you write your tests and production code in parallel. There is not much up front design involved at all.
On the question of it being "necessary" - no, it obviously isn't. There have been uncountable projects successfull without doing TDD. But there is some strong evidence out there that using TDD typically leads to significantly higher quality, often without negative impact on productivity. And it's fun, too!
Oh, and regarding it not feeling "natural", it's just a matter of what you are used to. I know people who are quite addicted to getting a green bar (the typical xUnit sign for "all tests passing") every couple of minutes.
There are so many answers now and they are all different. This perfectly resembles the reality out there. Everyone is doing it differently. I think there is a huge misunderstanding about unit testing. It seems to me as if people heard about TDD and they said it's good. Then they started to write unit tests without really understanding what TDD really is. They just got the part "oh yeah we have to write tests" and they agree with it. They also heard about this "you should write your tests first" but they do not take this serious.
I think it's because they do not understand the benefits of test-first which in turn you can only understand once you've done it this way for some time. And they always seem to find 1.000.000 excuses why they don't like writing the tests first. Because it's too difficult when figuring out how everything will fit together etc. etc. In my opinion, it's all excuses for them to hide away from their inability to once discipline themselve, try the test-first approach and start to see the benefits.
The most ridicoulous thing if they start to argue "I'm not conviced about this test-first thing but I've never done it this way" ... great ...
I wonder where unit testing originally comes from. Because if the concept really originates from TDD then it's just ridicoulous how people get it wrong.
Writing the tests first defines how your code will look like - i.e. it tends to make your code more modular and testable, so you do not create a "bloat" methods with very complex and overlapping functionality. This also helps to isolate all core functionality in separate methods for easier testing.
Personally, I believe unit tests lose a lot of their effectiveness if not done before writing the code.
The age old problem with testing is that no matter how hard we think about it, we will never come up with every possibly scenario to write a test to cover.
Obviously unit testing itself doesn't prevent this completely, as it restrictive testing, looking at only one unit of code not covering the interactions between this code and everything else, but it provides a good basis for writing clean code in the first place that should at least restrict the chances for issues of interaction between modules. I've always worked to the principle of keeping code as simple as it possibly can be - infact I believe this is one of the key principles of TDD.
So starting off with a test that basically says you can create a class of this type and build it up, in theory, writing a test for every line of code or at least covering every route through a particular piece of code. Designing as you go! Obviously based on a rough-up-front design produced initially, to give you a framework to work to.
As you say it is very unnatural to start with and can seem like a waste of time, but I've seen myself first hand that it pays off in the long run when defects stats come through and show the modules that were fully written using TDD have far lower defects over time than others.
Before, during and after.
Before is part of the spec, the contract, the definition of the work
During is when special cases, bad data, exceptions are uncovered while implementing.
After is maintenance, evolution, change, new requirements.
I don't write the actual unit tests first, but I do make a test matrix before I start coding listing all the possible scenarios that will have to be tested. I also make a list of cases that will have to be tested when a change is made to any part of the program as part of regression testing that will cover most of the basic scenarios in the application in addition to fully testing the bit of code that changed.
Remember with Extreme programming your tests effectly are you documenation. So if you don't know what you're testing, then you don't know what you want your application is going to do?
You can start off with "Stories" which might be something like
"Users can Get list of Questions"
Then as you start writing code to solve the unit tests. To solve the above you'll need at least a User and question class. So then you can start thinking about the fields:
"User Class Has Name DOB Address TelNo Locked Fields"
etc.
Hope it helps.
Crafty
Yes, if you are using true TDD principles. Otherwise, as long as you're writing the unit-tests, you're doing better than most.
In my experience, it is usually easier to write the tests before the code, because by doing it that way you give yourself a simple debugging tool to use as you write the code.
I write them at the same time. I create the skeleton code for the new class and the test class, and then I write a test for some functionality (which then helps me to see how I want the new object to be called), and implement it in the code.
Usually, I don't end up with elegant code the first time around, it's normally quite hacky. But once all the tests are working, you can refactor away until you end up with something pretty neat, tidy and proveable to be rock solid.
It helps when you are writing something that you are used writing to write first all the thing you would regularly check for and then write those features. More times then not those features are the most important for the piece of software you are writing. Now , on the other side there are not silver bullets and thing should never be followed to the letter. Developer judgment plays a big role in the decision of using test driven development versus test latter development.

Is duplicated code more tolerable in unit tests?

I ruined several unit tests some time ago when I went through and refactored them to make them more DRY--the intent of each test was no longer clear. It seems there is a trade-off between tests' readability and maintainability. If I leave duplicated code in unit tests, they're more readable, but then if I change the SUT, I'll have to track down and change each copy of the duplicated code.
Do you agree that this trade-off exists? If so, do you prefer your tests to be readable, or maintainable?
Readability is more important for tests. If a test fails, you want the problem to be obvious. The developer shouldn't have to wade through a lot of heavily factored test code to determine exactly what failed. You don't want your test code to become so complex that you need to write unit-test-tests.
However, eliminating duplication is usually a good thing, as long as it doesn't obscure anything, and eliminating the duplication in your tests may lead to a better API. Just make sure you don't go past the point of diminishing returns.
Duplicated code is a smell in unit test code just as much as in other code. If you have duplicated code in tests, it makes it harder to refactor the implementation code because you have a disproportionate number of tests to update. Tests should help you refactor with confidence, rather than be a large burden that impedes your work on the code being tested.
If the duplication is in fixture set up, consider making more use of the setUp method or providing more (or more flexible) Creation Methods.
If the duplication is in the code manipulating the SUT, then ask yourself why multiple so-called “unit” tests are exercising the exact same functionality.
If the duplication is in the assertions, then perhaps you need some Custom Assertions. For example, if multiple tests have a string of assertions like:
assertEqual('Joe', person.getFirstName())
assertEqual('Bloggs', person.getLastName())
assertEqual(23, person.getAge())
Then perhaps you need a single assertPersonEqual method, so that you can write assertPersonEqual(Person('Joe', 'Bloggs', 23), person). (Or perhaps you simply need to overload the equality operator on Person.)
As you mention, it is important for test code to be readable. In particular, it is important that the intent of a test is clear. I find that if many tests look mostly the same, (e.g. three-quarters of the lines the same or virtually the same) it is hard to spot and recognise the significant differences without carefully reading and comparing them. So I find that refactoring to remove duplication helps readability, because every line of every test method is directly relevant to the purpose of the test. That's much more helpful for the reader than a random combination of lines that are directly relevant, and lines that are just boilerplate.
That said, sometimes tests are exercising complex situations that are similiar but still significantly different, and it is hard to find a good way to reduce the duplication. Use common sense: if you feel the tests are readable and make their intent clear, and you're comfortable with perhaps needing to update more than a theoretically minimal number of tests when refactoring the code invoked by the tests, then accept the imperfection and move on to something more productive. You can always come back and refactor the tests later, when inspiration strikes!
Implementation code and tests are different animals and factoring rules apply differently to them.
Duplicated code or structure is always a smell in implementation code. When you start having boilerplate in implementation, you need to revise your abstractions.
On the other hand, testing code must maintain a level of duplication. Duplication in test code achieves two goals:
Keeping tests decoupled. Excessive test coupling can make it hard to change a single failing test that needs updating because the contract has changed.
Keeping the tests meaningful in isolation. When a single test is failing, it must be reasonably straightforward to find out exactly what it is testing.
I tend to ignore trivial duplication in test code as long as each test method stays shorter than about 20 lines. I like when the setup-run-verify rhythm is apparent in test methods.
When duplication creeps up in the "verify" part of tests, it is often beneficial to define custom assertion methods. Of course, those methods must still test a clearly identified relation that can be made apparent in the method name: assertPegFitsInHole -> good, assertPegIsGood -> bad.
When test methods grow long and repetitive I sometimes find it useful to define fill-in-the-blanks test templates that take a few parameters. Then the actual test methods are reduced to a call to the template method with the appropriate parameters.
As for a lot of things in programming and testing, there is no clear-cut answer. You need to develop a taste, and the best way to do so is to make mistakes.
You can reduce repetition using several different flavours of test utility methods.
I'm more tolerant of repetition in test code than in production code, but I have been frustrated by it sometimes. When you change a class's design and you have to go back and tweak 10 different test methods that all do the same setup steps, it's frustrating.
I agree. The trade off exists but is different in different places.
I'm more likely to refactor duplicated code for setting up state. But less likely to refactor the part of the test that actually exercises the code. That said, if exercising the code always takes several lines of code then I might think that is a smell and refactor the actual code under test. And that will improve readability and maintainability of both the code and the tests.
Jay Fields coined the phrase that "DSLs should be DAMP, not DRY", where DAMP means descriptive and meaningful phrases. I think the same applies to tests, too. Obviously, too much duplication is bad. But removing duplication at all costs is even worse. Tests should act as intent-revealing specifications. If, for example, you specify the same feature from several different angles, then a certain amount of duplication is to be expected.
"refactored them to make them more DRY--the intent of each test was no longer clear"
It sounds like you had trouble doing the refactoring. I'm just guessing, but if it wound up less clear, doesn't that mean you still have more work to do so that you have reasonably elegant tests which are perfectly clear?
That's why tests are a subclass of UnitTest -- so you can design good test suites that are correct, easy to validate and clear.
In the olden times we had testing tools that used different programming languages. It was hard (or impossible) to design pleasant, easy-to-work with tests.
You have the full power of -- whatever language you're using -- Python, Java, C# -- so use that language well. You can achieve good-looking test code that's clear and not too redundant. There's no trade-off.
I LOVE rspec because of this:
It has 2 things to help -
shared example groups for testing common behaviour.
you can define a set of tests, then 'include' that set in your real tests.
nested contexts.
you can essentially have a 'setup' and 'teardown' method for a specific subset of your tests, not just every one in the class.
The sooner that .NET/Java/other test frameworks adopt these methods, the better (or you could use IronRuby or JRuby to write your tests, which I personally think is the better option)
I feel that test code requires a similar level of engineering that would normally be applied to production code. There can certainly be arguments made in favor of readability and I would agree that's important.
In my experience, however, I find that well-factored tests are easier to read and understand. If there's 5 tests that each look the same except for one variable that's changed and the assertion at the end, it can be very difficult to find what that single differing item is. Similarly, if it is factored so that only the variable that's changing is visible and the assertion, then it's easy to figure out what the test is doing immediately.
Finding the right level of abstraction when testing can be difficult and I feel it is worth doing.
I don't think there is a relation between more duplicated and readable code. I think your test code should be as good as your other code. Non-repeating code is more readable then duplicated code when done well.
Ideally, unit tests shouldn't change much once they are written so I would lean towards readability.
Having unit tests be as discrete as possible also helps to keep the tests focused on the specific functionality that they are targeting.
With that said, I do tend to try and reuse certain pieces of code that I wind up using over and over, such as setup code that is exactly the same across a set of tests.