Recently, i took ownership of some c++ code. I am going to maintain this code, and add new features later on.
I know many people say that it is usually not worth adding unit-tests to existing code, but i would still like to add some tests which will at least partially cover the code. In particular, i would like to add tests which reproduce bugs which i fixed.
Some of the classes are constructed with some pretty complex state, which can make it more difficult to unit-test.
I am also willing to refactor the code to make it easier to test.
Is there any good article you recommend on guidelines which help to identify classes which are easier to unit-test? Do you have any advice of your own?
While Martin Fowler's book on refactoring is a treasure trove of information, why not take a look at "Working Effectively with Legacy Code."
Also, if you're going to be dealing with classes where there's a ton of global variables or huge amounts of state transitions I'd put in a lot of integration checks. Separate out as much of the code which interacts with the code you're refactoring to make sure that all expected inputs in the order they are recieved continue to produce the same outputs. This is critical as it's very easy to "fix" a subtle bug that might have been addressed somewhere else.
Take notes too. If you do find that there is a bug which another function/class expects and handles properly you'll want to change both at the same time. That's difficult unless you keep thorough records.
Presumably the code was written for a purpose, and a unit test will check if the purpose is met, i.e. the pre-conditions and post-conditions hold for the methods.
If the public class methods are such that you can externally check the state it can be unit tested easily enough (black-box test). If the class state is invisible or if you have to test tricky private methods, your test class may need to be a friend (white-box test).
A class that is hard to unit test will be one that
Has enormous dependencies, i.e. tightly coupled
Is intended to work in a high-volume or multi-threaded environment. There you would use a system test rather than a unit test and the actual output may not be totally determinate.
I written a fair number of blog posts about unit testing, non-trivial, C++ code: http://www.lenholgate.com/blog/2004/05/practical-testing.html
I've also written quite a lot about adding tests to existing code: http://www.lenholgate.com/blog/testing/
Almost everything can and should be unit tested. If not directly, then by using mock classes.
Since you decided to refactor your classes, try to use BDD or TDD approach.
To prevent breaking existing functionality, the only way is to have good integration tests, but usually it takes time to execute them all for a complex system.
Without more details on what you do, it is not that easy to give more implementation details. Some are :
use MVP or presenter first for developing gui
use design patterns where appropriate
use function and member pointers, or observer design pattern to break dependencies
I think that if you're having to come up with some "measure" to test if a class is testable, you're already fscked. You should be able to tell just by looking at it: can you write an independent program that links to this class alone and makes sure it works?
If a class is too huge so that you can't be sure just by looking at it...chances are it probably isn't testable. People that don't know how to make small, distinct interfaces generally don't know how to adhere to any other principle either.
In the end though, the way to find out if a class is testable is to try to put it in a harness. If you end up having to pull in half your program to do it, try refactoring. If you find that you can't even perform the most basic refactor without having to rewrite the entire program, analyze the expense of doing so.
We at IPL published a paper It's testing Jim, but not as we know it which explores the practical problems of testing C++ and suggests some techniques to address them that may well be of use given your question. These techniques are also well supported in Cantata++ - our C/C++ unit and integration testing tool.
Related
Since a few days ago I've started to feel interested in Unit Testing and TDD in C# and VS2010. I've read blog posts, watched youtube tutorials, and plenty more stuff that explains why TDD and Unit Testing are so good for your code, and how to do it.
But the biggest problem I find is, that I don't know what to check in my tests and what not to check.
I understand that I should check all the logical operations, problems with references and dependencies, but for example, should I create an unit test for a string formatting that's supossed to be user-input? Or is it just wasting my time while I just can check it in the actual code?
Is there any guide to clarify this problem?
In TDD every line of code must be justified by a failing test-case written before the code.
This means that you cannot develop any code without a test-case. If you have a line of code (condition, branch, assignment, expression, constant, etc.) that can be modified or deleted without causing any test to fail, it means this line of code is useless and should be deleted (or you have a missing test to support its existence).
That is a bit extreme, but this is how TDD works. That being said if you have a piece of code and you are wondering whether it should be tested or not, you are not doing TDD correctly. But if you have a string formatting routine or variable incrementation or whatever small piece of code out there, there must be a test case supporting it.
UPDATE (use-case suggested by Ed.):
Like for example, adding an object to a list and creating a test to see if it is really inside or there is a duplicate when the list shouldn't allow them.
Here is a counterexample, you would be surprised how hard it is to spot copy-paste errors and how common they are:
private Set<String> inclusions = new HashSet<String>();
private Set<String> exclusions = new HashSet<String>();
public void include(String item) {
inclusions.add(item);
}
public void exclude(String item) {
inclusions.add(item);
}
On the other hand testing include() and exclude() methods alone is an overkill because they do not represent any use-cases by themselves. However, they are probably part of some business use-case, you should test instead.
Obviously you shouldn't test whether x in x = 7 is really 7 after assignment. Also testing generated getters/setters is an overkill. But it is the easiest code that often breaks. All too often due to copy&paste errors or typos (especially in dynamic languages).
See also:
Mutation testing
Your first few TDD projects are going to probably result in worse design/redesign and take longer to complete as you are learning (at least in my experience). This is why you shouldn't jump into using TDD on a large critical project.
My advice is to use "pure" TDD (acceptance/unit test everything test-first) on a few small projects (100-10,000 LOC). Either do the side projects on your own or if you don't code in your free time, use TDD on small internal utility programs for your job.
After you do "pure" TDD on about 6-12 projects, you will start to understand how TDD affects design and learn how to design for testability. Once you know how to design for testability, you will need to TDD less and maximize the ROI of unit, regression, acceptance, etc. tests rather than test everything up front.
For me, TDD is more of teaching method for good code design than a practical methodology. However, I still TDD logic code and unit test instead of debug.
There is no simple answer to this question. There is the law of diminishing returns in action, so achieving perfect coverage is seldom worth it. Knowing what to test is a thing of experience, not rules. It’s best to consciously evaluate the process as you go. Did something break? Was it feasible to test? If not, is it possible to rewrite the code to make it more testable? Is it worth it to always test for such cases in the future?
If you split your code into models, views and controllers, you’ll find that most of the critical code is in the models, and those should be fairly testable. (That’s one of the main points of MVC.) If a piece of code is critical, I test it, even if it means that I would have to rewrite it to make it more testable. If a piece of code is easy to get wrong or get broken by future updates, it gets a test. I seldom test controllers and views, as it’s not proving worth the trouble for me.
The way I see it all of your code falls into one of three buckets:
Code that is easy to test: This includes your own deterministic public methods.
Code that is difficult to test: This includes GUI, non-deterministic methods, private methods, and methods with complex setup.
Code that you don't want to test: This includes 3rd party code, and code that is difficult to test and not worth the effort.
Of the three, you should focus on testing the easy code. The difficult to test code should be refactored so that into two parts: code that you don't want to test and easy code. And of course, you should test the refactored easy code.
I think you should only unit test entry points to behavior of the system. This include public methods, public accessors and public fields, but not constants (constant fields, enums, methods, etc.). It also includes any code which directly deals with IO, I explain why further below.
My reasoning is as follows:
Everything that's public is basically an entry point to a behavior of the system. A unit test should therefore be written that guarantees that the expected behavior of that entry point works as required. You shouldn't test all possible ways of calling the entry point, only the ones that you explicitly require. Your unit tests are therefore also the specs of what behavior your system supports and your documentation of how to use it.
Things that are not public can basically be deleted/re-factored at will with no impact to the behavior of the system. If you were to test those, you'd create a hard dependency from your unit test to that code, which would prevent you from doing refactoring on it. That's why you should not test anything else but public methods, fields and accessors.
Constants by design are not behavior, but axioms. A unit test that verifies a constant is itself a constant, so it would only be duplicated code and useless effort to write a test for constants.
So to answer your specific example:
should I create an unit test for a string formatting that's supossed
to be user-input?
Yes, absolutely. All methods which receive or send external input/output (which can be summed up as receiving IO), should be unit tested. This is probably the only case where I'd say non-public things that receive IO should also be unit tested. That's because I consider IO to be a public entry. Anything that's an entry point to an external actor I consider public.
So unit test public methods, public fields, public accessors, even when those are static constructs and also unit test anything which receives or sends data from an external actor, be it a user, a database, a protocol, etc.
NOTE: You can write temporary unit tests on non public things as a way for you to help make sure your implementation works. This is more of a way to help you figure out how to implement it properly, and to make sure your implementation works as you intend. After you've tested that it works though, you should delete the unit test or disable it from your test suite.
Kent Beck, in Extreme Programming Explained, said you only need to test the things that need to work in production.
That's a brusque way of encapsulating both test-driven development, where every change in production code is supported by a test that fails when the change is not present; and You Ain't Gonna Need It, which says there's no value in creating general-purpose classes for applications that only deal with a couple of specific cases.
I think you have to change your point of view.
In a pure form TDD requires the red-green-refactor workflow:
write test (it must fail) RED
write code to satisfy test GREEN
refactor your code
So the question "What I have to test?" has a response like: "You have to write a test that correspond to a feature or a particular requirements".
In this way you get must code coverage and also a better code design (remember that TDD stands also for Test Driven "Design").
Generally speaking you have to test ALL public method/interfaces.
should I create an unit test for a string formatting that's supossed
to be user-input? Or is it just wasting my time while I just can check
it in the actual code?
Not sure I understand what you mean, but the tests you write in TDD are supposed to test your production code. They aren't tests that check user input.
To put it another way, there can be TDD unit tests that test the user input validation code, but there can't be TDD unit tests that validate the user input itself.
I'm fairly green to unit testing and TDD, so please bear with me as I ask what some may consider newbie questions, or if this has been debated before. If this turns out to be considered a "bad question" (too subjective and open for debate), I will happily close it. However, I've searched for a couple days, and am not getting a definitive answer, and I need a better understand of this, so I know no better way to get more info than to post here.
I've started reading an older book on unit testing (because a colleague had it on hand), and its opening chapter talks about why to unit test. One of the points it makes is that in the long run, your code is much more reliable and cleaner, and less prone to bugs. It also points out that effective unit testing will make tracking and fixing bugs much easier. So it seems to focus quite a bit on the overall prevention/reduction of bugs in your code.
On the other hand, I also found an article about writing great unit tests, and it states that the goal of unit testing is to make your design more robust, and conversely, finding bugs is the goal of manual testing, not unit testing.
So being the newbie to TDD that I am, I'm a little confused as to the state of mind with which I should go into TDD and building my unit tests. I'll admit that part of the reason I'm taking this on now with my recently started project is because I'm tired of my changes breaking previously existing code. And admittedly, the linked article above does at least point this out as an advantage to TDD. But my hope is that by going back in and adding unit tests to my existing code (and then continuing TDD from this point forward) is to help prevent these bugs in the first place.
Are this book and this article really saying the same thing in different tones, or is there some subjectivity on this subject, and what I'm seeing is just two people having somewhat different views on how to approach TDD?
Thanks in advance.
Unit tests and automated tests generally are for both better design and verified code.
Unit test should test some execution path in some very small unit. This unit is usually public method or internal method exposed on your object. The method itself can still use many other protected or private methods from the same object instance. You can have single method and several unit test for this method to test different execution paths. (By execution path I meant something controlled by if, switch, etc.) Writing unit tests this way will validate that your code really does what you expect. This can be especially important in some corner cases where you expect to throw exception in some rare scenarios etc. You can also test how method behaves if you pass different parameters - for example null instead of object instance, negative value for integer used for indexing, etc. That is especially useful for public API.
Now suppose that your tested method also uses instances of other classes. How to deal with it? Should you still test your single method and believe that class works? What if the class is not implemented yet? What if the class has some complex logic inside? Should you test these execution paths as well on your current method? There are two approaches to deal with this:
For some cases you will simply let the real class instance to be tested together with your method. This is for example very common in case of logging (it is not bad to have logs available for test as well).
For other scenarios you would like to take this dependencies from your method but how to do it? The solution is dependency injection and implementing against abstraction instead of implementation. What does it mean? It means that your method / class will not create instances of these dependencies but instead it will get them either through method parameters, class constructor or class properties. It also means that you will not expect concrete implementation but either abstract base class or interface. This will allow you to pass fake, dummy or mock implementation to your tested object. These special type of implementations simply don't do any processing they get some data and return expected result. This will allow you to test your method without dependencies and lead to much better and more extensible design.
What is the disadvantage? Once you start using fakes / mocks you are testing single method / class but you don't have a test which will grab all real implementations and put them together to test if the whole system really works = You can have thousands of unit tests and validate that each your method works but it doesn't mean they will work together. This is scenario for more complex tests - integration or end-to-end tests.
Unit tests should be usually very easy to write - if they are not it means that your design is probably complicated and you should think about refactoring. They should be also very fast to execute so you can run them very often. Other kinds of test can be more complex and very slow and they should run mostly on build server.
How it fits with SW development process? The worst part of development process is stabilization and bug fixing because this part can be very hardly estimated. To be able to estimate how much time bug fixing takes you must know what causes the bug. But this investigation cannot be estimated. You can have bug which will take one hour to fix but you will spend two weeks by debugging your application and searching for this bug. When using good code coverage you will most probably find such bug early during development.
Automated testing don't say that SW doesn't contain bugs. It only say that you did your best to find and solve them during development and because of that your stabilization could be much less painful and much shorter. It also doesn't say that your SW does what it should - that is more about application logic itself which must be tested by some separate tests going through each use case / user story - acceptance tests (they can be also automated).
How this fit with TDD? TDD takes it to extreme because in TDD you will write your test first to drive your quality, code coverage and design.
It's a false choice. "Find/minimize bugs" OR improve design.
TDD, in particular (and as opposed to "just" unit testing) is all about giving you better design.
And when your design is better, what are the consequences?
Your code is easier to read
Your code is easier to understand
Your code is easier to test
Your code is easier to reuse
Your code is easier to debug
Your code has fewer bugs in the first place
With well-designed code, you spend less time finding and fixing bugs, and more time adding features and polish. So TDD gives you a savings on bugs and bug-hunting, by giving you better design. These things are not separate; they are dependent and interrelated.
There can many different reasons why you might want to test your code. Personally, I test for a number of reasons:
I usually design API using a combination of the normal design patterns (top-down) and test-driven development (TDD; bottom-up) to ensure that I have a sound API both from a best practices point-of-view as well as from an actual usage point-of-view. The focus of the tests is both on the major use-cases for the API, but also on the completeness of the API and the behavior - so they are primary "black box" tests. The development sequence is often:
main API based on design patterns and "gut feeling"
TDD tests for the major use-cases according to the high-level specification for the API - primary in order to make sure the API is "natural" and easy to use
fleshed out API and behavior
all the needed test cases to ensure the completeness and correct behavior
Whenever I fix an error in my code, I try to write a test to make sure it stay fixed. Somehow, the error got into my original design and passed my original testing of the code, so it is probably not all that trivial. I have noticed that many of the tests tests are "write box" tests.
In order to be able to make any sort of major re-factoring of the code, you need an extensive set of API tests to make sure the behavior of the code stays the same after the re-factoring. For any non-trivial API, I want the test suite to be in place and working for a long time before the re-factoring to be sure that all the major use-cases are covered in a good way. As often as not, you are forced to throw away most of your "white box" tests as they - by the very definition - makes too many assumptions about the internals. I usually try to "translate" as many as possible of these tests as the same non-trivial problems tend to survive re-factoring of the code.
In order to transfer any code between developers, I usually also want a good test suite with focus on the API and the major use-cases. So basically the tests from the initial TDD...
I think that answer to your question is: both.
You will improve design because there is one particular thing about TDD that is great: while you write tests you put yourself in the position of the client code that will be using the system under test - and this alone makes you think about certain design choices.
For example: UI. When you start writing the tests, you will see that those God-Forms are impossible to test, so you separate the logic behind the screens to a presenter/controller, and you get MVP/MVC/whatever.
Having the concept of unit testing a class and mocking dependencies brings you to Single Responsibility Principle. There is a point about every of SOLID principles.
As for bugs, well, if you unit test every method of every class you write (except properties, very simple methods and such) you will catch most bugs in the start. Write the integration tests, you cover almost all of them.
I'll take my stab at this using a remix of a previous answer I wrote. In short, I don't see this as a dichotomy between driving good design and minimizing bugs. I see it more as one (good design) leading to the other (minimizing bugs).
I tend towards saying TDD is a design process that happens to involve unit testing. It's a design process because within each Red-Green-Refactor iteration, you write the test first for code that doesn't exist. You're designing as you're going.
The first beauty of TDD is that the design of your code is guaranteed to be testable. Testable code tends to have loose coupling and high cohesion. Loose coupling and high cohesion are important because they make the code easy to change when requirements change. The second beauty of TDD is that after you're done implementing your system, you happen to have a huge regression suite to catch any bugs and changes in assumptions. Thus, TDD makes your code easy to change because of the design it creates and it makes your code safe to change because of the test harness it creates.
Trying to retrospectively add Unit tests can be quite painful and expensive. If the code doesn't support Unit test you may be better looking at integration tests to test your code.
Don't mix Unit Testing with TDD.
Unit Testing is just the fact of "testing" your code to ensure quality and maintainability.
TDD is a full blown development methodology in which you first write your tests (based on requirements), and only then you write the needed code (and just the needed code) to make that test pass. This means that you only write code to repair a broken test.
Once done that, you write another test, and the code needed to make it pass. In the way, you may be forced to do "refactoring" of the code to allow a new test run without braking another. This way, the "design" arises from the tests.
The purpose of this methodology is of course reduce bugs and improve design, but the main goal of it is to improve productivity because you write exactly the code you need. And you don't write documentation: the tests are the documentation. If a requirement changes, then you change the tests and the code afterwards. If new requirements appear, just add new tests.
Is it generally accepted that you cannot test code unless the code is setup to be tested?
A hypothetical bit of code:
public void QueueOrder(SalesOrder order)
{
if (order.Date < DateTime.Now-20)
throw new Exception("Order is too old to be processed");
...
}
Some would consider refactoring it into:
protected DateTime MinOrderAge;
{
return DateTime.Now-20;
}
public void QueueOrder(SalesOrder order)
{
if (order.Date < MinOrderAge)
throw new Exception("Order is too old to be processed");
...
}
Note: You can come up with even more complicated solutions; involving an IClock interface and factory. It doesn't affect my question.
The issue with changing the above code is that the code has changed. The code has changed without the customer asking for it to be changed. And any change requires meetings and conference calls. And so i'm at the point where it's easier not to test anything.
If i'm not willing/able to make changes: does it make me not able to perform testing?
Note: The above pseudo-code might look like C#, but that's only so it's readable. The question is language agnostic.
Note: The hypothetical code snippet, problem, need for refactoring, and refactoring are hypothetical. You can insert your own hypothetical code sample if you take umbrage with mine.
Note: The above hypothetical code is hypothetical. Any relation to any code, either living or dead, is purely coincidental.
Note: The code is hypothetical, but any answers are not. The question is not subjective: as i believe there is an answer.
Update: The problem here, of course, is that i cannot guarantee that change in the above example didn't break anything. Sure i refactored one piece of code out to a separate method, and the code is logically identical.
But i cannot guarantee that adding a new protected method didn't offset the Virtual Method Table of the object, and if this class is in a DLL then i've just introduced an access violation.
The answer is yes, some code will need to change to make it testable.
But there is likely lots of code that can be tested without having to change it. I would focus on writing tests for that stuff first, then writing tests for the rest when other customer requirements give you the opportunity to refactor it in a testable way.
Code can be written from the start to be testable. If it is not written from the start with testability in mind, you can still test it, you may just run into some difficulties.
In your hypothetical code, you could test the original code by creating a SalesOrder with a date far in the past, or you could mock out DateTime.Now. Having the code refactored as you showed is nicer for testing, but it isn't absolutely necessary.
If your code is not designed to be tested then it is more difficult to test it. In your example you would have to override the DateTime.Now Method which is propably no easy task.
I you think it adds little value to add tests to your code or the changing of existing code is not allowed then you should not do it.
However if you belief in TDD then you should write new code with tests.
You can unit test your original example using a Mock object framework. In this case I would mock the SalesOrder object several times, configuring a different Date value each time, and test. This avoids changing any code that ships and allows you to validate the algorithm in question that the order date is not too far in the past.
For a better overall view of what's possible given the dependencies you're dealing with, and the language features you have at your disposal, I recommend Working Effective with Legacy Code.
This is easy to accomplish in some dynamic languages. For example I can hook inside the import/using statements and replace an actual dependency with a stub one, even if the SUT (System Under Test) uses it as an implicit dependency. Or I can redefine those symbols (classes, methods, functions, etc.). I'm not saying this is the way to go. Things should be refactored, but it is easier to write some characterization tests.
The problem with this sort of code is always, that it's creating and depending on a lot of static classes, framework types, etc. etc. ...
A very good solution to 'inject' fakes for all these objects is Typemock Isolator (which is commercial, but worth every penny). So yes, you certainly can test legacy code, which was written without testability in mind. I've done it on a big project with Typemock and had very good results.
Alternatively to Typemock, you may use the free MS Moles framework, which does basically the same. It's only that it has a quite unintuitive API and is much harder to learn and use.
HTH.
Thomas
Mockito + PowerMock for Mockito.
You'll be able to test almost everything without dramatically changing your code. But some setters will be needed to inject the mocks.
What kind of practices do you use to make your code more unit testing friendly?
TDD -- write the tests first, forces
you to think about testability and
helps write the code that is actually
needed, not what you think you may
need
Refactoring to interfaces -- makes
mocking easier
Public methods virtual if not using
interfaces -- makes mocking easier
Dependency injection -- makes mocking
easier
Smaller, more targeted methods --
tests are more focused, easier to
write
Avoidance of static classes
Avoid singletons, except where
necessary
Avoid sealed classes
Dependency injection seems to help.
Write the tests first - that way, the tests drive your design.
Use TDD
When writing you code, utilise dependency injection wherever possible
Program to interfaces, not concrete classes, so you can substitute mock implementations.
Make sure all of your classes follow the Single Responsibility Principle. Single responsibility means that each class should have one and only one responsibility. That makes unit testing much easier.
I'm sure I'll be down voted for this, but I'm going to voice the opinion anyway :)
While many of the suggestions here have been good, I think it needs to be tempered a bit. The goal is to write more robust software that is changeable and maintainable.
The goal is not to have code that is unit testable. There's a lot of effort put into making code more "testable" despite the fact that testable code is not the goal. It sounds really nice and I'm sure it gives people the warm fuzzies, but the truth is all of those techniques, frameworks, tests, etc, come at a cost.
They cost time in training, maintenance, productivity overhead, etc. Sometimes it's worth it, sometimes it isn't, but you should never put the blinders on and charge ahead with making your code more "testable".
When writing tests (as with any other software task) Don't Repeat Yourself (DRY principle). If you have test data that is useful for more then one test then put it someplace where both tests can use it. Don't copy the code into both tests. I know this seems obvious but I see it happen all the time.
I use Test-Driven Development whenever possible, so I don't have any code that cannot be unit tested. It wouldn't exist unless the unit test existed first.
The easiest way is don't check in your code unless you check in tests with it.
I'm not a huge fan of writing the tests first. But one thing I believe very strongly in is that code must be checked in with tests. Not even an hour or so before, togther. I think the order in which they are written is less important as long as they come in together.
Small, highly cohesive methods. I learn it the hard way. Imagine you have a public method that handles authentication. Maybe you did TDD, but if the method is big, it will be hard to debug. Instead, if that #authenticate method does stuff in a more pseudo-codish kind of way, calling other small methods (maybe protected), when a bug shows up, it's easy to write new tests for those small methods and find the faulty one.
And something that you learn the first thing in OOP, but so many seems to forget: Code Against Interfaces, Not Implementations.
Spend some time refactoring untestable code to make it testable. Write the tests and get 95% coverage. Doing that taught me all I need to know about writing testable code. I'm not opposed to TDD, but learning the specifics of what makes code testable or untestable helps you to think about testability at design time.
Don't write untestable code
1.Using a framework/pattern like MVC to separate your UI from you
business logic will help a lot.
2. Use dependency injection so you can create mock test objects.
3. Use interfaces.
Check up this talk Automated Testing Patterns and Smells.
One of the main take aways for me, was to make sure that the UnitTest code is in high quality. If the code is well documented and well written, everyone will be motivated to keep this up.
No Statics - you can't mock out statics.
Also google has a tool that will measure the testability of your code...
I'm continually trying to find a process where unit testing is less of a chore and something that I actually WANT to do. In my experience, a pretty big factor is your tools. I do a lot of ActionScript work and sadly, the tools are somewhat limited, such as no IDE integration and lack of more advanced mocking frameworks (but good things are a-coming, so no complaints here!). I've done test driven development before with more mature testing frameworks and it was definately a more pleasurable experience, but still felt like somewhat of a chore.
Recently however I started writing code in a different manner. I used to start with writing the test, watching them fail, writing code to make the test succeed, rinse and repeat and all that.
Now however, I start with writing interfaces, almost no matter what I'm going to do. At first I of course try to identify the problem and think of a solution. Then I start writing the interfaces to get a sort of abstract feel for the code and the communication. At that point, I usually realize that I haven't really figured out a proper solution to the problem at all as a result of me not fully understanding the problem. So I go back, revise the solution and revise my interfaces. When I feel that the interfaces reflect my solution, I actually start with writing the implementation, not the tests. When I have something implemented (draft implementationd, usually baby steps), I start testing it. I keep going back between testing and implementing, a few steps forward at a time. Since I have interfaces for everything, it's incredibly easy to inject mocks.
I find working like this, with classes having very little knowledge of other implementation and only talking to interfaces, is extremely liberating. It frees me from thinking about the implementation of another class and I can focus on the current unit. All I need to know is the contract that the interface provides.
But yeah, I'm still trying to work out a process that works super-fantastically-awesomely-well every time.
Oh, I also wanted to add that I don't write tests for everything. Vanilla properties that don't do much but get/set variables are useless to test. They are garuanteed by the language contract to work. If they don't I have way worse problems than my units not being testable.
To prepare your code to be testable:
Document your assumptions and exclusions.
Avoid large complex classes that do more than one thing - keep the single responsibility principle in mind.
When possible, use interfaces to decouple interactions and allow mock objects to be injected.
When possible, make pubic method virtual to allow mock objects to emulate them.
When possible, use composition rather than inheritance in your designs - this also encourages (and supports) encapsulation of behaviors into interfaces.
When possible, use dependency injection libraries (or DI practices) to provide instances with their external dependencies.
To get the most out of your unit tests, consider the following:
Educate yourself and your development team about the capabilities of the unit testing framework, mocking libraries, and testing tools you intend to use. Understanding what they can and cannot do will be essential when you actually begin writing your tests.
Plan out your tests before you begin writing them. Identify the edge cases, constraints, preconditions, postconditions, and exclusions that you want to include in your tests.
Fix broken tests as near to when you discover them as possible. Tests help you uncover defects and potential problems in your code. If your tests are broken, you open the door to having to fix more things later.
If you follow a code review process in your team, code review your unit tests as well. Unit tests are as much a part of your system as any other code - reviews help to identify weaknesses in the tests just as they would for system code.
You don't necessarily need to "make your code more unit testing friendly".
Instead, a mocking toolkit can be used to make testability concerns go away.
One such toolkit is JMockit.
Having just read the first four chapters of Refactoring: Improving the Design of Existing Code, I embarked on my first refactoring and almost immediately came to a roadblock. It stems from the requirement that before you begin refactoring, you should put unit tests around the legacy code. That allows you to be sure your refactoring didn't change what the original code did (only how it did it).
So my first question is this: how do I unit-test a method in legacy code? How can I put a unit test around a 500 line (if I'm lucky) method that doesn't do just one task? It seems to me that I would have to refactor my legacy code just to make it unit-testable.
Does anyone have any experience refactoring using unit tests? And, if so, do you have any practical examples you can share with me?
My second question is somewhat hard to explain. Here's an example: I want to refactor a legacy method that populates an object from a database record. Wouldn't I have to write a unit test that compares an object retrieved using the old method, with an object retrieved using my refactored method? Otherwise, how would I know that my refactored method produces the same results as the old method? If that is true, then how long do I leave the old deprecated method in the source code? Do I just whack it after I test a few different records? Or, do I need to keep it around for a while in case I encounter a bug in my refactored code?
Lastly, since a couple people have asked...the legacy code was originally written in VB6 and then ported to VB.NET with minimal architecture changes.
For instructions on how to refactor legacy code, you might want to read the book Working Effectively with Legacy Code. There's also a short PDF version available here.
Good example of theory meeting reality. Unit tests are meant to test a single operation and many pattern purists insist on Single Responsibilty, so we have lovely clean code and tests to go with it. However, in the real (messy) world, code (especially legacy code) does lots of things and has no tests. What this needs is dose of refactoring to clean the mess.
My approach is to build tests, using the Unit Test tools, that test lots of things in a single test. In one test, I may be checking the DB connection is open, changing lots of data, and doing a before/after check on the DB. I inevitably find myself writing helper classes to do the checking, and more often than not those helpers can then be added into the code base, as they have encapsulated emergent behaviour/logic/requirements. I don't mean I have a single huge test, what I do mean is mnay tests are doing work which a purist would call an integration test - does such a thing still exist? Also I've found it useful to create a test template and then create many tests from that, to check boundary conditions, complex processing etc.
BTW which language environment are we talking about? Some languages lend themselves to refactoring better than others.
From my experience, I'd write tests not for particular methods in the legacy code, but for the overall functionality it provides. These might or might not map closely to existing methods.
Write tests at what ever level of the system you can (if you can), if that means running a database etc then so be it. You will need to write a lot more code to assert what the code is currently doing as a 500 line+ method is going to possibly have a lot of behaviour wrapped up in it. As for comparing the old versus the new, if you write the tests against the old code, they pass and they cover everything it does then when you run them against the new code you are effectively checking the old against the new.
I did this to test a complex sql trigger I wanted to refactor, it was a pain and took time but a month later when we found another issue in that area it was worth having the tests there to rely on.
In my experience this is the reality when working on Legacy code. Book (Working with Legacy..) mentioned by Esko is an excellent work which describes various approaches which can take you there.
I have seen similar issues with out unit-test itself which has grown to become system/functional test. Most important thing to develop tests for Legacy or existing code is to define the term "unit". It can be even functional unit like "reading from database" etc. Identify key functional units and maintain tests which adds value.
As an aside, there was recent talk between Joel S. and Martin F. on TDD/unit-tests. My take is that it is important to define unit and keep focus on it! URLS: Open Letter, Joel's transcript and podcast
That really is one of the key problems of trying to refit legacy code. Are you able to break the problem domain down to something more granular? Does that 500+ line method make anything other than system calls to JDK/Win32/.NET Framework JARs/DLLs/assemblies? I.e. Are there more granular function calls within that 500+ line behemoth that you could unit test?
The following book: The Art of Unit Testing contains a couple of chapters with some interesting ideas on how to deal with legacy code in terms of developing Unit Tests.
I found it quite helpful.