unit tests in C++

unit tests in C++ - c++

Is okay to break all dependencies using interfaces just to make a class testable? It involves significant overhead at runtime because of many virtual calls instead of plain method invocations.
How does test driven development work in real world C++ applications? I read Working Effectively With Legacy Code and fond it quite useful but do not get up to speed practising TDD.
If I do a refactoring it occurs very often that I have to completely rewite unit test because of massive logic changes. My code changes very often change the fundamental logic of the processing of data. I do not see a way to write unit tests which do not have to change in a large refactoring.
May be someone can point me to an open source c++ application wich is using TDD to learn by example.

Update: See this question too.
I can only answer some parts here:
Is okay to break all dependencies using interfaces just to make a class testable? It involves significant overhead at runtime because of many virtual calls instead of plain method invocations.
If your performance will suffer too much because of it, no (benchmark!). If your development suffers too much, no (estimate extra effort). If it seems like it won't matter much and help in the long run and helps you with quality, yes.
You could always 'friend' your test classes, or a TestAccessor object through which your tests could investigate stuff within it. That avoids making everything dynamically dispatchable just for testing. (it does sound like quite a bit of work.)
Designing testable interfaces isn't easy. Sometimes you have to add some extra methods that access the innards just for testing. It makes you cringe a bit but it's good to have and more often than not those functions are useful in the actual application as well, sooner or later.
If I do a refactoring it occurs very often that I have to completely rewite unit test because of massive logic changes. My code changes very often change the fundamental logic of the processing of data. I do not see a way to write unit tests which do not have to change in a large refactoring.
Large refactorings by defintion change a lot, including the tests. Be happy you have them as they will test stuff after refactoring too.
If you spend more time refactoring than making new features, perhaps you should consider thinking a bit more before coding to find better interfaces that can withstand more changes. Also, writing unit-tests before interfaces are stable is a pain, no matter what you do.
The more code you have against an interface that changes much, the more code you will have to change each time. I think your problem lies there. I've managed to have sufficiently stable interfaces in most places, and refactor only parts now and then.
Hope it helps.

I've routinely used macros, #if and other preprocessor tricks to "mock-out" dependencies for the purpose of unit testing in C and C++, exactly because with such macros I don't have to pay any run-time cost when the code is compiled for production rather than testing. Not elegant, but reasonably effective.
As for refactorings, they may well require changing the tests when they are so overwhelmingly large and intrustive as you describe. I don't find myself refactoring so drastically all that often, though.

The obvious answer would be to factor out dependencies using templates rather than interfaces. Of course that might hurt compile-times (depending on exactly how you implement it), but it should eliminate the runtime overhead at least. A slightly simpler solution might be to just rely on a set of typedefs, which can be swapped out with a few macros or similar.

As to your first question - it's rarely worthwhile to break things just for the sake of testing, although sometimes you might have to break things before making them better as part of your refactoring. The most important criteria for a software product is that it works, not that it's testable. Testable is only important as it helps you make a product that is more stable and functions better for your end-users.
A big part of test-driven development is selecting small, atomic parts of your code that aren't likely to change for unit testing. If you're having to rewrite a lot of unit tests because of massive logic changes, you might need to test at a finer-grained level, or re-design your code so that it's more stable. A stable design shouldn't change drastically over time, and testing won't help you avoid massive refactoring is that becomes required. However, if done right testing can make it so that when you refactor things you can be more confident that your refactoring was successful, assuming that there are some tests that don't need to be changed.

Is okay to break all dependencies using interfaces just to make a class testable? It involves significant overhead at runtime because of many virtual calls instead of plain method invocations.
I think it is OK to break the dependencies, since that will lead into better interfaces.
If I do a refactoring it occurs very often that I have to completely rewite unit test because of massive logic changes. My code changes very often change the fundamental logic of the processing of data. I do not see a way to write unit tests which do not have to change in a large refactoring.
You won't get ride of these large refactorings in any language since your tests should be expressing the real intent of your code. So if the logic changes, your tests must change.
Maybe you aren't really doing TDD, like:
Create a test that fails
Create the code to pass the test
Create another test that fails
Fix the code to pass both tests
Rinse and repeat until you think you have enough tests that show what your code should be doing
These steps say that you should be doing minor changes, and not big ones. If you stay with the latter, you can't escape big refactors. No language will save you from that, and C++ will be the worst of them because of the compile times, link times, bad error messages, etc. etc.
I'm actually working in a real world software written in C++ with a HUGE legacy code under it. We're using TDD and it is really helping evolve the design of the software.

If I do a refactoring it occurs very often that I have to completely rewrite unit test because of massive logic changes. ... I do not see a way to write unit tests which do not have to change in a large refactoring.
There are multiple layers of testing, and some of those layers won't break even after big logic changes. Unit test, on the other hand, are meant to test the internals of methods and objects, and will need to change more often than that. There's nothing wrong, necessarily. It's just how things are.
Is [it] okay to break all dependencies using interfaces just to make a class testable?
It's definitely OK to design classes to be more testable. That's part of the purpose of TDD, after all.
It involves significant overhead at runtime because of many virtual calls instead of plain method invocations.
Just about every company has some list of rules that all employees are supposed to follow. The clueless companies simply list every good quality they can think of ("our employees are efficient, responsible, ethical, and never cut corners"). More intelligent companies actually rank their priorities. If somebody comes up with an unethical way to be efficient, does the company do it? The best companies not only print up brochures saying how the priorities are ranked, but they also make sure management follows the ranking.
It is entirely possible for a program to be efficient and easily testable. However there are times when you need to choose which is more important. This is one of those times. I don't know how important efficiency is to you and your program, but you do. So "would you rather have a slow, well-tested program, or a fast program without total test coverage?"

It involves significant overhead at
runtime because of many virtual calls
instead of plain method invocations.
Remember that it is only a virtual call overhead if you access the method through a pointer (or ref.) to you interface or object. If you access the method through a concrete object in the stack, it will not have the virtual overhead, and it can even be inlined.
Also, never assume that this overhead is big before profiling your code. Almost always, a virtual call is worthless if you compare to what your method is doing. (Most of the penalty comes from not being possible to inline a one-line method, not from the extra indirection of the call).

Related

Goal of unit testing and TDD: find/minimize bugs or improve design?

I'm fairly green to unit testing and TDD, so please bear with me as I ask what some may consider newbie questions, or if this has been debated before. If this turns out to be considered a "bad question" (too subjective and open for debate), I will happily close it. However, I've searched for a couple days, and am not getting a definitive answer, and I need a better understand of this, so I know no better way to get more info than to post here.
I've started reading an older book on unit testing (because a colleague had it on hand), and its opening chapter talks about why to unit test. One of the points it makes is that in the long run, your code is much more reliable and cleaner, and less prone to bugs. It also points out that effective unit testing will make tracking and fixing bugs much easier. So it seems to focus quite a bit on the overall prevention/reduction of bugs in your code.
On the other hand, I also found an article about writing great unit tests, and it states that the goal of unit testing is to make your design more robust, and conversely, finding bugs is the goal of manual testing, not unit testing.
So being the newbie to TDD that I am, I'm a little confused as to the state of mind with which I should go into TDD and building my unit tests. I'll admit that part of the reason I'm taking this on now with my recently started project is because I'm tired of my changes breaking previously existing code. And admittedly, the linked article above does at least point this out as an advantage to TDD. But my hope is that by going back in and adding unit tests to my existing code (and then continuing TDD from this point forward) is to help prevent these bugs in the first place.
Are this book and this article really saying the same thing in different tones, or is there some subjectivity on this subject, and what I'm seeing is just two people having somewhat different views on how to approach TDD?
Thanks in advance.

Unit tests and automated tests generally are for both better design and verified code.
Unit test should test some execution path in some very small unit. This unit is usually public method or internal method exposed on your object. The method itself can still use many other protected or private methods from the same object instance. You can have single method and several unit test for this method to test different execution paths. (By execution path I meant something controlled by if, switch, etc.) Writing unit tests this way will validate that your code really does what you expect. This can be especially important in some corner cases where you expect to throw exception in some rare scenarios etc. You can also test how method behaves if you pass different parameters - for example null instead of object instance, negative value for integer used for indexing, etc. That is especially useful for public API.
Now suppose that your tested method also uses instances of other classes. How to deal with it? Should you still test your single method and believe that class works? What if the class is not implemented yet? What if the class has some complex logic inside? Should you test these execution paths as well on your current method? There are two approaches to deal with this:
For some cases you will simply let the real class instance to be tested together with your method. This is for example very common in case of logging (it is not bad to have logs available for test as well).
For other scenarios you would like to take this dependencies from your method but how to do it? The solution is dependency injection and implementing against abstraction instead of implementation. What does it mean? It means that your method / class will not create instances of these dependencies but instead it will get them either through method parameters, class constructor or class properties. It also means that you will not expect concrete implementation but either abstract base class or interface. This will allow you to pass fake, dummy or mock implementation to your tested object. These special type of implementations simply don't do any processing they get some data and return expected result. This will allow you to test your method without dependencies and lead to much better and more extensible design.
What is the disadvantage? Once you start using fakes / mocks you are testing single method / class but you don't have a test which will grab all real implementations and put them together to test if the whole system really works = You can have thousands of unit tests and validate that each your method works but it doesn't mean they will work together. This is scenario for more complex tests - integration or end-to-end tests.
Unit tests should be usually very easy to write - if they are not it means that your design is probably complicated and you should think about refactoring. They should be also very fast to execute so you can run them very often. Other kinds of test can be more complex and very slow and they should run mostly on build server.
How it fits with SW development process? The worst part of development process is stabilization and bug fixing because this part can be very hardly estimated. To be able to estimate how much time bug fixing takes you must know what causes the bug. But this investigation cannot be estimated. You can have bug which will take one hour to fix but you will spend two weeks by debugging your application and searching for this bug. When using good code coverage you will most probably find such bug early during development.
Automated testing don't say that SW doesn't contain bugs. It only say that you did your best to find and solve them during development and because of that your stabilization could be much less painful and much shorter. It also doesn't say that your SW does what it should - that is more about application logic itself which must be tested by some separate tests going through each use case / user story - acceptance tests (they can be also automated).
How this fit with TDD? TDD takes it to extreme because in TDD you will write your test first to drive your quality, code coverage and design.

It's a false choice. "Find/minimize bugs" OR improve design.
TDD, in particular (and as opposed to "just" unit testing) is all about giving you better design.
And when your design is better, what are the consequences?
Your code is easier to read
Your code is easier to understand
Your code is easier to test
Your code is easier to reuse
Your code is easier to debug
Your code has fewer bugs in the first place
With well-designed code, you spend less time finding and fixing bugs, and more time adding features and polish. So TDD gives you a savings on bugs and bug-hunting, by giving you better design. These things are not separate; they are dependent and interrelated.

There can many different reasons why you might want to test your code. Personally, I test for a number of reasons:
I usually design API using a combination of the normal design patterns (top-down) and test-driven development (TDD; bottom-up) to ensure that I have a sound API both from a best practices point-of-view as well as from an actual usage point-of-view. The focus of the tests is both on the major use-cases for the API, but also on the completeness of the API and the behavior - so they are primary "black box" tests. The development sequence is often:
main API based on design patterns and "gut feeling"
TDD tests for the major use-cases according to the high-level specification for the API - primary in order to make sure the API is "natural" and easy to use
fleshed out API and behavior
all the needed test cases to ensure the completeness and correct behavior
Whenever I fix an error in my code, I try to write a test to make sure it stay fixed. Somehow, the error got into my original design and passed my original testing of the code, so it is probably not all that trivial. I have noticed that many of the tests tests are "write box" tests.
In order to be able to make any sort of major re-factoring of the code, you need an extensive set of API tests to make sure the behavior of the code stays the same after the re-factoring. For any non-trivial API, I want the test suite to be in place and working for a long time before the re-factoring to be sure that all the major use-cases are covered in a good way. As often as not, you are forced to throw away most of your "white box" tests as they - by the very definition - makes too many assumptions about the internals. I usually try to "translate" as many as possible of these tests as the same non-trivial problems tend to survive re-factoring of the code.
In order to transfer any code between developers, I usually also want a good test suite with focus on the API and the major use-cases. So basically the tests from the initial TDD...

I think that answer to your question is: both.
You will improve design because there is one particular thing about TDD that is great: while you write tests you put yourself in the position of the client code that will be using the system under test - and this alone makes you think about certain design choices.
For example: UI. When you start writing the tests, you will see that those God-Forms are impossible to test, so you separate the logic behind the screens to a presenter/controller, and you get MVP/MVC/whatever.
Having the concept of unit testing a class and mocking dependencies brings you to Single Responsibility Principle. There is a point about every of SOLID principles.
As for bugs, well, if you unit test every method of every class you write (except properties, very simple methods and such) you will catch most bugs in the start. Write the integration tests, you cover almost all of them.

I'll take my stab at this using a remix of a previous answer I wrote. In short, I don't see this as a dichotomy between driving good design and minimizing bugs. I see it more as one (good design) leading to the other (minimizing bugs).
I tend towards saying TDD is a design process that happens to involve unit testing. It's a design process because within each Red-Green-Refactor iteration, you write the test first for code that doesn't exist. You're designing as you're going.
The first beauty of TDD is that the design of your code is guaranteed to be testable. Testable code tends to have loose coupling and high cohesion. Loose coupling and high cohesion are important because they make the code easy to change when requirements change. The second beauty of TDD is that after you're done implementing your system, you happen to have a huge regression suite to catch any bugs and changes in assumptions. Thus, TDD makes your code easy to change because of the design it creates and it makes your code safe to change because of the test harness it creates.

Trying to retrospectively add Unit tests can be quite painful and expensive. If the code doesn't support Unit test you may be better looking at integration tests to test your code.

Don't mix Unit Testing with TDD.
Unit Testing is just the fact of "testing" your code to ensure quality and maintainability.
TDD is a full blown development methodology in which you first write your tests (based on requirements), and only then you write the needed code (and just the needed code) to make that test pass. This means that you only write code to repair a broken test.
Once done that, you write another test, and the code needed to make it pass. In the way, you may be forced to do "refactoring" of the code to allow a new test run without braking another. This way, the "design" arises from the tests.
The purpose of this methodology is of course reduce bugs and improve design, but the main goal of it is to improve productivity because you write exactly the code you need. And you don't write documentation: the tests are the documentation. If a requirement changes, then you change the tests and the code afterwards. If new requirements appear, just add new tests.

TDD with unclear requirements

I know that TDD helps a lot and I like this method of development when you first create a test and then implement the functionality. It is very clear and correct way.
But due to some flavour of my projects it often happens that when I start to develop some module I know very little about what I want and how it will look at the end. The requirements appear as I develop, there may be 2 or 3 iterations when I delete all or part of the old code and write new.
I see two problems:
1. I want to see the result as soon as possible to understand are my ideas right or wrong. Unit tests slow down this process. So it often happens that I write unit tests after the code is finished what is known to be a bad pattern.
2. If I first write the tests I need to rewrite not only the code twice or more times but also the tests. It takes much time.
Could someone please tell me how can TDD be applied in such situation?
Thanks in advance!

I want to see the result as soon as possible to understand are my ideas right or wrong. Unit tests slow down this process.
I disagree. Unit tests and TDD can often speed up getting results because they force you to concentrate on the results rather than implementing tons of code that you might never need. It also allows you to run the different parts of your code as you write them so you can constantly see what results you are getting, rather than having to wait until your entire program is finished.

I find that TDD works particularly well in this kind of situation; in fact, I would say that having unclear and/or changing requirements is actually very common.
I find that the best uses of TDD is ensuring that your code is doing what you expect it to do. When you're writing any code, you should know what you want it to do, whether the requirements are clear or not. The strength of TDD here is that if there is a change in the requirements, you can simply change one or more of your unit tests to reflect the changed requirements, and then update your code while being sure that you're not breaking other (unchanged) functionality.
I think that one thing that trips up a lot of people with TDD is the assumption that all tests need to be written ahead of time. I think it's more effective to use the rule of thumb that you never write any implementation code while all of your tests are passing; this simply ensures that all code is covered, while also ensuring that you're checking that all code does what you want it to do without worrying about writing all your tests up front.

IMHO, your main problem is when you have to delete some code. This is waste and this is what shall be addressed first.
Perhaps you could prototype, or utilize "spike solutions" to validate the requirements and your ideas then apply TDD on the real code, once the requirements are stable.
The risk is to apply this and to have to ship the prototype.
Also you could test-drive the "sunny path" first and only implement the remaining such as error handling ... after the requirements have been pinned down. However the second phase of the implementation will be less motivating.
What development process are you using ? It sounds agile as you're having iterations, but not in an environment that fully supports it.

TDD will, for just about anybody, slow down initial development. So, if initial development speed is 10 on a 1-10 scale, with TDD you might get around an 8 if you're proficient.
It's the development after that point that gets interesting. As projects get larger, development efficiency typically drops - often to 3 on the same scale. With TDD, it's very possible to still stay in the 7-8 range.
Look up "technical debt" for a good read. As far as I'm concerned, any code without unit tests is effectively technical debt.

TDD helps you to express the intent of your code. This means that writing the test, you have to say what you expect from your code. How your expectations are fulfilled is then secondary (this is the implementation). Ask yourself the question: "What is more important, the implementation, or what the provided functionality is?" If it is the implementation, then you don't have to write the tests. If it is the functionality provided then writing the tests first will help you with this.
Another valuable thing is that by TDD, you will not implement functionality that will not be needed. You only write code that needs to satisfy the intent. This is also called YAGNI (You aint gonna need it).

There's no getting away from it - if you're measuring how long it takes to code just by how long it takes you to write classes, etc, then it'll take longer with TDD. If you're experienced it'll add about 15%, if you're new it'll take at least 60% longer if not more.
BUT, overall you'll be quicker. Why?
by writing a test first you're specifying what you want and delivering just that and nothing more - hence saving time writing unused code
without tests, you might think that the results are so obvious that what you've done is correct - when it isn't. Tests demonstrate that what you've done is correct.
you will get faster feedback from automated tests than by doing manual testing
with manual testing the time taken to test everything as your application grows increases rapidly - which means you'll stop doing it
with manual tests it's easy to make mistakes and 'see' something passing when it isn't, this is especially true if you're running them again and again and again
(good) unit tests give you a second client to your code which often highlights design problems that you might miss otherwise
Add all this up and if you measure from inception to delivery and TDD is much, much faster - you get fewer defects, you're taking fewer risks, you progress at a steady rate (which makes estimation easier) and the list goes on.
TDD will make you faster, no question, but it isn't easy and you should allow yourself some space to learn and not get disheartened if initially it seems slower.
Finally you should look at some techniques from BDD to enhance what you're doing with TDD. Begin with the feature you want to implement and drive down into the code from there by pulling out stories and then scenarios. Concentrate on implementing your solution scenario by scenario in thin vertical slices. Doing this will help clarify the requirements.

Using TDD could actually make you write code faster - not being able to write a test for a specific scenario could mean that there is an issue in the requirements.
When you TDD you should find these problematic places faster instead of after writing 80% of your code.
There are a few things you can do to make your tests more resistant to change:
You should try to reuse code inside
your tests in a form of factory
methods that creates your test
objects along with verify methods
that checks the test result. This
way if some major behavior change
occurs in your code you have less
code to change in your test.
Use IoC container instead of passing
arguments to your main classes -
again if the method signature
changes you do not need to change
all of your tests.
Make your unit tests short and Isolated - each test should check only one aspect of your code and use Mocking/Isolation framework to make the test independent of external objects.
Test and write code for only the required feature (YAGNI). Try to ask yourself what value my customer will receive from the code I'm writing. Don't create overcomplicated architecture instead create the needed functionality piece by piece while refactoring your code as you go.

Here's a blog post I found potent in explaining the use of TDD on a very iterative design process scale: http://blog.extracheese.org/2009/11/how_i_started_tdd.html.

Joshua Block commented on something similar in the book "Coders at work". His advice was to write examples of how the API would be used (about a page in length). Then think about the examples and the API a lot and refactor the API. Then write the specification and the unit tests. Be prepared, however, to refactor the API and rewrite the spec as you implement the API.

When I deal with unclear requirements, I know that my code will need to change. Having solid tests helps me feel more comfortable changing my code. Practising TDD helps me write solid tests, and so that's why I do it.
Although TDD is primarily a design technique, it has one great benefit in your situation: it encourages the programmer to consider details and concrete scenarios. When I do this, I notice that I find gaps or misunderstandings or lack of clarity in requirements quite quickly. The act of trying to write tests forces me to deal with the lack of clarity in the requirements, rather than trying to sweep those difficulties under the rug.
So when I have unclear requirements, I practise TDD both because it helps me identify the specific requirements issues that I need to address, but also because it encourages me to write code that I find easier to change as I understand more about what I need to build.

In this early prototype-phase I find it to be enough to write testable code. That is, when you write your code, think of how to make it possible to test, but for now, focus on the code itself and not the tests.
You should have the tests in place when you commit something though.

Writing "unit testable" code?

What kind of practices do you use to make your code more unit testing friendly?

TDD -- write the tests first, forces
you to think about testability and
helps write the code that is actually
needed, not what you think you may
need
Refactoring to interfaces -- makes
mocking easier
Public methods virtual if not using
interfaces -- makes mocking easier
Dependency injection -- makes mocking
easier
Smaller, more targeted methods --
tests are more focused, easier to
write
Avoidance of static classes
Avoid singletons, except where
necessary
Avoid sealed classes

Dependency injection seems to help.

Write the tests first - that way, the tests drive your design.

Use TDD
When writing you code, utilise dependency injection wherever possible
Program to interfaces, not concrete classes, so you can substitute mock implementations.

Make sure all of your classes follow the Single Responsibility Principle. Single responsibility means that each class should have one and only one responsibility. That makes unit testing much easier.

I'm sure I'll be down voted for this, but I'm going to voice the opinion anyway :)
While many of the suggestions here have been good, I think it needs to be tempered a bit. The goal is to write more robust software that is changeable and maintainable.
The goal is not to have code that is unit testable. There's a lot of effort put into making code more "testable" despite the fact that testable code is not the goal. It sounds really nice and I'm sure it gives people the warm fuzzies, but the truth is all of those techniques, frameworks, tests, etc, come at a cost.
They cost time in training, maintenance, productivity overhead, etc. Sometimes it's worth it, sometimes it isn't, but you should never put the blinders on and charge ahead with making your code more "testable".

When writing tests (as with any other software task) Don't Repeat Yourself (DRY principle). If you have test data that is useful for more then one test then put it someplace where both tests can use it. Don't copy the code into both tests. I know this seems obvious but I see it happen all the time.

I use Test-Driven Development whenever possible, so I don't have any code that cannot be unit tested. It wouldn't exist unless the unit test existed first.

The easiest way is don't check in your code unless you check in tests with it.
I'm not a huge fan of writing the tests first. But one thing I believe very strongly in is that code must be checked in with tests. Not even an hour or so before, togther. I think the order in which they are written is less important as long as they come in together.

Small, highly cohesive methods. I learn it the hard way. Imagine you have a public method that handles authentication. Maybe you did TDD, but if the method is big, it will be hard to debug. Instead, if that #authenticate method does stuff in a more pseudo-codish kind of way, calling other small methods (maybe protected), when a bug shows up, it's easy to write new tests for those small methods and find the faulty one.

And something that you learn the first thing in OOP, but so many seems to forget: Code Against Interfaces, Not Implementations.

Spend some time refactoring untestable code to make it testable. Write the tests and get 95% coverage. Doing that taught me all I need to know about writing testable code. I'm not opposed to TDD, but learning the specifics of what makes code testable or untestable helps you to think about testability at design time.

Don't write untestable code

1.Using a framework/pattern like MVC to separate your UI from you
business logic will help a lot.
2. Use dependency injection so you can create mock test objects.
3. Use interfaces.

Check up this talk Automated Testing Patterns and Smells.
One of the main take aways for me, was to make sure that the UnitTest code is in high quality. If the code is well documented and well written, everyone will be motivated to keep this up.

No Statics - you can't mock out statics.
Also google has a tool that will measure the testability of your code...

I'm continually trying to find a process where unit testing is less of a chore and something that I actually WANT to do. In my experience, a pretty big factor is your tools. I do a lot of ActionScript work and sadly, the tools are somewhat limited, such as no IDE integration and lack of more advanced mocking frameworks (but good things are a-coming, so no complaints here!). I've done test driven development before with more mature testing frameworks and it was definately a more pleasurable experience, but still felt like somewhat of a chore.
Recently however I started writing code in a different manner. I used to start with writing the test, watching them fail, writing code to make the test succeed, rinse and repeat and all that.
Now however, I start with writing interfaces, almost no matter what I'm going to do. At first I of course try to identify the problem and think of a solution. Then I start writing the interfaces to get a sort of abstract feel for the code and the communication. At that point, I usually realize that I haven't really figured out a proper solution to the problem at all as a result of me not fully understanding the problem. So I go back, revise the solution and revise my interfaces. When I feel that the interfaces reflect my solution, I actually start with writing the implementation, not the tests. When I have something implemented (draft implementationd, usually baby steps), I start testing it. I keep going back between testing and implementing, a few steps forward at a time. Since I have interfaces for everything, it's incredibly easy to inject mocks.
I find working like this, with classes having very little knowledge of other implementation and only talking to interfaces, is extremely liberating. It frees me from thinking about the implementation of another class and I can focus on the current unit. All I need to know is the contract that the interface provides.
But yeah, I'm still trying to work out a process that works super-fantastically-awesomely-well every time.
Oh, I also wanted to add that I don't write tests for everything. Vanilla properties that don't do much but get/set variables are useless to test. They are garuanteed by the language contract to work. If they don't I have way worse problems than my units not being testable.

To prepare your code to be testable:
Document your assumptions and exclusions.
Avoid large complex classes that do more than one thing - keep the single responsibility principle in mind.
When possible, use interfaces to decouple interactions and allow mock objects to be injected.
When possible, make pubic method virtual to allow mock objects to emulate them.
When possible, use composition rather than inheritance in your designs - this also encourages (and supports) encapsulation of behaviors into interfaces.
When possible, use dependency injection libraries (or DI practices) to provide instances with their external dependencies.
To get the most out of your unit tests, consider the following:
Educate yourself and your development team about the capabilities of the unit testing framework, mocking libraries, and testing tools you intend to use. Understanding what they can and cannot do will be essential when you actually begin writing your tests.
Plan out your tests before you begin writing them. Identify the edge cases, constraints, preconditions, postconditions, and exclusions that you want to include in your tests.
Fix broken tests as near to when you discover them as possible. Tests help you uncover defects and potential problems in your code. If your tests are broken, you open the door to having to fix more things later.
If you follow a code review process in your team, code review your unit tests as well. Unit tests are as much a part of your system as any other code - reviews help to identify weaknesses in the tests just as they would for system code.

You don't necessarily need to "make your code more unit testing friendly".
Instead, a mocking toolkit can be used to make testability concerns go away.
One such toolkit is JMockit.

How do you refactor?

I was wondering how other developers begin refactoring. What is your first step? How this process (refactoring) differ if you refactor code which is not yours? Do you write tests while refactoring?

do not refactor anything non-trivial that does not already have unit tests
write unit tests, then refactor
refactor small pieces and re-run the tests frequently
stop refactoring when the code is DRY* clean
* DRY = Don't Repeat Yourself

What is your first step?
The first step is to run the unit tests to make sure they all pass. Indeed, you can waste a large amount of time looking for which of your changes broke the a test if it was already broken before your modify the code.
How this process differ if you refactor code which is not yours ?
I'm certainly doing smaller steps when refactoring code I didn't write (or code I wrote a long time ago). I may also verify the test coverage before to proceed, to avoid relying on unit tests that always pass ... but that do no test the area I'm working on.
Do you write tests while refactoring ?
I usually don't, but I may add new tests in the following circumstances (list not exhaustive) :
idea of a new test sparkles in my mind ("what happen if ... ?" - write
a test to know)
discover hole in the test coverage
it also depends on the refactoring being performed. When extracting a function, I may create a new test if it can be invoked a different way as it was before.
Here are some general advice:
First thing is to maintain a list of the code smells noticed while working on the code. This allows freeing one's mind from the burden of remembering what was seen in the code. Also,
The golden rule is never refactor when the the unit tests do not pass completely.
Refactor when the code is stable, before adding something you know will be impacted by a future refactoring, before to integrate and above all before to say your done.
In the absence of unit tests, you'll have to put the part of code you want to refactor under test. If unit tests are too hard to retrofit, as it usually is the case, then you can create characterization tests, as recommended per Michael Feathers in Working Effectively with Legacy Code. In short they are end-to-end tests that allow you to pin down the current behavior of the code (which is not assumed to be working perfectly all the time).
Don't be afraid to do baby steps. Do not do two things at the same time. If you remark something requiring refactoring, note it down, do not fix it right now, even if it seem very easy.
Check in very often, when the tests pass. So that you can revert a bad refactoring without losing what was done before.
Keep on mind refactoring does not add value to your customer (this can be discussed), but the customer does not pay you to refactor. One rule of thumb is to refactor prior making changes or adding new capabilities to the code.

Read Martin Fowler's book "Refactoring"
BTW - that's Martin Fowler's own Amazon exec link, if you're wondering :)

I take crap and make it less crappy. :-)
Seriously. I don't refactor to create new functionality. Refactoring occurs before new stuff. If there are no tests I write tests to make sure that I'm not breaking anything with my refactoring. If there are tests, I use those. If the tests are insufficient, I might write more tests but I would consider this separate from the refactoring and do it first.
The first step for me is to notice that I can abstract something and make it more general (and useful in other places that need the functionality now), or I notice that something is bad and could be better (subjective). I don't refactor to generality without a reason. YAGNI principle applies.
We have the concept of shared ownership so the code is always mine -- I may not have written it, but I don't consider that when refactoring. I may seek to understand things before I decide it needs refactoring if the purpose isn't clear -- although that's almost always a reason to refactor in and of itself.

Depends very much on my goals. As has been said, you need unit tests to make sure your refactoring hasn't broken anything, and if it has you have to commit time to fixing it. For many situations, I test the existing solution, and if it works, wrap it rather than refactoring it, as this minimises the possibility of a break.
If I have to refactor, for example I recently had to port a bunch of ASCII based C++ to UNICODE, I tend to make sure I have good regression tests that work at end-user as well as unit level. Again, I try to use tools rather than manually refactoring, as this is less prone to error, and the errors you do get are systematic rather than random.

For me first thing is make sure the code hits all of the best practices of our office.
For example, using strict, warnings and taint for our Perl scripts.
If there are efficienry or speed troubles focus on them.
Things like find a better algorithm, or find a better way to to do what the quadruply nested for loop is doing.
And lastly see if there is a way to make the code more readable. This is normally accomplished by by turning 5 little scripts that do similar things into 1 module(class).

I refactor whilst writing new code, using unit tests. I'll also refactor old code, either mine or someone else's, if methods are too long, or variables named badly, or I spot duplication etc.

Start with getting unit tests, and then use automated refactoring tools. If the refactoring can't be automated then it's not truly a mechanical transformation of the code and so isn't a refactoring. The unit tests are to make sure that you are truly just performing mechanical transformations from one codebase to an equivalent one.

Refactoring without Unit Test is dangerous. Always have unit test. If you change something without having good testing you might be safe for some portion of the code but something elsewhere might not have the same behavior. With Unit Testing you protect any change.
Refactoring other code is fine too but the extreme is not. It's normal that someone else do not program like you do. It's not "kind" to change stuff because you would have did it in the other way. Just refactoring if it's really necessary.

I remove duplication, which unifies the thought patterns inherent in the code. Refactoring needs to achieve those two things. If you have code that does the same thing twice, refactor it to a common location, unifying the abstraction. If you have the same literal in three places, put it in a constant, unifying the purpose. If you have the same group of arguments, ensure they're always used in the same order or, better yet, put them in a common structure, unifying information groups.

I'm a lot more reluctant to refactor code written by others than to refactor my own.
If it's been written by one of my predecessors, I generally refactor only within a function. E.g. I might replace an if statement with a switch. Anything much bigger than that is usually out of scope and not within budget.
For my own code, I usually refactor as I'm writing, whenever something looks ugly or starts to smell. It's a lot easier to fix it now instead of waiting for it to cause problems down the road.

I agree with the other posters when you're refactoring code you wrote.
If it's code you didn't write, and especially if there's lots of it, I would start by using tools like fxCop, Visual Studio's Code Analysis, DevPartner -- I'm sure there are other good ones. They would give you ideas about where to start and what the most common coding issues are. I would also do stress testing to see where the bottlenecks are, therefore the greatest return on your effort at improving the code.
I love to refactor my code, but it is possible to overdo it. If you aren't really improving the app's performance, or seriously improving code readability, you should probably stop. There is always the possibility of introducing new bugs when you refactor, especially if you're working without unit tests.

First step: Identify a code smell.
Second step: Consider alternative implementations and what the trade offs are and which do I accept in terms of which is "better."
Third step: Implement better solution.
This doesn't differ if the code is mine or not as sometimes I may go back over code I wrote months or years ago and it'll look like code from someone else. I may write tests if I'm making new methods or there aren't adequate tests to the code, IMO.

General approach
I may start out by peeking at the software (component) from bird-view. Dependency analysis and graphing tools are a great help here (see later sections). I look for a circle in either the package-level or the class-level dependencies, and also alternatively for classes with too many dependencies. These are good candidates for refactoring.
Tools of the trade
Dependency analysis tools:
Python
Java

Is duplicated code more tolerable in unit tests?

I ruined several unit tests some time ago when I went through and refactored them to make them more DRY--the intent of each test was no longer clear. It seems there is a trade-off between tests' readability and maintainability. If I leave duplicated code in unit tests, they're more readable, but then if I change the SUT, I'll have to track down and change each copy of the duplicated code.
Do you agree that this trade-off exists? If so, do you prefer your tests to be readable, or maintainable?

Readability is more important for tests. If a test fails, you want the problem to be obvious. The developer shouldn't have to wade through a lot of heavily factored test code to determine exactly what failed. You don't want your test code to become so complex that you need to write unit-test-tests.
However, eliminating duplication is usually a good thing, as long as it doesn't obscure anything, and eliminating the duplication in your tests may lead to a better API. Just make sure you don't go past the point of diminishing returns.

Duplicated code is a smell in unit test code just as much as in other code. If you have duplicated code in tests, it makes it harder to refactor the implementation code because you have a disproportionate number of tests to update. Tests should help you refactor with confidence, rather than be a large burden that impedes your work on the code being tested.
If the duplication is in fixture set up, consider making more use of the setUp method or providing more (or more flexible) Creation Methods.
If the duplication is in the code manipulating the SUT, then ask yourself why multiple so-called “unit” tests are exercising the exact same functionality.
If the duplication is in the assertions, then perhaps you need some Custom Assertions. For example, if multiple tests have a string of assertions like:
assertEqual('Joe', person.getFirstName())
assertEqual('Bloggs', person.getLastName())
assertEqual(23, person.getAge())
Then perhaps you need a single assertPersonEqual method, so that you can write assertPersonEqual(Person('Joe', 'Bloggs', 23), person). (Or perhaps you simply need to overload the equality operator on Person.)
As you mention, it is important for test code to be readable. In particular, it is important that the intent of a test is clear. I find that if many tests look mostly the same, (e.g. three-quarters of the lines the same or virtually the same) it is hard to spot and recognise the significant differences without carefully reading and comparing them. So I find that refactoring to remove duplication helps readability, because every line of every test method is directly relevant to the purpose of the test. That's much more helpful for the reader than a random combination of lines that are directly relevant, and lines that are just boilerplate.
That said, sometimes tests are exercising complex situations that are similiar but still significantly different, and it is hard to find a good way to reduce the duplication. Use common sense: if you feel the tests are readable and make their intent clear, and you're comfortable with perhaps needing to update more than a theoretically minimal number of tests when refactoring the code invoked by the tests, then accept the imperfection and move on to something more productive. You can always come back and refactor the tests later, when inspiration strikes!

Implementation code and tests are different animals and factoring rules apply differently to them.
Duplicated code or structure is always a smell in implementation code. When you start having boilerplate in implementation, you need to revise your abstractions.
On the other hand, testing code must maintain a level of duplication. Duplication in test code achieves two goals:
Keeping tests decoupled. Excessive test coupling can make it hard to change a single failing test that needs updating because the contract has changed.
Keeping the tests meaningful in isolation. When a single test is failing, it must be reasonably straightforward to find out exactly what it is testing.
I tend to ignore trivial duplication in test code as long as each test method stays shorter than about 20 lines. I like when the setup-run-verify rhythm is apparent in test methods.
When duplication creeps up in the "verify" part of tests, it is often beneficial to define custom assertion methods. Of course, those methods must still test a clearly identified relation that can be made apparent in the method name: assertPegFitsInHole -> good, assertPegIsGood -> bad.
When test methods grow long and repetitive I sometimes find it useful to define fill-in-the-blanks test templates that take a few parameters. Then the actual test methods are reduced to a call to the template method with the appropriate parameters.
As for a lot of things in programming and testing, there is no clear-cut answer. You need to develop a taste, and the best way to do so is to make mistakes.

You can reduce repetition using several different flavours of test utility methods.
I'm more tolerant of repetition in test code than in production code, but I have been frustrated by it sometimes. When you change a class's design and you have to go back and tweak 10 different test methods that all do the same setup steps, it's frustrating.

I agree. The trade off exists but is different in different places.
I'm more likely to refactor duplicated code for setting up state. But less likely to refactor the part of the test that actually exercises the code. That said, if exercising the code always takes several lines of code then I might think that is a smell and refactor the actual code under test. And that will improve readability and maintainability of both the code and the tests.

Jay Fields coined the phrase that "DSLs should be DAMP, not DRY", where DAMP means descriptive and meaningful phrases. I think the same applies to tests, too. Obviously, too much duplication is bad. But removing duplication at all costs is even worse. Tests should act as intent-revealing specifications. If, for example, you specify the same feature from several different angles, then a certain amount of duplication is to be expected.

"refactored them to make them more DRY--the intent of each test was no longer clear"
It sounds like you had trouble doing the refactoring. I'm just guessing, but if it wound up less clear, doesn't that mean you still have more work to do so that you have reasonably elegant tests which are perfectly clear?
That's why tests are a subclass of UnitTest -- so you can design good test suites that are correct, easy to validate and clear.
In the olden times we had testing tools that used different programming languages. It was hard (or impossible) to design pleasant, easy-to-work with tests.
You have the full power of -- whatever language you're using -- Python, Java, C# -- so use that language well. You can achieve good-looking test code that's clear and not too redundant. There's no trade-off.

I LOVE rspec because of this:
It has 2 things to help -
shared example groups for testing common behaviour.
you can define a set of tests, then 'include' that set in your real tests.
nested contexts.
you can essentially have a 'setup' and 'teardown' method for a specific subset of your tests, not just every one in the class.
The sooner that .NET/Java/other test frameworks adopt these methods, the better (or you could use IronRuby or JRuby to write your tests, which I personally think is the better option)

I feel that test code requires a similar level of engineering that would normally be applied to production code. There can certainly be arguments made in favor of readability and I would agree that's important.
In my experience, however, I find that well-factored tests are easier to read and understand. If there's 5 tests that each look the same except for one variable that's changed and the assertion at the end, it can be very difficult to find what that single differing item is. Similarly, if it is factored so that only the variable that's changing is visible and the assertion, then it's easy to figure out what the test is doing immediately.
Finding the right level of abstraction when testing can be difficult and I feel it is worth doing.

I don't think there is a relation between more duplicated and readable code. I think your test code should be as good as your other code. Non-repeating code is more readable then duplicated code when done well.

Ideally, unit tests shouldn't change much once they are written so I would lean towards readability.
Having unit tests be as discrete as possible also helps to keep the tests focused on the specific functionality that they are targeting.
With that said, I do tend to try and reuse certain pieces of code that I wind up using over and over, such as setup code that is exactly the same across a set of tests.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js