How do I break this down into Unit Tests?

How do I break this down into Unit Tests? - unit-testing

I have a method that is called on an object to perform some business logic and add it to the database.
The object is a Transaction, and part of the business logic requires searching the databses for related accounts and history items on the account.
There are then a series of comparisons and operations that need to bring back information from the account and apply it to the transaction before the transaction is then passed on to other people and written to the database.
The only way I can think of for testing this currently is within the test to create an account and the relevant history information, then to construct a transaction for each different scenario and capture the information written to the DB for the transaction and information being passed on, however this feels like its testing way too much in one test. Each scenario would be performed in a separate unit test, with the test construction refactored out into separate methods, but the actual piece of code targetted by the test is over 500 lines long.
I guess this question is more about refactoring than unit testing, but in this case they go hand in hand.
If anyone has any advice (good or bad) then I'd be glad to hear it.
EDIT:
Pseudo code:
Find account for transaction
Do validation on transaction codes and values
Update transaction with info from account
Get related history from account Handle different transaction codes and values (6 different combinations, each with different logic)
Update the transaction again with new account info (resulting from business logic)
Send transaction to clients

I would appreciate it if you had some pseudocode on this question, but just following it over I would:
Create interfaces for the data access objects that directly access the database - this way you can pass in an object that only pretends (e.g., mocks) that it accesses the database. This object would then return results consistent with the results your database would return, without actually doing any DB call. Your object could also simulate scenarios such as rolling back data to its original state.
Extract each "scenario" into a single method each - that is the essence of a unit. If your method is 500 lines long then there must be contiguous blocks in there that can be extracted. Write a unit test for each, if appropriate.
If your unit test is testing too much, that probably means your method is doing too much - You can extract methods by identifying the different things you are testing and then putting them in their own methods. Rinse and repeat until you only need one test for each method.
Transactions "passed on to other people" sounds like a code smell - a transaction in and by itself should only be one contiguous unit. If you need different users to finish your transaction, you're doing too much; keep track of your data's state on the DB instead, in terms of flags or such, not in terms of a DB transaction.

Separating out units from existing legacy code can be extremely tricky and time consuming. Check out Working Effectively With Legacy Code for a variety of tried and tested techniques to make things more manageable.

This depends on what you would like to test. Would you like to test the database transaction? Would you like to test the business transaction or something else? Try to use mockups for things you would not like to test. With mockups you can concentrate on certain test objectives.

Yes you -can- rewrite you current code so that it can be unit tested according
to all guidelines and best-practices.
However, that can be expensive, and You should estimate the cost and compare that
against the earnings...
The earnings is that you might discover a problem with the code and also, if done right, the reduction of the complexity as the result of the refactoring.
Both factors might save some time - in the future.
The cost is the time and effort you have to spend both refactor your code,
writing the test cases and also the extra time you might have to spend in the future
to maintain the test-cases and the mocking code - and that can be significant costs.
You are comparing a known cost against a future risk and I am sure a lot of smart guys knows how to do that, but it's obvious that You can actually spend an infinite time refactoring and mocking without ever reducing the risk of failure to zero (or even at all if the code and problem is complex and you are messing things up when refactoring), so you need to find a balance here.
In this case, as the code is old, it might be ok to be "sloppy" or "pragmatic" and do black box testing - or top-down testing and just test the interface (or abstraction) without bothering to mock the database. And yes, you can argue that this is not a unit test but instead a system test or a function test practice.
...But, it might give the best value for your money - or your employers/customers money - or more time with your significant other (or at least more time to watch discovery channel.)
If you have old code, allow black box testing, allow dependencies between the tests and compile a sequence of test that sets up the test data and manipulates it, and its at least tested automatically while not tested 100%.

Related

is it ok to write test cases for save/update/persist methods - whether it be mock or by calling real methods [duplicate]

I know what the advantages are and I use fake data when I am working with more complex systems.
What if I am developing something simple and I can easily set up my environment in a real database and the data being accessed is so small that the access time is not a factor, and I am only running a few tests.
Is it still important to create fake data or can I forget the extra coding and skip right to the real thing?
When I said real database I do not mean a production database, I mean a test database, but using a real live DBMS and the same schema as the real database.

The reasons to use fake data instead of a real DB are:
Speed. If your tests are slow you aren't going to run them. Mocking the DB can make your tests run much faster than they otherwise might.
Control. Your tests need to be the sole source of your test data. When you use fake data, your tests choose which fakes you will be using. So there is no chance that your tests are spoiled because someone left the DB in an unfamiliar state.
Order Independence. We want our tests to be runnable in any order at all. The input of one test should not depend on the output of another. When your tests control the test data, the tests can be independent of each other.
Environment Independence. Your tests should be runnable in any environment. You should be able to run them while on the train, or in a plane, or at home, or at work. They should not depend on external services. When you use fake data, you don't need an external DB.
Now, if you are building a small little application, and by using a real DB (like MySQL) you can achieve the above goals, then by all means use the DB. I do. But make no mistake, as your application grows you will eventually be faced with the need to mock out the DB. That's OK, do it when you need to. YAGNI. Just make sure you DO do it WHEN you need to. If you let it go, you'll pay.

It sort of depends what you want to test. Often you want to test the actual logic in your code not the data in the database, so setting up a complete database just to run your tests is a waste of time.
Also consider the amount of work that goes into maintaining your tests and testdatabase. Testing your code with a database often means your are testing your application as a whole instead of the different parts in isolation. This often result in a lot of work keeping both the database and tests in sync.
And the last problem is that the test should run in isolation so each test should either run on its own version of the database or leave it in exactly the same state as it was before the test ran. This includes the state after a failed test.
Having said that, if you really want to test on your database you can. There are tools that help setting up and tearing down a database, like dbunit.
I've seen people trying to create unit test like this, but almost always it turns out to be much more work then it is actually worth. Most abandoned it halfway during the project, most abandoning ttd completely during the project, thinking the experience transfer to unit testing in general.
So I would recommend keeping tests simple and isolated and encapsulate your code good enough it becomes possible to test your code in isolation.

As far as the Real DB does not get in your way, and you can go faster that way, I would be pragmatic and go for it.
In unit-test, the "test" is more important than the "unit".

I think it depends on whether your queries are fixed inside the repository (the better option, IMO), or whether the repository exposes composable queries; for example - if you have a repository method:
IQueryable<Customer> GetCustomers() {...}
Then your UI could request:
var foo = GetCustomers().Where(x=>SomeUnmappedFunction(x));
bool SomeUnmappedFunction(Customer customer) {
return customer.RegionId == 12345 && customer.Name.StartsWith("foo");
}
This will pass for an object-based fake repo, but will fail for actual db implementations. Of course, you can nullify this by having the repository handle all queries internally (no external composition); for example:
Customer[] GetCustomers(int? regionId, string nameStartsWith, ...) {...}
Because this can't be composed, you can check the DB and the UI independently. With composable queries, you are forced to use integration tests throughout if you want it to be useful.

It rather depends on whether the DB is automatically set up by the test, also whether the database is isolated from other developers.
At the moment it may not be a problem (e.g. only one developer). However (for manual database setup) setting up the database is an extra impediment for running tests, and this is a very bad thing.

If you're just writing a simple one-off application that you absolutely know will not grow, I think a lot of "best practices" just go right out the window.
You don't need to use DI/IOC or have unit tests or mock out your db access if all you're writing is a simple "Contact Us" form. However, where to draw the line between a "simple" app and a "complex" one is difficult.
In other words, use your best judgment as there is no hard-and-set answer to this.

It is ok to do that for the scenario, as long as you don't see them as "unit" tests. Those would be integration tests. You also want to consider if you will be manually testing through the UI again and again, as you might just automated your smoke tests instead. Given that, you might even consider not doing the integration tests at all, and just work at the functional/ui tests level (as they will already be covering the integration).
As others as pointed out, it is hard to draw the line on complex/non complex, and you would usually now when it is too late :(. If you are already used to doing them, I am sure you won't get much overhead. If that were not the case, you could learn from it :)

Assuming that you want to automate this, the most important thing is that you can programmatically generate your initial condition. It sounds like that's the case, and even better you're testing real world data.
However, there are a few drawbacks:
Your real database might not cover certain conditions in your code. If you have fake data, you cause that behavior to happen.
And as you point out, you have a simple application; when it becomes less simple, you'll want to have tests that you can categorize as unit tests and system tests. The unit tests should target a simple piece of functionality, which will be much easier to do with fake data.

One advantage of fake repositories is that your regression / unit testing is consistent since you can expect the same results for the same queries. This makes it easier to build certain unit tests.
There are several disadvantages if your code (if not read-query only) modifies data:
- If you have an error in your code (which is probably why you're testing), you could end up breaking the production database. Even if you didn't break it.
- if the production database changes over time and especially while your code is executing, you may lose track of the test materials that you added and have a hard time later cleaning it out of the database.
- Production queries from other systems accessing the database may treat your test data as real data and this can corrupt results of important business processes somewhere down the road. For example, even if you marked your data with a certain flag or prefix, can you assure that anyone accessing the database will adhere to this schema?
Also, some databases are regulated by privacy laws, so depending on your contract and who owns the main DB, you may or may not be legally allowed to access real data.
If you need to run on a production database, I would recommend running on a copy which you can easily create during of-peak hours.

It's a really simple application, and you can't see it growing, I see no problem running your tests on a real DB. If, however, you think this application will grow, it's important that you account for that in your tests.
Keep everything as simple as you can, and if you require more flexible testing later on, make it so. Plan ahead though, because you don't want to have a huge application in 3 years that relies on old and hacky (for a large application) tests.

The downsides to running tests against your database is lack of speed and the complexity for setting up your database state before running tests.
If you have control over this there is no problem in running the tests directly against the database; it's actually a good approach because it simulates your final product better than running against fake data. The key is to have a pragmatic approach and see best practice as guidelines and not rules.

Best practice approach for automated testing

This is a very strange request for advice for which I truly feel there is no real answer. In my project I have archiving routines on various objects that have been consumed for logical calculations, I archive these items for the sake of audit trail and to check up on calculation errors or prove correctiveness at a later stage. I am working with Entity Framework and things are slightly different to perhaps your own project.
I consume the original object, modify it directly, create a clone of the modified item, revert the original item from store and save changes accordingly. An object is not reverted to original if never consumed by a calculation, in these instances, I save directly over that object along with the various relationships that exist with further objects.
This may sound long winded, but I assure you - it seems the easiest so far in terms of my workings with EF in my situation.
My trouble with these archiving routines is, that over time as I introduce further functionality - I sometimes, without knowing, break critical code to a point where I have to regression test the entire solution over, from beginning to end, to ensure that the archiving requirements remain intact.
Is there any unit test approach or automated methodology for testing these sorts of requirements. It would speed up deployment of packages cutting down on my own manual testing.
Any advice or links to simlar situations appreciated.

I think there are two pieces to this problem you are describing:
First you need some unit tests that you can build which will represent technical requirements of the system. Think of the unit tests as the rules which you have set up to technically accomplish the goal that the end user desires. In this way, I would craft unit tests that you can feel confident will break if a technical assumption you had made about the system fails because of a code change. Remember to keep the unit tests at the unit level so that you don't have a large amount of dependencies interacting to fail a test. A unit test should test exactly one thing. If you do this, when you make code changes you can run all your unit tests and immediately know what assumptions you had made about the system which are now not being met.
I would also set up some sort of integration functional tests which are automated. I think in your problem domain it would make sense to set up integrated tests which are similar to unit tests (you can use the same tool.) Here you will want to take bigger pieces of functionality, perhaps pipes which data flows through the system and test that the correct series of transformations occur on the data.

One best practice is to make sure the tests can be run in any order. You could separate the produce routines from the archive routines, perhaps by using "gold" data on the archive routing.

The number one best practice for unit tests is just do it! Beyond that, I'd like to recommend xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros.

TDD - top level function has too many mocks. Should I even bother testing it?

I have a .NET application with a web front-end, WCF Windows service back-end. The application is fairly simple - it takes some user input, sending it to the service. The service does this - takes the input (Excel spreadsheet), extracts the data items, checks SQL DB to make sure the items are not already existing - if they do not exist, we make a real-time request to a third party data vendor and retrieve the results, inserting them into the database. It does some logging along the way.
I have a Job class with a single public ctor and public Run() method. The ctor takes all the params, and the Run() method does all of the above logic. Each logical piece of functionality is split into a separate class - IParser does file parsing, IConnection does the interaction with the data vendor, IDataAccess does the data access, etc. The Job class has private instances of these interfaces, and uses DI to construct the actual implementations by default, but allows the class user to inject any interface.
In the real code, I use the default ctor. In my unit tests for the Run() method, I use all mock objects creating via NMock2.0. This Run() method is essentially the 'top level' function of this application.
Now here's my issue / question: the unit tests for this Run() method are crazy. I have three mock objects I'm sending into the ctor, and each mock object sets expectations on themselves. At the end I verify. I have a few different flows that the Run method can take, each flow having its own test - it could find everything is already in the database and not make a request to vendor... or an exception could be thrown and the job status could be set to 'failed'... OR we can have the case where we didn't have the data and needed to make the vendor request (so all those function calls would need to be made).
Now - before you yell at me and say 'your Run() method is too complicated!' - this Run method is only a mere 50 lines of code! (It does make calls to some private function; but the entire class is only 160 lines). Since all the 'real' logic is being done in the interfaces that are declared on this class. however, the biggest unit test on this function is 80 lines of code, with 13 calls to Expect.BLAH().. _
This makes re-factoring a huge pain. If I want to change this Run() method around, I have to go edit my three unit tests and add/remove/update Expect() calls. When I need to refactor, I end up spending more time creating my mock calls than I did actually writing the new code. And doing real TDD on this function makes it even more difficult if not impossible. It's making me think that it's not even worth unit testing this top level function at all, since really this class isn't doing much logic, it's just passing around data to its composite objects (which are all fully unit tested and don't require mocking).
So - should I even bother testing this high level function? And what am I gaining by doing this? Or am I completely misusing mock/stub objects here? Perhaps I should scrap the unit tests on this class, and instead just make an automated integration test, which uses the real implementations of the objects and Asserts() against SQL Queries to make sure the right end-state data exists? What am I missing here?
EDIT: Here's the code - the first function is the actual Run() method - then my five tests which test all five possible code paths. I changed it some for NDA reasons but the general concept is still there. Anything you see wrong with how I'm testing this function, any suggestions on what to change to make it better? Thanks.

I guess my advice echos most of what is posted here.
It sounds as if your Run method needs to be broken down more. If its design is forcing you into tests that are more complicated than it is, something is wrong. Remember this is TDD we're talking about, so your tests should dictate the design of your routine. If that means testing private functions, so be it. No technological philosophy or methodology should be so rigid that you can't do what feels right.
Additionally, I agree with some of the other posters, that your tests should be broken down into smaller segments. Ask yourself this, if you were going to be writting this app for the first time and your Run function didn't yet exist, what would your tests look like? That response is probably not what you have currently (otherwise you wouldn't be asking the question). :)
The one benefit you do have is that there isn't a lot of code in the class, so refactoring it shouldn't be very painful.
EDIT
Just saw you posted the code and had some thoughts (no particular order).
Way too much code (IMO) inside your SyncLock block. The general rule is to keep the code to a minimal inside a SyncLock. Does it ALL have to be locked?
Start breaking code out into functions that can be tested independently. Example: The ForLoop that removes ID's from the List(String) if they exist in the DB. Some might argue that the m_dao.BeginJob call should be in some sort of GetID function that can be tested.
Can any of the m_dao procedures be turned into functions that can tested on their own? I would assume that the m_dao class has it's own tests somewhere, but by looking at the code it appears that that might not be the case. They should, along with the functionality in the m_Parser class. That will relieve some of the burden of the Run tests.
If this were my code, my goal would be to get the code to a place where all the individual procedure calls inside Run are tested on their own and that the Run tests just test the final out come. Given input A, B, C: expect outcome X. Give input E, F, G: expect Y. The detail of how Run gets to X or Y is already tested in the other procedures' tests.
These were just my intial thoughts. I'm sure there are a bunch of different approaches one could take.

Two thoughts: first you should have an integration test anyway to make sure everything hangs together. Second, it sounds to me like you're missing intermediate objects. In my world, 50 lines is a long method. It's hard to say anything more precise without seeing the code.

The first thing I would try would be refactroing your unit tests to share the set up code between tests by refactoring to a method that sets up the mocks and expectations. Parameterize the method so your expectations are configurable. You may need one or perhaps more of these set up methods depending on how much alike the set up is from test to test.

So - should I even bother testing this
high level function?
Yes. If there are different code-paths, you should be.
And what am I gainging by doing this?
Or am I completely mis-using mock/stub
objects here?
As J.B. pointed out (Nice seeing you at AgileIndia2010!), Fowler's article is recommended read. As a gross simplification: Use Stubs, when you don't care about the values returned by the collaborators. If you the return values from the collaborator.call_method() changes the behavior(or you need non trivial checks on args, computation for return values), you need mocks.
Suggested refactorings:
Try moving the creation and injection of mocks into a common Setup method. Most unit testing frameworks support this; will be called before each test
Your LogMessage calls are beacons - calling out once again for intention revealing methods. e.g. SubmitBARRequest(). This will shorten your production code.
Try n move each Expect.Blah1(..) into intention revealing methods.
This will shorten your test code and make it immensely readable and easier to modify. e.g.
Replace all instances of
.
Expect.Once.On(mockDao) _
.Method("BeginJob") _
.With(New Object() {submittedBy, clientID, runDate, "Sent For Baring"}) _
.Will([Return].Value(0));
with
ExpectBeginJobOnDAO_AndReturnZero(); // you can name it better

on whether to test such function: you said in a comment
" the tests read just like the actual
function, and since im using mocks,
its only asserting the functions are
called and sent params (i can check
this by eyeballing the 50 line
function)"
imho eyeballing the function isn't enough, haven't you heard: "I can't believe I missed that!" ... you have a fair amount of scenarios that could go wrong in that Run method, covering that logic is a good idea.
on tests being brittle: try having some shared methods that you can use in the test class for the common scenarios. If you are concerned about a later change breaking all the tests, put the pieces that concerned you in specific methods that can be changed if needed.
on tests being too long / hard to know what's in there: don't test single scenarios with every single assertion that its related to it. Break it up, test stuff like it should log x messages when y happens (1 test), it should save to the db when y happens (another separate test), it should send a request to a third party when z happens (yet another test), etc.
on doing integration/system tests instead of these unit tests: you can see from your current situation that there are plenty of scenarios & little variations involved in that part of your system. That's with the shield of replacing yet more logic with those mocks & the ease of simulating different conditions. Doing the same with the whole thing will add a whole new level of complexity to your scenario, something that is surely unmanageable if you want to cover a wide set of scenarios.
imho you should minimize the combinations that you are leaving for your system tests, exercising some main scenarios should already tell you that a Lot of the system is working correctly - it should be a lot about everything being hooked correctly.
The above said, I do recommend adding focused integration tests for all the integration code you have that might not be currently covered by your tests / since by definition unit tests don't get there. This exercises specifically the integration code with all the variations you expect from it - the corresponding tests are much simpler than trying to reach those behaviors in the context of the whole system & tell you very quickly if any assumptions in those pieces is causing trouble.

If you think unit-tests are too hard, do this instead: add post-conditions to the Run method. Post-conditions are when you make an assertion about the code. For example, at the end of that method, you may want some variable to hold a particular value or one value out of some possible choices.
After, you can derive your pre-conditions for the method. This is basically the data type of each parameter and the limits and constraints on each of those parameters (and on any other variable initialized at the beginning of the method).
In this way, you can be sure both the input and output are what is desired.
That probably still won't be enough so you will have to look at the code of the method line by line and look for large sections that you want to make assertions about. If you have an If statement, you should check for some conditions before and after it.
You won't need any mock objects if you know how to check if the arguments to the object are valid and you know what range of outputs are desired.

Your tests are too complicated.
You should test aspects of your class rather than writing a unittest for each member of yor class. A unittest should not cover the entire functionality of a member.

I'm going to guess that each test for Run() set expectations on every method they call on the mocks, even if that test doesn't focus on checking every such method invocation. I strongly recommend you Google "mocks aren't stubs" and read Fowler's article.
Also, 50 lines of code is pretty complex. How many codepaths through the method? 20+? You might benefit from a higher level of abstraction. I'd need to see code to judge more certainly.

Refactoring and Test Driven Development

I'm Currently reading two excellent books "Working Effectively with Legacy Code" and "Clean Code".
They are making me think about the way I write and work with code in completely new ways but one theme that is common among them is test driven development and the idea of smothering everything with tests and having tests in place before you make a change or implement a new piece of functionality.
This has led to two questions:
Question 1:
If I am working with legacy code. According to the books I should put tests in place to ensure I'm not breaking anything. Consider that I have a method 500 lines long. I would assume I'll have a set of equivalent testing methods to test that method. When I split this function up, do I create new tests for each new method/class that results?
According to "Clean Code" any test that takes longer than 1/10th of a second is a test that takes too long. Trying to test a 500 long line legacy method that goes to databases and does god knows what else could well take longer than 1/10th of a second. While I understand you need to break dependencies what I'm having trouble with is the initial test creation.
Question 2:
What happens when the code is re-factored so much that structurally it no longer resembles the original code (new parameters added/removed to methods etc). It would follow that the tests will need re-factoring also? In that case you could potentially altering the functionality of the system while the allowing the tests to keep passing? Is re-factoring tests an appropriate thing to do in this circumstance?
While its ok to plod on with assumptions I was wondering whether there are any thoughts/suggestions on such matters from a collective experience.

That's the deal when working with legacy code. Legacy meaning a system with no tests and which is tightly coupled. When adding tests for that code, you are effectively adding integration tests. When you refactor and add the more specific test methods that avoid the network calls, etc those would be your unit tests. You want to keep both, just have then separate, that way most of your unit tests will run fast.
You do that in really small steps. You actually switch continually between tests and code, and you are correct, if you change a signature (small step) related tests need to be updated.
Also check my "update 2" on How can I improve my junit tests. It isn't about legacy code and dealing with the coupling it already has, but on how you go about writing logic + tests where external systems are involved i.e. databases, emails, etc.

The 0.1s unit test run time is fairly silly. There's no reason unit tests shouldn't use a network socket, read a large file or other hefty operations if they have to. Yes it's nice if the tests run quickly so you can get on with the main job of writing the application but it's much nicer to end up with the best result at the end and if that means running a unit test that takes 10s then that's what I'd do.
If you're going to refactor the key is to spend as much time as you need to understand the code you are refactoring. One good way of doing that would be to write a few unit tests for it. As you grasp what certain blocks of code are doing you could refactor it and then it's good practice to write tests for each of your new methods as you go.

Yes, create new tests for new methods.
I'd see the 1/10 of a second as a goal you should strive for. A slower test is still much better than no test.
Try not to change the code and the test at the same time. Always take small steps.

When you've got a lengthy legacy method that does X (and maybe Y and Z because of its size), the real trick is not breaking the app by 'fixing' it. The tests on the legacy app have preconditions and postconditions and so you've got to really know those before you go breaking it up. The tests help to facilitate that. As soon as you break that method into two or more new methods, obviously you need to know the pre/post states for each of those and so tests for those 'keep you honest' and let you sleep better at night.
I don't tend to worry too much about the 1/10th of a second assertion. Rather, the goal when I'm writing unit tests is to cover all my bases. Obviously, if a test takes a long time, it might be because what is being tested is simply way too much code doing way too much.
The bottom line is that you definitely don't want to take what is presumably a working system and 'fix' it to the point that it works sometimes and fails under certain conditions. That's where the tests can help. Each of them expects the world to be in one state at the beginning of the test and a new state at the end. Only you can know if those two states are correct. All the tests can 'pass' and the app can still be wrong.
Anytime the code gets changed, the tests will possibly change and new ones will likely need to be added to address changes made to the production code. Those tests work with the current code - doesn't matter if the parameters needed to change, there are still pre/post conditions that have to be met. It isn't enough, obviously, to just break up the code into smaller chunks. The 'analyst' in you has to be able to understand the system you are building - that's job one.
Working with legacy code can be a real chore depending on the 'mess' you start with. I really find that knowing what you've got and what it is supposed to do (and whether it actually does it at step 0 before you start refactoring it) is key to a successful refactoring of the code. One goal, I think, is that I ought to be able to toss out the old stuff, stick my new code in its place and have it work as advertised (or better). Depending on the language it was written in, the assumptions made by the original author(s) and the ability to encapsulate functionality into containable chunks, it can be a real trick.
Best of luck!

Here's my take on it:
No and yes. First things first is to have a unit test that checks the output of that 500 line method. And then that's only when you begin thinking of splitting it up. Ideally the process will go like this:
Write a test for the original legacy 500-line behemoth
Figure out, marking first with comments, what blocks of code you could extract from that method
Write a test for each block of code. All will fail.
Extract the blocks one by one. Concentrate on getting all the methods go green one at a time.
Rinse and repeat until you've finished the whole thing
After this long process you will realize that it might make sense that some methods be moved elsewhere, or are repetitive and several can be reduced to a single function; this is how you know that you succeeded. Edit tests accordingly.
Go ahead and refactor, but as soon as you need to change signatures make the changes in your test first before you make the change in your actual code. That way you make sure that you're still making the correct assertions given the change in method signature.

Question 1: "When I split this function up, do I create new tests for each new method/class that results?"
As always the real answer is it depends. If it is appropriate, it may be simpler when refactoring some gigantic monolithic methods into smaller methods that handle different component parts to set your new methods to private/protected and leave your existing API intact in order to continue to use your existing unit tests. If you need to test your newly split off methods, sometimes it is advantageous to just mark them as package private so that your unit testing classes can get at them but other classes cannot.
Question 2: "What happens when the code is re-factored so much that structurally it no longer resembles the original code?"
My first piece of advice here is that you need to get a good IDE and have a good knowledge of regular expressions - try to do as much of your refactoring using automated tools as possible. This can help save time if you are cautious enough not to introduce new problems. As you said, you have to change your unit tests - but if you used good OOP principals with the (you did right?), then it shouldn't be so painful.
Overall, it is important to ask yourself with regards to the refactor do the benefits outweigh the costs? Am I just fiddling around with architectures and designs? Am I doing a refactor in order to understand the code and is it really needed? I would consult a coworker who is familiar with the code base for their opinion on the cost/benefits of your current task.
Also remember that the theoretical ideal you read in books needs to be balanced with real world business needs and time schedules.

Unit testing Anti-patterns catalogue

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
anti-pattern : there must be at least two key elements present to formally distinguish an actual anti-pattern from a simple bad habit, bad practice, or bad idea:
Some repeated pattern of action, process or structure that initially appears to be beneficial, but ultimately produces more bad consequences than beneficial results, and
A refactored solution that is clearly documented, proven in actual practice and repeatable.
Vote for the TDD anti-pattern that you have seen "in the wild" one time too many.
The blog post by James Carr and Related discussion on testdrivendevelopment yahoogroup
If you've found an 'unnamed' one.. post 'em too. One post per anti-pattern please to make the votes count for something.
My vested interest is to find the top-n subset so that I can discuss 'em in a lunchbox meet in the near future.

Second Class Citizens - test code isn't as well refactored as production code, containing a lot of duplicated code, making it hard to maintain tests.

The Free Ride / Piggyback -- James Carr, Tim Ottinger
Rather than write a new test case method to test another/distinct feature/functionality, a new assertion (and its corresponding actions i.e. Act steps from AAA) rides along in an existing test case.

Happy Path
The test stays on happy paths (i.e. expected results) without testing for boundaries and exceptions.
JUnit Antipatterns

The Local Hero
A test case that is dependent on something specific to the development environment it was written on in order to run. The result is the test passes on development boxes, but fails when someone attempts to run it elsewhere.
The Hidden Dependency
Closely related to the local hero, a unit test that requires some existing data to have been populated somewhere before the test runs. If that data wasn’t populated, the test will fail and leave little indication to the developer what it wanted, or why… forcing them to dig through acres of code to find out where the data it was using was supposed to come from.
Sadly seen this far too many times with ancient .dlls which depend on nebulous and varied .ini files which are constantly out of sync on any given production system, let alone extant on your machine without extensive consultation with the three developers responsible for those dlls. Sigh.

Chain Gang
A couple of tests that must run in a certain order, i.e. one test changes the global state of the system (global variables, data in the database) and the next test(s) depends on it.
You often see this in database tests. Instead of doing a rollback in teardown(), tests commit their changes to the database. Another common cause is that changes to the global state aren't wrapped in try/finally blocks which clean up should the test fail.

The Mockery
Sometimes mocking can be good, and handy. But sometimes developers can lose themselves and in their effort to mock out what isn’t being tested. In this case, a unit test contains so many mocks, stubs, and/or fakes that the system under test isn’t even being tested at all, instead data returned from mocks is what is being tested.
Source: James Carr's post.

The Silent Catcher -- Kelly?
A test that passes if an exception is thrown.. even if the exception that actually occurs is one that is different than the one the developer intended.
See Also: Secret Catcher
[Test]
[ExpectedException(typeof(Exception))]
public void ItShouldThrowDivideByZeroException()
{
// some code that throws another exception yet passes the test
}

The Inspector
A unit test that violates encapsulation in an effort to achieve 100% code coverage, but knows so much about what is going on in the object that any attempt to refactor will break the existing test and require any change to be reflected in the unit test.
'how do I test my member variables without making them public... just for unit-testing?'

Excessive Setup -- James Carr
A test that requires a huge setup in order to even begin testing. Sometimes several hundred lines of code are used to prepare the environment for one test, with several objects involved, which can make it difficult to really ascertain what is tested due to the “noise” of all of the setup going on. (Src: James Carr's post)

Anal Probe
A test which has to use insane, illegal or otherwise unhealthy ways to perform its task like: Reading private fields using Java's setAccessible(true) or extending a class to access protected fields/methods or having to put the test in a certain package to access package global fields/methods.
If you see this pattern, the classes under test use too much data hiding.
The difference between this and The Inspector is that the class under test tries to hide even the things you need to test. So your goal is not to achieve 100% test coverage but to be able to test anything at all. Think of a class that has only private fields, a run() method without arguments and no getters at all. There is no way to test this without breaking the rules.
Comment by Michael Borgwardt: This is not really a test antipattern, it's pragmatism to deal with deficiencies in the code being tested. Of course it's better to fix those deficiencies, but that may not be possible in the case of 3rd party libraries.
Aaron Digulla: I kind of agree. Maybe this entry is really better suited for a "JUnit HOWTO" wiki and not an antipattern. Comments?

The Test With No Name -- Nick Pellow
The test that gets added to reproduce a specific bug in the bug tracker and whose author thinks does not warrant a name of its own. Instead of enhancing an existing, lacking test, a new test is created called testForBUG123.
Two years later, when that test fails, you may need to first try and find BUG-123 in your bug tracker to figure out the test's intent.

The Slow Poke
A unit test that runs incredibly slow. When developers kick it off, they have time to go to the bathroom, grab a smoke, or worse, kick the test off before they go home at the end of the day. (Src: James Carr's post)
a.k.a. the tests that won't get run as frequently as they should

The Butterfly
You have to test something which contains data that changes all the time, like a structure which contains the current date, and there is no way to nail the result down to a fixed value. The ugly part is that you don't care about this value at all. It just makes your test more complicated without adding any value.
The bat of its wing can cause a hurricane on the other side of the world. -- Edward Lorenz, The Butterfly Effect

The Flickering Test (Source : Romilly Cocking)
A test which just occasionally fails, not at specific times, and is generally due to race conditions within the test. Typically occurs when testing something that is asynchronous, such as JMS.
Possibly a super set to the 'Wait and See' anti-pattern and 'The Sleeper' anti-pattern.
The build failed, oh well, just run the build again. -- Anonymous Developer

Wait and See
A test that runs some set up code and then needs to 'wait' a specific amount of time before it can 'see' if the code under test functioned as expected. A testMethod that uses Thread.sleep() or equivalent is most certainly a "Wait and See" test.
Typically, you may see this if the test is testing code which generates an event external to the system such as an email, an http request or writes a file to disk.
Such a test may also be a Local Hero since it will FAIL when run on a slower box or an overloaded CI server.
The Wait and See anti-pattern is not to be confused with The Sleeper.

Inappropriately Shared Fixture -- Tim Ottinger
Several test cases in the test fixture do not even use or need the setup / teardown. Partly due to developer inertia to create a new test fixture... easier to just add one more test case to the pile

The Giant
A unit test that, although it is validly testing the object under test, can span thousands of lines and contain many many test cases. This can be an indicator that the system under tests is a God Object (James Carr's post).
A sure sign for this one is a test that spans more than a a few lines of code. Often, the test is so complicated that it starts to contain bugs of its own or flaky behavior.

I'll believe it when I see some flashing GUIs
An unhealthy fixation/obsession with testing the app via its GUI 'just like a real user'
Testing business rules through the GUI
is a terrible form of coupling. If
you write thousands of tests through
the GUI, and then change your GUI,
thousands of tests break.
Rather, test only GUI things through the GUI, and couple the
GUI to a dummy system instead of the
real system, when you run those tests.
Test business rules through an API
that doesn't involve the GUI. -- Bob Martin
“You must understand that seeing is believing, but also know that believing is seeing.” -- Denis Waitley

The Sleeper, aka Mount Vesuvius -- Nick Pellow
A test that is destined to FAIL at some specific time and date in the future. This often is caused by incorrect bounds checking when testing code which uses a Date or Calendar object. Sometimes, the test may fail if run at a very specific time of day, such as midnight.
'The Sleeper' is not to be confused with the 'Wait And See' anti-pattern.
That code will have been replaced long before the year 2000 -- Many developers in 1960

The Dead Tree
A test which where a stub was created, but the test wasn't actually written.
I have actually seen this in our production code:
class TD_SomeClass {
public void testAdd() {
assertEquals(1+1, 2);
}
}
I don't even know what to think about that.

got bit by this today:
Wet Floor:
The test creates data that is persisted somewhere, but the test does not clean up when finished. This causes tests (the same test, or possibly other tests) to fail on subsequent test runs.
In our case, the test left a file lying around in the "temp" dir, with permissions from the user that ran the test the first time. When a different user tried to test on the same machine: boom. In the comments on James Carr's site, Joakim Ohlrogge referred to this as the "Sloppy Worker", and it was part of the inspiration for "Generous Leftovers". I like my name for it better (less insulting, more familiar).

The Cuckoo -- Frank Carver
A unit test which sits in a test case with several others, and enjoys the same (potentially lengthy) setup process as the other tests in the test case, but then discards some or all of the artifacts from the setup and creates its own.
Advanced Symptom of : Inappropriately Shared Fixture

The Secret Catcher -- Frank Carver
A test that at first glance appears to be doing no testing, due to absence of assertions. But "The devil is in the details".. the test is really relying on an exception to be thrown and expecting the testing framework to capture the exception and report it to the user as a failure.
[Test]
public void ShouldNotThrow()
{
DoSomethingThatShouldNotThrowAnException();
}

The Environmental Vandal
A 'unit' test which for various 'requirements' starts spilling out into its environment, using and setting environment variables / ports. Running two of these tests simultaneously will cause 'unavailable port' exceptions etc.
These tests will be intermittent, and leave developers saying things like 'just run it again'.
One solution Ive seen is to randomly select a port number to use. This reduces the possibility of a conflict, but clearly doesnt solve the problem. So if you can, always mock the code so that it doesn't actually allocate the unsharable resource.

The Turing Test
A testcase automagically generated by some expensive tool that has many, many asserts gleaned from the class under test using some too-clever-by-half data flow analysis. Lulls developers into a false sense of confidence that their code is well tested, absolving them from the responsibility of designing and maintaining high quality tests. If the machine can write the tests for you, why can't it pull its finger out and write the app itself!
Hello stupid. -- World's smartest computer to new apprentice (from an old Amiga comic).

The Forty Foot Pole Test
Afraid of getting too close to the class they are trying to test, these tests act at a distance, separated by countless layers of abstraction and thousands of lines of code from the logic they are checking. As such they are extremely brittle, and susceptible to all sorts of side-effects that happen on the epic journey to and from the class of interest.

Doppelgänger
In order to test something, you have to copy parts of the code under test into a new class with the same name and package and you have to use classpath magic or a custom classloader to make sure it is visible first (so your copy is picked up).
This pattern indicates an unhealthy amount of hidden dependencies which you can't control from a test.
I looked at his face ... my face! It was like a mirror but made my blood freeze.

The Mother Hen -- Frank Carver
A common setup which does far more than the actual test cases need. For example creating all sorts of complex data structures populated with apparently important and unique values when the tests only assert for presence or absence of something.
Advanced Symptom of: Inappropriately Shared Fixture
I don't know what it does ... I'm adding it anyway, just in case. -- Anonymous Developer

The Test It All
I can't believe this hasn't been mentioned till now, but tests should not break the Single Responsibility Principle.
I have come across this so many times, tests that break this rule are by definition a nightmare to maintain.

Line hitter
On the first look tests covers everything and code coverage tools confirms it with 100%, but in reality tests only hit code without any output analyses.
coverage-vs-reachable-code

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js