I'm having trouble getting my unit tests to stay independent of each other. For instance, I have a linked list with two append methods, one that takes a single element and appends it to the list, and one that takes another list and appends the whole thing; but I can't test the second append method (the one that takes a whole list) without using the first append method to populate the list I'm passing in. How do I keep the unit tests for these two methods separate from each other?
The situation you describe happens everywhere in testing: You have some class or library to test. The class or library has certain methods / functions that need to be tested, and for the test of some of these, you have to call other methods / functions of that same library.
In other words, when breaking down the test according to the four phase test pattern (setup, exercise, evaluate, cleanup), you want to call your class / lib in the exercise phase. However, it seems annoying that you have to call some elements of it also in the setup phase, and, possibly, also in the evaluate and/or cleanup phases.
This is simply unavoidable: You mentioned that in the setup for the list append function you had to use the single-element append function. But, it is even worse: You also had to use the constructor of your list class - no chance to get away without that one. But, the constructor could also be buggy...
What can certainly happen is, that tests fail (or, mistakenly pass) because the functions called in the setup are defective. A proper test suite, however, should (as was mentioned in the comments) also have tests for the other (call them lower-level) functions.
For example, you should have a number of tests which check that the constructor of your class works correctly. If at some point you modify the constructor so it becomes defective, all tests that use the constructor in the setup phase are no longer trustworthy. But, some of the tests that test the constructor itself (and thus call it in the exercise phase) should fail now.
From the overview of the test results you will then be able to identify the root cause of the test failures. This requires some understanding about the dependencies: Which of the tests focus on the lower-level aspects and which are higher-level in the sense that they depend on some lower-level functionality to work.
There are some ways to make these dependencies more apparent and therefore make it easier to analyse test failures later - but none of these are essential:
In the test-code, you put the tests for the lower-level aspects at the top of the file, and the more dependent tests further to the bottom. Then, when several tests fail, first look at the test that is closest to the top of the file. Note that the order of tests in the test code does not necessarily imply an execution order: JUnit for example does not care in which order the test methods are written in the test class.
As it was suggested in the comments, you may in addition configure the test framework to run the lower level tests before the others.
You can create one method which itself is not a unit test method but instead creates the conditions for multiple tests, then performs verification of the results. Your actual unit test methods will call into this other method. So you can use the same data set for multiple tests, and not introduce dependencies between test methods.
I don't know what language you are using, but here is an example for Objective-C in Xcode 5 with the new XCTest framework. I would do something like this:
- (void)performTestWithArray:(NSArray *)list
{
NSMutableArray *initialList = ...; // create the initial list you will use with multiple tests
[initialList addObjectsFromArray:list];
XCTAssertTrue(testCondition, #"message");
}
- (void)testAddSingleElement
{
NSArray *array = #[ #"one element" ];
[self performTestWithArray:array];
}
- (void)testAddList
{
NSArray *array = #[ #"first element", #"second element", #"third element" ];
[self performTestWithArray:array];
}
Related
Consider unit testing a dictionary object. The first unit tests you might write are a few that simply adds items to the dictionary and check exceptions. The next test may be something like testing that the count is accurate, or that the dictionary returns a correct list of keys or values.
However, each of these later cases requires that the dictionary can first reliably add items. If the tests which add items fail, we have no idea whether our later tests fail because of what they're testing is implemented incorrectly, or because the assumption that we can reliably add items is incorrect.
Can I declare a set of unit tests which cause a given unit test to be inconclusive if any of them fail? If not, how should I best work around this? Have I set up my unit tests wrong, that I'm running into this predicament?
It's not as hard as it might seem. Let's rephrase the question a bit:
If I test my piece of code which requires System.Collections.Generic.List<T>.Add to work, what should I do when one day Microsoft decides to break .Add on List<T>? Do I make my tests depending on this to work inconclusive?
Answer to the above is obvious; you don't. You let them fail for one simple reason - your assumptions have failed, and test should fail. It's the same here. Once you get your add tests to work, from that point on you assume add works. You shouldn't treat your tested code any differently than 3rd party tested code. Once it's proven to work, you assume it indeed does.
On a different note, you can utilize concept called guard assertions. In your remove test, after the arrange phase you introduce additional assert phase, which verifies your initial assumptions (in this case - that the add is working). More information about this technique can be found here.
To add an example, NUnit uses the concept above disguised under the name Theory. This does exactly what you proposed (yet it seems to be more related to data driven testing rather than general utility):
The theory itself is responsible for ensuring that all data supplied meets its assumptions. It does this by use of the Assume.That(...) construct, which works just like Assert.That(...) but does not cause a failure. If the assumption is not satisfied for a particular test case, that case returns an Inconclusive result, rather than a Success or Failure.
However, I think what Mark Seemann states in an answer to the question I linked makes the most sense:
There may be many preconditions that need to be satisfied for a given test case, so you may need more than one Guard Assertion. Instead of repeating those in all tests, having one (and one only) test for each precondition keeps your test code more mantainable, since you will have less repetition that way.
Nice question, I often ponder this and had this problem the other day. What I did was get the basics of our collection working using a dictionary behind the scenes. For example:
public class MyCollection
{
private IDictionary<string, int> backingStore;
public MyCollection(IDictionary<string, int> backingStore)
{
_backingStore = backingStore;
}
}
Then we test drove the addition implementation. As we had the dictionary by reference we could assert that after adding items our business logic was correct.
For example the pseudo code for the additon was something like:
public void Add(Item item)
{
// Check we have not added before
// More business logic...
// Add
}
Then the test could be written such as:
var subject = new MyCollection(backingStore);
subject.Add(new Item())
Assert.That(backingStore.Contains(itemThatWeAdded)
We then went on to drive out the other methods such as retrieval, and deletion.
Your question is what should you do with regards the addition breaking, in turn breaking the retrieval. This is a catch 22 scenario. Personally I'd rather ditch the backing store and use this as an implementation detail. So this is what we did. We refactored the tests to use the system under test, rather than the backing store for the asserts. The great thing about the backing store being public initially is it allows you test drive small parts of the codebase, rather than having to implement both addition and retrieval in one go.
The test for addition then looked like the following after we refactored the collection to not expose the backing store.
var subject = new MyCollection();
var item = new Item()
subject.Add(item)
Assert.That(subject.Has(item), Is.True);
In this case I think this is fine. If you can not add items successfully then you sure as hell cannot retrieve anything because you've not added them. As long as your tests are named well any developer seeing some test such as "CanOnlyAddUniqueItemsToCollection" will point future developers in the right direction, in other words, the addition is broken. Just make sure your tests are named well and you should be giving as much help as possible.
I don't see this as too much of a problem. If your Dictionary class is not too big, and the unit test for that class is the only unit test testing that code, then when your add method is broken and multiple tests fail, you still know the problem is in the Dictionary class and can identify it, debug and fix it easily.
Where it becomes a problem is when you have other code smells or design problems such as:
unit tests tests are testing many application classes, using mocks instead can help here.
unit tests are actually system tests creating and testing many application classes at once.
the Dictionary class is too big and complex so when it breaks and tests fail it's difficult to figure out what part is broken.
This is very interesting. We use NUnit and the best I can tell it runs test-methods alphabetically. That might be an overly-artificial way to order your tests, but if you built up your test classes such that alphabetically/numerically-named pre-req methods came first you might accomplish what you want.
I find myself writing a test method, firing just it to watch it fail, and then writing the code to make it pass. When I'm all done I can run the whole class and everything passes - it doesn't matter what order the tests ran in becuase everything 'works' becuase of the incremental dev I did.
Now later on if I break something in the thing i'm testing who knows what all will fail in the harness. I guess it doesn't really matter to me - I've got a long list of failures and I can tease out what went wrong.
I've got a complex part of a web application, and we're starting now to unit test it in order to ensure that everything works fine, and if any changes will be made, the tests will be there to check whether we broke anything.
This portion of our app is a sort of wizard: You go from step 1 to step N. Each step can fork in different ways depending on what the user chooses. Each step can also contain either only 1 item or a collection of items, like this:
Step 1: Are there items of type X? If yes, how many?
For each item declared -> form to input item data
Step 2: Are there items of type Y? If yes, how many?
For each item declared -> form to input item data (may contain references to items declared in step 1)
etc. It's not all like this, there are exceptions, but it's just to give an idea of how it is. Now this procedure isn't forward-only. The user must be able to jump back to previous nodes and apply changes, add items or remove items from collections, etc. and the software must remember what was the last step he completed before jumping back, so when he's done he can go on.
For unit testing purposes, I am thinking that I can't have standalone tests: if you haven't completed previous steps, you don't have the data for successive steps. Thus I was thinking of writing ordered tests.
I also read that a best practice is to "have the test be independent from one another", and what I thought to do is going against this.
The sample tests I wrote are green if run as an ordered test, but only the first one is green if run as standalone tests.
Now I'd like to hear opinions and if anyone has a correct way to approach this situation.
Each test should indeed be independent of each other. If Test B depends on Test A executing before it, then you have potentially very flakey tests on your hands. In your situation, I'd prefer to SetUp Test B by pre-configuring a context.
What I mean is, whatever state Test A leaves the system after it has completed, use that to setup the context for Test B e.g.
public void TestB()
{
// Arrange.
SetupSystemLikeTestAHadJustRun();
// Act.
// Do your tests.
// Assert.
}
By having a known, fixed, context at the start of the test, you stand a much better chance of having a good suite of tests.
Alternatively, if you have heard of BDD (Behaviour Driven Development) you could a use a BDD tool like SpecFlow or NBehave (for .NET) or Cucumber for Ruby. Using BDD allows you to be more expressive in your testing.
Do's and Don'ts
DO
Name tests with both their expected outcome and relevant details of the state or input being tested
DON'T
Give tests names that say nothing beyond the name of the method being tested except in trivial cases
STRUCTURE
Structure tests in three distinct blocks - arrange, act, and assert.
Unit tests tend to have a very regular structure. A common way to refer to this structure is arrange, act, assert: every test must arrange the state of the world to test, act on the class under test by calling methods, and assert that the world is in the expected state afterward.
The arrange block is for setting up details of the external world specific to the situation under test. This involves things like creating local variables that will be reused in the test, and sometimes instantiating the object under test with specific arguments. This step should not involve any calls to the object under test (do that during the act block) or verifications of initial state (do that during assert, maybe in another test). Note that general setup required by all or many tests should be done in the test's setUp method. If your test doesn't depend on any specific external state, you can skip the arrange block.
The act block is where you actually make calls to the class under test to trigger the behavior that is being tested. Frequently this block will be a single method call, but if the behavior you're testing spans several methods then they will each be called here. Simple arguments can be inlined as part of the method call, but more complex argument expressions are sometimes better off extracted to the arrange block to avoid distracting from the intent of the block. The act block may also assign a method's return value to a local variable so that it can be asserted on later.
The assert block is the place to make assertions on the return values collected and to verify any interactions with mock objects. It can also build values required for the assertions and verifications. In very simple tests, the act and assert blocks are sometimes combined by inlining calls on the class under test into an assert statement.
These blocks should be distinct from one another - the test should not perform any additional setup or stubbing once it makes calls to the class under test in the act block, and it should not make further calls to the class under test once verification begins in the assert block.
It should be clear when glancing at the test where each block starts and ends. Usually this can be done by adding a single blank line between each block (though this isn't necessary in simple tests where the blocks are only one or two lines each). In particularly complex tests, especially ones where you have to set up several different objects, you might want to use blank lines within blocks to make them more readable. In this case, one option to distinguish the blocks is to label each with a comment like // Arrange, // Act, and // Assert.
Tests that emphasize this structure are clearer since they make it easy to navigate different parts of the test, and more likely to be complete since the regular structure helps ensure that the details of the behavior being tested aren't hidden or omitted.
Mocking frameworks interact with this structure in different ways. Most modern frameworks like Mockito allow stubs to be configured in the arrange block along with defining local variables, and mocks to be verified in the assert block along with performing assertions. Some older frameworks like EasyMock unfortunately require the expected behaviors of mocks to be specified before invoking the code under test - this requires a fourth "expect" block before the act block which works in a similar way to the assert block.
I would use a tool like cucumber or selenium to test the graphical stuff you describe. You can use a unit test framework like junit and nunit to write these kind of tests, but these doesn't really support running ordered tests.
I'm new to test driven development and first time I'm tring to use it in a simple project.
I have a class, and I need to test creation, insertion and deletion of objects of this class. If I write three seperate test functions I need to duplicate initialization codes in other function. On the hand if I put all tests in one test function then it is a contradiction with one test per function. What should I do?
Here the situation:
tst_create()
{
createHead(head);
createBody(body);
createFoot(foot);
}
tst_insert()
{
createHead(head);
createBody(body);
createFoot(foot);
obj_id=insert(obj); //Also I need to delete obj_id somehow in order to preserve old state
}
tst_delete()
{
createHead(head);
createBody(body);
createFoot(foot);
obj_id=insert(obj);
delete(obj_id);
}
vs
tstCreateInsertDelete()
{
createHead(head);
createBody(body);
createFoot(foot);
obj_id=insert(obj);
delete(obj_id);
}
Rather than "One test per function", try thinking about it as, "One aspect of behaviour per function".
What does inserting an object give you? How about deleting an object? Why are these valuable? How can you tell you've done them? Write an example of how the code might be used, and why that behaviour is valuable. That then becomes your test.
When you've worked out what the behaviour is that you're interested in, extract out the duplication only if it makes the test more readable. TDD isn't just about testing; it's also about providing documentation, and helping you think about the responsibility of each element of code and the design of that code. The tests will probably be read far more than they're written, so readability has to come first.
If necessary, put all the behaviour you're interested in in one method, and just make sure it's readable. You can add comments if required.
Factor out the duplication in your tests.
Depending on your test framework, there may be support for defining a setup method that's called before each test execution and a teardown method that's called after each test.
Regardless, you can extract the common stuff so that you only have to repeat a call to a single shared setup.
If you tell us what language and test framework you use, we might be able to give more specific advice.
Suppose I have several unit tests in a test class ([TestClass] in VSUnit in my case). I'm trying to test just one thing in each test (doesn't mean just one Assert though). Imagine there's one test (e.g. Test_MethodA() ) that tests a method used in other tests as well. I do not want to put an assert on this method in other tests that use it to avoid duplicity/maintainability issues so I have the assert in only this one test. Now when this test fails, all tests that depend on correct execution of that tested method fail as well. I want to be able to locate the problem faster so I want to be somehow pointed to Test_MethodA. It would e.g. help if I could make some of the tests in the test class execute in a particular order and when they fail I'd start looking for cause of the failure in the first failing test. Do you have any idea how to do this?
Edit: By suggesting that a solution would be to execute the tests in a particular order I have probably went too far and in the wrong direction. I don't care about the order of the tests. It's just that some of the tests will always fail if a prequisite isn't valid. E.g. I have a test class that tests a DAO class (ok, probably not a UNIT test, but there's logic in the database stored procedures that needs to be tested but that's not the point here I think). I need to insert some records into a table in order to test that a method responsible for retrieving the records (let's call it GetAll()) gets them all in the correct order e.g. I do the insert by using a method on the DAO class. Let's call it Insert(). I have tests in place that verify that the Insert() method works as expected. Now I want to test the GetAll() method. In order to get the database in a desired state I use the Insert() method. If Insert() doesn't work, most tests for GetAll() will fail. I'd prefer to mark the tests that can't pass because Insert() doesn't work as inconclusive rather than failed. It would ease finding the cause of the problem if I know which method/test to look into first.
You can't (and shouldn't) execute unit tests in a specific order. The underlying reason for this is to prevent Interacting Tests - I realize that your motivation for requesting such a feature is different, but that's the reason why unit test frameworks don't allow you to order tests. In fact, last time I checked, xUnit.net even randomizes the order.
One could argue that the fact that some of your tests depend on a different method call on the same class is a symptom of tight coupling, but that's not always the case (state machines come to mind).
However, if possible, consider using a Back Door instead of the other method in question.
If you can't do either that or decouple the interdependency (e.g. by making the first method virtual and using the Extract and Override technique), you will have to live with it.
Here's an example:
public class MyClass
{
public virtual void FirstMethod() { // do something... }
public void SecondMethod() {}
}
Since FirstMethod is virtual, you can derive from MyClass and override its behavior. You can also use a dynamic mock to do that for you. With Moq, it would look like this:
var sutStub = new Mock<MyClass>();
// by default, Moq overrides all virtual methods without calling base
// Now invoke both methods in sequence:
sutStub.Object.FirstMethod(); // overriden by Moq, so it does nothing
sutSutb.Object.SecondMethod();
I think I would indeed have the assertion on the method_A() result in every tests relying on its result, even if this introduces some duplication. Then I would use the assertion message to point to the method_A() failure.
assert("method_A() returned true", true, rc);
Perhaps will I end extracting the method_A() call and the assertion into an helper function to remove the duplication.
Now let's imagine method_A() queries an object and returns it, or NULL when no object is found. Then this assertion is a guard ; and it is necessary with languages suchas C, C++ that do not have NullPointerException.
I'm afraid you can't do this. The only solution is to redesign your code and break it up into smaller methods so that unit tests can call these one by one. Of course this isn't always desirable.
With Visual Studio you can order your tests: see here. But I'd like to advise you to stay away from this technique as much as possible: unit tests are meant to be run anywhere, anytime and in every order.
EDIT: why is this a problem for you? All failing tests point to the same method anyway...
I have a method that checks some assumptions and either follows the happy path, or terminates along the unhappy paths. I've either designed it poorly, or I'm missing the method for testing that the control of the flow.
if (this.officeInfo.OfficeClosed)
{
this.phoneCall.InformCallerThatOfficeIsClosedAndHangUp();
return;
}
if (!this.operators.GetAllOperators().Any())
{
this.phoneCall.InformCallerThatNoOneIsAvailableAndSendToVoicemail();
return;
}
Call call=null;
forach(var operator in this.operators.GetAllOperators())
{
call = operator.Call();
if(call!=null) {break;}
}
and so on. I've got my dependencies injected. I've got my mocks moq'd. I can make sure that this or that is called, but I don't know how to test that the "return" happens. If TDD means I don't write a line until I have a test that fails without it, I'm stuck.
How would you test it? Or is there a way to write it that makes it more testable?
Update: Several answers have been posted saying that I should test the resultant calls, not the flow control. The problem I have with this approach, is that every test is required to setup and test the state and results of the other tests. This seems really unwieldy and brittle. Shouldn't I be able to test the first if clause alone, and then test the second one alone? Do I really need to have a logarithmically expanding set of tests that start looking like Method_WithParameter_DoesntInvokeMethod8IfMethod7IsTrueandMethod6IsTrueAndMethod5IsTrueAndMethod4IsTrueAndMethod3IsFalseAndMethod2IsTrueAndMethod1isAaaaccck()?
I think you want to test the program's outputs: for example, that when this.officeInfo.OfficeClosed then the program does invoke this.phoneCall.InformCallerThatOfficeIsClosedAndHangUp() and does not invoke other methods such as this.operators.GetAllOperators().
I think that your test does this by asking its mock objects (phoneCall, etc.) which of their methods was invoked, or by getting them to throw an exception if any of their methods are invoked unexpectedly.
One way to do it is to make a log file of the program's inputs (e.g. 'OfficeClosed returns true') and outputs: then run the test, let the test generate the log file, and then assert that the contents of the generated log file match the expected log file contents for that test.
I'm not sure that's really the right approach. You care about whether or not the method produced the expected result, not necessarily how control "flowed" through the particular method. For example, if phoneCall.InformCallerThatOfficeIsClosedAndHangUp is called, then I assume some result is recorded somewhere. So in your unit test, you would be asserting that result was indeed recorded (either by checking a database record, file, etc.).
With that said, it's important to ensure that your unit tests indeed cover your code. For that, you can use a tool like NCover to ensure that all of your code is being excercised. It'll generate a coverage report which will show you exactly which lines were executed by your unit tests and more importantly, which ones weren't.
You could go ballistic and use a strategy pattern. Something along the lines of having an interface IHandleCall, with a single void method DoTheRightThing(), and 3 classes HandleOfficeIsClosed, HandleEveryoneIsBusy, HandleGiveFirstOperatorAvailable, which implement the interface. And then have code like:
IHandleCall handleCall;
if (this.officeInfo.OfficeClosed)
{
handleCall = new HandleOfficeIsClosed();
}
else if other condition
{
handleCall = new OtherImplementation();
}
handleCall.DoTheRightThing();
return;
That way you can get rid of the multiple return points in your method. Note that this is a very dirty outline, but essentially at that point you should extract the if/else into some factory, and then the only thing you have to test is that your class calls the factory, and that handleCall.DoTheRightThing() is called - (and of course that the factory returns the right strategy).
In any case, because you have already guarded against no operator available, you could simplify the end to:
var operator = this.operators.FindFirst();
call = operator.Call();
Don't test the flow control, just test the expected behavior. That is, unit testing does not care about the implementation details, only that the behavior of the method matches the specifications of the method. So if Add(int x, int y) should produce the result 4 on input x = 2, y = 2, then test that the output is 4 but don't worry about how Add produced the result.
To put it another way, unit testing should be invariant under implementation details and refactoring. But if you're testing implementation details in your unit testing, then you can't refactor without breaking the unit tests. For example, if you implement a method GetPrime(int k) to return the kth prime then check that GetPrime(10) returns 29 but don't test the flow control inside the method. If you implement GetPrime using the Sieve of Eratóstenes and have tested the flow control inside the method and later refactor to use the Sieve of Atkin, your unit tests will break. Again, all that matters is that GetPrime(10) returns 29, not how it does it.
If you are stuck using TDD it's a good thing: it means that TDD drives your design and you are looking into how to change it so you can test it.
You can either:
1) verify state: check SUT state after SUT execution or
2) verify behavior: check that mock object calls complied with test expectations
If you don't like how either of these approaches look in your test it's time to refactor the code.
The pattern described by Aaron Feng and K. Scott Allen would solve for my problem and it's testability concerns. The only issue I see is that it requires all the computation to be performed up front. The decision data object needs to be populated before all of the conditionals. That's great unless it requires successive round trips to the persistence storage.