Value of Step-by-Step Asserts in Unit Tests - unit-testing

When writing unit tests, there are cases where one can create an Assert for each condition that could fail or an Assert that would catch all such conditions. C# Example:
Dictionary<string, string> dict = LoadDictionary();
// Optional Asserts:
Assert.IsNotNull(dict, "LoadDictionary() returned null");
Assert.IsTrue(dict.Count > 0, "Dictionary is empty");
Assert.IsTrue(dict.ContainsKey("ExpectedKey"), "'ExpectedKey' not in dictionary");
// Condition actually interested in testing:
Assert.IsTrue(dict["ExpectedKey"] == "ExpectedValue", "'ExpectedKey' is present but value is not 'ExpectedValue'");
Is there value to a large, multi-person project in this kind of situation to add the "Optional Asserts"? There's more work involved (if you have lots of unit tests) but it will be more immediately clear where the problem lies.
I'm using VS 2010 and the integrated testing tools but intend the question to be generic.

I think that there is a value in doing something like this but you have to be carefuly how to use it. I also work on a large, multi-person project, and recently we have started to use similar approach in our unit testing strategy.
We try to have one test per "execution path", and we have test cases with multiple asserts. However, we use fatal and non-fatal asserts in our test cases, and only non-fatal asserts are used in those test cases that have multiple asserts. Fatal asserts are also used in these test cases (one per TC) to validate condition which if fails there is no point in asserting anything else. This approach helps us to faster localize errors as sometimes more than one assert can occur.
Combining it with custom logs to provide additional information on failures - testing and debugging is much faster and actually more efficient.
However, looking at your example, I am not sure if "multiple/optional asserts" is really good aproach as most likely you do not want to test these basic functionalities over and over again (LoadDict(), not empty etc). I think that in your case, the "test case setup" should ensure that Dictionary is "not empty" and LoadDictionary() performs as expected (already tested with specific TCs). The goal of this test case seems to be validating the lookup method and it should be focused on testing that thing only. Everything else is setup/other functinality and should not belong to this TC, really.

It is advisable to only have one assert per test, that way, you are clear in your own mind exactly what you are testing.
In TDD, this will make it easier to isolate exactly what it is you are about to implement after a failing test.
The reports generated by your testing tools will be more accurate.
Each of those "optional" Asserts strike me as being separate tests, but there is no need to assert each of them in each test.
Have a test for whether LoadDictionary(); returns null,
Have a test for whether LoadDictionary(); returns an empty dictionary
etc.
Don't have a single test containing all of these asserts. Definitely don't have several tests containing most of these asserts followed by the actual thing each test is meant to examine.

Personally, I don't think there's a lot of value to pre-emptively revisiting existing tests to insert the optional Asserts.
If this is an existing project, hopefully all your tests are passing. If that's the case, they'll only start breaking when you refactor and/or change or add something to your code.
I think it'd be easier to deal with the tests that break if and when they break. Sure, you'll be getting a generic failure for your test, but by throwing in a breakpoint (or generally just investigating) you'll pretty easily discover the true source of your failure.
Alternatively, when a test fails, you could add the optional Asserts then to clarify the error. That way you won't be using time upfront to add additional Asserts to tests that aren't going to fail, but you still get the benefit when you do.
If, however, this is a proactive move (as you've suggested) that outlines guidelines for testing, I think it really depends on the test itself and the benefit you receive from the additional Asserts. How much time will you really save by knowing dict is null rather than just missing a key? Ultimately, you should only be testing one thing, so if you start to find a lot of assertions in the one test, there's probably something wrong.
Personally, I don't think a global policy dictating the assertions required is something worth implementing. I think it should be decided on a case-by-case basis. For some tests, like the example you've given, there's probably a lot of value in some of the additional assertions. For something simpler that's unlikely to fail, there's probably not.
Forcing developers to catch and describe every possible discrete failure is a bit negative. Like you're expecting it to fail frequently enough that it's worthwhile saving a few minutes diagnosing it.

What's important is to be able to quickly understand the origin of the failure when the bar goes red.
Let's suppose there's only one assertion:
Assert.IsTrue(dict["ExpectedKey"] == "ExpectedValue",
"'ExpectedKey' is present but value is not 'ExpectedValue'");
What happens when LoadDictionary() returns null ? If this crashes the entire unit test application as will happen with C/C++, then an assertion guard is necessary. If the failure message clearly indicates that dict was null, then the optional assertions are pointless.
The second question is what happens when the expected key does not appear in the dictionary ? Once again, the error message should make the distinction. If the assertion to test the value associated to the key throws an exception (such as missing key exception), then it's fine : no need to test first that the key exists, and then its value. One test is enough here.
The test if the dictionary is empty is useless as it seems that the objective of this test is to verify that a certain key is present with a certain value.
Last, using some sort of equality assertion would give a more accurate error message:
Assert.Equal("ExpectedValue", dict["ExpectedKey"],
"Incorrect value for 'ExpectedKey'");
The error should be something like "expected: ExpectedValue, actual: ". Knowing the incorrect value for the key can be helpful.

Related

Unit testing for either/or conditions

The module I'm working on holds a list of items and has a method to locate and return an item from that list based on certain criteria. the specification states that "...if several matching values are found, any one may be returned"
I'm trying to write some tests with Nunit, and I can't find anything that allows me to express this condition very well (i.e. the returned object must be either A or B but I don't mind which)
Of course I could quite easily write code that sets a boolean to whether the result is as expected and then just do a simple assert on that boolean, but this whole question is making me wonder whether this is a "red flag" for unit testing and whether there's a better solution.
How do experienced unit testers generally handle the case where there are a range of acceptable outputs and you don't want to tie the test down to one specific implementation?
Since your question is in rather general form, I can only give a rather general answer, but for example...
Assert.That(someObject, Is.TypeOf<A>().Or.TypeOf<B>());
Assert.That(someObject, Is.EqualTo(objectA).Or.EqualTo(objectB));
Assert.That(listOfValidOjects, Contains.Item(someObject));
It depends on the details of what you are testing.
I am coming from Java, JUnit and parametrized tests, but it seems that nunit supports those as well (see here).
One could use that to generate values for your different variables (and the "generator" could keep track of the expected overall result, too).
Using that approach you might find ways to avoid "hard-coding" all the potential input value combinations (as said: by really generating them); but at least you should be able to write code where that information of different input values together with the expected result is more nicely "colocated" in your source code.

Unit testing checking for nulls

This is a very basic question but I still cannot find the appropriate answer. In my test there is a possibility to have null values and because of that the last stage (Act) starts looking a little bit strange (it is no longer act only). What I mean is the following:
Assert.IsNotNull(variable);
var newVariable = variable.Property;
Assert.IsNotNull(newVariable);
var finalVariable = newVariable.AnotherProperty;
Assert.AreEqual(3, finalVariable.Count);
Now they are obviously related and I have to be sure that the values are not null, but also there are three asserts in one test and the act part starts to look not right.
So what is the general solution in such cases? Is there anything smarter than 3 tests with one assert each and checks for null before the asserts of the last 2?
Basically there are two ways of dealing with your problem:
Guard assertions: extra asserts making sure data is in known state before proper test takes place (that's what you're doing now).
Moving guard assertions to their own tests.
Which option to chose largely depends on code under test. If preconditions would be duplicated in other tests, it's a hint for separate test approach. If precondition has reflection in production code, it's again hint for separate test approach.
On the other hand, if it's only something you do to boost your confidence, maybe separate test is too much (yet as noted in other answers, it might be a sign that you're not in full control of your test or that you're testing too many things at once).
I think you should split this test into three tests and name them accordingly to what's happening. It's perfectly sensible even if your acts in those tests are same, you are testing different scenarios by checking return value of the method.
Nulls are royal pain. The question is, can they legitimately exist?
Let's separate our discussion to code and tests.
If the null shouldn't exist then the code itself, not the tests, should check and verify that they are not null. For this reason each and every method of my code is built using a snippet that checks the arguments:
public VideoPosition(FrameRate theFrameRate, TimeSpan theAirTime)
{
Logger.LogMethod("theVideoMovie", theFrameRate, "theAirTime", theAirTime);
try
{
#region VerifyInputs
Validator.Verify(theFrameRate);
Validator.Verify(theAirTime);
Validator.VerifyTrue(theAirTime.Ticks >= 0, "theAirTime.Ticks >= 0");
If null ARE legitimate in the code, but you are testing a scenario where the returned values shouldn't be null, then of course you have to verify this in your testing code.
In your Unit Test you should be able to control every input to your class under test. This means that you control if your variable has a value or not.
So you would have one unit test that forces your variable to be null andnthen asserts this.
You will then have another test where you can be sure that your variable has a value and you omly need the other asserts.
I wrote a blog about this some time ago. Maybe it can help: Unit Testing, hell or heaven?

Unit testing with random data

I've read that generating random data in unit tests is generally a bad idea (and I do understand why), but testing on random data and then constructing a fixed unit test case from random tests which uncovered bugs seems nice. However I don't understand how to organize it nicely. My question is not related to a specific programming language or to a specific unit test framework actually, so I'll use python and some pseudo unit test framework. Here's how I see coding it:
def random_test_cases():
datasets = [
dataset1,
dataset2,
...
datasetn
]
for dataset in datasets:
assertTrue(...)
assertEquals(...)
assertRaises(...)
# and so on
The problem is: when this test case fails I can't figure out which dataset caused failure. I see two ways of solving it:
Create a single test case per dataset — the problem is load of test cases and code duplication.
Usually test framework lets us pass a message to assert functions (in my example I could do something like assertTrue(..., message = str(dataset))). The problem is that I should pass such a message to each assert, which does not look like elegant too.
Is there a simpler way of doing it?
I still think it's a bad idea.
Unit tests need to be straightforward. Given the same piece of code and the same unit test, you should be able to run it infinitely and never get a different response unless there's an external factor coming in to play. A goal contrary to this will increase maintenance cost of your automation, which defeats the purpose.
Outside of the maintenance aspect, to me it seems lazy. If you put thought in to your functionality and understand the positive as well as the negative test cases, developing unit tests are straightforward.
I also disagree with the user who shows how to do multiple tests cases inside of the same test case. When a test fails, you should be able to tell immediately which test failed and know why it failed. Tests should be as simple as you can make them and as concise/relevant to the code under test as possible.
You could define tests by extension instead of enumeration, or you could call multiple test cases from a single case.
calling multiple test cases from a single test case:
MyTest()
{
MyTest(1, "A")
MyTest(1, "B")
MyTest(2, "A")
MyTest(2, "B")
MyTest(3, "A")
MyTest(3, "B")
}
And there are sometimes elegant ways to achieve this with some testing frameworks. Here is how to do it in NUnit:
[Test, Combinatorial]
public void MyTest(
[Values(1,2,3)] int x,
[Values("A","B")] string s)
{
...
}
I also think it's a bad idea.
Mind you, not throwing random data at your code, but having unit tests doing that. It all boils down to why you unit test in the first place. The answer is "to drive the design of the code". Random data doesn't drive the design of the code, because it depends on a very rigid public interface. Mind you, you can find bugs with it, but that's not what unit tests are about. And let me note that I'm talking about unit tests, and not tests in general.
That being said, I strongly suggest taking a look at QuickCheck. It's Haskell, so it's a bit dodgy on presentation and a bit PhD-ish on documentation, but you should be able to figure it out. I'm going to summarize how it works, though.
After you pick the code you want to test (let's say the sort() function), you establish invariants which should hold. In this examples, you can have the following invariants if result = sort(input):.
Every element in result should be smaller than or equal to the next one.
Every element in input should be present in result the same number of times.
result and input should have the same length (this is repeats the previous, but let's have it for illustration).
You encode each variant in a simple function that takes the result and the output and checks whether those invariants code.
Then, you tell QuickCheck how to generate input. Since this is Haskell and the type system kicks ass, it can see that the function takes a list of integers and it knows how to generate those. It basically generates random lists of random integers and random length. Of course, it can be more fine-grained if you have a more complex data type (for example, only positive integers, only squares, etc.).
Finally, when you have those two, you just run QuickCheck. It generates all that stuff randomly and checks the invariants. If some fail, it will show you exactly which ones. It would also tell you the random seed, so you can rerun this exact failure if you need to. And as an extra bonus, whenever it gets a failed invariant, it will try to reduce the input to the smallest possible subset that fails the invariant (if you think of a tree structure, it will reduce it to the smallest subtree that fails the invariant).
And there you have it. In my opinion, this is how you should go about testing stuff with random data. It's definitely not unit tests and I even think you should run it differently (say, have CI run it every now and then, as opposed to running it on every change (since it will quickly get slow)). And let me repeat, it's a different benefit from unit testing - QuickCheck finds bugs, while unit testing drives design.
Usually the unit test frameworks support 'informative failures' as long as you pick the right assertion method.
However if everything else doesn't work, You could easily trace the dataset to the console/output file. Low tech but should work.
[TestCaseSource("GetDatasets")]
public Test.. (Dataset d)
{
Console.WriteLine(PrettyPrintDataset(d));
// proceed with checks
Console.WriteLine("Worked!");
}
In quickcheck for R we tried to solve this problem as follows
the tests are actually pseudo-random (the seed is fixed) so you can always reproduce your tests results (barring external factors, of course)
the test function returns enough data to reproduce the error, including the assertion that failed and the data that made it fail. A convenience function, repro, called on the return value of test will land you in the debugger at the beginning of the failing assertion, with arguments set to the witnesses of the failure. If the tests are executed in batch mode, equivalent information is stored in a file and the command to retrieve it is printed in stderr. Then you can call repro as before. Whether or not you program in R, I would love to know if this starts to address you requirements. Some aspects of this solution may be hard to implement in languages that are less dynamic or don't have first class functions.

Designing a robust unit test - testing the same logic in several different ways?

In unit test design, it is very easy to fall into the trap of actually just calling your implementation logic.
For example, if testing an array of ints which should all be two higher than the other (2, 4, 6, 8, etc), is it really enough to get the return value from the method and assert that this pattern is the case?
Am I missing something? It does seem like a single unit test method needs to be made more robust by testing the same expectation in several ways. So the above expectation can be asserted by checking the increase of two is happening but also the next number is divisible by 2. Or is this just redundant logic?
So in short, should a unit test test the one expectation in several ways? For example, if I wanted to test that my trousers fit me, I would/could measure the length, put it next to my leg and see the comparison, etc. Is this the sort of logic needed for unit testing?
Thanks
Your unit tests should check all of your assumptions. Whether you do that in 1 test or multiple tests is a personal preference.
In the example you stated above, you had two different assumptions: (1) Each value should increment by 2. (2) All values should be even.
Should (-8,-6,-4,-2) pass/fail?
Remember, ensuring your code fails when it's supposed to is just as important, if not more important, then making sure it passes when it's supposed to.
If you assert that your array contains 2,4,6,8 -- then your testing logic might be flawed because your test would pass if you just returned an array with those elements, but not with, say, 6,8,10,12. You need to test that calculation is correct. So you need to test it with multiple arrays, in this particular case.
I find that making sure the test fails, then making the test pass, in the true spirit of TDD, helps flush out what the correct test is...
The array you are testing, must be generated in some sort of logic. Isn't it better to test this logic to ensure that the resulting array always meets your requirements?
For example, if testing an array of
ints which should all be two higher
than the other (2, 4, 6, 8, etc), is
it really enough to get the return
value from the method and assert that
this pattern is the case?
Perhaps you need to think a little more about how the function would be used. Will it be use with very large numbers? If so, the you may want to try some tests with very large numbers. Will it be used with negative numbers?
Am I missing something? It does seem
like a single unit test method needs
to be made more robust by testing the
same expectation in several ways. So
the above expectation can be asserted
by checking the increase of two is
happening but also the next number is
divisible by 2. Or is this just
redundant logic?redundant logic?
Hmm... well 1,3,5,9 would pass the assertEachValueIncrementsByTwo test, but it would not pass the assertValuesDivisibleByTwo test. Does it matter that they are divisible by 2? If so, then you really should test that. If not, then it's a pointless redundant test.
You should try to find more than 1 test for your methods, but redundant tests for the sake of more testing is not going to help you. Adding the assertValuesDivisibleByTwo test when that is not really required will just confuse later developers who are trying to modify your code.
If you can't think of any more tests, try writing a random input function that will generate 100 random test arrays each time you run your tests. You'd be surprised how many bugs escape under the radar when you only check one or two input sets.
I'd recommend multiple tests. If you ever need to change the behaviour you'd like to have as few tests to change as possible. This also makes it easier to find what the problem is. If your really blow the implementation and get [1,3,4,5] your one test will fail, but you'll only get one failure for the first thing you test when there are actually two different problems.
Try naming your tests. If you can't say in one clear method name what you're testing break up the test.
testEntriesStepByTwo
testEntriesAllEven
Also don't forget the edge cases. The empty list will likely pass the 'each entry is 2 more than the previous' one and 'all entries are even' tests, but should it?

Unit testing specific values

Consider the following code (from a requirement that says that 3 is special for some reason):
bool IsSpecial(int value)
if (value == 3)
return true
else
return false
I would unit test this with a couple of functions - one called TEST(3IsSpecial) that asserts that when passed 3 the function returns true and another that passes some random value other than 3 and asserts that the function returns false.
When the requirement changes and say it now becomes 3 and 20 are special, I would write another test that verifies that when called with 20 this function returns true as well. That test would fail and I would then go and update the if condition in the function.
Now, what if there are people on my team who do not believe in unit testing and they make this change. They will directly go and change the code and since my second unit test might not test for 20 (it could be randomly picking an int or have some other int hardcoded). Now my tests aren't in sync with the code. How do I ensure that when they change the code some unit test or the other fails?
I could be doing something grossly wrong here so any other techniques to get around this are also welcome.
That's a good question. As you note a Not3IsNotSpecial test picking a random non-3 value would be the traditional approach. This wouldn't catch a change in the definition of "special".
In a .NET environment you can use the new code contracts capability to write the test predicate (the postcondition) directly in the method. The static analyzer would catch the defect you proposed. For example:
Contract.Ensures(value != 3 && Contract.Result<Boolean>() == false);
I think anybody that's a TDD fan is experimenting with contracts now to see use patterns. The idea that you have tools to prove correctness is very powerful. You can even specify these predicates for an interface.
The only testing approach I've seen that would address this is Model Based Testing. The idea is similar to the contracts approach. You set up the Not3IsNotSpecial condition abstractly (e.g., IsSpecial(x => x != 3) == false)) and let a model execution environment generate concrete tests. I'm not sure but I think these environments do static analysis as well. Anyway, you let the model execution environment run continuously against your SUT. I've never used such an environment, but the concept is interesting.
Unfortunately, that specific scenario is something that is difficult to guard against. With a function like IsSpecial, it's unrealistic to test all four billion negative test cases, so, no, you're not doing something grossly wrong.
Here's what comes to me off the top of my head. Many repositories have hooks that allow you to run some process on each check-in, such as running the unit tests. It's possible to set a criterion that newly checked in code must reach some threshold of code coverage under unit tests. If the commit does not meet certain metrics, it is rejected.
I've never had to set one of these systems up, so I don't know what is involved, but I do know it's possible.
And believe me, I feel your pain. I work with people who are similarly resistant to unit testing.
One thing you need to think about is why 3 is a special character and others are not. If it is defining some aspect of your application, you can take that aspect out and make an enum out of it.
Now you can check here that this test should fail if value doesn't exist in enum. And for enum class write a test to check for possible values. If there is new possible value being added your test should fail.
So your method will become:
bool IsSpecial(int value)
if (SpecialValues.has(value))
return true
else
return false
and your SpecialValues will be an enum like:
enum SpecialValues {
Three(3), Twenty(20)
public int value;
}
and now you should write to test possible values for enum. A simple test can be to check total number of possible values and another test can be to check the possible values itself
The other point to make is that in a less contrived example:
20 might have been some valid condition to test for based on knowledge of the business domain. Writing tests in a BDD style based on knowledge of the business problem might have helped you explicitly catch it.
4 might have been a good value to test for due to its status as a boundary condition. This may have been more likely to change in the real world so would more likely show up in a full test case.