Should I have failing tests? - unit-testing

Please note: I'm not asking for your opinion. I'm asking about conventions.
I was just wondering whether I should have both passing and failing tests with appropriate method names such as, Should_Fail_When_UsageQuantityIsNegative() , Should_Fail_When_UsageQuantityMoreThan50() , Should_Pass_When_UsageQuantityIs50().
Or instead, should I code them to pass and keep all the tests in Passed condition?

When you create unit tests, they should all pass. That doesn't mean that you shouldn't test the "failing" cases. It just means that the test should pass when it "fails."
This way, you don't have to go through your (preferably) large number of tests and manually check that the correct ones passed and failed. This pretty much defeats the purpose of automation.
As Mark Rotteveel points out in the comments, just testing that something failed isn't always enough. Make sure that the failure is the correct failure. For example, if you are using error codes and error_code being equal to 0 indicates a success and you want to make sure that there is a failure, don't test that error_code != 0; instead, test for example that error_code == 19 or whatever the correct failing error code is.
Edit
There is one additional point that I would like to add. While the final version of your code that you deploy should not have failing tests, the best way to make sure that you are writing correct code is to write your tests before you write the rest of the code. Before making any change to your source code, write a unit test (or ideally, a few unit tests) that should fail (or fail to compile) now, but pass after your change has been made. That's a good way to make sure that the tests that you write are testing the correct thing. So, to summarize, your final product should not have failing unit tests; however, the software development process should include periods where you have written unit tests that do not yet pass.

You should not have failing tests unless your program is acting in a way that it is not meant to.
If the intended behavior of your program is for something to fail, and it fails, that should trigger the test to pass.
If the program passes in a place where it should be failing, the test for that portion of code should fail.
In summary, a program is not working properly unless all tests are passing.

You should never have failing tests, as others have pointed out, this defeats the purpose of automation. What you might want are tests that verifies your code works as expected when inputs are incorrect. Looking at your examples Should_Fail_When_UsageQuantityIsNegative() is a test that should pass, but the assertions you make depend on what fail means. For example, if your code should throw an IllegalArgumentException when usage quantity is negative then you might have a test like this:
#Test(expected = IllegalArgumentException.class)
public void Should_Fail_When_UsageQuantityIsNegative() {
// code to set usage quantity to a negative value
}

There's a few different ways to interpret the question if tests should fail.
A test like Should_Fail_When_UsageQuantityMoreThan50() should instead be a passing test which checks the appropriate error is thrown. Throws_Exception_When_UsageQuantityMoreThan50() or the like. Many test suites have special facilities for testing exceptions: JUnit's expected parameter and Perl modules such as Test::Exception and can even test for warnings.
Tests should fail during the course of development, it means they're doing their job. You should be suspicious of a test suite which never fails, it probably has bad coverage. The failing tests will catch changes to public behavior, bugs, and other mistakes by the developer or the tests or the code. But when committed and pushed, the tests should be returned to passing.
Finally, there are legitimate cases where you have a known bug or missing feature which cannot at this time be fixed or implemented. Sometimes bugs are incidentally fixed, so it's good to write a test for it. When it passes, you know the bug has been fixed, and you want a notice when it starts passing. Different testing systems allow you to write tests which are expected to fail, and will only be visible if they pass. In Perl this is the TODO or expected failure. POSIX has a number of results such as UNRESOLVED, UNSUPPORTED and UNTESTED to cover this case.

Related

Verify Unit test by breaking Code on purpose

I want to test (proof) whether my Unit tests actually tests everything it needs to. Specifically how do I check whether I didn't miss certain asserts?
Take for instance this code:
int AddPositives(int a, int b)
{
if (a > 0 && b > 0)
return a + b;
return -1;
}
And someone wrote a Unit test like so:
[Test]
public void TestAddPositives()
{
Assert.AreEqual(3, AddPositives(1, 2));
AddPositives(0, 1);
}
Clearly an assert was missed here, which you might catch in a code-review. But how would you catch this automatically?
So is there something which breaks tested code on purpose to detect missing Asserts? Something which inspects the bytecode and changes constants and deletes code to check whether things can be changed without the Unit test failing.
There are several approaches that can help avoid the problem you have described:
1) The approach you mention (to 'break' the code) is known as mutation testing: Create 'mutants' of the system under test and see how many of the mutants are detected by the test suite. A mutant is a modification of the SUT, for example, by replacing operators in the code: One + in the code could be replaced by a - or a *. But, there are many more possibilities to create mutants. The English Wikipedia has an article about mutation testing. There you also find a number of references, some of which list tools to support mutation testing.
Mutation testing may help you detect 'inactive' test cases, but only if you have some reference that indicates, which mutations should have been detected.
2) Test-first approaches / test-driven development (TDD) also helps to avoid the problem you have described: In a test-first scenario, you write the test before you write the code that makes the test succeed. Therefore, after writing the test, the test suite should fail because of the new test.
Your scenario, namely that you forget to add an assertion, would be detected, because after adding your (not yet complete) test your test suite would not fail, but rather continue to suceed.
However, after the code is implemented, usually additional tests are implemented, for example to also address boundary cases. In these cases, the code is already there and you would then have to temporarily 'break' it to also see the additional tests fail.
3) As was already pointed out by others, coverage analysis can help you to detect the lack of tests that cover a specific part of the code. There are different types of coverage, like statement coverage, branch coverage, etc. But, with a good quality test suite, a piece of code is often covered many times to address boundary cases and other scenarios of interest. Then, leaving out one test case may still not be detected.
Summarized, while all these approaches can help you somehow, none of them is bullet proof. Neither is a review, because also reviewers miss some points. A review may, however, bring additional benefits, like, suggestions to improve the set of tests or the test code.
Some code coverage tools such as NCrunch (excellent but not free) will annotate your code lines to show whether a test hits them.
In the example you gave NCrunch would show a small black dot next to the "return -1;" line. This would indicate that no existing test passes through that line of code and therefore it is untested.
This is not perfect however since you could still write a test that hit that line of code without asserting that it returned -1, so you can't assume that just because you have 100% coverage that you have written all the meaningful tests. So it can tell you that return -1 is definitely not unit-tested but it would not tell you that you had failed to test a boundary condition (such as checking what happens when a = 0)

Should my unit tests for existing methods fail?

In TDD or BDD, we start from failing our unit tests, then fix the methods under test to let the unit tests pass.
Often times, at a new job, we need to write unit tests for existing methods. Probably not a good practice, but this does happen. That's the situation I am in now.
So, here is my question: Should I let my unit tests for existing methods fail? Thank you.
You're not dealing with TDD when you're adding unit tests for working code. However, it is still a good idea to make the tests fail when you first write them (for example, in an extremely simple case, if the actual output will be abc, you write the test to expect abd) so that you know that the tests do fail when the output is different from what the test says is expected. Once you've proved that the tests can fail, you can make them pass by fixing the expected output.
The worst situation to be in is to add unit tests to working code that pass when they're first written. Then you modify the code, changing the output, and the unit tests still pass — when they shouldn't. So, make sure your new unit tests do detect problems — which really does mean writing them so they fail at first (but that may mean you have to write them with known-to-be-bogus expected results).
No, you should not try to make your tests fail.
Why does TDD make the test fail first?
The first reason is to ensure that you really are writing your tests before your code. If you write a test and it passes right away, you are writing code before your test. We write failing tests to be sure that we really are writing the tests first. In your case, its too late for that, so this reason doesn't apply.
A second reason is to verify that the tests are correct. The danger of a test is that it will not be testing the functionality that you think it is. Having the test fail for the correct reason gives confidence that the test is actually working. However, you cannot have the test fail for the correct reason. The code works, and the test is supposed to detect whether or not the good is working. So there is no way to write a test that actually fails for the correct reason.
You can, as the other answer suggested, write test code that is wrong, watch if fail and then correct it. But that fails because the test is wrong, its not really showing that your test correctly catches actual errors. At best it really shows that your assertions work. But generally we are pretty confident that the assertions work, and we don't need to constantly retest our assertions functions.
I don't think you gain much by trying to get your tests to fail when you are adding tests to already working code. So I wouldn't do it.

Unit testing: Is it a good practice to have assertions in setup methods?

In unit testing, the setup method is used to create the objects needed for testing.
In those setup methods, I like using assertions: I know what values I want to see in those
objects, and I like to document that knowledge via an assertion.
In a recent post on unit tests calling other unit tests here on stackoverflow, the general feeling seems to be that unit tests should not call other tests:
The answer to that question seems to be that you should refactor your setup, so
that test cases do not depend on each other.
But there isn't much difference in a "setup-with-asserts" and a
unit test calling other unit tests.
Hence my question: Is it good practice to have assertions in setup methods?
EDIT:
The answer turns out to be: this is not a good practice in general. If the setup results need to be tested, it is recommended to add a separate test method with the assertions (the answer I ticked); for documenting intent, consider using Java asserts.
Instead of assertions in the setup to check the result, I used a simple test (a test method along the others, but positionned as first test method).
I have seen several advantages:
The setup keeps short and focused, for readability.
The assertions are run only once, which is more efficient.
Usage and discussion :
For example, I name the method testSetup().
To use it, when I have some test errors in that class, I know that if testSetup() has an error, I don't need to bother with the other errors, I need to fix this one first.
If someone is bothered by this, and wants to make this dependency explicit, the testSetup() could be called in the setup() method. But I don't think it matters. My point is that, in JUnit, you can already have something similar in the rest of your tests:
some tests that test local code,
and some tests that is calls more global code, which indirectly calls the same code as the previous test.
When you read the test result where both fail, you already have to take care of this dependency that is not in the test, but in the code being called. You have to fix the simple test first, and then rerun the global test to see if it still fails.
This is the reason why I'm not bothered by the implicit dependency I explained before.
Having assertions in the Setup/TearDown methods is not advisable. It makes the test less readable if the user needs to "understand" that some of the test logic is not in the test method.
There are times when you do not have a choice but to use the setup/teardown methods for something other than what they where intended for.
There is a bigger issue in this question: a test that calls another test, it is a smell for some problem in your tests.
Each test should test a specific aspect of your code and should only have one or two assertions in it, so if your test calls another test you might be testing too many things in that test.
For more information read: Unit Testing: One Test, One Assertion - Why It Works
They're different scenarios; I don't see the similarity.
Setup methods should contain code that is common to (ideally) all tests in a fixture. As such, there's nothing inherently wrong with putting asserts in a test setup method if certain things must be true before the rest of the test code executes. The setup is an extension of the test; it is part of the test as a whole. If the assert trips, people will discover which pre-requisite failed.
On the other hand, if the setup is complicated enough that you feel the need to assert it is correct, it may be a warning sign. Furthermore, if all tests do not require the setup's full output, then it is a sign that the fixture has poor cohesion and should be split up based on scenarios and/or refactored.
It's partly because of this that I tend to stay away from using Setup methods. Where possible, I use private factory methods or similar to set things up. It makes the test more readable and avoids confusion. Sometimes this is not practical (e.g. working with tightly coupled classes and/or when writing integration tests), but for the majority of my tests it does the job.
Follow your heart / Blink decisions. Asserts within a Setup method can document intent ; improver readability. So personally I'd back you up on this.
It is different from a test calling other tests - which is bad. No test isolation. A test should not influence the outcome of another test.
Although it is not a freq use-case, I sometimes use Asserts inside a Setup method so that I can know if test setup has not taken place as I intended it to; usually when I'm dealing with components that I didn't write myself. An Assertion failure which reads 'Setup failed!' in the errors tab - quickly helps me zone in on the setup code instead of having to look at a bunch of failed tests.
A Setup failure usually should cause all tests in that fixture to fail - which is a smell that your nose should soon pickup. 'All tests failed usually implies Setup broke ' So assertions are not always needed. That said be pragmatic, look at your specific context and 'Add to taste.'
I use Java asserts, rather than JUnit ones, in the cases where something like this is necessary. e.g. when you use some other utility class to set up test data.:
byte[] pkt = pktFactory.makePacket(TIME, 12, "23, F2");
assert pkt.length == 15;
Failing has the implication 'system is not in a state to even try to run this test'.

Should failing tests make the continuous build fail?

If one has a project that has tests that are executed as part of the build procedure on a build machine, if a set tests fail, should the entire build fail?
What are the things one should consider when answering that question? Does it matter which tests are failing?
Background information that prompted this question:
Currently I am working on a project that has NUnit tests that are done as part of the build procedure and are executed on our cruise control .net build machine.
The project used to be setup so that if any tests fail, the build fails. The reasoning being if the tests fail, that means the product is not working/not complete/it is a failure of a project, and hence the build should fail.
We have added some tests that although they fail, they are not crucial to the project (see below for more details). So if those tests fail, the project is not a complete failure, and we would still want it to build.
One of the tests that passes verifies that incorrect arguments result in an exception, but the test does not pass is the one that checks that all the allowed arguments do not result in an exception. So the class rejects all invalid cases, but also some valid ones. This is not a problem for the project, since the rejected valid arguments are fringe cases, on which the application will not rely.
If it's in any way doable, then do it. It greatly reduces the broken-window-problem:
In a system with no (visible) flaws, introducing a small flaw is usually seen as a very bad idea. So if you've got a project with a green status (no unit test fails) and you introduce the first failing test, then you (and/or your peers) will be motivated to fix the problem.
If, on the other side, there are known-failing tests, then adding just another broken test is seen as keeping the status-quo.
Therefore you should always strive to keep all tests running (and not just "most of them"). And treating every single failing test as reason for failing the build goes a long way towards that goal.
If a unit test fails, some code is not behaving as expected. So the code shouldn't be released.
Although you can make the build for testing/bugfixing purposes.
If you felt that a case was important enough to write a test for, then if that test is failing, the software is failing. Based on that alone, yes, it should consider the build a failure and not continue. If you don't use that reasoning, then who decides what tests are not important? Where is the line between "if this fails it's ok, but if that fails it's not"? Failure is failure.
I think a nice setup like yours should always build successfully, including all unit tests passed.
Like Gamecat said, the build itself is succeeded, but this code should never go to production.
Imagine one of your team members introducing a bug in the code which that one unit test (which always fails) covers. It won't be discovered by the test since you allow that one test to always fail.
In our team we have a simple policy: if all tests don't pass, we don't go to production with the code. This is also a very simple to understand for our project manager.
In my opinion it really depends on your Unit Tests,...
if your Unit tests are really UNIT tests (like they should be => "reference to endless books ;)" )
then the build should fail, because something is not behaving as should...
but most often (unfortunately to often seen), in so many projects these unit tests only cover some 'edges' and/or are integration tests, then the build should go on
(yes, this is a subjective answer ;)
in short:
do you know the unit tests to be fine: fail;
else: build on
The real problem is with your failing tests. You should not have a unit test where it's OK to fail because it's an edge case. Decide whether the edge case is important or not - if not then delete the unit test; if yes then fix the code.
Like some of the other answers implied, it's definitely a code smell when unit tests fail. If you live with the smell, then you're less likely to spot the next problem
All the answers have been great, here is what I decided to do:
Make the tests (or if need be split a failing test) that are not crucial be ignored by NUnit (I remembered this feature after asking the question). This allows:
The build can fail if any tests fail, hence reducing the smelliness
The tests that are ignored have to be defended to project manager (whomever is in charge)
Any tests that are ignored are marked in a special way
I think that is the best compromise, forcing people to fix the tests, but not necessarily right away (but they have to defend their decision of not fixing it now since everyone knows what they did).
What I actually did: fixed the broken tests.

Unit Testing without Assertions

Occasionally I come accross a unit test that doesn't Assert anything. The particular example I came across this morning was testing that a log file got written to when a condition was met. The assumption was that if no error was thrown the test passed.
I personally don't have a problem with this, however it seems to be a bit of a "code smell" to write a unit test that doesn't have any assertions associated with it.
Just wondering what people's views on this are?
It's simply a very minimal test, and should be documented as such. It only verifies that it doesn't explode when run. The worst part about tests like this is that they present a false sense of security. Your code coverage will go up, but it's illusory. Very bad odor.
This would be the official way to do it:
// Act
Exception ex = Record.Exception(() => someCode());
// Assert
Assert.Null(ex);
If there is no assertion, it isn't a test.
Quit being lazy -- it may take a little time to figure out how to get the assertion in there, but well worth it to know that it did what you expected it to do.
These are known as smoke tests and are common. They're basic sanity checks. But they shouldn't be the only kinds of tests you have. You'd still need some kind of verification in another test.
Such a test smells. It should check that the file was written to, at least that the modified time was updated perhaps.
I've seen quite a few tests written this way that ended up not testing anything at all i.e. the code didn't work, but it didn't blow up either.
If you have some explicit requirement that the code under test doesn't throw an exception and you want to explicitly call out this fact (tests as requirements docs) then I would do something like this:
try
{
unitUnderTest.DoWork()
}
catch
{
Assert.Fail("code should never throw exceptions but failed with ...")
}
... but this still smells a bit to me, probably because it's trying to prove a negative.
In some sense, you are making an implicit assertion - that the code doesn't throw an exception. Of course it would be more valuable to actually grab the file and find the appropriate line, but I suppose something's better than nothing.
It can be a good pragmatic solution, especially if the alternative is no test at all.
The problem is that the test would pass if all the functions called were no-ops. But sometimes it just isn't feasible to verify the side effects are what you expected. In the ideal world there would be enough time to write the checks for every test ... but I don't live there.
The other place I've used this pattern is for embedding some performance tests in with unit tests because that was an easy way to get them run every build. The tests don't assert anything, but measure how long the test took and log that.
The name of the test should document this.
void TestLogDoesNotThrowException(void) {
log("blah blah");
}
How does the test verify if the log is written without assertion ?
In general, I see this occuring in integration testing, just the fact that something succeeded to completion is good enough. In this case Im cool with that.
I guess if I saw it over and over again in unit tests I would be curious as to how useful the tests really were.
EDIT: In the example given by the OP, there is some testable outcome (logfile result), so assuming that if no error was thrown that it worked is lazy.
We do this all the time. We mock our dependencies using JMock, so I guess in a sense the JMock framework is doing the assertion for us... but it goes something like this. We have a controller that we want to test:
Class Controller {
private Validator validator;
public void control(){
validator.validate;
}
public setValidator(Validator validator){ this.validator = validator; }
}
Now, when we test Controller we dont' want to test Validator because it has it's own tests. so we have a test with JMock just to make sure we call validate:
public void testControlShouldCallValidate(){
mockValidator.expects(once()).method("validate");
controller.control;
}
And that's all, there is no "assertion" to see but when you call control and the "validate" method is not called then the JMock framework throws you an exception (something like "expected method not invoked" or something).
We have those all over the place. It's a little backwards since you basically setup your assertion THEN make the call to the tested method.
I've seen something like this before and I think this was done just to prop up code coverage numbers. It's probably not really testing code behaviour. In any case, I agree that it (the intention) should be documented in the test for clarity.
I sometimes use my unit testing framework of choice (NUnit) to build methods that act as entry points into specific parts of my code. These methods are useful for profiling performance, memory consumption and resource consumption of a subset of the code.
These methods are definitely not unit tests (even though they're marked with the [Test] attribute) and are always flagged to be ignored and explicitly documented when they're checked into source control.
I also occasionally use these methods as entry points for the Visual Studio debugger. I use Resharper to step directly into the test and then into the code that I want to debug. These methods either don't make it as far as source control, or they acquire their very own asserts.
My "real" unit tests are built during normal TDD cycles, and they always assert something, although not always directly - sometimes the assertions are part of the mocking framework, and sometimes I'm able to refactor similar assertions into a single method. The names of those refactored methods always start with the prefix "Assert" to make it obvious to me.
I have to admit that I have never written a unit test that verified I was logging correctly. But I did think about it and came across this discussion of how it could be done with JUnit and Log4J. Its not too pretty but it looks like it would work.
Tests should always assert something, otherwise what are you proving and how can you consistently reproduce evidence that your code works?
I would say that a test with no assertions indicates one of two things:
a test that isn't testing the code's important behavior, or
code without any important behaviors, that might be removed.
Thing 1
Most of the comments in this thread are about thing 1, and I would agree that if code under test has any important behavior, then it should be possible to write tests that make assertions about that behavior, either by
asserting on a function/method return value,
asserting on calls to 'test double' dependencies, or
asserting on changes to visible state.
If the code under test has important behavior, but there aren't assertions on the correctness of that behavior, then the test is deficient.
Your question appears to belong in this category. The code under test is supposed to log when a condition is met. So there are at least two tests:
Given that the condition is met, when we call the method, then does the logging occur?
Given that the condition is not met, when we call the method, then does the logging not occur?
The test would need a way to arrange the state of the code so that the condition was or was not met, and it would need a way to confirm that the logging either did or did not occur, probably with some logging 'test double' that just recorded the logging calls (people often use mocking frameworks for this.)
Thing 2
So how about those other tests, that lack assertions, but it's because the code under test doesn't do anything important? I would say that a judgment call is required. In large code bases with high code velocity (many commits per day) and with many simultaneous contributors, it is necessary to deliver code incrementally in small commits. This is so that:
your code reviewers are not overwhelmed by large complicated commits
you avoid merge conflicts
it is easy to revert your commit if it causes a fault.
In these situations, I have added 'placeholder' classes, which don't do anything interesting, but which provide the structure for the implementation that will follow. Adding this class now, and even using it from other classes, can help show reviewers how the pieces will fit together even if the important behavior of the new class is not yet implemented.
So, if we assume that such placeholders are appropriate to add, should we test them? It depends. At the least, you will want to confirm that the class is syntactically valid, and perhaps that none of its incidental behaviors cause uncaught exceptions.
For examples:
Python is an interpreted language, and so your continuous build may not have a way to confirm that your placeholder class is syntactically valid unless it executes the code as part of a test.
Your placeholder may have incidental behavior, such as logging statements. These behaviors are not important enough to assert on because they are not an essential part of the class's behavior, but they are potential sources of exceptions. Most test frameworks treat uncaught exceptions as errors, and so by executing this code in a test, you are confirming that the incidental behavior does not cause uncaught exceptions.
Personally I believe that this reasoning supports the temporary inclusion of assertion-free tests in a code base. That said, the situation should be temporary, and the placeholder class should soon receive a more complete implementation, or it should be removed.
As a final note, I don't think it's a good idea to include asserts on incidental behavior just to satisfy a formalism that 'all tests must have assertions'. You or another author may forget to remove these formalistic assertions, and then they will clutter the tests with assertions of non-essential behavior, distracting focus from the important assertions. Many of us are probably familiar with the situation where you come upon a test, and you see something that looks like it doesn't belong, and we say, "I'd really like to remove this...but it makes no sense why it's there. So it might be there for some potentially obscure and important reason that the original author forgot to document. I should probably just leave it so that I 1) respect the intentions of the original author, and 2) don't end up breaking anything and making my life more difficult." (See Chesterton's fence.)