How to handle scenarios that "should not happen" in Spock? - unit-testing

I am wondering what is the best practice in Spock for scenarios that are exceptional or underfined situations. One that should normally not happen.
EXAMPLE:
Let's say in our API, there are two mutually exclusive fields. When both provided the result of some calculations is undefined, which in practice boils to some arbitrary handling - like giving precedence to one of them.
On a higher level that will lead to exception and a proper http response eventually. But in some local (unit) area a piece of code will still be executed against both.
When writing unit tests, I'd like to cover (record) also the current behaviour for this (atm) impossible corner case.
So far I just resorted to a proper comment, either in the test case name or somewhere else. But the question is: what would be the best/idiomatic Spock practice for documenting those scenarios? I mean something similar using || for separating expected arguments in where clause.

Related

Is it ok to use spec/valid? for function input validation at runtime?

I am writing a clojure library, and I am thinking on using clojure.spec. Is it good practice to use spec/valid? on functions input at runtime? or should I use destracturing along with type hints? I am concerned, if there will be a performance penalty, and if it's considered bad use of spec, and generally bad practice.
It is always appropriate to check for valid inputs "where appropriate". What does that mean? That is an open question.
In general, the most "dangerous" inputs will be coming from the outside world, at the boundaries of your program. This means input from the web/browser, or perhaps from another service.
The "safest" inputs are deep inside your program, where (presumably) you have more control and confidence. For example, you would (probably) never check the inputs to the + function to ensure each arg was a number. And, even if an invalid arg slipped by (a string, perhaps), the JVM would throw an exception right away.
For sure, you should have maximal checking when running unit tests, so at least you know your code is working properly in the "normal" path. I like to use Plumatic Schema for this as it is easy to have validation always run during unit tests, but is off by default during "production" runs.
Having said that, I often place assert statements at the beginning of functions where a bad input would be difficult to recognize or would really cause hard-to-diagnose problems later on.
Both of these techniques have saved me many times!
Note that even a strongly typed language like Java won't save you here. Typed variables may ensure that an input is a number, but it could be invalid like sending a negative value to the sqrt function. Other functions may have even more restrictive inputs, such as only being valid for prime numbers, etc. Types cannot capture these detailed requirements. Only good design and good testing can prevent these problems.
Types & asserts cannot prevent all problems, but they are like guard rails on the road (or seatbelts & airbags). They may not be able to prevent a crash, but they can provide early warning and reduce the severity of the consequences.
P.S. You can see how I like to balance the trade-off between safety & cost of checking in a libary I wrote. Be sure to see the file _bootstrap.clj which is a trick to turn on the Plumatic Schema tests only during unit tests.

Should we unit test what method actually does, or what it should do?

The question might seem a bit weird, but i'll explain it.
Consider following:
We have a service FirstNameValidator, which i created for other developers so they have a consistent way to validate a person's first name. I want to test it, but because the full set of possible inputs is infinite (or very very big), i only test few cases:
Assert.IsTrue(FirstNameValidator.Validate("John"))
Assert.IsFalse(FirstNameValidator.Validate("$$123"))
I also have LastNameValidator, which is 99% identical, and i wrote a test for it too:
Assert.IsTrue(LastNameValidator.Validate("Doe"))
Assert.IsFalse(LastNameValidator.Validate("__%%"))
But later a new structure appeared - PersonName, which consists of first name and last name. We want to validate it too, so i create a PersonNameValidator. Obviously, for reusability i just call FirstNameValidator and LastNameValidator. Everything is fine till i want to write a test for it.
What should i test?
The fact that FirstNameValidator.Validate was actually called with correct argument?
Or i need to create few cases and test them?
That is actually the question - should we test what service is expected to do? It is expected to validate PersonName, how it does it we actually don't care. So we pass few valid and invalid inputs and expect corresponding return values.
Or, maybe, what it actually does? Like it actually just calls other validators, so test that (.net mocking framework allows it).
Unit tests should be acceptance criteria for a properly functioning unit of code...
they should test what the code should and shouldn't do, you will often find corner cases when you are writing tests.
If you refactor code, you often will have to refactor tests... This should be viewed as part of the original effort, and should bring glee to your soul as you have made the product and process an improvement of such magnitude.
of course if this is a library with outside (or internal, depending on company culture) consumers, you have documentation to consider before you are completely done.
edit: also those tests are pretty weak, you should have a definition of what is legal in each, and actually test inclusion and exclusion of at least all of the classes of glyphps... they can still use related code for testing... ie isValidUsername(name,allowsSpace) could work for both first name and whole name depending on if spaces are allowed.
You have formulated your question a bit strangely: Both options that you describe would test that the function behaves as it should - but in each case on a different level of granularity. In one case you would test the behaviour based on the API that is available to a user of the function. Whether and how the function implements its functionality with the help of other functions/components is not relevant. In the second case you test the behaviour in isolation, including the way the function interacts with its dependended-on components.
On a general level it is not possible to say which is better - depending on the circumstances each option may be the best. In general, isolating a piece of software requires usually more effort to implement the tests and makes the tests more fragile against implementation changes. That means, going for isolation should only be done in situations where there are good reasons for it. Before getting to your specific case, I will describe some situations where isolation is recommendable.
With the original depended-on component (DOC), you may not be able to test everything you want. Assume your code does error handling for the case the DOC returns an error code. But, if the DOC can not easily be made to return an error code, you have difficulty to test your error handling. In this case, if you double the DOC you could make the double return an error code, and thus also test your error handling.
The DOC may have non-deterministic or system-specific behaviour. Some examples are random number generators or date and time functions. If this makes testing your functions difficult, it would be an argument to replace the DOC with some double, so you have control over what is provided to your functions.
The DOC may require a very complex setup. Imagine a complex data base or some complex xml document that needs to be provided. For one thing, this can make your setup quite complicated, for another, your tests get fragile and will likely break if the DOC changes (think that the XML schema changes...).
The setup of the DOC or the calls to the DOC are very time consuming (imagine reading data from a network connection, computing the next chess move, solving the TSP, ...). Or, the use of the DOC prolongs compilation or linking significantly. With a double you can possibly shorten the execution or build time significantly, which is the more interesting the more often you are executing / building the tests.
You may not have a working version of the DOC - possibly the DOC is still under development and is not yet available. Then, with doubles you can start testing nevertheless.
The DOC may be immature, such that with the version you have your tests are instable. In such a case it is likely that you lose trust in your test results and start ignoring failing tests.
The DOC itself may have other dependencies which have some of the problems described above.
These criteria can help to come to an informed decision about whether isolation is necessary. Considering your specific example: The way you have described the situation I get the impression that none of the above criteria is fulfilled. Which for me would lead to the conclusion that I would not isolate the function PersonNameValidator from its DOCs FirstNameValidator and LastNameValidator.

Is a postcondition a (type of) unit test?

I'm trying to incorporate some design-by-contract techniques into my coding style. Postconditions look a lot to me like embedded unit tests and I'm wondering if my thinking here is on the right track or way off-base.
Wikipedia defines a postcondition as "a condition or predicate that must always be true just after the execution of some section of code or after an operation in a formal specification. Postconditions are sometimes tested using assertions within the code itself".
Is that not very similar to what you do in a unit test that verifies state directly (doesn't use mocks)?
If that's the case:
1) By using post-conditions, aren't I now sort of embedding testing code in my production code, and isn't that frowned upon?
2) Should using postconditions change the structure of my unit tests? My first thought is that the assertion logic is moved from the tests to the postconditions. That is, tests will use the same inputs and I'm still testing everything I was testing before, but now instead of making assertions in the unit tests I'm making a simple binary assertion about the postconditions passing or not.
3) My second thought is that postcondition code might have control flow and is therefore not ideal for test code, which is supposed to be simple and avoid control flow. But, if I test the postconditions, can I then rely on them in my unit tests?
4) It seems difficult to test postconditions because if I understand them correctly they basically pass or fail and you would have to repeat the logic of the postcondition itself to check that it did the right thing. So, how do you test a postcondition? Do you check them by not utilizing them in your unit testing and ensuring your unit tests and postconditions pass or fail together?
5) My unit tests sometimes verify that a method has caused changes to state in collaborators. In standard practice, do postconditions cover collaborator state or just the state of the class they are defined on?
You are on the right track.
It is true that post-conditions serve a similar purpose to unit tests. The key difference is that the post-condition always runs, while the unit test only runs against a known set of data. This means that the post-condition is less likely to overlook the corner case you didn't think of, but is more expensive at run time.
Here are answers to your specific questions.
There is a run-time penalty to post-conditions. However (depending on your environment), it may be possible to drop assertions for speed. (In C you can use an #ifdef, in Java look up AOP, in Python anything in a assert only runs if you pass the --debug flag, etc.) Should you get a performance problem from your assertions, it is solvable. However my preference would be to leave them on until you have a reason not to.
Some of your logic will naturally move from the unit test to the post-condition. However it is worthwhile to make sure that you have unit tests that run through all of the cases of interest for your post-condition. This is particularly true if you are dropping assertions in production for speed.
Post-conditions are not unit tests. Write them in whatever way that makes sense for what they do. (In general they should be somewhat simple.)
In general you test post-conditions as described in #2, by passing in a set of inputs of interest where the post-condition might possibly be violated, and check that it isn't. If you want to test the logic of the post-condition itself, then you can set up code that can violate the post-condition, but which will only run during tests. For instance have a global variable that tests can set which, if it is set, replaces the data to be returned with whatever you want. Now you can cause the post-condition to receive any input you want.
I'm not going to give you a hard and fast rule. They are your contracts. They should say what makes sense for what the function is doing. That said, what you are describing can lead to tight coupling between those objects. Tight coupling is something you should only do with good reason.
Contracts aren't a form of unit-testing. Rather they're a way of specifying (in an executable format) what conditions should hold before and after a particular function or method is called, and may also specify invariants of objects.
You still need tests when you have contracts since just because you've specified what the functions are supposed to do doesn't mean they'll actually do it. But you'll find that your contracts will help you debug - because by having code that can check that what's happening at run-time is what was expected means that any logic or programming error will cause a failure near to the code that contains the error.
You may find that with contracts you're happy to have fewer smaller tests and more larger-scale tests since the contracts will let you narrow down the source of an error even if the test is broad. Also, there's less need for unit tests to play the role of a specification of how the logic is supposed to work, further limiting the value of the smaller tests.
Contracts are like assertions in that you may choose to or choose not to have them enabled in production code. My opinion is that contracts tend to be more expensive than assertions and so you'll tend to have them disabled in production.
As with any methodology or coding style - there is no single correct answer. However, one thing I found to be true so far is that there is never a 'one size fits all' solution.
So, if you implement these assertions into a logics of every single postcondition in your design, I'd consider it to be wrong.
My own opinion is that such assertions should be used only if failure to meet postconditions leads the entire system to a dangerously inconsistent state. So, if something like that happens, I'd definitely like the system to do something like: send email/sms to admin, halt production execution, run diagnostics or whatever should be done for that particular system. Note, that this would be an actual feature which purpose is increased security, it's not a unit test code.
On the other hand, if you're coding assertions after every single method call, then as you noticed only thing you are doing is hardcoding test cases into production code. That doesn't serve any real purpose, other than to make your codebase a big mess.

Does YAGNI also apply when writing tests?

When I write code I only write the functions I need as I need them.
Does this approach also apply to writing tests?
Should I write a test in advance for every use-case I can think of just to play it safe or should I only write tests for a use-case as I come upon it?
I think that when you write a method you should test both expected and potential error paths. This doesn't mean that you should expand your design to encompass every potential use -- leave that for when it's needed, but you should make sure that your tests have defined the expected behavior in the face of invalid parameters or other conditions.
YAGNI, as I understand it, means that you shouldn't develop features that are not yet needed. In that sense, you shouldn't write a test that drives you to develop code that's not needed. I suspect, though, that's not what you are asking about.
In this context I'd be more concerned with whether you should write tests that cover unexpected uses -- for example, errors due passing null or out of range parameters -- or repeating tests that only differ with respect to the data, not the functionality. In the former case, as I indicated above, I would say yes. Your tests will document the expected behavior of your method in the face of errors. This is important information to people who use your method.
In the latter case, I'm less able to give you a definitive answer. You certainly want your tests to remain DRY -- don't write a test that simply repeats another test even if it has different data. Alternatively, you may not discover potential design issues unless you exercise the edge cases of your data. A simple example is a method that computes a sum of two integers: what happens if you pass it maxint as both parameters? If you only have one test, then you may miss this behavior. Obviously, this is related to the previous point. Only you can be sure when a test is really needed or not.
Yes YAGNI absolutely applies to writing tests.
As an example, I, for one, do not write tests to check any Properties. I assume that properties work a certain way, and until I come to one that does something different from the norm, I won't have tests for them.
You should always consider the validity of writing any test. If there is no clear benefit to you in writing the test, then I would advise that you don't. However, this is clearly very subjective, since what you might think is not worth it someone else could think is very worth the effort.
Also, would I write tests to validate input? Absolutely. However, I would do it to a point. Say you have a function with 3 parameters that are ints and it returns a double. How many tests are you going to write around that function. I would use YAGNI here to determine which tests are going to get you a good ROI, and which are useless.
Write the test as you need it. Tests are code. Writing a bunch of (initially failing) tests up front breaks the red/fix/green cycle of TDD, and makes it harder to identify valid failures vs. unwritten code.
You should write the tests for the use cases you are going to implement during this phase of development.
This gives the following benefits:
Your tests help define the functionality of this phase.
You know when you've completed this phase because all of your tests pass.
You should write tests that cover all your code, ideally. Otherwise, the rest of your tests lose value, and you will in the end debug that piece of code repeatedly.
So, no. YAGNI does not include tests :)
There is of course no point in writing tests for use cases you're not sure will get implemented at all - that much should be obvious to anyone.
For use cases you know will get implemented, test cases are subject to diminishing returns, i.e. trying to cover each and every possible obscure corner case is not a useful goal when you can cover all important and critical paths with half the work - assuming, of course, that the cost of overlooking a rarely occurring error is endurable; I would certainly not settle for anything less than 100% code and branch coverage when writing avionics software.
You'll probably get some variance here, but generally, the goal of writing tests (to me) is to ensure that all your code is functioning as it should, without side effects, in a predictable fashion and without defects. In my mind, then, the approach you discuss of only writing tests for use cases as they are come upon does you no real good, and may in fact cause harm.
What if the particular use case for the unit under test that you ignore causes a serious defect in the final software? Has the time spent developing tests bought you anything in this scenario beyond a false sense of security?
(For the record, this is one of the issues I have with using code coverage to "measure" test quality -- it's a measurement that, if low, may give an indication that you're not testing enough, but if high, should not be used to assume that you are rock-solid. Get the common cases tested, the edge cases tested, then consider all the ifs, ands and buts of the unit and test them, too.)
Mild Update
I should note that I'm coming from possibly a different perspective than many here. I often find that I'm writing library-style code, that is, code which will be reused in multiple projects, for multiple different clients. As a result, it is generally impossible for me to say with any certainty that certain use cases simply won't happen. The best I can do is either document that they're not expected (and hence may require updating the tests afterward), or -- and this is my preference :) -- just writing the tests. I often find option #2 is for more livable on a day-to-day basis, simply because I have much more confidence when I'm reusing component X in new application Y. And confidence, in my mind, is what automated testing is all about.
You should certainly hold off writing test cases for functionality you're not going to implement yet. Tests should only be written for existing functionality or functionality you're about to put in.
However, use cases are not the same as functionality. You only need to test the valid use cases that you've identified, but there's going to be a lot of other things that might happen, and you want to make sure those inputs get a reasonable response (which could well be an error message).
Obviously, you aren't going to get all the possible use cases; if you could, there'd be no need to worry about computer security. You should get at least the more plausible ones, and as problems come up you should add them to the use cases to test.
I think the answer here is, as it is in so many places, it depends. If the contract that a function presents states that it does X, and I see that it's got associated unit tests, etc., I'm inclined to think it's a well-tested unit and use it as such, even if I don't use it that exact way elsewhere. If that particular usage pattern is untested, then I might get confusing or hard-to-trace errors. For this reason, I think a test should cover all (or most) of the defined, documented behavior of a unit.
If you choose to test more incrementally, I might add to the doc comments that the function is "only tested for [certain kinds of input], results for other inputs are undefined".
I frequently find myself writing tests, TDD, for cases that I don't expect the normal program flow to invoke. The "fake it 'til you make it" approach has me starting, generally, with a null input - just enough to have an idea in mind of what the function call should look like, what types its parameters will have and what type it will return. To be clear, I won't just send null to the function in my test; I'll initialize a typed variable to hold the null value; that way when Eclipse's Quick Fix creates the function for me, it already has the right type. But it's not uncommon that I won't expect the program normally to send a null to the function. So, arguably, I'm writing a test that I AGN. But if I start with values, sometimes it's too big a chunk. I'm both designing the API and pushing its real implementation from the beginning. So, by starting slow and faking it 'til I make it, sometimes I write tests for cases I don't expect to see in production code.
If you're working in a TDD or XP style, you won't be writing anything "in advance" as you say, you'll be working on a very precise bit of functionality at any given moment, so you'll be writing all the necessary tests in order make sure that bit of functionality works as you intend it to.
Test code is similar with "code" itself, you won't be writing code in advance for every use cases your app has, so why would you write test code in advance ?

Unit Testing without Assertions

Occasionally I come accross a unit test that doesn't Assert anything. The particular example I came across this morning was testing that a log file got written to when a condition was met. The assumption was that if no error was thrown the test passed.
I personally don't have a problem with this, however it seems to be a bit of a "code smell" to write a unit test that doesn't have any assertions associated with it.
Just wondering what people's views on this are?
It's simply a very minimal test, and should be documented as such. It only verifies that it doesn't explode when run. The worst part about tests like this is that they present a false sense of security. Your code coverage will go up, but it's illusory. Very bad odor.
This would be the official way to do it:
// Act
Exception ex = Record.Exception(() => someCode());
// Assert
Assert.Null(ex);
If there is no assertion, it isn't a test.
Quit being lazy -- it may take a little time to figure out how to get the assertion in there, but well worth it to know that it did what you expected it to do.
These are known as smoke tests and are common. They're basic sanity checks. But they shouldn't be the only kinds of tests you have. You'd still need some kind of verification in another test.
Such a test smells. It should check that the file was written to, at least that the modified time was updated perhaps.
I've seen quite a few tests written this way that ended up not testing anything at all i.e. the code didn't work, but it didn't blow up either.
If you have some explicit requirement that the code under test doesn't throw an exception and you want to explicitly call out this fact (tests as requirements docs) then I would do something like this:
try
{
unitUnderTest.DoWork()
}
catch
{
Assert.Fail("code should never throw exceptions but failed with ...")
}
... but this still smells a bit to me, probably because it's trying to prove a negative.
In some sense, you are making an implicit assertion - that the code doesn't throw an exception. Of course it would be more valuable to actually grab the file and find the appropriate line, but I suppose something's better than nothing.
It can be a good pragmatic solution, especially if the alternative is no test at all.
The problem is that the test would pass if all the functions called were no-ops. But sometimes it just isn't feasible to verify the side effects are what you expected. In the ideal world there would be enough time to write the checks for every test ... but I don't live there.
The other place I've used this pattern is for embedding some performance tests in with unit tests because that was an easy way to get them run every build. The tests don't assert anything, but measure how long the test took and log that.
The name of the test should document this.
void TestLogDoesNotThrowException(void) {
log("blah blah");
}
How does the test verify if the log is written without assertion ?
In general, I see this occuring in integration testing, just the fact that something succeeded to completion is good enough. In this case Im cool with that.
I guess if I saw it over and over again in unit tests I would be curious as to how useful the tests really were.
EDIT: In the example given by the OP, there is some testable outcome (logfile result), so assuming that if no error was thrown that it worked is lazy.
We do this all the time. We mock our dependencies using JMock, so I guess in a sense the JMock framework is doing the assertion for us... but it goes something like this. We have a controller that we want to test:
Class Controller {
private Validator validator;
public void control(){
validator.validate;
}
public setValidator(Validator validator){ this.validator = validator; }
}
Now, when we test Controller we dont' want to test Validator because it has it's own tests. so we have a test with JMock just to make sure we call validate:
public void testControlShouldCallValidate(){
mockValidator.expects(once()).method("validate");
controller.control;
}
And that's all, there is no "assertion" to see but when you call control and the "validate" method is not called then the JMock framework throws you an exception (something like "expected method not invoked" or something).
We have those all over the place. It's a little backwards since you basically setup your assertion THEN make the call to the tested method.
I've seen something like this before and I think this was done just to prop up code coverage numbers. It's probably not really testing code behaviour. In any case, I agree that it (the intention) should be documented in the test for clarity.
I sometimes use my unit testing framework of choice (NUnit) to build methods that act as entry points into specific parts of my code. These methods are useful for profiling performance, memory consumption and resource consumption of a subset of the code.
These methods are definitely not unit tests (even though they're marked with the [Test] attribute) and are always flagged to be ignored and explicitly documented when they're checked into source control.
I also occasionally use these methods as entry points for the Visual Studio debugger. I use Resharper to step directly into the test and then into the code that I want to debug. These methods either don't make it as far as source control, or they acquire their very own asserts.
My "real" unit tests are built during normal TDD cycles, and they always assert something, although not always directly - sometimes the assertions are part of the mocking framework, and sometimes I'm able to refactor similar assertions into a single method. The names of those refactored methods always start with the prefix "Assert" to make it obvious to me.
I have to admit that I have never written a unit test that verified I was logging correctly. But I did think about it and came across this discussion of how it could be done with JUnit and Log4J. Its not too pretty but it looks like it would work.
Tests should always assert something, otherwise what are you proving and how can you consistently reproduce evidence that your code works?
I would say that a test with no assertions indicates one of two things:
a test that isn't testing the code's important behavior, or
code without any important behaviors, that might be removed.
Thing 1
Most of the comments in this thread are about thing 1, and I would agree that if code under test has any important behavior, then it should be possible to write tests that make assertions about that behavior, either by
asserting on a function/method return value,
asserting on calls to 'test double' dependencies, or
asserting on changes to visible state.
If the code under test has important behavior, but there aren't assertions on the correctness of that behavior, then the test is deficient.
Your question appears to belong in this category. The code under test is supposed to log when a condition is met. So there are at least two tests:
Given that the condition is met, when we call the method, then does the logging occur?
Given that the condition is not met, when we call the method, then does the logging not occur?
The test would need a way to arrange the state of the code so that the condition was or was not met, and it would need a way to confirm that the logging either did or did not occur, probably with some logging 'test double' that just recorded the logging calls (people often use mocking frameworks for this.)
Thing 2
So how about those other tests, that lack assertions, but it's because the code under test doesn't do anything important? I would say that a judgment call is required. In large code bases with high code velocity (many commits per day) and with many simultaneous contributors, it is necessary to deliver code incrementally in small commits. This is so that:
your code reviewers are not overwhelmed by large complicated commits
you avoid merge conflicts
it is easy to revert your commit if it causes a fault.
In these situations, I have added 'placeholder' classes, which don't do anything interesting, but which provide the structure for the implementation that will follow. Adding this class now, and even using it from other classes, can help show reviewers how the pieces will fit together even if the important behavior of the new class is not yet implemented.
So, if we assume that such placeholders are appropriate to add, should we test them? It depends. At the least, you will want to confirm that the class is syntactically valid, and perhaps that none of its incidental behaviors cause uncaught exceptions.
For examples:
Python is an interpreted language, and so your continuous build may not have a way to confirm that your placeholder class is syntactically valid unless it executes the code as part of a test.
Your placeholder may have incidental behavior, such as logging statements. These behaviors are not important enough to assert on because they are not an essential part of the class's behavior, but they are potential sources of exceptions. Most test frameworks treat uncaught exceptions as errors, and so by executing this code in a test, you are confirming that the incidental behavior does not cause uncaught exceptions.
Personally I believe that this reasoning supports the temporary inclusion of assertion-free tests in a code base. That said, the situation should be temporary, and the placeholder class should soon receive a more complete implementation, or it should be removed.
As a final note, I don't think it's a good idea to include asserts on incidental behavior just to satisfy a formalism that 'all tests must have assertions'. You or another author may forget to remove these formalistic assertions, and then they will clutter the tests with assertions of non-essential behavior, distracting focus from the important assertions. Many of us are probably familiar with the situation where you come upon a test, and you see something that looks like it doesn't belong, and we say, "I'd really like to remove this...but it makes no sense why it's there. So it might be there for some potentially obscure and important reason that the original author forgot to document. I should probably just leave it so that I 1) respect the intentions of the original author, and 2) don't end up breaking anything and making my life more difficult." (See Chesterton's fence.)