Should I refer to the implementation while writing unit tests? - unit-testing

I was assigned a task regarding writing unit tests for some already-implemented classes.
Currently I am not really understand the purposes of each class. My concern is that whether I should (1) deeply dive into implementation, or (2) just simply see input - expected output of each class (as well as its methods)?
IMO:
(1) has an advantage that I can see some potential bugs so that I can design unit tests that can cover those cases.
In the other hand, (1) has disadvantages that are (a) I can get biased by the implementation and (b): takes time.

Actually, you do both.
First, and foremost, you focus on the public contract of the classes under test. So yes, looking at input/output should be your "first stop".
You see - ideally, your unit test do not need any kind of mocking. You create some instance underTest; you call a method; you observe/verify some behavior (for example asserting an expected against the actual return value).
But of course, sometimes it is worth looking into the details of the implementation.
So, a reasonable procedure is:
write "black box" tests that don't know about implementation details (where possible)
use coverage to understand how "good" your tests are covering the class under test
From the coverage numbers you can draw further conclusions, like:
your black box tests aren't sufficient, and you need to add certain "white box" tests in order to get into specific methods. Meaning: you look into the class under test to understand what it is doing; and how to get into important corners.
your tests are actually great - but there is unused code in those classes under test. Maybe that can then be deleted (deleting source code is the second best thing you can do as software engineer!)
Finally: it is more of an anti pattern to create unit tests that (more or less) re-program the implementation. You really want to avoid that your unit tests do nothing else but configure mocks for those calls that your production code is making. Thing is: when you test implementation details, then any implementation change (like a simple refactoring/reordering) can break your unit tests.

Related

Unit Testing Classes VS Methods

When unit testing, is it better practice to test a class or individual methods?
Most of the examples I've seen, test the class apart from other classes, mocking dependencies between classes. Another method I've played around w/ is mocking methods you're not testing (by overriding) so that you're only testing the code in one method. Thus 1 bug breaks 1 test since the methods are isolated from each other.
I was wondering if there is a standard method and if there are any big disadvantages to isolating each method for testing as opposed to isolating classes.
The phrase unit testing comes from hardware systems testing, and is more or less semantics-free when applied to software. It can get used for anything from isolation testing of a single routine to testing a complete system in headless mode with an in-memory database.
So don't trust anyone who argues that the definition implies there is only one way to do things independently of context; there a variety of ways, some of which are sometimes more useful than others. And presumably every approach a smart person would argue for has at least some value somewhere.
The smallest unit of hardware is the atom, or perhaps some subatomic particle. Some people test software like they were scanning each atom to see if the laws of quantum mechanics still held. Others take a battleship and see if it floats.
Something in between is very likely better. Once you know something about the kind of thing you are producing beyond 'it is software', you can start to come up with a plan that is appropriate to what you are supposed to be doing.
The point of unit testing is to test a unit of code i.e. class.
This gives you confidence that part of the code on its one is doing what is expected.
This is also the first part of the testing process. It helps to catch those pesky bugs as early as possible and having a unit test to demonstrate it makes it easier to fix that further down the line.
Unit testing by definition is testing the smallest piece of written code you can. "Units" are not classes they are methods.
Every public method should have at least 1 unit test, that tests that method specifically.
If you follow the rule above, you will eventually get to where class interactions are being covered. As long as you write 1 test per method, you will cover class interaction as well.
There is probably no one standard answer. Unit tests are for the developer (or they should be), do what is most helpful to you.
One downside of testing individual methods is you may not test the actual behavior of the object. If the mocking of some methods is not accurate that may go undetected. Also mocks are a lot of work, and they tend to make the tests very fragile, because they make the tests care a lot about what specific method calls take place.
In my own code I try whenever possible to separate infrastructure-type dependencies from business logic so that I can write tests of the business logic classes entirely without mocks. If you have a nasty legacy code base it probably makes more sense to test individual methods and mock any collaborating methods of the object, in order to insulate the parts from each other.
Theoretically objects are supposed to be cohesive so it would make sense to test them as a whole. In practice a lot of things are not particularly object-oriented. In some cases it is easier to mock collaborator methods than it is to mock injected dependencies that get called by the collaborators.

Black Box Unit Testing

In my last project, we had Unit Testing with almost 100% cc, and as a result we almost didn’t have any bugs.
However, since Unit Testing must be White Box (you have to mock inner functions to get the result you want, so your tests need to know about the inner structure of your code) any time we changed the implementation of a function, we had to change the tests as well.
Note that we didn't change the logic of the functions, just the implementation.
It was very time-consuming and it felt as if we are working the wrong way.
Since we used all proper OOP guidelines (specifically Encapsulation), every time we changed the implementation we didn't have to change the rest of our code, but had to change the unit tests.
It felt as if we are serving the tests, instead of them serving us.
To prevent this, some of us argued that unit tests should be Black Box Testing.
That would be possible if we create one big mock of our entire Domain and create a stub for every function in every class in one place, and use it in every unit test.
Of course that if a specific test needs specific inner function to be called (Like making sure we write to the DB), we can override our stub.
So, every time we change the implementation of a function (like adding or replacing a call to a help function) we will only need to change our main big mock. Even if we do need to change some unit tests, it will still be much less than before.
Others argue that unit tests must be White Box, since not only do you want to make sure your app writes to the DB in a specific place, you want to make sure your app does not write to the DB anywhere else unless you specifically expect it to. While this is a valid point, I don't think it worth the time of writing White Box tests instead of Black Box tests.
So in conclusion, two questions:
What do you think about the concept of Black Box Unit Testing?
What do you think about the way we want to implement that concept? Do you have better ideas?
You need different types of tests.
Unit-tests which should be white-box testing, as you did
Integration tests (or system tests) which test the ability to use the actual implementations of your system and its communication with external layers (external systems, database, etc.) which should be black-box styled, but each one for a specific feature (CRUD tests for example)
Acceptance tests which should be completely black-box and are driven by functional requirements (as your users would phrase them). End-to-end as much as possible, and not knowing the internal of your chosen implementations. The textbook definition of black-box tests.
And remember code coverage is meaningless in most of the cases. You need a high lines coverage (or methods coverage, whatever your counting method is), but that's usually not sufficient. The concept you need to think about is functional coverage: making sure all your requirements and logical paths are covered.
and as a result we almost didn’t have any bugs
If you were really able to achieve this, then I don't think you should change anything.
Black box testing might sound appealing on paper, but truth is you almost always need to know parts of inner workings of a tested class. The provide input, verify output in reality works only for simple cases. Most of the times your tests need to have at least some knowledge of tested method - how it interacts with external collaborators, what methods it calls, in what order and so forth.
Whole idea behind mocking and SOLID design is to avoid situation where dependency implementation change causes other class test changes/failures. On contrary, if you change implementation details of tested method, so should change implementation details of it tests. That's nothing too uncommon.
Overall, if you were really able to achieve almost no bugs, then I would stick to that approach.
tl;dr version:
Black Box unit testing is exactly how unit testing should be done.
Black Box unit testing is exactly how unit testing should be done. Proper TDD practice does exactly this.
Full version.
There is absolutely no need in testing private methods of the objects. It'll have no impact on code coverage, also.
When you TDD a class, you write tests that check the behavior of that class. Behavior is expressed through the public methods of that class. You should never bother with how that methods are really implemented. Google people described that a lot better than I will ever be able to: http://googletesting.blogspot.ru/2013/08/testing-on-toilet-test-behavior-not.html
If you do the usual mistake and statically depend on other entity classes or worse, on classes from the different layer of application, it's inevitable that you will find yourself in a situation when you need to check a lot of things in your test and prepare a lot of stuff for it. For solving this the Dependency Injection principle and the Law of Demeter exist.
I think you should continue writing unit tests - just make them less fragile.
Unit tests should be low level but should test the result and not how things done. When implementation change cause a lot of test change it means that instead of testing requirements you're actually testing implementation.
There are several rules of the thumb - such as "don't test private methods" and use mock objects.
Mocking/simulating the entire domain usually result in the opposite of what you're trying to accomplish - when the code behavior change you need to update the tests to make sure that your "simulated objects" behaves the same - it becomes really hard really fast as the complexity of the project increase.
I suggest that you continue writing unit tests - just learn how to make them more robust and less fragile.
"as a result we almost didn’t have any bugs" -- so keep it that way.
Sole cause of frustration is necessity to maintain unit tests, which actually is not such a bad thing (alternative is much worse). Just make them more maintainable. "The art of Unit Testing" by Roy Osherove gave me a good start in this way.
So
1) Not an option. (The idea itself contradicts principles of TDD, for instance)
2) You'll have much more maintenance troubles with such approach. Unit testing philosophy is to chop out SUT from other system and test it using stubs as input and mocks as output (signals?) simulating real life situations (or mb I just dont catch the "one big mock of our entire Domain" idea).
For detailed information about black, white and grey box and decision tables refer to the following article, which explains everything.
Testing Web-based applications: The state of the art and future trends (PDF)

Unit Testing : what to test / what not to test?

Since a few days ago I've started to feel interested in Unit Testing and TDD in C# and VS2010. I've read blog posts, watched youtube tutorials, and plenty more stuff that explains why TDD and Unit Testing are so good for your code, and how to do it.
But the biggest problem I find is, that I don't know what to check in my tests and what not to check.
I understand that I should check all the logical operations, problems with references and dependencies, but for example, should I create an unit test for a string formatting that's supossed to be user-input? Or is it just wasting my time while I just can check it in the actual code?
Is there any guide to clarify this problem?
In TDD every line of code must be justified by a failing test-case written before the code.
This means that you cannot develop any code without a test-case. If you have a line of code (condition, branch, assignment, expression, constant, etc.) that can be modified or deleted without causing any test to fail, it means this line of code is useless and should be deleted (or you have a missing test to support its existence).
That is a bit extreme, but this is how TDD works. That being said if you have a piece of code and you are wondering whether it should be tested or not, you are not doing TDD correctly. But if you have a string formatting routine or variable incrementation or whatever small piece of code out there, there must be a test case supporting it.
UPDATE (use-case suggested by Ed.):
Like for example, adding an object to a list and creating a test to see if it is really inside or there is a duplicate when the list shouldn't allow them.
Here is a counterexample, you would be surprised how hard it is to spot copy-paste errors and how common they are:
private Set<String> inclusions = new HashSet<String>();
private Set<String> exclusions = new HashSet<String>();
public void include(String item) {
inclusions.add(item);
}
public void exclude(String item) {
inclusions.add(item);
}
On the other hand testing include() and exclude() methods alone is an overkill because they do not represent any use-cases by themselves. However, they are probably part of some business use-case, you should test instead.
Obviously you shouldn't test whether x in x = 7 is really 7 after assignment. Also testing generated getters/setters is an overkill. But it is the easiest code that often breaks. All too often due to copy&paste errors or typos (especially in dynamic languages).
See also:
Mutation testing
Your first few TDD projects are going to probably result in worse design/redesign and take longer to complete as you are learning (at least in my experience). This is why you shouldn't jump into using TDD on a large critical project.
My advice is to use "pure" TDD (acceptance/unit test everything test-first) on a few small projects (100-10,000 LOC). Either do the side projects on your own or if you don't code in your free time, use TDD on small internal utility programs for your job.
After you do "pure" TDD on about 6-12 projects, you will start to understand how TDD affects design and learn how to design for testability. Once you know how to design for testability, you will need to TDD less and maximize the ROI of unit, regression, acceptance, etc. tests rather than test everything up front.
For me, TDD is more of teaching method for good code design than a practical methodology. However, I still TDD logic code and unit test instead of debug.
There is no simple answer to this question. There is the law of diminishing returns in action, so achieving perfect coverage is seldom worth it. Knowing what to test is a thing of experience, not rules. It’s best to consciously evaluate the process as you go. Did something break? Was it feasible to test? If not, is it possible to rewrite the code to make it more testable? Is it worth it to always test for such cases in the future?
If you split your code into models, views and controllers, you’ll find that most of the critical code is in the models, and those should be fairly testable. (That’s one of the main points of MVC.) If a piece of code is critical, I test it, even if it means that I would have to rewrite it to make it more testable. If a piece of code is easy to get wrong or get broken by future updates, it gets a test. I seldom test controllers and views, as it’s not proving worth the trouble for me.
The way I see it all of your code falls into one of three buckets:
Code that is easy to test: This includes your own deterministic public methods.
Code that is difficult to test: This includes GUI, non-deterministic methods, private methods, and methods with complex setup.
Code that you don't want to test: This includes 3rd party code, and code that is difficult to test and not worth the effort.
Of the three, you should focus on testing the easy code. The difficult to test code should be refactored so that into two parts: code that you don't want to test and easy code. And of course, you should test the refactored easy code.
I think you should only unit test entry points to behavior of the system. This include public methods, public accessors and public fields, but not constants (constant fields, enums, methods, etc.). It also includes any code which directly deals with IO, I explain why further below.
My reasoning is as follows:
Everything that's public is basically an entry point to a behavior of the system. A unit test should therefore be written that guarantees that the expected behavior of that entry point works as required. You shouldn't test all possible ways of calling the entry point, only the ones that you explicitly require. Your unit tests are therefore also the specs of what behavior your system supports and your documentation of how to use it.
Things that are not public can basically be deleted/re-factored at will with no impact to the behavior of the system. If you were to test those, you'd create a hard dependency from your unit test to that code, which would prevent you from doing refactoring on it. That's why you should not test anything else but public methods, fields and accessors.
Constants by design are not behavior, but axioms. A unit test that verifies a constant is itself a constant, so it would only be duplicated code and useless effort to write a test for constants.
So to answer your specific example:
should I create an unit test for a string formatting that's supossed
to be user-input?
Yes, absolutely. All methods which receive or send external input/output (which can be summed up as receiving IO), should be unit tested. This is probably the only case where I'd say non-public things that receive IO should also be unit tested. That's because I consider IO to be a public entry. Anything that's an entry point to an external actor I consider public.
So unit test public methods, public fields, public accessors, even when those are static constructs and also unit test anything which receives or sends data from an external actor, be it a user, a database, a protocol, etc.
NOTE: You can write temporary unit tests on non public things as a way for you to help make sure your implementation works. This is more of a way to help you figure out how to implement it properly, and to make sure your implementation works as you intend. After you've tested that it works though, you should delete the unit test or disable it from your test suite.
Kent Beck, in Extreme Programming Explained, said you only need to test the things that need to work in production.
That's a brusque way of encapsulating both test-driven development, where every change in production code is supported by a test that fails when the change is not present; and You Ain't Gonna Need It, which says there's no value in creating general-purpose classes for applications that only deal with a couple of specific cases.
I think you have to change your point of view.
In a pure form TDD requires the red-green-refactor workflow:
write test (it must fail) RED
write code to satisfy test GREEN
refactor your code
So the question "What I have to test?" has a response like: "You have to write a test that correspond to a feature or a particular requirements".
In this way you get must code coverage and also a better code design (remember that TDD stands also for Test Driven "Design").
Generally speaking you have to test ALL public method/interfaces.
should I create an unit test for a string formatting that's supossed
to be user-input? Or is it just wasting my time while I just can check
it in the actual code?
Not sure I understand what you mean, but the tests you write in TDD are supposed to test your production code. They aren't tests that check user input.
To put it another way, there can be TDD unit tests that test the user input validation code, but there can't be TDD unit tests that validate the user input itself.

Am I doing something fundamentally wrong in my unit tests?

After reading an interesting article about unit testing behavior instead of state, I came to realize that my unit tests often are tightly coupled to my code because I am using mocks.
I cannot image writing unit tests without mocks but the fact is that these mocks are coupling my unit test very much to my code because of the expect andReturn calls.
For example when I create a test that uses a mock, I record all calls to the specific mock and assign return values.
Now when I change the implementation of the actual code for whatever reason, a lot of tests break because that call was not expected by the mock, forcing me to update the unit test also, and effectively forcing me to implement every change twice...
This happens a lot.
Is this issue intrinsic to using mocks, and should I learn to live with it, or am I doing something fundamentally wrong?
Please enlighten me :)
Clear examples coming with the explanation are most welcome of course.
when I create a test that uses a mock,
I record all calls to the specific
mock and assign return values
It sounds like you may be over-specifying expectations.
Try to build as little setup code as possible into your tests: stub (rather than expect) all behavior that doesn't pertain to the current test and only specify return values that are absolutely needed to make your test work.
This answer includes a concise example (as well as an alternative, more detailed explanation).
My experience is to use mocks only at the bounderies of (sub)systems. If I have two classes that are strongly related I do not mock them appart but test them together. An example might be an composite and a visitor. If I test a concrete visitor I do not use a mock for the composite but create real composites. One might argue that this is not a unit test (depends on the definition of what is a unit). But that doesn't matter that much. What I try to achieve is:
Write readable tests (tests without mocks are most of the time more easy to read).
Test only a focused area of code (in the example the concreate visitor and the relevant part of the composite).
Write fast tests (as long as I instantiate only a few classes, in the example the concrete composites, this is not a concern ... watch for transitive creations).
Only if I encounter the boundary of a subsystem I use mocks. Example: I have a composite that can render itself to a renderer I would mock out the renderer if I test the render logic of the composite.
Testing behavior instead of state looks promosing at first but in general I would test state as the resulting tests are easiear to maintain. Mocks are a cannon. Don't crack a nut with a sledgehammer.
If you are fixing the tests because they break, you are not using them as intended.
If the behaviour of a method changes, in test driven development you would first change the test to expect the new behaviour, then implement the new behaviour.
Several good answers here already, but for me a good rule of thumb is to test the requirements of the method, not the implementation. Sometimes that may mean using a mock object because the interaction is the requirement, but you're usually better off testing the return value of the method or the change in state of the object.

Testing for required behaviour vs. TDD

In the article Test for Required Behavior, not Incidental Behavior, Kevlin Henney advises us that:
"[...] a common pitfall in testing is to hardwire tests to the specifics of an implementation, where those specifics are incidental and have no bearing on the desired functionality."
However, when using TDD, I often end up writing tests for incidental behaviour. What do I do with these tests? Throwing them away seems wrong, but the advice in the article is that these tests can reduce agility.
What about separating them into a separate test suite? That sounds like a start, but seems impractical intuitively. Does anyone do this?
In my experience implementation-dependent tests are brittle and will fail massively at the very first refactoring. What I try to do is focus on deriving a proper interface for a class while writing the tests, effectively avoiding such implementation details in the interface. This not only solves the brittle tests, but it also promotes cleaner design.
This still allows for extra tests that check for the risky parts of my selected implementation, but only as extra protection to a good coverage of the "normal" interface of my class.
For me the big paradigma shift came when I started writing tests before even thinking about the implementation. My initial surprise was that it became much easier to generate "extreme" test cases. Then I recognized the improved interface in turn helped shape the implementation behind it. The result is that my code nowadays doesn't do much more than the interface exposes, effectively reducing the need for most "implementation" tests.
During refactoring of the internals of a class, all tests will hold. Only in cases where the exposed interface changes, the test set may need to be extended or modified.
The problem you describe is very real and very easy to encounter when TDD'ing. In general you can say that it isn't testing incidental behavior itself which is a problem, but rather if tons of tests depend on that incidental behavior.
The DRY principle applies to test code as well as to production code. That can often be a good guideline when writing test code. The goal should be that all the 'incidental' behavior you specify along the way is isolated so that only a few tests out of the entire test suite use them. In that way, if you need to refactor that behavior, you only need to modify a few tests instead of a large fraction of the entire test suite.
This is best achieved by copious use of interfaces or abstract classes as collaborators, because this means that you get low class coupling.
Here's an example of what I mean. Assume that you have some kind of MVC implementation where a Controller should return a View. Assume that we have a method like this on a BookController:
public View DisplayBookDetails(int bookId)
The implementation should use an injected IBookRepository to get the book from the database and then convert that to a View of that book. You could write a lot of tests to cover all aspects of the DisplayBookDetails method, but you could also do something else:
Define an additional IBookMapper interface and inject that into the BookController in addition to the IBookRepository. The implementation of the method could then be something like this:
public View DisplayBookDetails(int bookId)
{
return this.mapper.Map(this.repository.GetBook(bookId);
}
Obviously this is a too simplistic example, but the point is that now you can write one set of tests for your real IBookMapper implementation, which means that when you test the DisplayBookDetails method, you can just use a Stub (best generated by a dynamic mock framework) to implement the mapping, instead of trying to define a brittle and complex relationship between a Book Domain object and how it is mapped.
The use of an IBookMaper is definitely an incidental implementation detail, but if you use a SUT Factory or better yet an auto-mocking container, the definition of that incidental behavior is isolated which means that you if later on you decide to refactor the implementation, you can do that by only changing the test code in a few places.
"What about separating them into a separate test suite?"
What would you do with that separate suite?
Here's the typical use case.
You wrote some tests which test implementation details they shouldn't have tested.
You factor those tests out of the main suite into a separate suite.
Someone changes the implementation.
Your implementation suite now fails (as it should).
What now?
Fix the implementation tests? I think not. The point was to not test an implementation because it leads to way to much maintenance work.
Have tests that can fail, but the overall unittest run is still considered good? If the tests fail, but the failure doesn't matter, what does that even mean? [Read this question for an example: Non-critical unittest failures An ignored or irrelevant test is just costly.
You have to discard them.
Save yourself some time and aggravation by discarding them now, not when they fail.
I you really do TDD the problem is not so big as it may seem at once because you are writing tests before code. You should not even think about any possible implementation before writing test.
Such problems of testing incidental behavior is much more common when you write tests after implementation code. Then the easy way is just checking that the function output is OK and does what you want, then writing test using that output. Really that's cheating, not TDD, and the cost of cheating is tests that will break if implementation change.
The good thing is that such tests will break yet more easily than good tests (good test meaning here tests depending only of the wanted feature, not implementation dependent). Having tests so generic they never break is quite worse.
Where I work what we do is simply fix such tests when we stumble upon them. How we fix them depends on the kind of incidental test performed.
the most common such test is probably the case where testing results occurs in some definite order overlooking this order is really not guaranteed. The easy fix is simple enough: sort both result and expected result. For more complex structures use some comparator that ignore that kind of differences.
every so often we test innermost function, while it's some outer most function that perform the feature. That's bad because refactoring away the innermost function becomes difficult. The solution is to write another test covering the same feature range at outermost function level, then remove the old test, and only then we can refactor the code.
when such test break and we see an easy way to make them implementation independant we do it. Yet, if it's not easy we may choose to fix them to still be implementation dependant but depending on the new implementation. Tests will break again at the next implementation change, but it's not necessarily a big deal. If it's a big deal then definitely throw away that test and find another one to cover that feature, or change the code to make it easier to test.
another bad case is when we have written tests using some Mocked object (used as stub) and then the mocked object behavior change (API Change). This one is bad because it does not break code when it should because changing the mocked object behavior won't change the Mock mimicking it. The fix here is to use the real object instead of the mock if possible, or fix the Mock for new behavior. In that case both the Mock behavior and the real object behavior are both incidental, but we believe tests that does not fail when they should are a bigger problem than tests breaking when they shouldn't. (Admitedly such cases can also be taken care of at integration tests level).