Related
I'm talking about uncle bob's rules of TDD:
You are not allowed to write any production code unless it is to make a failing unit test pass.
You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
You are not allowed to write any more production code than is sufficient to pass the one failing unit test.
My problem is that what happens when you expect to create a feature that can generate more than 1 result, and at first iteration you implement codes that can satisfy every scenario?
I wrote such a code once, because that was the only solution which hit my mind first.
I'd say I didn't violate any of these 3 rules.
I wrote a test with least possible conditions to fail.
Then I implemented the feature with sufficient codes to pass the test (and that was the only solution I came up with; so I'd say least possible codes I could have wrote).
And then I wrote the next test to discover that It was already passing.
Now what about the rules ?
Am I not allowed to write this test even if the feature is a soooper important one ? Or should I rollback and start over ?
I'd also mention that this method cannot be refactored according to the result or data fed to it. Right now the example situation I can think about is a little stupid but bear with me please. take a look at this for example:
I want to create a method to add numbers, this is pretty much all I can do. White a failing test:
public function it_can_add_numbers()
{
$this->add(2, 3)->shouldReturn(5);
}
Then make it pass:
public function add($numberOne, $numberTwo)
{
return $numberOne + $numberTwo;
}
Now one could argue that I should have returned 5 in first iteration because that was enough to pass the test and to introduce regression as well, but this is not an actual problem, so please bear with me and suppose this is the only solution one could think of.
Now my company want me to make sure that they are able to add 12 and 13 because these are some internal magical numbers that we'd be using quite a lot of time. I go ahead and write another test, because this is how I'm supposed to verify a feature.
public function it_can_add_twelve_and_thirteen()
{
$this->add(12, 13)->shouldReturn(25);
}
turns out the test is already passing.
At this point, I can choose to not write the test, but what if at a later time someone make changes in actual codes and make it
public function add($numberOne, $numberTwo)
{
return 5;
}
The test will still pass but the feature is not there.
So what about those situations when you cannot immediately think about a possible flaw to introduce in first iterations before making improvements ? should I leave it here and wait for someone to come over and screw it up ? should I leave this case for regression tests ?
To be true to the rule number 3 in uncle bob's rules, after writing the test:
public function it_can_add_numbers()
{
$this->add(2, 3)->shouldReturn(5);
}
The correct code would be:
public function add($numberOne, $numberTwo)
{
return 5;
}
Now, when you add the second test, it will fail, which will make you change the code to comply with both tests, and then refactor it to be DRY, resulting in:
public function add($numberOne, $numberTwo)
{
return $numberOne + $numberTwo;
}
I won't say that it is the "only true way" to code, and #Oli and #Leo have a point that you should not stop thinking like a programmer because you are TDDing, but the above process is an example of following the 3 rules you stated in TDD...
If you write a passing test, you are not doing TDD.
That is not to say that the test has no value, but it says very clearly that your new test is not "driving design" (or "driving development", the "other" DD). You may need the new test for regression or to satisfy management or to bring up some code coverage metric, but you don't need it to drive your design or your development.
You did violate Uncle Bob's third rule, insofar as you wrote more logic than was required to pass the test you had.
It's OK to break the rules; it's just not OK to break the rules and say you're not breaking them. If you want to rigidly adhere to TDD as Uncle Bob defines it, you need to follow his rules.
If the feature is really simple, like the addition example, I would not worry about implementing it with just one test. In more complex, real-world situations, implementing the entire solution at once may leave you with poorly structured code because you don't get the chance to evolve the design and refactor the code. TDD is a design technique that helps you write better code and requires you to be constantly THINKING, not just following rules by rote.
TDD is a best practice, not a religion. if your case is trivial/obvious, just write the right code instead of doing some artificial useless refactoring
Consider unit testing a dictionary object. The first unit tests you might write are a few that simply adds items to the dictionary and check exceptions. The next test may be something like testing that the count is accurate, or that the dictionary returns a correct list of keys or values.
However, each of these later cases requires that the dictionary can first reliably add items. If the tests which add items fail, we have no idea whether our later tests fail because of what they're testing is implemented incorrectly, or because the assumption that we can reliably add items is incorrect.
Can I declare a set of unit tests which cause a given unit test to be inconclusive if any of them fail? If not, how should I best work around this? Have I set up my unit tests wrong, that I'm running into this predicament?
It's not as hard as it might seem. Let's rephrase the question a bit:
If I test my piece of code which requires System.Collections.Generic.List<T>.Add to work, what should I do when one day Microsoft decides to break .Add on List<T>? Do I make my tests depending on this to work inconclusive?
Answer to the above is obvious; you don't. You let them fail for one simple reason - your assumptions have failed, and test should fail. It's the same here. Once you get your add tests to work, from that point on you assume add works. You shouldn't treat your tested code any differently than 3rd party tested code. Once it's proven to work, you assume it indeed does.
On a different note, you can utilize concept called guard assertions. In your remove test, after the arrange phase you introduce additional assert phase, which verifies your initial assumptions (in this case - that the add is working). More information about this technique can be found here.
To add an example, NUnit uses the concept above disguised under the name Theory. This does exactly what you proposed (yet it seems to be more related to data driven testing rather than general utility):
The theory itself is responsible for ensuring that all data supplied meets its assumptions. It does this by use of the Assume.That(...) construct, which works just like Assert.That(...) but does not cause a failure. If the assumption is not satisfied for a particular test case, that case returns an Inconclusive result, rather than a Success or Failure.
However, I think what Mark Seemann states in an answer to the question I linked makes the most sense:
There may be many preconditions that need to be satisfied for a given test case, so you may need more than one Guard Assertion. Instead of repeating those in all tests, having one (and one only) test for each precondition keeps your test code more mantainable, since you will have less repetition that way.
Nice question, I often ponder this and had this problem the other day. What I did was get the basics of our collection working using a dictionary behind the scenes. For example:
public class MyCollection
{
private IDictionary<string, int> backingStore;
public MyCollection(IDictionary<string, int> backingStore)
{
_backingStore = backingStore;
}
}
Then we test drove the addition implementation. As we had the dictionary by reference we could assert that after adding items our business logic was correct.
For example the pseudo code for the additon was something like:
public void Add(Item item)
{
// Check we have not added before
// More business logic...
// Add
}
Then the test could be written such as:
var subject = new MyCollection(backingStore);
subject.Add(new Item())
Assert.That(backingStore.Contains(itemThatWeAdded)
We then went on to drive out the other methods such as retrieval, and deletion.
Your question is what should you do with regards the addition breaking, in turn breaking the retrieval. This is a catch 22 scenario. Personally I'd rather ditch the backing store and use this as an implementation detail. So this is what we did. We refactored the tests to use the system under test, rather than the backing store for the asserts. The great thing about the backing store being public initially is it allows you test drive small parts of the codebase, rather than having to implement both addition and retrieval in one go.
The test for addition then looked like the following after we refactored the collection to not expose the backing store.
var subject = new MyCollection();
var item = new Item()
subject.Add(item)
Assert.That(subject.Has(item), Is.True);
In this case I think this is fine. If you can not add items successfully then you sure as hell cannot retrieve anything because you've not added them. As long as your tests are named well any developer seeing some test such as "CanOnlyAddUniqueItemsToCollection" will point future developers in the right direction, in other words, the addition is broken. Just make sure your tests are named well and you should be giving as much help as possible.
I don't see this as too much of a problem. If your Dictionary class is not too big, and the unit test for that class is the only unit test testing that code, then when your add method is broken and multiple tests fail, you still know the problem is in the Dictionary class and can identify it, debug and fix it easily.
Where it becomes a problem is when you have other code smells or design problems such as:
unit tests tests are testing many application classes, using mocks instead can help here.
unit tests are actually system tests creating and testing many application classes at once.
the Dictionary class is too big and complex so when it breaks and tests fail it's difficult to figure out what part is broken.
This is very interesting. We use NUnit and the best I can tell it runs test-methods alphabetically. That might be an overly-artificial way to order your tests, but if you built up your test classes such that alphabetically/numerically-named pre-req methods came first you might accomplish what you want.
I find myself writing a test method, firing just it to watch it fail, and then writing the code to make it pass. When I'm all done I can run the whole class and everything passes - it doesn't matter what order the tests ran in becuase everything 'works' becuase of the incremental dev I did.
Now later on if I break something in the thing i'm testing who knows what all will fail in the harness. I guess it doesn't really matter to me - I've got a long list of failures and I can tease out what went wrong.
Suppose you have a method:
public void Save(Entity data)
{
this.repositoryIocInstance.EntitySave(data);
}
Would you write a unit test at all?
public void TestSave()
{
// arrange
Mock<EntityRepository> repo = new Mock<EntityRepository>();
repo.Setup(m => m.EntitySave(It.IsAny<Entity>());
// act
MyClass c = new MyClass(repo.Object);
c.Save(new Entity());
// assert
repo.Verify(m => EntitySave(It.IsAny<Entity>()), Times.Once());
}
Because later on if you do change method's implementation to do more "complex" stuff like:
public void Save(Entity data)
{
if (this.repositoryIocInstance.Exists(data))
{
this.repositoryIocInstance.Update(data);
}
else
{
this.repositoryIocInstance.Create(data);
}
}
...your unit test would fail but it probably wouldn't break your application...
Question
Should I even bother creating unit tests on methods that don't have any return types* or **don't change anything outside of internal mock?
Don't forget that unit tests isn't just about testing code. It's about allowing you to determine when behaviour changes.
So you may have something that's trivial. However, your implementation changes and you may have a side effect. You want your regression test suite to tell you.
e.g. Often people say you shouldn't test setters/getters since they're trivial. I disagree, not because they're complicated methods, but someone may inadvertently change them through ignorance, fat-finger scenarios etc.
Given all that I've just said, I would definitely implement tests for the above (via mocking, and/or perhaps it's worth designing your classes with testability in mind and having them report status etc.)
It's true your test is depending on your implementation, which is something you should avoid (though it is not really that simple sometimes...) and is not necessarily bad. But these kind of tests are expected to break even if your change doesn't break the code.
You could have many approaches to this:
Create a test that really goes to the database and check if the state was changed as expected (it won't be a unit test anymore)
Create a test object that fakes a database and do operations in-memory (another implementation for your repositoryIocInstance), and verify the state was changed as expected. Changes to the repository interface would incurr in changes to this object as well. But your interfaces shouldn't be changing much, right?
See all of this as too expensive, and use your approach, which may incur on unnecessarily breaking tests later (but once the chance is low, it is ok to take the risk)
Ask yourself two questions. "What is the manual equivalent of this unit test?" and "is it worth automating?". In your case it would be something like:
What is manual equivalent?
- start debugger
- step into "Save" method
- step into next, make sure you're inside IRepository.EntitySave implementation
Is it worth automating? My answer is "no". It is 100% obvious from the code.
From hundreds of similar waste tests I didn't see a single which would turn out to be useful.
The general rule of thumb is, that you test all things, that could probably break. If you are sure, that the method is simple enough (and stays simple enough) to not be a problem, that let it out with testing.
The second thing is, you should test the contract of the method, not the implementation. If the test fails after a change, but not the application, then your test tests not the right thing. The test should cover cases that are important for your application. This should ensure, that every change to the method that doesn't break the application also don't fail the test.
A method that does not return any result still changes the state of your application. Your unit test, in this case, should be testing whether the new state is as intended.
"your unit test would fail but it probably wouldn't break your application"
This is -- actually -- really important to know. It may seem annoying and trivial, but when someone else starts maintaining your code, they may have made a really bad change to Save and (improbably) broken the application.
The trick is to prioritize.
Test the important stuff first. When things are slow, add tests for trivial stuff.
When there isn't an assertion in a method, you are essentially asserting that exceptions aren't thrown.
I'm also struggling with the question of how to test public void myMethod(). I guess if you do decide to add a return value for testability, the return value should represent all salient facts necessary to see what changed about the state of the application.
public void myMethod()
becomes
public ComplexObject myMethod() {
DoLotsOfSideEffects()
return new ComplexObject { rows changed, primary key, value of each column, etc };
}
and not
public bool myMethod()
DoLotsOfSideEffects()
return true;
The short answer to your question is: Yes, you should definitely test methods like that.
I assume that it is important that the Save method actually saves the data. If you don't write a unit test for this, then how do you know?
Someone else may come along and remove that line of code that invokes the EntitySave method, and none of the unit tests will fail. Later on, you are wondering why items are never persisted...
In your method, you could say that anyone deleting that line would only be doing so if they have malign intentions, but the thing is: Simple things don't necessarily stay simple, and you better write the unit tests before things get complicated.
It is not an implementation detail that the Save method invokes EntitySave on the Repository - it is part of the expected behavior, and a pretty crucial part, if I may say so. You want to make sure that data is actually being saved.
Just because a method does not return a value doesn't mean that it isn't worth testing. In general, if you observe good Command/Query Separation (CQS), any void method should be expected to change the state of something.
Sometimes that something is the class itself, but other times, it may be the state of something else. In this case, it changes the state of the Repository, and that is what you should be testing.
This is called testing Inderect Outputs, instead of the more normal Direct Outputs (return values).
The trick is to write unit tests so that they don't break too often. When using Mocks, it is easy to accidentally write Overspecified Tests, which is why most Dynamic Mocks (like Moq) defaults to Stub mode, where it doesn't really matter how many times you invoke a given method.
All this, and much more, is explained in the excellent xUnit Test Patterns.
What Makes a Good Unit Test? says that a test should test only one thing. What is the benefit from that?
Wouldn't it be better to write a bit bigger tests that test bigger block of code? Investigating a test failure is anyway hard and I don't see help to it from smaller tests.
Edit: The word unit is not that important. Let's say I consider the unit a bit bigger. That is not the issue here. The real question is why make a test or more for all methods as few tests that cover many methods is simpler.
An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.
Testing only one thing will isolate that one thing and prove whether or not it works. That is the idea with unit testing. Nothing wrong with tests that test more than one thing, but that is generally referred to as integration testing. They both have merits, based on context.
To use an example, if your bedside lamp doesn't turn on, and you replace the bulb and switch the extension cord, you don't know which change fixed the issue. Should have done unit testing, and separated your concerns to isolate the problem.
Update: I read this article and linked articles and I gotta say, I'm shook: https://techbeacon.com/app-dev-testing/no-1-unit-testing-best-practice-stop-doing-it
There is substance here and it gets the mental juices flowing. But I reckon that it jibes with the original sentiment that we should be doing the test that context demands. I suppose I'd just append that to say that we need to get closer to knowing for sure the benefits of different testing on a system and less of a cross-your-fingers approach. Measurements/quantifications and all that good stuff.
I'm going to go out on a limb here, and say that the "only test one thing" advice isn't as actually helpful as it's sometimes made out to be.
Sometimes tests take a certain amount of setting up. Sometimes they may even take a certain amount of time to set up (in the real world). Often you can test two actions in one go.
Pro: only have all that setup occur once. Your tests after the first action will prove that the world is how you expect it to be before the second action. Less code, faster test run.
Con: if either action fails, you'll get the same result: the same test will fail. You'll have less information about where the problem is than if you only had a single action in each of two tests.
In reality, I find that the "con" here isn't much of a problem. The stack trace often narrows things down very quickly, and I'm going to make sure I fix the code anyway.
A slightly different "con" here is that it breaks the "write a new test, make it pass, refactor" cycle. I view that as an ideal cycle, but one which doesn't always mirror reality. Sometimes it's simply more pragmatic to add an extra action and check (or possibly just another check to an existing action) in a current test than to create a new one.
Tests that check for more than one thing aren't usually recommended because they are more tightly coupled and brittle. If you change something in the code, it'll take longer to change the test, since there are more things to account for.
[Edit:]
Ok, say this is a sample test method:
[TestMethod]
public void TestSomething() {
// Test condition A
// Test condition B
// Test condition C
// Test condition D
}
If your test for condition A fails, then B, C, and D will appear to fail as well, and won't provide you with any usefulness. What if your code change would have caused C to fail as well? If you had split them out into 4 separate tests, you would know this.
Haaa... unit tests.
Push any "directives" too far and it rapidly becomes unusable.
Single unit test test a single thing is just as good practice as single method does a single task. But IMHO that does not mean a single test can only contain a single assert statement.
Is
#Test
public void checkNullInputFirstArgument(){...}
#Test
public void checkNullInputSecondArgument(){...}
#Test
public void checkOverInputFirstArgument(){...}
...
better than
#Test
public void testLimitConditions(){...}
is question of taste in my opinion rather than good practice. I personally much prefer the latter.
But
#Test
public void doesWork(){...}
is actually what the "directive" wants you to avoid at all cost and what drains my sanity the fastest.
As a final conclusion, group together things that are semantically related and easilly testable together so that a failed test message, by itself, is actually meaningful enough for you to go directly to the code.
Rule of thumb here on a failed test report: if you have to read the test's code first then your test are not structured well enough and need more splitting into smaller tests.
My 2 cents.
Think of building a car. If you were to apply your theory, of just testing big things, then why not make a test to drive the car through a desert. It breaks down. Ok, so tell me what caused the problem. You can't. That's a scenario test.
A functional test may be to turn on the engine. It fails. But that could be because of a number of reasons. You still couldn't tell me exactly what caused the problem. We're getting closer though.
A unit test is more specific, and will firstly identify where the code is broken, but it will also (if doing proper TDD) help architect your code into clear, modular chunks.
Someone mentioned about using the stack trace. Forget it. That's a second resort. Going through the stack trace, or using debug is a pain and can be time consuming. Especially on larger systems, and complex bugs.
Good characteristics of a unit test:
Fast (milliseconds)
Independent. It's not affected by or dependent on other tests
Clear. It shouldn't be bloated, or contain a huge amount of setup.
Using test-driven development, you would write your tests first, then write the code to pass the test. If your tests are focused, this makes writing the code to pass the test easier.
For example, I might have a method that takes a parameter. One of the things I might think of first is, what should happen if the parameter is null? It should throw a ArgumentNull exception (I think). So I write a test that checks to see if that exception is thrown when I pass a null argument. Run the test. Okay, it throws NotImplementedException. I go and fix that by changing the code to throw an ArgumentNull exception. Run my test it passes. Then I think, what happens if it's too small or too big? Ah, that's two tests. I write the too small case first.
The point is I don't think of the behavior of the method all at once. I build it incrementally (and logically) by thinking about what it should do, then implement code and refactoring as I go to make it look pretty (elegant). This is why tests should be small and focused because when you are thinking about the behavior you should develop in small, understandable increments.
Having tests that verify only one thing makes troubleshooting easier. It's not to say you shouldn't also have tests that do test multiple things, or multiple tests that share the same setup/teardown.
Here should be an illustrative example. Let's say that you have a stack class with queries:
getSize
isEmpty
getTop
and methods to mutate the stack
push(anObject)
pop()
Now, consider the following test case for it (I'm using Python like pseudo-code for this example.)
class TestCase():
def setup():
self.stack = new Stack()
def test():
stack.push(1)
stack.push(2)
stack.pop()
assert stack.top() == 1, "top() isn't showing correct object"
assert stack.getSize() == 1, "getSize() call failed"
From this test case, you can determine if something is wrong, but not whether it is isolated to the push() or pop() implementations, or the queries that return values: top() and getSize().
If we add individual test cases for each method and its behavior, things become much easier to diagnose. Also, by doing fresh setup for each test case, we can guarantee that the problem is completely within the methods that the failing test method called.
def test_size():
assert stack.getSize() == 0
assert stack.isEmpty()
def test_push():
self.stack.push(1)
assert stack.top() == 1, "top returns wrong object after push"
assert stack.getSize() == 1, "getSize wrong after push"
def test_pop():
stack.push(1)
stack.pop()
assert stack.getSize() == 0, "getSize wrong after push"
As far as test-driven development is concerned. I personally write larger "functional tests" that end up testing multiple methods at first, and then create unit tests as I start to implement individual pieces.
Another way to look at it is unit tests verify the contract of each individual method, while larger tests verify the contract that the objects and the system as a whole must follow.
I'm still using three method calls in test_push, however both top() and getSize() are queries that are tested by separate test methods.
You could get similar functionality by adding more asserts to the single test, but then later assertion failures would be hidden.
If you are testing more than one thing then it is called an Integration test...not a unit test. You would still run these integration tests in the same testing framework as your unit tests.
Integration tests are generally slower, unit tests are fast because all dependencies are mocked/faked, so no database/web service/slow service calls.
We run our unit tests on commit to source control, and our integration tests only get run in the nightly build.
If you test more than one thing and the first thing you test fails, you will not know if the subsequent things you are testing pass or fail. It is easier to fix when you know everything that will fail.
Smaller unit test make it more clear where the issue is when they fail.
The GLib, but hopefully still useful, answer is that unit = one. If you test more than one thing, then you aren't unit testing.
Regarding your example: If you are testing add and remove in the same unit test, how do you verify that the item was ever added to your list? That is why you need to add and verify that it was added in one test.
Or to use the lamp example: If you want to test your lamp and all you do is turn the switch on and then off, how do you know the lamp ever turned on? You must take the step in between to look at the lamp and verify that it is on. Then you can turn it off and verify that it turned off.
I support the idea that unit tests should only test one thing. I also stray from it quite a bit. Today I had a test where expensive setup seemed to be forcing me to make more than one assertion per test.
namespace Tests.Integration
{
[TestFixture]
public class FeeMessageTest
{
[Test]
public void ShouldHaveCorrectValues
{
var fees = CallSlowRunningFeeService();
Assert.AreEqual(6.50m, fees.ConvenienceFee);
Assert.AreEqual(2.95m, fees.CreditCardFee);
Assert.AreEqual(59.95m, fees.ChangeFee);
}
}
}
At the same time, I really wanted to see all my assertions that failed, not just the first one. I was expecting them all to fail, and I needed to know what amounts I was really getting back. But, a standard [SetUp] with each test divided would cause 3 calls to the slow service. Suddenly I remembered an article suggesting that using "unconventional" test constructs is where half the benefit of unit testing is hidden. (I think it was a Jeremy Miller post, but can't find it now.) Suddenly [TestFixtureSetUp] popped to mind, and I realized I could make a single service call but still have separate, expressive test methods.
namespace Tests.Integration
{
[TestFixture]
public class FeeMessageTest
{
Fees fees;
[TestFixtureSetUp]
public void FetchFeesMessageFromService()
{
fees = CallSlowRunningFeeService();
}
[Test]
public void ShouldHaveCorrectConvenienceFee()
{
Assert.AreEqual(6.50m, fees.ConvenienceFee);
}
[Test]
public void ShouldHaveCorrectCreditCardFee()
{
Assert.AreEqual(2.95m, fees.CreditCardFee);
}
[Test]
public void ShouldHaveCorrectChangeFee()
{
Assert.AreEqual(59.95m, fees.ChangeFee);
}
}
}
There is more code in this test, but it provides much more value by showing me all the values that don't match expectations at once.
A colleague also pointed out that this is a bit like Scott Bellware's specunit.net: http://code.google.com/p/specunit-net/
Another practical disadvantage of very granular unit testing is that it breaks the DRY principle. I have worked on projects where the rule was that each public method of a class had to have a unit test (a [TestMethod]). Obviously this added some overhead every time you created a public method but the real problem was that it added some "friction" to refactoring.
It's similar to method level documentation, it's nice to have but it's another thing that has to be maintained and it makes changing a method signature or name a little more cumbersome and slows down "floss refactoring" (as described in "Refactoring Tools: Fitness for Purpose" by Emerson Murphy-Hill and Andrew P. Black. PDF, 1.3 MB).
Like most things in design, there is a trade-off that the phrase "a test should test only one thing" doesn't capture.
When a test fails, there are three options:
The implementation is broken and should be fixed.
The test is broken and should be fixed.
The test is not anymore needed and should be removed.
Fine-grained tests with descriptive names help the reader to know why the test was written, which in turn makes it easier to know which of the above options to choose. The name of the test should describe the behaviour which is being specified by the test - and only one behaviour per test - so that just by reading the names of the tests the reader will know what the system does. See this article for more information.
On the other hand, if one test is doing lots of different things and it has a non-descriptive name (such as tests named after methods in the implementation), then it will be very hard to find out the motivation behind the test, and it will be hard to know when and how to change the test.
Here is what a it can look like (with GoSpec), when each test tests only one thing:
func StackSpec(c gospec.Context) {
stack := NewStack()
c.Specify("An empty stack", func() {
c.Specify("is empty", func() {
c.Then(stack).Should.Be(stack.Empty())
})
c.Specify("After a push, the stack is no longer empty", func() {
stack.Push("foo")
c.Then(stack).ShouldNot.Be(stack.Empty())
})
})
c.Specify("When objects have been pushed onto a stack", func() {
stack.Push("one")
stack.Push("two")
c.Specify("the object pushed last is popped first", func() {
x := stack.Pop()
c.Then(x).Should.Equal("two")
})
c.Specify("the object pushed first is popped last", func() {
stack.Pop()
x := stack.Pop()
c.Then(x).Should.Equal("one")
})
c.Specify("After popping all objects, the stack is empty", func() {
stack.Pop()
stack.Pop()
c.Then(stack).Should.Be(stack.Empty())
})
})
}
The real question is why make a test or more for all methods as few tests that cover many methods is simpler.
Well, so that when some test fails you know which method fails.
When you have to repair a non-functioning car, it is easier when you know which part of the engine is failing.
An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.
Let's suppose that the addition method is broken and does not add, and that the removal method is broken and does not remove. Your test would check that the list, after addition and removal, has the same size as initially. Your test would be in success. Although both of your methods would be broken.
Disclaimer: This is an answer highly influenced by the book "xUnit Test Patterns".
Testing only one thing at each test is one of the most basic principles that provides the following benefits:
Defect Localization: If a test fails, you immediately know why it failed (ideally without further troubleshooting, if you've done a good job with the assertions used).
Test as a specification: the tests are not only there as a safety net, but can easily be used as specification/documentation. For instance, a developer should be able to read the unit tests of a single component and understand the API/contract of it, without needing to read the implementation (leveraging the benefit of encapsulation).
Infeasibility of TDD: TDD is based on having small-sized chunks of functionality and completing progressive iterations of (write failing test, write code, verify test succeeds). This process get highly disrupted if a test has to verify multiple things.
Lack of side-effects: Somewhat related to the first one, but when a test verifies multiple things, it's more possible that it will be tied to other tests as well. So, these tests might need to have a shared test fixture, which means that one will be affected by the other one. So, eventually you might have a test failing, but in reality another test is the one that caused the failure, e.g. by changing the fixture data.
I can only see a single reason why you might benefit from having a test that verifies multiple things, but this should be seen as a code smell actually:
Performance optimisation: There are some cases, where your tests are not running only in memory, but are also dependent in persistent storage (e.g. databases). In some of these cases, having a test verify multiple things might help in decreasing the number of disk accesses, thus decreasing the execution time. However, unit tests should ideally be executable only in memory, so if you stumble upon such a case, you should re-consider whether you are going in the wrong path. All persistent dependencies should be replaced with mock objects in unit tests. End-to-end functionality should be covered by a different suite of integration tests. In this way, you do not need to care about execution time anymore, since integration tests are usually executed by build pipelines and not by developers, so a slightly higher execution time has almost no impact to the efficiency of the software development lifecycle.
Considering such code:
class ToBeTested {
public:
void doForEach() {
for (vector<Contained>::iterator it = m_contained.begin(); it != m_contained.end(); it++) {
doOnce(*it);
doTwice(*it);
doTwice(*it);
}
}
void doOnce(Contained & c) {
// do something
}
void doTwice(Contained & c) {
// do something
}
// other methods
private:
vector<Contained> m_contained;
}
I want to test that if I fill vector with 3 values my functions will be called in proper order and quantity. For example my test can look something like this:
tobeTested.AddContained(one);
tobeTested.AddContained(two);
tobeTested.AddContained(three);
BEGIN_PROC_TEST()
SHOULD_BE_CALLED(doOnce, 1)
SHOULD_BE_CALLED(doTwice, 2)
SHOULD_BE_CALLED(doOnce, 1)
SHOULD_BE_CALLED(doTwice, 2)
SHOULD_BE_CALLED(doOnce, 1)
SHOULD_BE_CALLED(doTwice, 2)
tobeTested.doForEach()
END_PROC_TEST()
How do you recommend to test this? Are there any means to do this with CppUnit or GoogleTest frameworks? Maybe some other unit test framework allow to perform such tests?
I understand that probably this is impossible without calling any debug functions from these functions, but at least can it be done automatically in some test framework. I don't like to scan trace logs and check their correctness.
UPD: I'm trying to check not only the state of an objects, but also the execution order to avoid performance issues on the earliest possible stage (and in general I want to know that my code is executed exactly as I expected).
You should be able to use any good mocking framework to verify that calls to a collaborating object are done in a specific order.
However, you don't generally test that one method makes some calls to other methods on the same class... why would you?
Generally, when you're testing a class, you only care about testing its publicly visible state. If you test
anything else, your tests will prevent you from refactoring later.
I could provide more help, but I don't think your example is consistent (Where is the implementation for the AddContained method?).
If you're interested in performance, I recommend that you write a test that measures performance.
Check the current time, run the method you're concerned about, then check the time again. Assert that the total time taken is less than some value.
The problem with check that methods are called in a certain order is that your code is going to have to change, and you don't want to have to update your tests when that happens. You should focus on testing the actual requirement instead of testing the implementation detail that meets that requirement.
That said, if you really want to test that your methods are called in a certain order, you'll need to do the following:
Move them to another class, call it Collaborator
Add an instance of this other class to the ToBeTested class
Use a mocking framework to set the instance variable on ToBeTested to be a mock of the Collborator class
Call the method under test
Use your mocking framework to assert that the methods were called on your mock in the correct order.
I'm not a native cpp speaker so I can't comment on which mocking framework you should use, but I see some other commenters have added their suggestions on this front.
You could check out mockpp.
Instead of trying to figure out how many functions were called, and in what order, find a set of inputs that can only produce an expected output if you call things in the right order.
Some mocking frameworks allow you to set up ordered expectations, which lets you say exactly which function calls you expect in a certain order. For example, RhinoMocks for C# allows this.
I am not a C++ coder so I'm not aware of what's available for C++, but that's one type of tool that might allow what you're trying to do.
http://msdn.microsoft.com/en-au/magazine/cc301356.aspx
This is a good article about Context Bound Objects. It contains some so advanced stuff, but if you are not lazy and really want to understand this kind of things it will be really helpful.
At the end you will be able to write something like:
[CallTracingAttribute()]
public class TraceMe : ContextBoundObject
{...}
You could use ACE (or similar) debug frameworks, and in your test, configure the debug object to stream to a file. Then you just need to check the file.