The process of Unit Testing

The process of Unit Testing - unit-testing

OK, I know what a unit test is but I use it in some projects and not others...some clients don't know how its done and follow one convention...blah blah.
so here I am asking, how EXACTLY is a unit test process created?
I hear, and read, that you write the tests first then the functionality and write the tests for that functionality and also use code coverage to identify any "slips" of not having tests for that code which has not been covered.
So, lets use a simple example:
Requirement: "application must return the result of 2 numbers combined."
you and I know we would have a class, something like "Addition" and a method "Add" which returns an integer like so:
public class Addition
{
public int Add(int num1, int num2)
{
return num1 + num2;
}
}
But even before writing this class, how do you write tests first? What is your process? What do you do? What would the process be when you have that spec doc and going into development?
Many thanks,

Process you're referring to is called Test-Driven Development. Idea is simple and close to what you described; given functionality, you start writing code by writing test for this functionality. In your add example, before any working code is written you should have a simple test - a test that fails.
Failing Test
[Test]
public void TestAdd()
{
var testedClass = new Addition();
var result = testedClass.Add(1, 2);
Assert.AreEqual(result, 3);
}
This is a simple test for your .Add method, stating your expectations of the soon-to-be working code. Since you don't have any code just yet, this test will naturally fail (as it is supposed to - which is good).
Passing test
Next step is to write the most basic code that makes the test pass (naturally, the most basic code is return 3; but for this simple example this level of details is not necessary):
public int Add(int num1, int num2)
{
return num1 + num2;
}
This works and test passes. What you have at this point, is basic proof that your method works in the way you stated it in your assumptions/expectations (the test).
However, you might notice that this test is not a good one; it tests only one simple input data of many. Not to mention, in some cases one test might not be enough and even though you had initial requirements, testing might reveal more is needed (for example arguments validation or logging). This is the part when you go back to reviewing your requirements and writing tests, which leads us to...
Refactor
At this point you should refactor code you just wrote. And I'm talking both unit test methods code and tested implementation. Since Add method is fairly simple and there's not much you can improve in terms of adding two numbers, you can focus on making test better. For example:
add more test cases (or consider data driven testing)
make test name more descriptive
improve variables naming
extract magic numbers to constants
Like this:
[TestCase(0, 0, 0)]
[TestCase(1, 2, 3)]
[TestCase(1, 99, 100)]
public void Add_ReturnsSumOfTwoNumbers(int first, int second, int expectedSum)
{
var testedClass = new Addition();
var actualSum = testedClass.Add(first, second);
Assert.That(actualSum, Is.EqualTo(expectedSum));
}
Refactoring is topic worth it's own book (and there are many), so I won't go into details. The process we just went through is often referred to as Red-Green-Refactor (red indicating failing test, green - passing one), and it's part of TDD. Remember to rerun the test one more time in order to make sure refactoring didn't accidentally break anything.
This is how the basic cycle goes. You start with requirements, write failing test for it, write code to make test pass, refactor both code and tests, review requirements, proceed with next task.
Where to go from here?
Few useful resources that are good natural follow-up once you get to know the idea behind TDD (even from such brief explanation as presented in this post):
Test-Driven Development by example - introduction to TDD by Kent Beck himself
Test-Driven Development in Microsoft.NET - this book is great walkthrough especially if you're new to TDD. It explains concepts really well and contains comprehensive example of application development with TDD (including data access, web services and more)
Roy Osherove's TDD Kata - contains videos of people developing simple calculator API with TDD; many different languages, many different frameworks (definitely worth seeing)

Related

Writing tests before writing code

As far as I understand TDD and BDD cycle is something like:
Start by writing tests
See them fail
Write code
Pass the tests
Repeat
The question is how do you write tests before you have any code? Should I create some kind of class skeletons or interfaces? Or have I misunderstood something?

You have the essence of it, but I would change one part of your description. You don't write tests before you write code - you write a test before you write code. Then - before writing any more tests - you write just enough code to get your test to pass. When it's passing, you look for opportunities to improve the code, and make the improvements while keeping your tests passing - and then you write your second test. The point is, you're focusing on one tiny bit of functionality at any given time. What is the next thing you want your program to do? Write a test for that, and nothing more. Get that test passing. Clean the code. What's the next thing you want it to do? Iterate until you're happy.
The thing is, if you write tests before writing code, you don't have that focus. It's one test at a time.

Yes, that is correct. If you check out Michael Hartl's book on Ruby on Rails (free for HTML viewing), you will see how he does this specifically. So to add on to what lared said, let's say your first job is to add a new button to a web page. Your process would look like this:
Write a test to look for the button on the page visually.
Verify that the test fails (there should not be a button present, therefore it should fail).
Write code to place button on the page.
Verify test passes.
TDD will save your bacon when you accidentally do something to your code that breaks an old test. For example, you change the button to a link accidentally. The test will fail and alert you to the problem.

If you are using a real programming language, (you know, with a compiler and all,) then yes, of course you have to write class skeletons or interfaces, otherwise your tests will not even compile.
If you are using a scripting language, then you do not even have to write skeletons or interfaces, because your test script will happily begin to run and will fail on the first non-existent class or method that it encounters.

The question is how do you write tests before you have any code? Should I create some kind of class skeletons or interfaces? Or have I misunderstood something?
To expand on a point that lared made in his comment:
Then you write tests, which fail because the classes/whatever doesn't exist, and then you write the minimal amount of code which makes them pass
One thing to remember with TDD is that the test you are writing is the first client of your code. Therefore I wouldn't worry about not having the classes or interfaces defined already - because as he pointed out, simply by writing code referencing classes that don't exist, you will get your first "Red" in the cycle - namely, your code won't compile! That's a perfectly valid test.
TDD can also mean Test Driven Design
Once you embrace this idea, you will find that writing the test first serves less as a simple "is this code correct" and more of a "is this code right" guideline, so you'll find that you actually end up producing production code that is not only correct, but well structured as well.
Now a video showing this process would be super, but I don't have one but I'll make a stab at an example. Note this is a super simple example and ignores up-front pencil and paper planning / real world requirements from business, which will often be the driving force behind your design process.
Anyway suppose we want to create a simple Person object that can store a person's name and age. And we'd like to do this via TDD so we know it's correct.
So we think about it for a minute, and write our first test (note: example using pseudo C# / pseudo test framework)
public void GivenANewPerson_TheirNameAndAgeShouldBeAsExpected()
{
var sut = new Person();
Assert.Empty(sut.Name);
Assert.Zero(sut.Age);
}
Straight away we have a failing test, this won't compile because the Person class doesn't exist. So you use your IDE to auto-create the class for you:
public class Person
{
public int Age {get;set;}
public string Name {get;set;}
}
OK, now you have a first passing test. But now as you look at that class, you realise that there is nothing to ensure a person's age is always positive (>0). Let's assert that this is the case:
public void GivenANegativeAgeValue_PersonWillRejectIt()
{
var sut = new Person();
Assert.CausesException(sut.Age = -100);
}
Well, that test fails so let's fix up the class:
public class Person
{
protected int age;
public int Age
{
get{return age;}
set{
if(value<=0)
{
throw new InvalidOperationException("Age must be a positive number");
}
age=value;
}
}
public string Name {get;set;}
}
But now you might say to yourself - OK, since I know that a person's age can never be <=0, why do I even bother creating a writable property - do I always want to have to write two statements, one to create a Person and another to set their Age? What if I forgot to do it in one part of my code? What if I created a Person in one part of my code, and then later on I tried to assign a variable that was negative to Age later on, in another module? Surely, Age must be an invariant of Person so let's fix this up:
public class Person
{
public Person(int age){
if (age<=0){
throw new InvalidOperationException("Age must be a positive number");
}
this.Age = age;
}
public int Age {get;protected set;}
public string Name {get;set;}
}
And of course you have to fix your tests because they are won't compile any more - and if fact now you realise that the second test is redundant and can be dropped!
public void GivenANewPerson_TheirNameAndAgeShouldBeAsExpected()
{
var sut = new Person(42);
Assert.Empty(sut.Name);
Assert.42(sut.Age);
}
And you will then probably go through a similar process with Name, and so on.
Now I know this seems like a terribly long-winded way of creating a class, but consider that you have basically designed this class from scratch with built-in defences against invalid state - for example you will never ever have to debug code like this:
//A Person instance, 6,000 lines and 3 modules away from where it was instantiated
john.Age = x; //Crash because x is -42
or
//A Person instance, reserialised from a message queue in another process
var someValue = 2015/john.Age; //DivideByZeroException because we forgot to assign john's age
For me, this is one of the key benefits of TDD, using it not only as a testing tool but as a design tool that makes you think about the production code you are implementing, and forcing you to consider how the classes you create could end up in invalid, application killing states, and how to guard against this, and helping you to write objects that are easy to use and don't require their consumers to understand how they work, but rather what they do.
Since any modern IDE worth it's salt will provide you with the opportunity to create missing classes / interfaces with a couple of keystrokes or mouse clicks, I believe it's well worth trying this approach.

TDD and BDD are different things that share a common mechanism. This shared mechanism that is that you write something that 'tests' something before you write the thing that does something. You then use the failures to guide/drive the development.
(
You write the tests by thinking about the problem you are trying to solve, and fleshing out the details by pretending that you have an ideal solution that you can test. You write your test to use your ideal solution. Doing this does all sorts of things like:
Discover names of things you need for your solution
Uncover interfaces for your things to make them easy to use
Experience failures with your things
...
A difference between BDD and TDD is that BDD is much more focused on the 'what' and the 'why', rather than the 'how'. BDD is very concerned about the appropriate use of langauge to describe things. BDD starts at a higher level of abstraction. When you get to areas where the detail overwhelms language then TDD is used as a tool to implement the detail.
This idea that you can choose to think of things and write about them at different levels of abstraction is key.
You write the 'tests' you need by choosing:
the appropriate langauage for your problem
the appropriate level of abstraction to explain your problem simply and clearly
an appropriate mechanism to call your functionality.

Should I write a Test that already passes?

I'm talking about uncle bob's rules of TDD:
You are not allowed to write any production code unless it is to make a failing unit test pass.
You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
You are not allowed to write any more production code than is sufficient to pass the one failing unit test.
My problem is that what happens when you expect to create a feature that can generate more than 1 result, and at first iteration you implement codes that can satisfy every scenario?
I wrote such a code once, because that was the only solution which hit my mind first.
I'd say I didn't violate any of these 3 rules.
I wrote a test with least possible conditions to fail.
Then I implemented the feature with sufficient codes to pass the test (and that was the only solution I came up with; so I'd say least possible codes I could have wrote).
And then I wrote the next test to discover that It was already passing.
Now what about the rules ?
Am I not allowed to write this test even if the feature is a soooper important one ? Or should I rollback and start over ?
I'd also mention that this method cannot be refactored according to the result or data fed to it. Right now the example situation I can think about is a little stupid but bear with me please. take a look at this for example:
I want to create a method to add numbers, this is pretty much all I can do. White a failing test:
public function it_can_add_numbers()
{
$this->add(2, 3)->shouldReturn(5);
}
Then make it pass:
public function add($numberOne, $numberTwo)
{
return $numberOne + $numberTwo;
}
Now one could argue that I should have returned 5 in first iteration because that was enough to pass the test and to introduce regression as well, but this is not an actual problem, so please bear with me and suppose this is the only solution one could think of.
Now my company want me to make sure that they are able to add 12 and 13 because these are some internal magical numbers that we'd be using quite a lot of time. I go ahead and write another test, because this is how I'm supposed to verify a feature.
public function it_can_add_twelve_and_thirteen()
{
$this->add(12, 13)->shouldReturn(25);
}
turns out the test is already passing.
At this point, I can choose to not write the test, but what if at a later time someone make changes in actual codes and make it
public function add($numberOne, $numberTwo)
{
return 5;
}
The test will still pass but the feature is not there.
So what about those situations when you cannot immediately think about a possible flaw to introduce in first iterations before making improvements ? should I leave it here and wait for someone to come over and screw it up ? should I leave this case for regression tests ?

To be true to the rule number 3 in uncle bob's rules, after writing the test:
public function it_can_add_numbers()
{
$this->add(2, 3)->shouldReturn(5);
}
The correct code would be:
public function add($numberOne, $numberTwo)
{
return 5;
}
Now, when you add the second test, it will fail, which will make you change the code to comply with both tests, and then refactor it to be DRY, resulting in:
public function add($numberOne, $numberTwo)
{
return $numberOne + $numberTwo;
}
I won't say that it is the "only true way" to code, and #Oli and #Leo have a point that you should not stop thinking like a programmer because you are TDDing, but the above process is an example of following the 3 rules you stated in TDD...

If you write a passing test, you are not doing TDD.
That is not to say that the test has no value, but it says very clearly that your new test is not "driving design" (or "driving development", the "other" DD). You may need the new test for regression or to satisfy management or to bring up some code coverage metric, but you don't need it to drive your design or your development.
You did violate Uncle Bob's third rule, insofar as you wrote more logic than was required to pass the test you had.
It's OK to break the rules; it's just not OK to break the rules and say you're not breaking them. If you want to rigidly adhere to TDD as Uncle Bob defines it, you need to follow his rules.

If the feature is really simple, like the addition example, I would not worry about implementing it with just one test. In more complex, real-world situations, implementing the entire solution at once may leave you with poorly structured code because you don't get the chance to evolve the design and refactor the code. TDD is a design technique that helps you write better code and requires you to be constantly THINKING, not just following rules by rote.

TDD is a best practice, not a religion. if your case is trivial/obvious, just write the right code instead of doing some artificial useless refactoring

Is this unit test excessive?

Given the following SUT, would you consider this unit test to be unnecessary?
**edit : we cannot assume the names will match, so reflection wouldn't work.
**edit 2 : in actuality, this class would implement an IMapper interface and there would be full blown behavioral (mock) testing at the business logic layer of the application. this test just happens to be the lowest level of testing that must be state based. I question whether this test is truly necessary because the test code is almost identical to the source code itself, and based off of actual experience I don't see how this unit test makes maintenance of the application any easier.
//SUT
public class Mapper
{
public void Map(DataContract from, DataObject to)
{
to.Value1 = from.Value1;
to.Value2 = from.Value2;
....
to.Value100 = from.Value100;
}
}
//Unit Test
public class MapperTest()
{
DataContract contract = new DataContract(){... } ;
DataObject do = new DataObject(){...};
Mapper mapper = new Mapper();
mapper.Map(contract, do);
Assert.AreEqual(do.Value1, contract.Value1);
...
Assert.AreEqual(do.Value100, contract.Value100);
}

i would question the construct itself, not the need to test it
[reflection would be far less code]

I'd argue that it is necessary.
However, it would be better as 100 separate unit tests, each that check one value.
That way, when you something go wrong with value65, you can run the tests, and immediately find that value65 and value66 are being transposed.
Really, it's this kind of simple code where you switch your brain off and forget about that errors happen. Having tests in place means you pick them up and not your customers.
However, if you have a class with 100 properties all named ValueXXX, then you might be better using an Array or a List.

It is not excessive. I'm sure not sure it fully focuses on what you want to test.
"Under the strict definition, for QA purposes, the failure of a UnitTest implicates only one unit. You know exactly where to search to find the bug."
The power of a unit test is in having a known correct resultant state, the focus should be the values assigned to DataContract. Those are the bounds we want to push. To ensure that all possible values for DataContract can be successfully copied into DataObject. DataContract must be populated with edge case values.
PS. David Kemp is right 100 well designed tests would be the most true to the concept of unit testing.
Note : For this test we must assume that DataContract populates perfectly when built (that requires separate tests).

It would be better if you could test at a higher level, i.e. the business logic that requires you to create the Mapper.Map() function.

Not if this was the only unit test of this kind in the entire app. However, the second another like it showed up, you'd see me scrunch my eyebrows and start thinking about reflection.

Not Excesive.
I agree the code looks strange but that said:
The beauty of unit test is that once is done is there forever, so if anyone for any reason decides to change that implementation for something more "clever" still the test should pass, so not a big deal.
I personally would probably have a perl script to generate the code as I would get bored of replacing the numbers for each assert, and I would probably make some mistakes on the way, and the perl script (or what ever script) would be faster for me.

How to properly mock and unit test

I'm basically trying to teach myself how to code and I want to follow good practices. There are obvious benefits to unit testing. There is also much zealotry when it comes to unit-testing and I prefer a much more pragmatic approach to coding and life in general. As context, I'm currently writing my first "real" application which is the ubiquitous blog engine using asp.net MVC. I'm loosely following the MVC Storefront architecture with my own adjustments. As such, this is my first real foray into mocking objects. I'll put the code example at the end of the question.
I'd appreciate any insight or outside resources that I could use to increase my understanding of the fundamentals of testing and mocking. The resources I've found on the net are typically geared towards the "how" of mocking and I need more understanding of the where, why and when of mocking. If this isn't the best place to ask this question, please point me to a better place.
I'm trying to understand the value that I'm getting from the following tests. The UserService is dependent upon the IUserRepository. The value of the service layer is to separate your logic from your data storage, but in this case most of the UserService calls are just passed straight to IUserRepository. The fact that there isn't much actual logic to test could be the source of my concerns as well. I have the following concerns.
It feels like the code is just testing that the mocking framework is working.
In order to mock out the dependencies, it makes my tests have too much knowledge of the IUserRepository implementation. Is this a necessary evil?
What value am I actually gaining from these tests? Is the simplicity of the service under test causing me to doubt the value of these tests.
I'm using NUnit and Rhino.Mocks, but it should be fairly obvious what I'm trying to accomplish.
[SetUp]
public void Setup()
{
userRepo = MockRepository.GenerateMock<IUserRepository>();
userSvc = new UserService(userRepo);
theUser = new User
{
ID = null,
UserName = "http://joe.myopenid.com",
EmailAddress = "joe#joeblow.com",
DisplayName = "Joe Blow",
Website = "http://joeblow.com"
};
}
[Test]
public void UserService_can_create_a_new_user()
{
// Arrange
userRepo.Expect(repo => repo.CreateUser(theUser)).Return(true);
// Act
bool result = userSvc.CreateUser(theUser);
// Assert
userRepo.VerifyAllExpectations();
Assert.That(result, Is.True,
"UserService.CreateUser(user) failed when it should have succeeded");
}
[Test]
public void UserService_can_not_create_an_existing_user()
{
// Arrange
userRepo.Stub(repo => repo.IsExistingUser(theUser)).Return(true);
userRepo.Expect(repo => repo.CreateUser(theUser)).Return(false);
// Act
bool result = userSvc.CreateUser(theUser);
// Assert
userRepo.VerifyAllExpectations();
Assert.That(result, Is.False,
"UserService.CreateUser() allowed multiple copies of same user to be created");
}

Essentially what you are testing here is that the methods are getting called, not whether or not they actually work. Which is what mocks are supposed to do. Instead of calling the method, they just check to see if the method got called, and return whatever is in the Return() statement. So in your assertion here:
Assert.That(result, Is.False, "error message here");
This assertion will ALWAYS succeed because your expectation will ALWAYS return false, because of the Return statement:
userRepo.Expect(repo => repo.CreateUser(theUser)).Return(false);
I'm guessing this isn't that useful in this case.
Where mocking is useful is when you want to, for example, make a database call somewhere in your code, but you don't want to actually call to the database. You want to pretend that the database got called, but you want to set up some fake data for it to return, and then (here's the important part) test the logic that does something with the fake data your mock returned. In the above examples you are omitting the last step. Imagine you had a method that displayed a message to the user that said whether the new user was created:
public string displayMessage(bool userWasCreated) {
if (userWasCreated)
return "User created successfully!";
return "User already exists";
}
then your test would be
userRepo.Expect(repo => repo.CreateUser(theUser)).Return(false);
Assert.AreEqual("User already exists", displayMessage(userSvc.CreateUser(theUser)))
Now this has some value, because you are testing some actual behavior. Of course, you could also just test this directly by passing in "true" or "false." You don't even need a mock for that test. Testing expectations is fine, but I've written plenty of tests like that, and have come to the same conclusion that you are reaching - it just isn't that useful.
So in short, mocking is useful when you want to abstract away externalities, like databases, or webservice calls, etc, and inject known values at that point. But it's not often useful to test mocks directly.

You are right: the simplicity of the service makes these tests uninteresting. It is not until you get more business logic in the service, that you will gain value from the tests.
You might consider some tests like these:
CreateUser_fails_if_email_is_invalid()
CreateUser_fails_if_username_is_empty()
Another comment: it looks like a code-smell, that your methods return booleans to indicate success or failure. You might have a good reason to do it, but usually you should let the exceptions propagate out. It also makes it harder to write good tests, since you will have problems detecting whether your method failed for the "right reason", f.x. you might write the CreateUser_fails_if_email_is_invalid()-test like this:
[Test]
public void CreateUser_fails_if_email_is_invalid()
{
bool result = userSvc.CreateUser(userWithInvalidEmailAddress);
Assert.That(result, Is.False);
}
and it would probably work with your existing code. Using the TDD Red-Green-Refactor-cycle would mitigate this problem, but it would be even better to be able to actually detect that the method failed because of the invalid email, and not because of another problem.

If you write your tests before you write your code, you'll gain much more from your unit tests. One of the reasons that it feels like your tests aren't worth much is that you're not deriving the value of having your tests drive the design. Writing your tests afterwards is mostly just an exercise in seeing if you can remember everything that can go wrong. Writing your tests first causes you to think about how you would actually implement the functionality.
These tests aren't all that interesting because the functionality that is being implemented is pretty basic. The way you are going about mocking seems pretty standard -- mock the things the class under test depends on, not the class under test. Testability (or good design sense) has already led you to implement interfaces and use dependency injection to reduce coupling. You might want to think about changing the error handling, as other have suggested. It would be nice to know why, if only to improve the quality of your tests, CreateUser has failed, for instance. You could do this with exceptions or with an out parameter (which is how MembershipProvider works, if I remember correctly).

You are facing the question of "classical" vs. "mockist" approaches to testing. Or "state-verification" vs. "behaviour-verification" as described by Martin Fowler: http://martinfowler.com/articles/mocksArentStubs.html#ClassicalAndMockistTesting
Another most excellent resource is Gerard Meszaros' book "xUnit Test Patterns: Refactoring Test Code"

Why should unit tests test only one thing?

What Makes a Good Unit Test? says that a test should test only one thing. What is the benefit from that?
Wouldn't it be better to write a bit bigger tests that test bigger block of code? Investigating a test failure is anyway hard and I don't see help to it from smaller tests.
Edit: The word unit is not that important. Let's say I consider the unit a bit bigger. That is not the issue here. The real question is why make a test or more for all methods as few tests that cover many methods is simpler.
An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.

Testing only one thing will isolate that one thing and prove whether or not it works. That is the idea with unit testing. Nothing wrong with tests that test more than one thing, but that is generally referred to as integration testing. They both have merits, based on context.
To use an example, if your bedside lamp doesn't turn on, and you replace the bulb and switch the extension cord, you don't know which change fixed the issue. Should have done unit testing, and separated your concerns to isolate the problem.
Update: I read this article and linked articles and I gotta say, I'm shook: https://techbeacon.com/app-dev-testing/no-1-unit-testing-best-practice-stop-doing-it
There is substance here and it gets the mental juices flowing. But I reckon that it jibes with the original sentiment that we should be doing the test that context demands. I suppose I'd just append that to say that we need to get closer to knowing for sure the benefits of different testing on a system and less of a cross-your-fingers approach. Measurements/quantifications and all that good stuff.

I'm going to go out on a limb here, and say that the "only test one thing" advice isn't as actually helpful as it's sometimes made out to be.
Sometimes tests take a certain amount of setting up. Sometimes they may even take a certain amount of time to set up (in the real world). Often you can test two actions in one go.
Pro: only have all that setup occur once. Your tests after the first action will prove that the world is how you expect it to be before the second action. Less code, faster test run.
Con: if either action fails, you'll get the same result: the same test will fail. You'll have less information about where the problem is than if you only had a single action in each of two tests.
In reality, I find that the "con" here isn't much of a problem. The stack trace often narrows things down very quickly, and I'm going to make sure I fix the code anyway.
A slightly different "con" here is that it breaks the "write a new test, make it pass, refactor" cycle. I view that as an ideal cycle, but one which doesn't always mirror reality. Sometimes it's simply more pragmatic to add an extra action and check (or possibly just another check to an existing action) in a current test than to create a new one.

Tests that check for more than one thing aren't usually recommended because they are more tightly coupled and brittle. If you change something in the code, it'll take longer to change the test, since there are more things to account for.
[Edit:]
Ok, say this is a sample test method:
[TestMethod]
public void TestSomething() {
// Test condition A
// Test condition B
// Test condition C
// Test condition D
}
If your test for condition A fails, then B, C, and D will appear to fail as well, and won't provide you with any usefulness. What if your code change would have caused C to fail as well? If you had split them out into 4 separate tests, you would know this.

Haaa... unit tests.
Push any "directives" too far and it rapidly becomes unusable.
Single unit test test a single thing is just as good practice as single method does a single task. But IMHO that does not mean a single test can only contain a single assert statement.
Is
#Test
public void checkNullInputFirstArgument(){...}
#Test
public void checkNullInputSecondArgument(){...}
#Test
public void checkOverInputFirstArgument(){...}
...
better than
#Test
public void testLimitConditions(){...}
is question of taste in my opinion rather than good practice. I personally much prefer the latter.
But
#Test
public void doesWork(){...}
is actually what the "directive" wants you to avoid at all cost and what drains my sanity the fastest.
As a final conclusion, group together things that are semantically related and easilly testable together so that a failed test message, by itself, is actually meaningful enough for you to go directly to the code.
Rule of thumb here on a failed test report: if you have to read the test's code first then your test are not structured well enough and need more splitting into smaller tests.
My 2 cents.

Think of building a car. If you were to apply your theory, of just testing big things, then why not make a test to drive the car through a desert. It breaks down. Ok, so tell me what caused the problem. You can't. That's a scenario test.
A functional test may be to turn on the engine. It fails. But that could be because of a number of reasons. You still couldn't tell me exactly what caused the problem. We're getting closer though.
A unit test is more specific, and will firstly identify where the code is broken, but it will also (if doing proper TDD) help architect your code into clear, modular chunks.
Someone mentioned about using the stack trace. Forget it. That's a second resort. Going through the stack trace, or using debug is a pain and can be time consuming. Especially on larger systems, and complex bugs.
Good characteristics of a unit test:
Fast (milliseconds)
Independent. It's not affected by or dependent on other tests
Clear. It shouldn't be bloated, or contain a huge amount of setup.

Using test-driven development, you would write your tests first, then write the code to pass the test. If your tests are focused, this makes writing the code to pass the test easier.
For example, I might have a method that takes a parameter. One of the things I might think of first is, what should happen if the parameter is null? It should throw a ArgumentNull exception (I think). So I write a test that checks to see if that exception is thrown when I pass a null argument. Run the test. Okay, it throws NotImplementedException. I go and fix that by changing the code to throw an ArgumentNull exception. Run my test it passes. Then I think, what happens if it's too small or too big? Ah, that's two tests. I write the too small case first.
The point is I don't think of the behavior of the method all at once. I build it incrementally (and logically) by thinking about what it should do, then implement code and refactoring as I go to make it look pretty (elegant). This is why tests should be small and focused because when you are thinking about the behavior you should develop in small, understandable increments.

Having tests that verify only one thing makes troubleshooting easier. It's not to say you shouldn't also have tests that do test multiple things, or multiple tests that share the same setup/teardown.
Here should be an illustrative example. Let's say that you have a stack class with queries:
getSize
isEmpty
getTop
and methods to mutate the stack
push(anObject)
pop()
Now, consider the following test case for it (I'm using Python like pseudo-code for this example.)
class TestCase():
def setup():
self.stack = new Stack()
def test():
stack.push(1)
stack.push(2)
stack.pop()
assert stack.top() == 1, "top() isn't showing correct object"
assert stack.getSize() == 1, "getSize() call failed"
From this test case, you can determine if something is wrong, but not whether it is isolated to the push() or pop() implementations, or the queries that return values: top() and getSize().
If we add individual test cases for each method and its behavior, things become much easier to diagnose. Also, by doing fresh setup for each test case, we can guarantee that the problem is completely within the methods that the failing test method called.
def test_size():
assert stack.getSize() == 0
assert stack.isEmpty()
def test_push():
self.stack.push(1)
assert stack.top() == 1, "top returns wrong object after push"
assert stack.getSize() == 1, "getSize wrong after push"
def test_pop():
stack.push(1)
stack.pop()
assert stack.getSize() == 0, "getSize wrong after push"
As far as test-driven development is concerned. I personally write larger "functional tests" that end up testing multiple methods at first, and then create unit tests as I start to implement individual pieces.
Another way to look at it is unit tests verify the contract of each individual method, while larger tests verify the contract that the objects and the system as a whole must follow.
I'm still using three method calls in test_push, however both top() and getSize() are queries that are tested by separate test methods.
You could get similar functionality by adding more asserts to the single test, but then later assertion failures would be hidden.

If you are testing more than one thing then it is called an Integration test...not a unit test. You would still run these integration tests in the same testing framework as your unit tests.
Integration tests are generally slower, unit tests are fast because all dependencies are mocked/faked, so no database/web service/slow service calls.
We run our unit tests on commit to source control, and our integration tests only get run in the nightly build.

If you test more than one thing and the first thing you test fails, you will not know if the subsequent things you are testing pass or fail. It is easier to fix when you know everything that will fail.

Smaller unit test make it more clear where the issue is when they fail.

The GLib, but hopefully still useful, answer is that unit = one. If you test more than one thing, then you aren't unit testing.

Regarding your example: If you are testing add and remove in the same unit test, how do you verify that the item was ever added to your list? That is why you need to add and verify that it was added in one test.
Or to use the lamp example: If you want to test your lamp and all you do is turn the switch on and then off, how do you know the lamp ever turned on? You must take the step in between to look at the lamp and verify that it is on. Then you can turn it off and verify that it turned off.

I support the idea that unit tests should only test one thing. I also stray from it quite a bit. Today I had a test where expensive setup seemed to be forcing me to make more than one assertion per test.
namespace Tests.Integration
{
[TestFixture]
public class FeeMessageTest
{
[Test]
public void ShouldHaveCorrectValues
{
var fees = CallSlowRunningFeeService();
Assert.AreEqual(6.50m, fees.ConvenienceFee);
Assert.AreEqual(2.95m, fees.CreditCardFee);
Assert.AreEqual(59.95m, fees.ChangeFee);
}
}
}
At the same time, I really wanted to see all my assertions that failed, not just the first one. I was expecting them all to fail, and I needed to know what amounts I was really getting back. But, a standard [SetUp] with each test divided would cause 3 calls to the slow service. Suddenly I remembered an article suggesting that using "unconventional" test constructs is where half the benefit of unit testing is hidden. (I think it was a Jeremy Miller post, but can't find it now.) Suddenly [TestFixtureSetUp] popped to mind, and I realized I could make a single service call but still have separate, expressive test methods.
namespace Tests.Integration
{
[TestFixture]
public class FeeMessageTest
{
Fees fees;
[TestFixtureSetUp]
public void FetchFeesMessageFromService()
{
fees = CallSlowRunningFeeService();
}
[Test]
public void ShouldHaveCorrectConvenienceFee()
{
Assert.AreEqual(6.50m, fees.ConvenienceFee);
}
[Test]
public void ShouldHaveCorrectCreditCardFee()
{
Assert.AreEqual(2.95m, fees.CreditCardFee);
}
[Test]
public void ShouldHaveCorrectChangeFee()
{
Assert.AreEqual(59.95m, fees.ChangeFee);
}
}
}
There is more code in this test, but it provides much more value by showing me all the values that don't match expectations at once.
A colleague also pointed out that this is a bit like Scott Bellware's specunit.net: http://code.google.com/p/specunit-net/

Another practical disadvantage of very granular unit testing is that it breaks the DRY principle. I have worked on projects where the rule was that each public method of a class had to have a unit test (a [TestMethod]). Obviously this added some overhead every time you created a public method but the real problem was that it added some "friction" to refactoring.
It's similar to method level documentation, it's nice to have but it's another thing that has to be maintained and it makes changing a method signature or name a little more cumbersome and slows down "floss refactoring" (as described in "Refactoring Tools: Fitness for Purpose" by Emerson Murphy-Hill and Andrew P. Black. PDF, 1.3 MB).
Like most things in design, there is a trade-off that the phrase "a test should test only one thing" doesn't capture.

When a test fails, there are three options:
The implementation is broken and should be fixed.
The test is broken and should be fixed.
The test is not anymore needed and should be removed.
Fine-grained tests with descriptive names help the reader to know why the test was written, which in turn makes it easier to know which of the above options to choose. The name of the test should describe the behaviour which is being specified by the test - and only one behaviour per test - so that just by reading the names of the tests the reader will know what the system does. See this article for more information.
On the other hand, if one test is doing lots of different things and it has a non-descriptive name (such as tests named after methods in the implementation), then it will be very hard to find out the motivation behind the test, and it will be hard to know when and how to change the test.
Here is what a it can look like (with GoSpec), when each test tests only one thing:
func StackSpec(c gospec.Context) {
stack := NewStack()
c.Specify("An empty stack", func() {
c.Specify("is empty", func() {
c.Then(stack).Should.Be(stack.Empty())
})
c.Specify("After a push, the stack is no longer empty", func() {
stack.Push("foo")
c.Then(stack).ShouldNot.Be(stack.Empty())
})
})
c.Specify("When objects have been pushed onto a stack", func() {
stack.Push("one")
stack.Push("two")
c.Specify("the object pushed last is popped first", func() {
x := stack.Pop()
c.Then(x).Should.Equal("two")
})
c.Specify("the object pushed first is popped last", func() {
stack.Pop()
x := stack.Pop()
c.Then(x).Should.Equal("one")
})
c.Specify("After popping all objects, the stack is empty", func() {
stack.Pop()
stack.Pop()
c.Then(stack).Should.Be(stack.Empty())
})
})
}

The real question is why make a test or more for all methods as few tests that cover many methods is simpler.
Well, so that when some test fails you know which method fails.
When you have to repair a non-functioning car, it is easier when you know which part of the engine is failing.
An example: A list class. Why should I make separate tests for addition and removal? A one test that first adds then removes sounds simpler.
Let's suppose that the addition method is broken and does not add, and that the removal method is broken and does not remove. Your test would check that the list, after addition and removal, has the same size as initially. Your test would be in success. Although both of your methods would be broken.

Disclaimer: This is an answer highly influenced by the book "xUnit Test Patterns".
Testing only one thing at each test is one of the most basic principles that provides the following benefits:
Defect Localization: If a test fails, you immediately know why it failed (ideally without further troubleshooting, if you've done a good job with the assertions used).
Test as a specification: the tests are not only there as a safety net, but can easily be used as specification/documentation. For instance, a developer should be able to read the unit tests of a single component and understand the API/contract of it, without needing to read the implementation (leveraging the benefit of encapsulation).
Infeasibility of TDD: TDD is based on having small-sized chunks of functionality and completing progressive iterations of (write failing test, write code, verify test succeeds). This process get highly disrupted if a test has to verify multiple things.
Lack of side-effects: Somewhat related to the first one, but when a test verifies multiple things, it's more possible that it will be tied to other tests as well. So, these tests might need to have a shared test fixture, which means that one will be affected by the other one. So, eventually you might have a test failing, but in reality another test is the one that caused the failure, e.g. by changing the fixture data.
I can only see a single reason why you might benefit from having a test that verifies multiple things, but this should be seen as a code smell actually:
Performance optimisation: There are some cases, where your tests are not running only in memory, but are also dependent in persistent storage (e.g. databases). In some of these cases, having a test verify multiple things might help in decreasing the number of disk accesses, thus decreasing the execution time. However, unit tests should ideally be executable only in memory, so if you stumble upon such a case, you should re-consider whether you are going in the wrong path. All persistent dependencies should be replaced with mock objects in unit tests. End-to-end functionality should be covered by a different suite of integration tests. In this way, you do not need to care about execution time anymore, since integration tests are usually executed by build pipelines and not by developers, so a slightly higher execution time has almost no impact to the efficiency of the software development lifecycle.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js