Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I always learned that doing maximum code coverage with unit tests is good. I also hear developers from big companies such as Microsoft saying that they write more lines of testing code than the executable code itself.
Now, is it really great? Doesn't it seem sometimes like a complete loss of time which has an only effect to making maintenance more difficult?
For example, let's say I have a method DisplayBooks() which populates a list of books from a database. The product requirements tell that if there are more than one hundred books in the store, only one hundred must be displayed.
So, with TDD,
I will start by making an unit test BooksLimit() which will save two hundred books in the database, call DisplayBooks(), and do an Assert.AreEqual(100, DisplayedBooks.Count).
Then I will test if it fails,
Then I'll change DisplayBooks() by setting the limit of results to 100, and
Finally I will rerun the test to see if it succeeds.
Well, isn't it much more easier to go directly to the third step, and do never make BooksLimit() unit test at all? And isn't it more Agile, when requirements will change from 100 to 200 books limit, to change only one character, instead of changing tests, running tests to check if it fails, changing code and running tests again to check if it succeeds?
Note: lets assume that the code is fully documented. Otherwise, some may say, and they would be right, that doing full unit tests will help to understand code which lacks documentation. In fact, having a BooksLimit() unit test will show very clearly that there is a maximum number of books to display, and that this maximum number is 100. Stepping into the non-unit-tests code would be much more difficult, since such limit may be implemented though for (int bookIndex = 0; bookIndex < 100; ... or foreach ... if (count >= 100) break;.
Well, isn't it much more easier to go directly to the third step, and do never make BooksLimit() unit test at all?
Yes... If you don't spend any time writing tests, you'll spend less time writing tests. Your project might take longer overall, because you'll spend a lot of time debugging, but maybe that's easier to explain to your manager? If that's the case... get a new job! Testing is crucial to improving your confidence in your software.
Unittesting gives the most value when you have a lot of code. It's easy to debug a simple homework assignment using a few classes without unittesting. Once you get out in the world, and you're working in codebases of millions of lines - you're gonna need it. You simply can't single step your debugger through everything. You simply can't understand everything. You need to know that the classes you're depending on work. You need to know if someone says "I'm just gonna make this change to the behavior... because I need it", but they've forgotten that there's two hundred other uses that depend on that behavior. Unittesting helps prevent that.
With regard to making maintenance harder: NO WAY! I can't capitalize that enough.
If you're the only person that ever worked on your project, then yes, you might think that. But that's crazy talk! Try to get up to speed on a 30k line project without unittests. Try to add features that require significant changes to code without unittests. There's no confidence that you're not breaking implicit assumptions made by the other engineers. For a maintainer (or new developer on an existing project) unittests are key. I've leaned on unittests for documentation, for behavior, for assumptions, for telling me when I've broken something (that I thought was unrelated). Sometimes a poorly written API has poorly written tests and can be a nightmare to change, because the tests suck up all your time. Eventually you're going to want to refactor this code and fix that, but your users will thank you for that too - your API will be far easier to use because of it.
A note on coverage:
To me, it's not about 100% test coverage. 100% coverage doesn't find all the bugs, consider a function with two if statements:
// Will return a number less than or equal to 3
int Bar(bool cond1, bool cond2) {
int b;
if (cond1) {
b++;
} else {
b+=2;
}
if (cond2) {
b+=2;
} else {
b++;
}
}
Now consider I write a test that tests:
EXPECT_EQ(3, Bar(true, true));
EXPECT_EQ(3, Bar(false, false));
That's 100% coverage. That's also a function that doesn't meet the contract - Bar(false, true); fails, because it returns 4. So "complete coverage" is not the end goal.
Honestly, I would skip tests for BooksLimit(). It returns a constant, so it probably isn't worth the time to write them (and it should be tested when writing DisplayBooks()). I might be sad when someone decides to (incorrectly) calculate that limit from the shelf size, and it no longer satisfies our requirements. I've been burned by "not worth testing" before. Last year I wrote some code that I said to my coworker: "This class is mostly data, it doesn't need to be tested". It had a method. It had a bug. It went to production. It paged us in the middle of the night. I felt stupid. So I wrote the tests. And then I pondered long and hard about what code constitutes "not worth testing". There isn't much.
So, yes, you can skip some tests. 100% test coverage is great, but it doesn't magically mean your software is perfect. It all comes down to confidence in the face of change.
If I put class A, class B and class C together, and I find something that doesn't work, do I want to spend time debugging all three? No. I want to know that A and B already met their contracts (via unittests) and my new code in class C is probably broken. So I unittest it. How do I even know it's broken, if I don't unittest? By clicking some buttons and trying the new code? That's good, but not sufficient. Once your program scales up, it'll be impossible to rerun all your manual tests to check that everything works right. That's why people who unittest usually automate running their tests too. Tell me "Pass" or "Fail", don't tell me "the output is ...".
OK, gonna go write some more tests...
100% unit test coverage is generally a code smell, a sign that someone has come over all OCD over the green bar in the coverage tool, instead of doing something more useful.
Somewhere around 85% is the sweet spot, where a test failing more often that not indicates an actual or potential problem, rather than simply being an inevitable consequence of any textual change not inside comment markers. You are not documenting any useful assumptions about the code if your assumptions are 'the code is what it is, and if it was in any way different it would be something else'. That's a problem solved by a comment-aware checksum tool, not a unit test.
I wish there was some tool that would let you specify the target coverage. And then if you accidentally go over it, show things in yellow/orange/red to push you towards deleting some of the spurious extra tests.
When looking at an isolated problem, you're completely right. But unit tests are about covering all the intentions you have for a certain piece of code.
Basically, the unit tests formulate your intentions. With a growing number of intentions, the behavior of the code to be tested can always be checked against all intentions made so far. Whenever a change is made, you can prove that there is no side-effect which breaks existing intentions. Newly found bugs are nothing else but an (implicit) intention which is not held by the code, so that you formulate your intention as new test (which fails at first) and the fix it.
For one-time code, unit tests are indeed not worth the effort because no major changes are expected. However, for any block of code which is to be maintained or which serves as component for other code, warranting that all intentions are held for any new version is worth a lot (in terms of less effort for manually trying to check for side effects).
The tipping point where unit tests actually save you time and therefore money depends on the complexity of the code, but there always is a tipping point which usually is reached after only few iterations of changes. Also, last but not least it allows you to ship fixes and changes much faster without compromising the quality of your product.
There is no exlpicit relation between code coverage and good software. You can easily imagine piece of code that has 100%(or close) code coverage and it still contains a lot of bugs. (Which does not mean that tests are bad!)
Your question about agility of 'no test at all' approach is a good one only for short perspective (which means it is most likely not good if you plan to build your program for longer time). I know from my experience that such simple tests are very useful when your project gets bigger and bigger and at some stage you need to make significant changes. This can be a momment when you'll say to yourself 'It was a good decision to spend some extra minutes to write that tiny test that spotted bug I just introduced!".
I was a big fan of code coverage recently but now it turned (luckilly) to something like 'problems coverage' approach. It means that your tests should cover all problems and bugs that were spotted not just 'lines of code'. There is no need to do a 'code coverage race'.
I understand 'Agile' word in terms of number tests as 'number of tests that helps me build good software and not waste time to write unnecessary piece of code' rather than '100% coverage' or 'no tests at all'. It's very subjective and it based on your experience, team, technology and many others factors.
The psychological side effect of '100% code coverage' is that you may think that your code has no bugs, which never is true:)
Sorry for my english.
100% code coverage is a managerial physiological disorder to impress stakeholders artificially. We do testing because there is some complex code out there which can lead to defects. So we need to make sure that the complex code has a test case, its tested and the defects are fixed before its live.
We should aim at test something which is complex and not just everything. Now this complex needs to be expressed in terms of some metric number which can be ether cyclomatic complexity , lines of code , aggregations , coupling etc. or its probably culmination of all the above things. If we find that metric higher we need to ensure that , that part of the code is covered. Below is my article which covers what is the best % for code coverage.
Is 100% code coverage really needed ?
I agree with #soru, 100% test coverage is not a rational goal.
I do not believe that any tool or metric can exist that can tell you the "right" amount of coverage. When I was in grad school, my Thesis advisor's work was on designing test coverage for "mutated" code. He's take a suite of tests, and then run an automated program to make errors in the source code under test. The idea was that the mutated code contained errors that would be found in the real world, and thus a test suite that found the highest percentage of broken code was the winner.
While his thesis was accepted, and he is now a Professor at a major engineering school, he did not find either:
1) a magic number of test coverage that is optimal
2) any suite that could find 100% of the errors.
Note, the goal is to find 100% of the errors, not to find 100% coverage.
Whether #soru's 85% is right or not is a subject for discussion. I have no means to assess if a better number would be 80% or 90% or anything else. But as a working assessment, 85% feels about right to me.
First 100% is hard to get especially on big projects ! and even if you do when a block of code is covered it doesn't mean that it is doing what it is supposed to unless your test are asserting every possible input and output (which is Almost impossible).
So i wouldn't consider a piece of software to be good simply because it has 100% code coverage but code coverage still a good thing to have.
Well, isn't it much more easier to go directly to the third step, and do never make BooksLimit() unit test at all?
well having that test there makes you pretty confident that if someone changes the code and the test fails you will notice that something is wrong with the new code therefore you avoid any potencial bug in your application
When the client decides to change the limit to 200, good luck finding bugs related to that seemingly trivial test. Specially, when you have other 100 variables in your code, and there are other 5 developers working on code that relies on that tiny piece of information.
My point: if it's valuable to the business value (or, if you dislike the name, to the very important core of the project), test it. Only discard when there is no possible (or cheap) way of testing it, like UI or user interaction, or when you are sure the impact of not writing that test is minimal. This holds truer for projects with vague, or quickly changing requirements [as I painfully discovered].
For the other example you present, the recommended is to test boundary values. So you can limit your tests to only four values: 0, some magical number between 0 and BooksLimit, BooksLimit, and some number higher.
And, as other people said, make tests, but be 100% positive something else can fail.
I am writing a fairly complicated machine learning program for my thesis in computer vision. It's working fairly well, but I need to keep trying out new things out and adding new functionality. This is problematic because I sometimes introduce bugs when I am extending the code or trying to simplify an algorithm.
Clearly the correct thing to do is to add unit tests, but it is not clear how to do this. Many components of my program produce a somewhat subjective answer, and I cannot automate sanity checks.
For example, I had some code that approximated a curve with a lower-resolution curve, so that I could do computationally intensive work on the lower-resolution curve. I accidentally introduced a bug into this code, and only found it through a painstaking search when my the results of my entire program got slightly worse.
But, when I tried to write a unit-test for it, it was unclear what I should do. If I make a simple curve that has a clearly correct lower-resolution version, then I'm not really testing out everything that could go wrong. If I make a simple curve and then perturb the points slightly, my code starts producing different answers, even though this particular piece of code really seems to work fine now.
You may not appreciate the irony, but basically what you have there is legacy code: a chunk of software without any unit tests. Naturally you don't know where to begin. So you may find it helpful to read up on handling legacy code.
The definitive thought on this is Michael Feather's book, Working Effectively with Legacy Code. There used to be a helpful summary of that on the ObjectMentor site, but alas the website has gone the way of the company. However WELC has left a legacy in reviews and other articles. Check them out (or just buy the book), although the key lessons are the ones which S.Lott and tvanfosson cover in their replies.
2019 update: I have fixed the link to the WELC summary with a version from the Wayback Machine web archive (thanks #milia).
Also - and despite knowing that answers which comprise mainly links to other sites are low quality answers :) - here is a link to a new (2019 new) Google tutorial on Testing and Debugging ML code. I hope this will be of illumination to future Seekers who stumble across this answer.
"then I'm not really testing out everything that could go wrong."
Correct.
The job of unit tests is not to test everything that could go wrong.
The job of unit tests is to test that what you have does the right thing, given specific inputs and specific expected results. The important part here is the specific visible, external requirements are satisfied by specific test cases. Not that every possible thing that could go wrong is somehow prevented.
Nothing can test everything that could go wrong. You can write a proof, but you'll be hard-pressed to write tests for everything.
Choose your test cases wisely.
Further, the job of unit tests is to test that each small part of the overall application does the right thing -- in isolation.
Your "code that approximated a curve with a lower-resolution curve" for example, probably has several small parts that can be tested as separate units. In isolation. The integrated whole could also be tested to be sure that it works.
Your "computationally intensive work on the lower-resolution curve" for example, probably has several small parts that can be tested as separate units. In isolation.
That point of unit testing is to create small, correct units that are later assembled.
Without seeing your code, it's hard to tell, but I suspect that you are attempting to write tests at too high a level. You might want to think about breaking your methods down into smaller components that are deterministic and testing these. Then test the methods that use these methods by providing mock implementations that return predictable values from the underlying methods (which are probably located on a different object). Then you can write tests that cover the domain of the various methods, ensuring that you have coverage of the full range of possible outcomes. For the small methods you do so by providing values that represent the domain of inputs. For the methods that depend on these, by providing mock implementations that return the range of outcomes from the dependencies.
Your unit tests need to employ some kind of fuzz factor, either by accepting approximations, or using some kind of probabilistic checks.
For example, if you have some function that returns a floating point result, it is almost impossible to write a test that works correctly across all platforms. Your checks would need to perform the approximation.
TEST_ALMOST_EQ(result, 4.0);
Above TEST_ALMOST_EQ might verify that result is between 3.9 and 4.1 (for example).
Alternatively, if your machine learning algorithms are probabilistic, your tests will need to accommodate for it by taking the average of multiple runs and expecting it to be within some range.
x = 0;
for (100 times) {
x += result_probabilistic_test();
}
avg = x/100;
TEST_RANGE(avg, 10.0, 15.0);
Ofcourse, the tests are non-deterministic, so you will need to tune them such that you can get non-flaky tests with a high probability. (E.g., increase the number of trials, or increase the range of error).
You can also use mocks for this (e.g, a mock random number generator for your probabilistic algorithms), and they usually help for deterministically testing specific code paths, but they are a lot of effort to maintain. Ideally, you would use a combination of fuzzy testing and mocks.
HTH.
Generally, for statistical measures you would build in an epsilon for your answer. I.E. the mean square difference of your points would be < 0.01 or some such. Another option is to run several times and if it fails "too often" then you have an issue.
Get an appropriate test dataset (maybe a subset of what your using usually)
Calculate some metric on this dataset (e.g. the accuracy)
Note down the value obtained (cross-validated)
This should give an indication of what to set the threshold for
Of course if can be that when making changes to your code the performance on the dataset will increase a little, but if it ever decreases by large this would be an indication something is going wrong.
As a programmer, I have bought whole-heartedly into the TDD philosophy and take the effort to make extensive unit tests for any nontrivial code I write. Sometimes this road can be painful (behavioral changes causing cascading multiple unit test changes; high amounts of scaffolding necessary), but on the whole I refuse to program without tests that I can run after every change, and my code is much less buggy as a result.
Recently, I've been playing with Haskell, and it's resident testing library, QuickCheck. In a fashion distinctly different from TDD, QuickCheck has an emphasis on testing invariants of the code, that is, certain properties that hold over all (or substantive subsets) of inputs. A quick example: a stable sorting algorithm should give the same answer if we run it twice, should have increasing output, should be a permutation of the input, etc. Then, QuickCheck generates a variety of random data in order to test these invariants.
It seems to me, at least for pure functions (that is, functions without side effects--and if you do mocking correctly you can convert dirty functions into pure ones), that invariant testing could supplant unit testing as a strict superset of those capabilities. Each unit test consists of an input and an output (in imperative programming languages, the "output" is not just the return of the function but also any changed state, but this can be encapsulated). One could conceivably created a random input generator that is good enough to cover all of the unit test inputs that you would have manually created (and then some, because it would it would generate cases that you wouldn't have thought of); if you find a bug in your program due to some boundary condition, you improve your random input generator so that it generates that case too.
The challenge, then, is whether or not it's possible to formulate useful invariants for every problem. I'd say it is: it's a lot simpler once you have an answer to see if it's correct than it is to calculate the answer in the first place. Thinking about invariants also helps clarify the specification of a complex algorithm much better than ad hoc test cases, which encourage a kind of case-by-case thinking of the problem. You could use a previous version of your program as a model implementation, or a version of a program in another language. Etc. Eventually, you could cover all of your former test-cases without having to explicitly code an input or an output.
Have I gone insane, or am I on to something?
A year later, I now think I have an answer to this question: No! In particular, unit tests will always be necessary and useful for regression tests, in which a test is attached to a bug report and lives on in the codebase to prevent that bug from ever coming back.
However, I suspect that any unit test can be replaced with a test whose inputs are randomly generated. Even in the case of imperative code, the “input” is the order of imperative statements you need to make. Of course, whether or not it’s worth creating the random data generator, and whether or not you can make the random data generator have the right distribution is another question. Unit testing is simply a degenerate case where the random generator always gives the same result.
What you've brought up is a very good point - when only applied to functional programming. You stated a means of accomplishing this all with imperative code, but you also touched on why it's not done - it's not particularly easy.
I think that's the very reason it won't replace unit testing: it doesn't fit for imperative code as easily.
Doubtful
I've only heard of (not used) these kinds of tests, but I see two potential issues. I would love to have comments about each.
Misleading results
I've heard of tests like:
reverse(reverse(list)) should equal list
unzip(zip(data)) should equal data
It would be great to know that these hold true for a wide range of inputs. But both these tests would pass if the functions just return their input.
It seems to me that you'd want to verify that, eg, reverse([1 2 3]) equals [3 2 1] to prove correct behavior in at least one case, then add some testing with random data.
Test complexity
An invariant test that fully describes the relationship between the input and output might be more complex than the function itself. If it's complex, it could be buggy, but you don't have tests for your tests.
A good unit test, by contrast, is too simple to screw up or misunderstand as a reader. Only a typo could create a bug in "expect reverse([1 2 3]) to equal [3 2 1]".
What you wrote in your original post, reminded me of this problem, which is an open question as to what the loop invariant is to prove the loop correct...
anyways, i am not sure how much you have read in formal spec, but you are heading down that line of thought. david gries's book is one the classics on the subject, I still haven't mastered the concept well enough to use it rapidly in my day to day programming. the usual response to formal spec is, its hard and complicated, and only worth the effort if you are working on safety critical systems. but i think there are back of envelope techniques similar to what quickcheck exposes that can be used.
I'm just getting into unit testing, and have written some short tests to check if function called isPrime() works correctly.
I've got a test that checks that the function works, and have some test data in the form of some numbers and the expected return value.
How many should I test? How do I decide on which to test? What's the best-practices here?
One approach would be to generate 1000 primes, then loop through them all, another would be to just select 4 or 5 and test them. What's the correct thing to do?
I've also been informed that every time a bug is found, you should write a test to verify that it is fixed. It seems reasonable to me, anyway.
You'd want to check edge cases. How big a prime number is your method supposed to be able to handle? This will depend on what representation (type) you used. If you're only interested in small (really relative term when used in number theory) primes, you're probably using int or long. Test a handful of the biggest primes you can in the representation you've chosen. Make sure you check some non-prime numbers too. (These are much easier to verify independently.)
Naturally, you'll also want to test a few small numbers (primes and non-primes) and a few in the middle of the range. A handful of each should be plenty. Also make sure you throw an exception (or return an error code, whichever is your preference) for numbers that are out of range of your valid inputs.
in general, test as many cases as you need to feel comfortable/confident
also in general, test the base/zero case, the maximum case, and at least one median/middle case
also test expected-exception cases, if applicable
if you're unsure of your prime algorithm, then by all means test it with the first 1000 primes or so, to gain confidence
"Beware of bugs. I have proven the above algorithm correct, but have not tested it yet."
Some people don't understand the above quote (paraphrase?), but it makes perfect sense when you think about it. Tests will never prove an algorithm correct, they only help to indicate whether you've coded it right. Write tests for mistakes you expect might appear and for boundary conditions to achieve good code coverage. Don't just be picking values out of the blue to see if they work, because that might lead to lots of tests which all test exactly the same thing.
For your example, just hand-select a few primes and non-primes to test specific conditions in the implementation.
Ask yourself: what EXACTLY do I want to test, and test the most important things. Test to make sure it basically does what you are expecting it to do in the expected cases.
Testing all those nulls and edge-cases - I - don't think is real, too time consuming and someone needs to maintain that later!
And...your test code should be simple enough so that you do not need to test your test code!
If you want to check that your function correctly applied the algorithm and works in general - probably will be enough some primes.
If you want prove that the method for finding primes is CORRECT - 100000 primes will not be enough. But you don't want to test the latter (probably).
Only you know what you want to test!
PS I think using loops in unit tests is not always wrong but I would think twice before doing that. Test code should be VERY simple. What if something goes wrong and there is a bug in your test? However, you should try to avoid test code duplication as regular code duplication. Someone has to maintain test code.
A few questions, the answers may inform your decision:
How important is the correct functioning of this code?
Is the implementation of this code likely to be changed in the future? (if so, test more to support the future change)
Is the public contract of this code likely to change in the future? (if so, test less - to reduce the amount of throw-away test code)
How is the code coverage, are all branches visited?
Even if the code doesn't branch, are boundary considerations tested?
Do the tests run quickly?
Edit: Hmm, so to advise in your specific scenario. Since you started writing unit tests yesterday, you might not have the experience to decide among all these factors. Let me help you:
This code is probably not too important (no one dies, no one goes to war, no one is sued), so a smattering of tests will be fine.
The implementation probably won't change (prime number techniques are well known), so we don't need tests to support this. If the implementation does change, it's probably due to an observed failing value. That can be added as a new test at the time of change.
The public contract of this won't change.
Get 100% code coverage on this. There's no reason to write code that a test doesn't visit in this circumstance. You should be able to do this with a small number of tests.
If you care what the code does when zero is called, test that.
The small number of tests should run quickly. This will allow them to be run frequently (both by developers and by automation).
I would test 1, 2, 3, 21, 23, a "large" prime (5 digits), a "large" non-prime and 0 if you care what this does with 0.
To be really sure, you're going to have to test them all. :-)
Seriously though, for this kind of function you're probably using an established and proven algorithm. The main thing you need to do is verify that your code correctly implements the algorithm.
The other thing is to make sure you understand the limits of your number representation, whatever that is. At the very least, this will put an upper limit on the size the number you can test. E.g., if you use a 32-bit unsigned int, you're never going to be able to test values larger than 4G. Possibly your limit will be lower than that, depending on the details of your implementation.
Just as an example of something that could go wrong with an implementation:
A simple algorithm for testing primes is to try dividing the candidate by all known primes up to the square root of the candidate. The square root function will not necessarily give an exact result, so to be safe you should go a bit past that. How far past would depend on specifically how the square root function is implemented and how much it could be off.
Another note on testing: In addition to testing known primes to see if your function correctly identifies them as prime, also test known composite numbers to make sure you're not getting "false positives." To make sure you get that square root function thing right, pick some composite numbers that have a prime factor as close as possible to their square root.
Also, consider how you're going to "generate" your list of primes for testing. Can you trust that list to be correct? How were those numbers tested, and by whom?
You might consider coding two functions and testing them against each other. One could be a simple but slow algorithm that you can be more sure of having coded correctly, and the other a faster but more complex one that you really want to use in your app, but is more likely to have a coding mistake.
"If it's worth building, it's worth testing"
"If it's not worth testing, why are you wasting your time working on it?"
I'm guessing you didn't subscribe to test first, where you write the test before you write the code?
Not to worry, I'm sure you are not alone.
As has been said above, test the edges, great place to start. Also must test bad cases, if you only test what you know works, you can be sure that a bad case will happen in production at the worst time.
Oh and "I've just found the last bug" - HA HA.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
The thing I've found about TDD is that its takes time to get your tests set up and being naturally lazy I always want to write as little code as possible. The first thing I seem do is test my constructor has set all the properties but is this overkill?
My question is to what level of granularity do you write you unit tests at?
..and is there a case of testing too much?
I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don't typically make a kind of mistake (like setting the wrong variables in a constructor), I don't test for it. I do tend to make sense of test errors, so I'm extra careful when I have logic with complicated conditionals. When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.
Different people will have different testing strategies based on this philosophy, but that seems reasonable to me given the immature state of understanding of how tests can best fit into the inner loop of coding. Ten or twenty years from now we'll likely have a more universal theory of which tests to write, which tests not to write, and how to tell the difference. In the meantime, experimentation seems in order.
Write unit tests for things you expect to break, and for edge cases. After that, test cases should be added as bug reports come in - before writing the fix for the bug. The developer can then be confident that:
The bug is fixed;
The bug won't reappear.
Per the comment attached - I guess this approach to writing unit tests could cause problems, if lots of bugs are, over time, discovered in a given class. This is probably where discretion is helpful - adding unit tests only for bugs that are likely to re-occur, or where their re-occurrence would cause serious problems. I've found that a measure of integration testing in unit tests can be helpful in these scenarios - testing code higher up codepaths can cover the codepaths lower down.
Everything should be made as simple as
possible, but not simpler. - A. Einstein
One of the most misunderstood things about TDD is the first word in it. Test. That's why BDD came along. Because people didn't really understand that the first D was the important one, namely Driven. We all tend to think a little bit to much about the Testing, and a little bit to little about the driving of design. And I guess that this is a vague answer to your question, but you should probably consider how to drive your code, instead of what you actually are testing; that is something a Coverage-tool can help you with. Design is a quite bigger and more problematic issue.
To those who propose testing "everything": realise that "fully testing" a method like int square(int x) requires about 4 billion test cases in common languages and typical environments.
In fact, it's even worse than that: a method void setX(int newX) is also obliged not to alter the values of any other members besides x -- are you testing that obj.y, obj.z, etc. all remain unchanged after calling obj.setX(42);?
It's only practical to test a subset of "everything." Once you accept this, it becomes more palatable to consider not testing incredibly basic behaviour. Every programmer has a probability distribution of bug locations; the smart approach is to focus your energy on testing regions where you estimate the bug probability to be high.
The classic answer is "test anything that could possibly break". I interpret that as meaning that testing setters and getters that don't do anything except set or get is probably too much testing, no need to take the time. Unless your IDE writes those for you, then you might as well.
If your constructor not setting properties could lead to errors later, then testing that they are set is not overkill.
I write tests to cover the assumptions of the classes I will write. The tests enforce the requirements. Essentially, if x can never be 3, for example, I'm going to ensure there is a test that covers that requirement.
Invariably, if I don't write a test to cover a condition, it'll crop up later during "human" testing. I'll certainly write one then, but I'd rather catch them early. I think the point is that testing is tedious (perhaps) but necessary. I write enough tests to be complete but no more than that.
Part of the problem with skipping simple tests now is in the future refactoring could make that simple property very complicated with lots of logic. I think the best idea is that you can use Tests to verify requirements for the module. If when you pass X you should get Y back, then that's what you want to test. Then when you change the code later on, you can verify that X gives you Y, and you can add a test for A gives you B, when that requirement is added later on.
I've found that the time I spend during initial development writing tests pays off in the first or second bug fix. The ability to pick up code you haven't looked at in 3 months and be reasonably sure your fix covers all the cases, and "probably" doesn't break anything is hugely valuable. You also will find that unit tests will help triage bugs well beyond the stack trace, etc. Seeing how individual pieces of the app work and fail gives huge insight into why they work or fail as a whole.
In most instances, I'd say, if there is logic there, test it. This includes constructors and properties, especially when more than one thing gets set in the property.
With respect to too much testing, it's debatable. Some would say that everything should be tested for robustness, others say that for efficient testing, only things that might break (i.e. logic) should be tested.
I'd lean more toward the second camp, just from personal experience, but if somebody did decide to test everything, I wouldn't say it was too much... a little overkill maybe for me, but not too much for them.
So, No - I would say there isn't such a thing as "too much" testing in the general sense, only for individuals.
Test Driven Development means that you stop coding when all your tests pass.
If you have no test for a property, then why should you implement it? If you do not test/define the expected behaviour in case of an "illegal" assignment, what should the property do?
Therefore I'm totally for testing every behaviour a class should exhibit. Including "primitive" properties.
To make this testing easier, I created a simple NUnit TestFixture that provides extension points for setting/getting the value and takes lists of valid and invalid values and has a single test to check whether the property works right. Testing a single property could look like this:
[TestFixture]
public class Test_MyObject_SomeProperty : PropertyTest<int>
{
private MyObject obj = null;
public override void SetUp() { obj = new MyObject(); }
public override void TearDown() { obj = null; }
public override int Get() { return obj.SomeProperty; }
public override Set(int value) { obj.SomeProperty = value; }
public override IEnumerable<int> SomeValidValues() { return new List() { 1,3,5,7 }; }
public override IEnumerable<int> SomeInvalidValues() { return new List() { 2,4,6 }; }
}
Using lambdas and attributes this might even be written more compactly. I gather MBUnit has even some native support for things like that. The point though is that the above code captures the intent of the property.
P.S.: Probably the PropertyTest should also have a way of checking that other properties on the object didn't change. Hmm .. back to the drawing board.
I make unit test to reach the maximum feasible coverage. If I cannot reach some code, I refactor until the coverage is as full as possible
After finished to blinding writing test, I usually write one test case reproducing each bug
I'm used to separate between code testing and integration testing. During integration testing, (which are also unit test but on groups of components, so not exactly what for unit test are for) I'll test for the requirements to be implemented correctly.
So the more I drive my programming by writing tests, the less I worry about the level of granuality of the testing. Looking back it seems I am doing the simplest thing possible to achieve my goal of validating behaviour. This means I am generating a layer of confidence that my code is doing what I ask to do, however this is not considered as absolute guarantee that my code is bug free. I feel that the correct balance is to test standard behaviour and maybe an edge case or two then move on to the next part of my design.
I accept that this will not cover all bugs and use other traditional testing methods to capture these.
Generally, I start small, with inputs and outputs that I know must work. Then, as I fix bugs, I add more tests to ensure the things I've fixed are tested. It's organic, and works well for me.
Can you test too much? Probably, but it's probably better to err on the side of caution in general, though it'll depend on how mission-critical your application is.
I think you must test everything in your "core" of your business logic. Getter ans Setter too because they could accept negative value or null value that you might do not want to accept. If you have time (always depend of your boss) it's good to test other business logic and all controller that call these object (you go from unit test to integration test slowly).
I don't unit tests simple setter/getter methods that have no side effects. But I do unit test every other public method. I try to create tests for all the boundary conditions in my algorthims and check the coverage of my unit tests.
Its a lot of work but I think its worth it. I would rather write code (even testing code) than step through code in a debugger. I find the code-build-deploy-debug cycle very time consuming and the more exhaustive the unit tests I have integrated into my build the less time I spend going through that code-build-deploy-debug cycle.
You didn't say why architecture you are coding too. But for Java I use Maven 2, JUnit, DbUnit, Cobertura, & EasyMock.
The more I read about it the more I think some unit tests are just like some patterns: A smell of insufficient languages.
When you need to test whether your trivial getter actually returns the right value, it is because you may intermix getter name and member variable name. Enter 'attr_reader :name' of ruby, and this can't happen any more. Just not possible in java.
If your getter ever gets nontrivial you can still add a test for it then.
Test the source code that makes you worried about it.
Is not useful to test portions of code in which you are very very confident with, as long as you don't make mistakes in it.
Test bugfixes, so that it is the first and last time you fix a bug.
Test to get confidence of obscure code portions, so that you create knowledge.
Test before heavy and medium refactoring, so that you don't break existing features.
This answer is more for figuring out how many unit tests to use for a given method you know you want to unit test due to its criticality/importance. Using Basis Path Testing technique by McCabe, you could do the following to quantitatively have better code coverage confidence than simple "statement coverage" or "branch coverage":
Determine Cyclomatic Complexity value of your method that you want to unit test (Visual Studio 2010 Ultimate for example can calculate this for you with static analysis tools; otherwise, you can calculate it by hand via flowgraph method - http://users.csc.calpoly.edu/~jdalbey/206/Lectures/BasisPathTutorial/index.html)
List the basis set of independent paths that flow thru your method - see link above for flowgraph example
Prepare unit tests for each independent basis path determined in step 2