Do your unit tests constitute 100% code coverage? Yes or no, and why or why not.
No for several reasons :
It is really expensive to reach the 100% coverage, compared to the 90% or 95% for a benefit that is not obvious.
Even with 100% of coverage, your code is not perfect. Take a look at this method (in fact it depends on which type of coverage you are talking about - branch coverage, line coverage...):
public static String foo(boolean someCondition) {
String bar = null;
if (someCondition) {
bar = "blabla";
}
return bar.trim();
}
and the unit test:
assertEquals("blabla", foo(true));
The test will succeed, and your code coverage is 100%. However, if you add another test:
assertEquals("blabla", foo(false));
then you will get a NullPointerException. And as you were at 100% with the first test, you would have not necessarily write the second one!
Generally, I consider that the critical code must be covered at almost 100%, while the other code can be covered at 85-90%
To all the 90% coverage tester:
The problem with doing so is that the 10% hard to test code is also the not-trivial code that contains 90% of the bug! This is the conclusion I got empirically after many years of TDD.
And after all this is pretty straightforward conclusion. This 10% hard to test code, is hard to test because it reflect tricky business problem or tricky design flaw or both. These exact reasons that often leads to buggy code.
But also:
100% covered code that decreases with time to less than 100% covered often pinpoints a bug or at least a flaw.
100% covered code used in conjunction with contracts, is the ultimate weapon to lead to live close to bug-free code. Code Contracts and Automated Testing are pretty much the same thing
When a bug is discovered in 100% covered code, it is easier to fix. Since the code responsible for the bug is already covered by tests, it shouldn't be hard to write new tests to cover the bug fix.
No, because there is a practical trade-off between perfect unit tests and actually finishing a project :)
It is seldom practical to get 100% code coverage in a non-trivial system. Most developers who write unit tests shoot for the mid to high 90's.
An automated testing tool like Pex can help increase code coverage. It works by searching for hard-to-find edge cases.
Yes we do.
It depends on what language and framework you're using as to how easy that is to achieve though.
We're using Ruby on Rails for my current project. Ruby is very "mockable" in that you can stub/mock out large chunks of your code without having to build in overly complicated class composition and construction designs that you would have to do in other languages.
That said, we only have 100% line coverage (basically what rcov gives you). You still have to think about testing all the required branches.
This is only really possible if you include it from the start as part of your continuous integration build, and break the build if coverage drops below 100% - prompting developers to immediately fix it. Of course you could choose some other number as a target, but if you're starting fresh, there isn't much difference for the effort to get from 90% to 100%
We've also got a bunch of other metrics that break the build if they cross a given threshold as well (cyclomatic complexity, duplication for example) these all go together and help reinforce each other.
Again, you really have to have this stuff in place from the start to keep working at a strict level - either that or set some target you can hit, and gradually ratchet it up till you get to a level you're happy with.
Does doing this add value? I was skeptical at first, but I can honestly say that yes it does. Not primarily because you have thoroughly tested code (although that is definitely a benefit), but more in terms of writing simple code that is easy to test and reason about. If you know you have to have 100% test coverage, you stop writing overly complex if/else/while/try/catch monstrosities and Keep It Simple Stupid.
What I do when I get the chance is to insert statements on every branch of the code that can be grepped for and that record if they've been hit, so that I can do some sort of comparison to see which statements have not been hit. This is a bit of a chore, so I'm not always good about it.
I just built a small UI app to use in charity auctions, that uses MySQL as its DB. Since I really, really didn't want it to break in the middle of an auction, I tried something new.
Since it was in VC6 (C++ + MFC) I defined two macros:
#define TCOV ASSERT(FALSE)
#define _COV ASSERT(TRUE)
and then I sprinkled
TCOV;
throughout the code, on every separate path I could find, and in every routine.
Then I ran the program under the debugger, and every time it hit a TCOV, it would halt.
I would look at the code for any obvious problems, and then edit it to _COV, then continue. The code would recompile on the fly and move on to the next TCOV.
In this way, I slowly, laboriously, eliminated enough TCOV statements so it would run "normally".
After a while, I grepped the code for TCOV, and that showed what code I had not tested. Then I went back and ran it again, making sure to test more branches I had not tried earlier.
I kept doing this until there were no TCOV statements left in the code.
This took a few hours, but in the process I found and fixed several bugs. There is no way I could have had the discipline to make and follow a test plan that would have been that thorough.
Not only did I know I had covered all branches, but it had made me look at every branch while it was running - a very good kind of code review.
So, whether or not you use a coverage tool, this is a good way to root out bugs that would otherwise lurk in the code until a more embarrasing time.
I personally find 100% test coverage to be problematic on multiple levels. First and foremost, you have to make sure you are gaining a tangible, cost-saving benefit from the unit tests you write. In addition, unit tests, like any other code, are CODE. That means it, just like any other code, must be verified for correctness and maintained. That additional time verifying additional code for correctness, and maintaining it and keeping those tests valid in response to changes to business code, adds cost. Achieving 100% test coverage and ensuring you test you're code as thoroughly as possible is a laudable endeavor, but achieving it at any cost...well, is often too costly.
There are many times when covering error and validity checks that are in place to cover fringe or extremely rare, but definitely possible, exceptional cases are an example of code that does not necessarily need to be covered. The amount of time, effort (and ultimately money) that must be invested to achieve coverage of such rare fringe cases is often wasteful in light of other business needs. Properties are often a part of code that, especially with C# 3.0, do not need to be tested as most, if not all, properties behave exactly the same way, and are excessively simple (single-statement return or set.) Investing tremendous amounts of time wrapping unit tests around thousands of properties could quite likely be better invested somewhere else where a greater, more valuable tangible return on that investment can be realized.
Beyond simply achieving 100% test coverage, there are similar problems with trying to set up the "perfect" unit. Mocking frameworks have progressed to an amazing degree these days, and almost anything can be mocked (if you are willing to pay money, TypeMock can actually mock anything and everything, but it does cost a lot.) However, there are often times when dependencies of your code were not written in a mock-able way (this is actually a core problem with the vast bulk of the .NET framework itself.) Investing time to achieve the proper scope of a test is useful, but putting in excessive amounts of time to mock away everything and anything under the face of the sun, adding layers of abstraction and interfaces to make it possible, is again most often a waste of time, effort, and ultimately money.
The ultimate goal with testing shouldn't really be to achieve the ultimate in code coverage. The ultimate goal should be achieving the greatest value per unit time invested in writing unit tests, while covering as much as possible in that time. The best way to achieve this is to take the BDD approach: Specify your concerns, define your context, and verify the expected outcomes occur for any piece of behavior being developed (behavior...not unit.)
On a new project I practice TDD and maintain 100% line coverage. It mostly occurs naturally through TDD. Coverage gaps are usually worth the attention and are easily filled. If the coverage tool I'm using provided branch coverage or something else I'd pay attention to that, although I've never seen branch coverage tell me anything, probably because TDD got there first.
My strongest argument for maintaining 100% coverage (if you care about coverage at all) is that it's much easier to maintain 100% coverage than to manage less than 100% coverage. If you have 100% coverage and it drops, you immediately know why and can easily fix it, because the drop is in code you've just been working on. But if you settle for 95% or whatever, it's easy to miss coverage regressions and you're forever re-reviewing known gaps. It's the exact reason why current best practice requires one's test suite to pass completely. Anything less is harder, not easier, to manage.
My attitude is definitely bolstered by having worked in Ruby for some time, where there are excellent test frameworks and test doubles are easy. 100% coverage is also easy in Python. I might have to lower my standards in an environment with less amenable tools.
I would love to have the same standards on legacy projects, but I've never found it practical to bring a large application with mediocre coverage up to 100% coverage; I've had to settle for 95-99%. It's always been just too much work to go back and cover all the old code. This does not contradict my argument that it's easy to keep a codebase at 100%; it's much easier when you maintain that standard from the beginning.
No because I spent my time adding new features that help the users rather than tricky to write obscure tests that deliver little value. I say unit test the big things, subtle things and things that are fragile.
I generally write unit tests just as a regression-prevention method. When a bug is reported that I have to fix, I create a unit test to ensure that it doesn't re-surface in the future. I may create a few tests for sections of functionality I have to make sure stay intact (or for complex inter-part interactions), but I usually want for the bug fix to tell me one is necessary.
I usually manage to hit 93..100% with my coverage but I don't aim for 100% anymore. I used to do that and while it's doable, it's not worth the effort beyond a certain point because testing blindly obvious usually isn't needed. Good example of this could be the true evaluation branch of the following code snipped
public void method(boolean someBoolean) {
if (someBoolean) {
return;
} else {
/* do lots of stuff */
}
}
However what's important to achieve is to as close to 100% coverage on functional parts of the class as possible since those are the dangerous waters of your application, the misty bog of creeping bugs and undefined behaviour and of course the money-making flea circus.
From Ted Neward blog.
By this point in time, most developers have at least heard of, if not considered adoption of, the Masochistic Testing meme. Fellow NFJS'ers Stuart Halloway and Justin Gehtland have founded a consultancy firm, Relevance, that sets a high bar as a corporate cultural standard: 100% test coverage of their code.
Neal Ford has reported that ThoughtWorks makes similar statements, though it's my understanding that clients sometimes put accidental obstacles in their way of achieving said goal. It's ambitious, but as the ancient American Indian proverb is said to state,
If you aim your arrow at the sun, it will fly higher and farther than if you aim it at the ground.
In many cases it's not worth getting 100% statement coverage, but in some cases, it is worth it. In some cases 100% statement coverage is far too lax a requirement.
The key question to ask is, "what's the impact if the software fails (produces the wrong result)?". In most cases, the impact of a bug is relatively low. For example, maybe you have to go fix the code within a few days and rerun something. However, if the impact is "someone might die in 120 seconds", then that's a huge impact, and you should have a lot more test coverage than just 100% statement coverage.
I lead the Core Infrastructure Initiative Best Practices Badge for the Linux Foundation. We do have 100% statement coverage, but I wouldn't say it was strictly necessary. For a long time we were very close to 100%, and just decided to do that last little percent. We couldn't really justify the last few percent on engineering grounds, though; those last few percent were added purely as "pride of workmanship". I do get a very small extra piece of mind from having 100% coverage, but really it wasn't needed. We were over 90% statement coverage just from normal tests, and that was fine for our purposes. That said, we want the software to be rock-solid, and having 100% statement coverage has helped us get there. It's also easier to get 100% statement coverage today.
It's still useful to measure coverage, even if you don't need 100%. If your tests don't have decent coverage, you should be concerned. A bad test suite can have good statement coverage, but if you don't have good statement coverage, then by definition you have a bad test suite. How much you need is a trade-off: what are the risks (probability and impact) from the software that is totally untested? By definition it's more likely to have errors (you didn't test it!), but if you and your users can live with those risks (probability and impact), it's okay. For many lower-impact projects, I think 80%-90% statement coverage is okay, with better being better.
On the other hand, if people might die from errors in your software, then 100% statement coverage isn't enough. I would at least add branch coverage, and maybe more, to check on the quality of your tests. Standards like DO-178C (for airborne systems) take this approach - if a failure is minor, no big deal, but if a failure could be catastrophic, then much more rigorous testing is required. For example, DO-178C requires MC/DC coverage for the most critical software (the software that can quickly kill people if it makes a mistake). MC/DC is way more strenuous than statement coverage or even branch coverage.
I only have 100% coverage on new pieces of code that have been written with testability in mind. With proper encapsulation, each class and function can have functional unit tests that simultaneously give close to 100% coverage. It's then just a matter of adding some additional tests that cover some edge cases to get you to 100%.
You shouldn't write tests just to get coverage. You should be writing functional tests that test correctness/compliance. By a good functional specification that covers all grounds and a good software design, you can get good coverage for free.
Yes, I have had projects that have had 100% line coverage. See my answer to a similar question.
You can get 100% line coverage, but as others have pointed out here on SO and elsewhere on the internet its maybe only a minimum. When you consider path and branch coverage, there's a lot more work to do.
The other way of looking at it is to try to make your code so simple that its easy to get 100% line coverage.
There's a lot of good information here, I just wanted to add a few more benefits that I've found when aiming for 100% code coverage in the past
It helps reduce code complexity
Since it is easier to remove a line than it is to write a test case, aiming for 100% coverage forces you to justify every line, every branch, every if statement, often leading you to discover a much simpler way to do things that requires fewer tests
It helps develop good test granularity
You can achieve high test coverage by writing lots of small tests testing tiny bits of implementation as you go. This can be useful for tricky bits of logic but doing it for every piece of code no matter how trivial can be tedious, slow you down and become a real maintenance burden also making your code harder to refactor. On the other hand, it is very hard to achieve good test coverage with very high level end to end behavioural tests because typically the thing you are testing involves many components interacting in complicated ways and the permutations of possible cases become very large very quickly. Therefore if you are practical and also want to aim for 100% test coverage, you quickly learn to find a level of granularity for your tests where you can achieve a high level of coverage with a few good tests. You can achieve this by testing components at a level where they are simple enough that you can reasonably cover all the edge cases but also complicated enough that you can test meaningful behaviour. Such tests end up being simple, meaningful and useful for identifying and fixing bugs. I think this is a good skill and improves code quality and maintainability.
A while ago I did a little analysis of coverage in the JUnit implementation, code written and tested by, among others, Kent Beck and David Saff.
From the conclusions:
Applying line coverage to one of the best tested projects in the world, here is what we learned:
Carefully analyzing coverage of code affected by your pull request is more useful than monitoring overall coverage trends against thresholds.
It may be OK to lower your testing standards for deprecated code, but do not let this affect the rest of the code. If you use coverage thresholds on a continuous integration server, consider setting them differently for deprecated code.
There is no reason to have methods with more than 2-3 untested lines of code.
The usual suspects (simple code, dead code, bad weather behavior, …) correspond to around 5% of uncovered code.
In summary, should you monitor line coverage? Not all development teams do, and even in the JUnit project it does not seem to be a standard practice. However, if you want to be as good as the JUnit developers, there is no reason why your line coverage would be below 95%. And monitoring coverage is a simple first step to verify just that.
Related
I'm strongly considering adding unit testing to an existing project that is in production. It was started 18 months ago before I could really see any benefit of TDD (face palm), so now it's a rather large solution with a number of projects and I haven't the foggiest idea where to start in adding unit tests. What's making me consider this is that occasionally an old bug seems to resurface, or a bug is checked in as fixed without really being fixed. Unit testing would reduce or prevents these issues occuring.
By reading similar questions on SO, I've seen recommendations such as starting at the bug tracker and writing a test case for each bug to prevent regression. However, I'm concerned that I'll end up missing the big picture and end up missing fundamental tests that would have been included if I'd used TDD from the get go.
Are there any process/steps that should be adhered to in order to ensure that an existing solutions is properly unit tested and not just bodged in? How can I ensure that the tests are of a good quality and aren't just a case of any test is better than no tests.
So I guess what I'm also asking is;
Is it worth the effort for an
existing solution that's in production?
Would it better to ignore the testing
for this project and add it in a
possible future re-write?
What will be more benefical; spending
a few weeks adding tests or a few
weeks adding functionality?
(Obviously the answer to the third point is entirely dependant on whether you're speaking to management or a developer)
Reason for Bounty
Adding a bounty to try and attract a broader range of answers that not only confirm my existing suspicion that it is a good thing to do, but also some good reasons against.
I'm aiming to write this question up later with pros and cons to try and show management that it's worth spending the man hours on moving the future development of the product to TDD. I want to approach this challenge and develop my reasoning without my own biased point of view.
I've introduced unit tests to code bases that did not have it previously. The last big project I was involved with where I did this the product was already in production with zero unit tests when I arrived to the team. When I left - 2 years later - we had 4500+ or so tests yielding about 33 % code coverage in a code base with 230 000 + production LOC (real time financial Win-Forms application). That may sound low, but the result was a significant improvement in code quality and defect rate - plus improved morale and profitability.
It can be done when you have both an accurate understanding and commitment from the parties involved.
First of all, it is important to understand that unit testing is a skill in itself. You can be a very productive programmer by "conventional" standards and still struggle to write unit tests in a way that scales in a larger project.
Also, and specifically for your situation, adding unit tests to an existing code base that has no tests is also a specialized skill in itself. Unless you or somebody in your team has successful experience with introducing unit tests to an existing code base, I would say reading Feather's book is a requirement (not optional or strongly recommended).
Making the transition to unit testing your code is an investment in people and skills just as much as in the quality of the code base. Understanding this is very important in terms of mindset and managing expectations.
Now, for your comments and questions:
However, I'm concerned that I'll end up missing the big picture and end up missing fundamental tests that would have been included if I'd used TDD from the get go.
Short answer: Yes, you will miss tests and yes they might not initially look like what they would have in a green field situation.
Deeper level answer is this: It does not matter. You start with no tests. Start adding tests, and refactor as you go. As skill levels get better, start raising the bar for all newly written code added to your project. Keep improving etc...
Now, reading in between the lines here I get the impression that this is coming from the mindset of "perfection as an excuse for not taking action". A better mindset is to focus on self trust. So as you may not know how to do it yet, you will figure out how to as you go and fill in the blanks. Therefore, there is no reason to worry.
Again, its a skill. You can not go from zero tests to TDD-perfection in one "process" or "step by step" cook book approach in a linear fashion. It will be a process. Your expectations must be to make gradual and incremental progress and improvement. There is no magic pill.
The good news is that as the months (and even years) pass, your code will gradually start to become "proper" well factored and well tested code.
As a side note. You will find that the primary obstacle to introducing unit tests in an old code base is lack of cohesion and excessive dependencies. You will therefore probably find that the most important skill will become how to break existing dependencies and decoupling code, rather than writing the actual unit tests themselves.
Are there any process/steps that should be adhered to in order to ensure that an existing solutions is properly unit tested and not just bodged in?
Unless you already have it, set up a build server and set up a continuous integration build that runs on every checkin including all unit tests with code coverage.
Train your people.
Start somewhere and start adding tests while you make progress from the customer's perspective (see below).
Use code coverage as a guiding reference of how much of your production code base is under test.
Build time should always be FAST. If your build time is slow, your unit testing skills are lagging. Find the slow tests and improve them (decouple production code and test in isolation). Well written, you should easilly be able to have several thousands of unit tests and still complete a build in under 10 minutes (~1-few ms / test is a good but very rough guideline, some few exceptions may apply like code using reflection etc).
Inspect and adapt.
How can I ensure that the tests are of a good quality and aren't just a case of any test is better than no tests.
Your own judgement must be your primary source of reality. There is no metric that can replace skill.
If you don't have that experience or judgement, consider contracting someone who does.
Two rough secondary indicators are total code coverage and build speed.
Is it worth the effort for an existing solution that's in production?
Yes. The vast majority of the money spent on a custom built system or solution is spent after it is put in production. And investing in quality, people and skills should never be out of style.
Would it better to ignore the testing for this project and add it in a possible future re-write?
You would have to take into consideration, not only the investment in people and skills, but most importantly the total cost of ownership and the expected life time of the system.
My personal answer would be "yes of course" in the majority of cases because I know its just so much better, but I recognize that there might be exceptions.
What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?
Neither. Your approach should be to add tests to your code base WHILE you are making progress in terms of functionality.
Again, it is an investment in people, skills AND the quality of the code base and as such it will require time. Team members need to learn how to break dependencies, write unit tests, learn new habbits, improve discipline and quality awareness, how to better design software, etc. It is important to understand that when you start adding tests your team members likely don't have these skills yet at the level they need to be for that approach to be successful, so stopping progress to spend all time to add a lot of tests simply won't work.
Also, adding unit tests to an existing code base of any sizeable project size is a LARGE undertaking which requires commitment and persistance. You can't change something fundamental, expect a lot of learning on the way and ask your sponsor to not expect any ROI by halting the flow of business value. That won't fly, and frankly it shouldn't.
Thirdly, you want to instill sound business focus values in your team. Quality never comes at the expense of the customer and you can't go fast without quality. Also, the customer is living in a changing world, and your job is to make it easier for him to adapt. Customer alignment requires both quality and the flow of business value.
What you are doing is paying off technical debt. And you are doing so while still serving your customers ever changing needs. Gradually as debt is paid off, the situation improves, and it is easier to serve the customer better and deliver more value. Etc. This positive momentum is what you should aim for because it underlines the principles of sustainable pace and will maintain and improve moral - both for your development team, your customer and your stakeholders.
Is it worth the effort for an existing solution that's in production?
Yes!
Would it better to ignore the testing for this project and add it in a possible future re-write?
No!
What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?
Adding testing (especially automated testing) makes it much easier to keep the project working in the future, and it makes it significantly less likely that you'll ship stupid problems to the user.
Tests to put in a priori are ones that check whether what you believe the public interface to your code (and each module in it) is working the way you think. If you can, try to also induce each isolated failure mode that your code modules should have (note that this can be non-trivial, and you should be careful to not check too carefully how things fail, e.g., you don't really want to do things like counting the number of log messages produced on failure, since verifying that it is logged at all is enough).
Then put in a test for each current bug in your bug database that induces exactly the bug and which will pass when the bug is fixed. Then fix those bugs! :-)
It does cost time up front to add tests, but you get paid back many times over at the back end as your code ends up being of much higher quality. That matters enormously when you're trying to ship a new version or carry out maintenance.
The problem with retrofitting unit tests is you'll realise you didn't think of injecting a dependency here or using an interface there, and before long you'll be rewriting the entire component. If you have the time to do this, you'll build yourself a nice safety net, but you could have introduced subtle bugs along the way.
I've been involved with many projects which really needed unit tests from day one, and there is no easy way to get them in there, short of a complete rewrite, which cannot usually be justified when the code is working and already making money. Recently, I have resorted to writing powershell scripts that exercise the code in a way that reproduces a defect as soon as it is raised and then keeping these scripts as a suite of regression tests for further changes down the line. That way you can at least start to build up some tests for the application without changing it too much, however, these are more like end to end regression tests than proper unit tests.
I do agree with what most everyone else has said. Adding tests to existing code is valuable. I will never disagree with that point, but I would like to add one caveat.
Although adding tests to existing code is valuable, it does come at a cost. It comes at the cost of not building out new features. How these two things balance out depends entirely on the project, and there are a number of variables.
How long will it take you to put all that code under test? Days? Weeks? Months? Years?
Who are you writing this code for? Paying customers? A professor? An open source project?
What is your schedule like? Do you have hard deadlines you must meet? Do you have any deadlines at all?
Again, let me stress, tests are valuable and you should work to put your old code under test. This is really more a matter of how you approach it. If you can afford to drop everything and put all your old code under test, do it. If that's not realistic, here's what you should do at the very least
Any new code you write should be completely under unit test
Any old code you happen to touch (bug fix, extension, etc.) should be put under unit test
Also, this is not an all or nothing proposition. If you have a team of, say, four people, and you can meet your deadlines by putting one or two people on legacy testing duty, by all means do that.
Edit:
I'm aiming to write this question up later with pros and cons to try and show management that it's worth spending the man hours on moving the future development of the product to TDD.
This is like asking "What are the pros and cons to using source control?" or "What are the pros and cons to interviewing people before hiring them?" or "What are the pros and cons to breathing?"
Sometimes there is only one side to the argument. You need to have automated tests of some form for any project of any complexity. No, tests don't write themselves, and, yes, it will take a little extra time to get things out the door. But in the long run it will take more time and cost more money to fix bugs after the fact than write tests up front. Period. That's all there is to it.
When we started adding tests, it was to a ten-year-old, approximately million-line codebase, with far too much logic in the UI and in the reporting code.
One of the first things we did (after setting up a continuous build server) was to add regression tests. These were end-to-end tests.
Each test suite starts by initializing the database to a known state. We actually have dozens of regression datasets that we keep in Subversion (in a separate repository from our code, because of the sheer size). Each test's FixtureSetUp copies one of these regression datasets into a temp database, and then runs from there.
The test fixture setup then runs some process whose results we're interested in. (This step is optional -- some regression tests exist only to test the reports.)
Then each test runs a report, outputs the report to a .csv file, and compares the contents of that .csv to a saved snapshot. These snapshot .csvs are stored in Subversion next to each regression dataset. If the report output doesn't match the saved snapshot, the test fails.
The purpose of regression tests is to tell you if something changes. That means they fail if you broke something, but they also fail if you changed something on purpose (in which case the fix is to update the snapshot file). You don't know that the snapshot files are even correct -- there might be bugs in the system (and then when you fix those bugs, the regression tests will fail).
Nevertheless, regression tests were a huge win for us. Just about everything in our system has a report, so by spending a few weeks getting a test harness around the reports, we were able to get some level of coverage over a huge part of our code base. Writing the equivalent unit tests would have taken months or years. (Unit tests would have given us far better coverage, and would have been far less fragile; but I'd rather have something now, rather than waiting years for perfection.)
Then we went back and started adding unit tests when we fixed bugs, or added enhancements, or needed to understand some code. Regression tests in no way remove the need for unit tests; they're just a first-level safety net, so that you get some level of test coverage quickly. Then you can start refactoring to break dependencies, so you can add unit tests; and the regression tests give you a level of confidence that your refactoring isn't breaking anything.
Regression tests have problems: they're slow, and there are too many reasons why they can break. But at least for us, they were so worth it. They've caught countless bugs over the last five years, and they catch them within a few hours, rather than waiting for a QA cycle. We still have those original regression tests, spread over seven different continuous-build machines (separate from the one that runs the fast unit tests), and we even add to them from time to time, because we still have so much code that our 6,000+ unit tests don't cover.
It's absolutely worth it. Our app has complex cross-validation rules, and we recently had to make significant changes to the business rules. We ended up with conflicts that prevented the user from saving. I realized it would take forever to sort it out in the applcation (it takes several minutes just to get to the point where the problems were). I'd wanted to introduce automated unit tests and had the framework installed, but I hadn't done anything beyond a couple of dummy tests to make sure things were working. With the new business rules in hand, I started writing tests. The tests quickly identified the conditions that caused the conflicts, and we were able to get the rules clarified.
If you write tests that cover the functionality you're adding or modifying, you'll get an immediate benefit. If you wait for a re-write, you may never have automated tests.
You shouldn't spend a lot of time writing tests for existing things that already work. Most of the time, you don't have a specification for the existing code, so the main thing you're testing is your reverse-engineering ability. On the other hand, if you're going to modify something, you need to cover that functionality with tests so you'll know you made the changes correctly. And of course, for new functionality, write tests that fail, then implement the missing functionality.
I'll add my voice and say yes, it's always useful!
There are some distinctions you should keep in mind, though: black-box vs white-box, and unit vs functional. Since definitions vary, here's what I mean by these:
Black-box = tests that are written without special knowledge of the implementation, typically poking around at the edge cases to make sure things happen as a naive user would expect.
White-box = tests that are written with knowledge of the implementation, which often try to exercise well-known failure points.
Unit tests = tests of individual units (functions, separable modules, etc). For example: making sure your array class works as expected, and that your string comparison function returns the expected results for a wide range of inputs.
Functional tests = tests of the entire system all at once. These tests will exercise a big chunk of the system all at once. For example: init, open a connection, do some real-world stuff, close down, terminate. I like to draw a distinction between these and unit tests, because they serve a different purpose.
When I've added tests to a shipping product late in the game, I found that I got the most bang for the buck from white-box and functional tests. If there's any part of the code that you know is especially fragile, write white-box tests to cover the problem cases to help make sure it doesn't break the same way twice. Similarly, whole-system functional tests are a useful sanity check that helps you make sure you never break the 10 most common use cases.
Black-box and unit tests of small units are useful too, but if your time is limited, it's better to add them early. By the time you're shipping, you've generally found (the hard way) the majority of the edge cases and problems that these tests would have found.
Like the others, I'll also remind you of the two most important things about TDD:
Creating tests is a continuous job. It never stops. You should try to add new tests every time you write new code, or modify existing code.
Your test suite is never infallible! Don't let the fact that you have tests lull you into a false sense of security. Just because it passes the test suite doesn't mean it's working correctly, or that you haven't introduced a subtle performance regression, etc.
You don't mention the implementation language, but if in Java then you could try this approach:
In a seperate source tree build regression or 'smoke' tests, using a tool to generate them, which might get you close to 80% coverage. These tests execute all the code logic paths, and verify from that point on that the code still does exactly what it does currently (even if a bug is present). This gives you a safety net against inadvertently changing behaviour when doing the necessary refactoring to make code easily unit testable by hand.
Product suggestion - I used to use the free web based product Junit Factory, but sadly it's closed now. However the product lives on in the commercially licenced AgitarOne JUnit Generator at http://www.agitar.com/solutions/products/automated_junit_generation.html
For each bug you fix, or feature you add from now on, use a TDD approach to ensure new code is designed to be testable and place these tests in a normal test source tree.
Existing code will also likely need to be changed, or refactored to make it testable as part of adding new features; your smoke tests will give you a safety net against regressions or inadvertent subtle changes to behaviour.
When making changes (bug fixes or features) via TDD, when complete it's likely the companion smoke test is failing. Verify the failures are as expected due to the changes made and remove the less readable smoke test, as your hand written unit test has full coverage of that improved component. Ensure that your test coverage does not decline only stay the same or increase.
When fixing bugs write a failing unit test that exposes the bug first.
Whether it's worth adding unit tests to an app that's in production depends on the cost of maintaining the app. If the app has few bugs and enhancement requests, then maybe it's not worth the effort. OTOH, if the app is buggy or frequently modified then unit tests will be hugely beneficial.
At this point, remember that I'm talking about adding unit tests selectively, not trying to generate a suite of tests similar to those that would exist if you had practiced TDD from the start. Therefore, in response to the second half of your second question: make a point of using TDD on your next project, whether it's a new project or a re-write (apologies, but here is a link to another book that you really should read: Growing Object Oriented Software Guided by Tests)
My answer to your third question is the same as the first: it depends on the context of your project.
Embedded within you post is a further question about ensuring that any retro-fitted testing is done properly. The important thing to ensure is that unit tests really are unit tests, and this (more often than not) means that retrofitting tests requires refactoring existing code to allow decoupling of your layers/components (cf. dependency injection; inversion of control; stubbing; mocking). If you fail to enforce this then your tests become integration tests, which are useful, but less targeted and more brittle than true unit tests.
I would like to start this answer by saying that unit testing is really important because it will help you arrest bugs before they creep into production.
Identify the areas projects/modules where bugs have been re-introduced. start with those projects to write tests. It perfectly makes sense to write tests for new functionality and for bug fix.
Is it worth the effort for an existing
solution that's in production?
Yes. You will see the effect of bugs coming down and maintenance becoming easier
Would it better to ignore the testing
for this project and add it in a
possible future re-write?
I would recommend to start if from now.
What will be more benefical; spending
a few weeks adding tests or a few
weeks adding functionality?
You are asking the wrong question. Definitely, functionality is more important than anything else. But, rather you should ask if spending a few weeks adding test will make my system more stable. Will this help my end user? Will it help a new developer in the team to understand the project and also to ensure that he/she, doesn't introduce a bug due to lack of understanding of the overall impact of a change.
I'm very fond of Refactor the Low-hanging Fruit as an answer to the question of where to begin refactoring. It's a way to ease into better design without biting off more than you can chew.
I think the same logic applies to TDD - or just unit tests: write the tests you need, as you need them; write tests for new code; write tests for bugs as they appear. You're worried about neglecting harder-to-reach areas of the code base, and it's certainly a risk, but as a way to get started: get started! You can mitigate the risk down the road with code coverage tools, and the risk isn't (in my opinion) that big, anyway: if you're covering the bugs, covering the new code, covering the code you're looking at, then you're covering the code that has the greatest need for tests.
yes, it is. when you start adding new functionality it can cause some old code modification and as results it is a source of potential bugs.
(see the first one) before you start adding new functionality all (or almost) code (ideally) should be covered by unit tests.
(see the first and second one) :). a new grandiose functionality can "destroy" the old worked code.
Yes it can: Just try to make sure all code you write from now has a test in place.
If the code that is already in place needs to be modified and can be tested, then do so, but it is better not to be too vigorous in trying to get tests in place for stable code. That sort of thing tends to have a knock-on effect and can spiral out of control.
Is it worth the effort for an existing solution that's in production?
Yes. But you don't have to write all unit tests to get started. Just add them one by one.
Would it better to ignore the testing for this project and add it in a possible future re-write?
No. First time you are adding code which breaks the functionality, you will regret it.
What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?
For new functionality (code) it is simple. You write the unit test first and then the functionality.
For old code you decide on the way. You don't have to have all unit tests in place... Add the ones that hurt you most not having... Time (and errors) will tell on which one you have to focus ;)
Update
6 years after the original answer, I have a slightly different take.
I think it makes sense to add unit tests to all new code you write - and then refactor places where you make changes to make them testable.
Writing tests in one go for all your existing code will not help - but not writing tests for new code you write (or areas you modify) also doesn't make sense. Adding tests as you refactor/add things is probably the best way to add tests and make the code more maintainable in an existing project with no tests.
Earlier answer
Im going to raise a few eyebrows here :)
First of all what is your project - if it is a compiler or a language or a framework or anything else that is not going to change functionally for a long time, then I think its absolutely fantastic to add unit tests.
However, if you are working on an application that is probably going to require changes in functionality (because of changing requirements) then there is no point in taking that extra effort.
Why?
Unit tests only cover code tests - whether the code does what it is designed to - it is not a replacement for manual testing which anyways has to be done (to uncover functional bugs, usability issues and all other kinds of issues)
Unit tests cost time! Now where I come from, that's a precious commodity - and business generally picks better functionality over a complete test suite.
If your application is even remotely useful to users, they are going to request changes - so you will have versions that will do things better, faster and probably do new things - there may also be a lot of refactoring as your code grows. Maintaining a full grown unit test suite in a dynamic environment is a headache.
Unit tests are not going to affect the perceived quality of your product - the quality that the user sees. Sure, your methods might work exactly as they did on day 1, the interface between presentation layer and business layer might be pristine - but guess what? The user does not care! Get some real testers to test your application. And more often than not, those methods and interfaces have to change anyways, sooner or later.
What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality? - There are hell lot of things that you can do better than writing tests - Write new functionality, improve performance, improve usability, write better help manuals, resolve pending bugs, etc etc.
Now dont get me wrong - If you are absolutely positive that things are not going to change for next 100 years, go ahead, knock yourself out and write those tests. Automated Tests are a great idea for APIs as well, where you absolutely do not want to break third party code. Everywhere else, its just something that makes me ship later!
It's unlikely you'll ever have significant test coverage, so you must be tactical about where you add tests:
As you mentioned, when you find a bug, it's a good time to write a test (to reproduce it), and then fix the bug. If you see the test reproduce the bug, you can be sure it's a good, alid test. Given such a large portion of bugs are regressions (50%?), it's almost always worth writing regression tests.
When you dive into an area of code to modify it, it's a good time to write tests around it. Depending on the nature of the code, different tests are appropriate. One good set of advice is found here.
OTOH, it's not worth just sitting around writing tests around code that people are happy with-- especially if nobody is going to modify it. It just doesn't add value (except maybe understanding the behavior of the system).
Good luck!
You say you don't want to buy another book. So just read Michael Feather's article on working effectively with legacy code. Then buy the book :)
If I were in your place, I would probably take an outside-in approach, starting with functional tests that exercise the whole system. I would try to re-document the system's requirements using a BDD specification language like RSpec, and then write tests to verify those requirements by automating the user interface.
Then I would do defect driven development for newly discovered bugs, writing unit tests to reproduce the problems, and work on the bugs until the tests pass.
For new features, I would stick with the outside-in approach: Start with features documented in RSpec and verified by automating the user interface (which will of course fail initially), then add more finely-grained unit tests as the implementation moves along.
I'm no expert on the process, but from what little experience I have I can tell you that BDD via automated UI testing is not easy, but I think it's worth the effort, and probably would yield the most benefit in your case.
I'm not a seasoned TDD expert by any means, but of course I would say that it's incredibly important to unit test as much as you can. Since the code is already in place, I would start by getting some sort of unit test automation in place. I use TeamCity to exercise all of the tests in my projects, and it gives you a nice summary of how the components did.
With that in place, I'd move on to those really critical business logic-like components that can't fail. In my case, there are some basic trigometry problems that need to be solved for various inputs, so I test the heck out of those. The reason I do this is that when I'm burning the midnight oil, it's very easy to waste time digging down to depths of code that really don't need to be touched, because you know they are tested for all of the possible inputs (in my case, there is a finite number of inputs).
Ok, so now you hopefully feel better about those critical pieces. Instead of sitting down and banging out all of the tests, I would attack them as they come up. If you hit a bug that's a real PITA to fix, write the unit tests for it and get them out of the way.
There are cases where you'll find that testing is tough because you can't instantiate a particular class from the test, so you have to mock it. Oh, but maybe you can't mock it easily because you didn't write to an interface. I take these "whoops" scenarios as an opportunity to implement said interface, because, well, it's a Good Thing.
From there, I'd get your build server or whatever automation you have in place configured with a code coverage tool. They create nasty bar graphs with big red zones where you have poor coverage. Now 100% coverage isn't your goal, nor would 100% coverage necessarily mean your code is bulletproof, but the red bar definitely motivates me when I have free time. :)
There is so many good answers so I will not repeat their content. I checked your profile and it seems you are C# .NET developer. Because of that I'm adding reference to Microsoft PEX and Moles project which can help you with autogenerating unit tests for legacy code. I know that autogeneration is not the best way but at least it is the way to start. Check this very interesting article from MSDN magazine about using PEX for legacy code.
I suggest reading a brilliant article by a TopTal Engineer, that explains where to start adding tests: it contains a lot of maths, but the basic idea is:
1) Measure your code's Afferent Coupling (CA) (how much a class is used by other classes, meaning breaking it would cause widespread damage)
2) Measure your code's Cyclomatic Complexity (CC) (higher complexity = higher change of breaking)
You need to identify classes with high CA and CC, i.e. have a function f(CA,CC) and the classes with the smallest differences between the two metrics should be given the highest priority for test coverage.
Why? Because a high CA but very low CC classes are very important but unlikely to break. On the other hand, low CA but high CC are likely to break, but will cause less damage. So you want to balance.
It depends...
It's great to have unit tests but you need to consider who your users are and what they are willing to tolerate in order to get a more bug-free product. Inevitably by refactoring your code which has no unit tests at present, you will introduce bugs and many users will find it hard to understand that you are making the product temporarily more defective to make it less defective in the long run. Ultimately it's the users who will have the final say...
Yes.
No.
Adding tests.
Going towards a more TDD approach will actually better inform your efforts to add new functionality and make regression testing much easier. Check it out!
I have a largish complex app around 27k lines. Its essentially a rule drive multithreaded processing engine, without giving too much away Its been partially tested as it's been built, certain components.
Question I have, is what is the pro's and con's of doing unit testing on after the fact, so to speak, after its been implemented. It is clear that traditional testing is going to take 2-3+ months to test every facet, and it all needs to work, and that time is not available really.
I've done a fair bit of unit testing in the past, but generally it's been on desktop automation or LOB apps, which are fairly simple. The app is itself is highly componentized internally, interface driven really. I've not decided on what particular framework to use. Any advice would be appreciated.
What say you.
I think there are several advantages to unit testing existing code
Regression management
Better understanding of the code. Testing it will reveal cases you did not anticipate and will help define the behavior of the code
It will point out design deficiencies in the code as you stuggle to test poorly defined methods.
But I think it's more interesting to consider the cons of unit testing code. AFAIK, there are no cons. All of the time spent adding tests will pay for themselves even in everything but the shortest of time cycles.
There are many reasons to unit test code. The main reason I would advocate unit testing after the fact is simple. Your code is broken, you just don't know it yet.
There is a very simple rule in software. If the code is not tested, it's broken. This may not be immediately obvious at first, but as you begin testing, you will find bugs. It's up to you to determine how much you care about finding these bugs.
Besides this, there are several other important benefits of unit testing,
regression testing will be made simpler
other developers, that are less knowledgeable, can't break your desired behavior
the tests are a form of self documentation
can reduce time in future modifications (no more manual testing?, less bugs?)
The list can go on and on. The only real drawback is the time it takes to write these tests. I believe that drawback will always be offset by the time it takes you to debug
problems you could have found while unit testing!
Depending on how many bugs "manual testing" turns up, you could simply do test-driven bug fixing which in my experience is far more effective than simply driving up code coverage by writing "post-mortem" unit tests.
(Which is not to say writing unit tests afterwards is a bad idea, it's just that TDD is almost always a better idea.)
Here's a few of each to my mind:
Pro:
Time is saved in not having to test methods that have been removed as the design evolved over time. What is left is what really has to get tested.
By adding tests, this allows an opportunity to review all the aspects in the app and determine what other optimizations one could add now that a working prototype is ready.
Con:
Large time investment to get the tests written, new functionality may be delayed for some time to generate all the tests.
Bugs may have been introduced that the tests will discover that may cause this to be longer than initially planned.
The main point would be that adding unit tests allows for refactoring and putting more polish on the application.
I think one of the biggest con of testing "after the fact" is that you will probably have a harder time testing. If you write code without tests, you usually don't have testability in mind and end up writing code that is hard to test.
But, after you spent this extra time writing tests and changing your code for better testability, you'll be much more confident about making changes, once you won't need a lot of time to debug and check if everything is ok.
Finally, you might find new bugs which weren't caught before, and spend some time fixing it. But hey, that's what tests are for =)
Pro post facto unit testing:
Get documentation you can trust.
Improve understanding of the code.
Push toward refactoring and improving the code itself.
Fix bugs that lurk in the code.
Con post facto unit testing:
Waste time fixing bugs you can live with. (If you wrote 27KLOC, we hope it does something, right?)
Spend time understanding and refactoring code you don't need to understand.
Lose time that could go into the next project.
The unasked question is just how important an asset is this code to your organization, long term? The answer to this question determines how much you should invest. I have plenty of (successful) competitors where the major purpose of their code is to get out numbers to evaluate some new technique or idea. Once they have the numbers, the code is of little marginal value. They (rightly) test very carefully to make sure the numbers are meaningful. After that, if there are fifty open bugs that don't affect the numbers, they don't care. And why should they? The code has served its purpose.
If you are doing any refactoring, those tests will help you detect any bugs that will appear in the process.
Unit testing "after the fact" is still valuable, and provides most of the same advantages of unit testing during development.
That being said, I find it's more work to test after the fact (if you want to get the same level of testing). It's still valuable, and still worth while.
Personally, when trying to tackle something with limited time, I try to focus my testing efforts as much as possible. Any time you fix a bug, add tests to help prevent it in the future. Any time you're going to refactor, try to put enough testing in place to feel confident you're not going to break something.
The only con of adding unit testing is that it does take some development time. Personally, I find that the development time spent on testing is far outweighed by the time saved in maintenance, but this is something you need determine on your own.
Unit testing is still definitely useful. Check out http://en.wikipedia.org/wiki/Unit_testing for a full list and explanation of the benefits.
The main benefits you will gain are documentation, making change easier, and it simplifies future integration.
There are really no costs to adding unit testing except your time. Realize though that the time you spend adding unit testing will reduce the amount of time you will need to spend in other areas of development by at least the same amount and most likely more.
Unit testing doesn't prove that a system works. It proves that each unit works as an independent unit. It doesn't prove that the integrated system will work
Unit testing "after the fact" is useful for two things - finding bugs that you've missed so far and won't find using any other kind of testing (especially for rare conditions - there's huge numbers of rare conditions that can happen in particular units for any real world system), and as regression tests during maintenance.
Neither of these is going to help much in your situation - you need to do other forms of testing either way. If you don't have time to do what you need to do, taking on even more work is unlikely to help.
That said, without unit testing, I guarantee you will have nasty surprises when the customers start using the code. It's all those rare conditions - there's so many of them that some of them are bound to occur soon. Black-box testers tend to get into habitual patterns, which mean they only test so many rare cases - and they have no way of knowing what rare cases there are in particular units and how to trigger them anyway. More users means more variations in usage patterns.
I'm with those who say unit tests should be written as part of the programming process - one of the programmers responsibilities. As a rule, code gets written faster that way, as you get fewer and less complex bugs to track down as you go, and you tend to find out about them when you're still familiar with the code that has the bug.
If development is "done" I would say that there is not too much point in unit testing.
This is one of these difficult value judgement types of questions.
I would mostly agree with Epaga, that writing new tests as you fix bugs (perhaps with a couple of extra tests thrown in) is a good approach.
I would add two further comments:
Doing backed-off black box testing to a unit before making large changes can be a good idea
Consistency testing isn't unit testing, but certain types of program lend themselves to the easy generation of consistency tests. This might be one approach to making sure you don't break things.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm looking for real world examples of some bad side effects of code coverage.
I noticed this happening at work recently because of a policy to achieve 100% code coverage. Code quality has been improving for sure but conversely the testers seem to be writing more lax test plans because 'well the code is fully unit tested'. Some logical bugs managed to slip through as a result. They were a REALLY BIG PAIN to debug because 'well the code is fully unit tested'.
I think that was partly because our tool did statement coverage only. Still, it could have been time better spent.
If anyone has other negative side effects of having a code coverage policy please share. I'd like to know what kind of other 'problems' are happening out there in the real-world.
Thanks in advance.
EDIT: Thanks for all the really good responses. There are a few which I would mark as the answer but I can only mark one unfortunately.
In a sentence: Code coverage tells you what you definitely haven't tested, not what you have.
Part of building a valuable unit test suite is finding the most important, high-risk code and asking hard questions of it. You want to make sure the tough stuff works as a priority. Coverage figures have no notion of the 'importance' of code, nor the quality of tests.
In my experience, many of the most important tests you will ever write are the tests that barely add any coverage at all (edge cases that add a few extra % here and there, but find loads of bugs).
The problem with setting hard and (potentially counter-productive) coverage targets is that developers may have to start bending over backwards to test their code. There's making code testable, and then there's just torture. If you hit 100% coverage with great tests then that's fantastic, but in most situations the extra effort is just not worth it.
Furthermore, people start obsessing/fiddling with numbers rather than focussing on the quality of the tests. I've seen badly written tests that have 90+% coverage, just as I've seen excellent tests that only have 60-70% coverage.
Again, I tend to look at coverage as an indicator of what definitely hasn't been tested.
Just because there's code coverage doesn't mean you're actually testing all paths through the function.
For example, this code has four paths:
if (A) { ... } else { ... }
if (B) { ... } else { ... }
However just two tests (e.g. one with A and B true, one with A and B false) would give "100% code coverage."
This is a problem because the tendency is to stop testing once you've achieved the magic 100% number.
In my experience, the biggest problem with code coverage tools is the risk that somebody will fall victim to the belief that "high code coverage" equals "good testing." Most coverage tools just offer statement coverage metrics, as opposed to condition, data path or decision coverage. That means that it's possible to get 100% coverage on a bit of code like this:
for (int i = 0; i < MAX_RETRIES; ++i) {
if (someFunction() == MAGIC_NUMBER) {
break;
}
}
... without ever testing the termination condition on the for loop.
Worse, it's possible to get very high "coverage" from a test that simply invokes your application, without bothering to validate the output, or validating it incorrectly.
Simply put, low code coverage levels is certainly an indication of insufficient testing, but high coverage levels are not an indication of sufficient or correct testing.
Sometimes corner cases are so rare they're not worth testing, yet a strict code-coverage rule requires you test it anyway.
For example, in Java the MD5 algorithm is built-in, but technically it's possible that an "unsupported algorithm" type exception is thrown. It's never thrown and your test would have to go through significant gyrations to test that path.
It would be a lot of work wasted.
In my opinion, the greatest danger a team runs from measuring code coverage is that it rewards large tests, and penalizes small ones. If you have the choice between writing a single test that covers a large portion of your application's functionality, and writing ten small tests which test a single method, only measuring code coverage implies that you should write the large test.
However, writing the set of 10 small tests will give you much less brittle tests, and will test your application much more thoroughly than the one large test will. Thus, by measuring code coverage, particularly in an organization with still evolving testing habits, you can often set up the wrong incentives.
I know this isn't a direct answer to your question, but...
Any testing, regardless of what type, is insufficient by itself. Unit testing/code coverage is for developers. QA still needs to test the system as a whole. Business users still need to test the system as a whole as well.
The converse, QA tests the code completely, so developers shouldn't test is equally as bad. Testing is complimentary and different tests provide different things. Each test type can miss things that another might find.
Just like the rest of development, don't take shortcuts with testing, it'll only let bugs through.
Writing too targeted test cases.
Insufficient input variability testing of the Code
Large number of artificial test cases executed.
Not concentrating on the important test failures due to noise.
Difficulty in assigning defects because many conditions from many components must interact for a line to execute.
The worst side effect of having a 100% coverage goal is to spend a lot of the testing development cycle (75%+) hiting corner cases. Another poor effect of such a policy is the concentration of hitting a particular line of code rather than addressing the range of inputs. I don't really care that the strcpy function ran at least once. I really care that it ran against a wide variety of input. Having a policy is good. But having any extremely draconian policy is bad. The 100% metric of code coverage is neither necessary nor sufficient for code to be considered solid.
One of the largest pitfalls of code coverage is that people just talk about code coverage without actually specifying what type of code coverage they are talking about. The characteristics of C0, C1, C2 and even higher levels of code coverage are very different, so just talking about "code coverage" doesn't even make sense.
For example, achieving 100% full path coverage is pretty much impossible. If your program has n decision points, you need 2n tests (and depending on the definition, every single bit in a value is a decision point, so to achieve 100% full path coverage for an extremely simple function that just adds two ints, you need 18446744073709551616 tests). If you only have one loop, you already need infinitely many tests.
OTOH, achieving 100% C0 coverage is trivial.
Another important thing to remember, is that code coverage does not tell you what code was tested. It only tells you what code was run! You can try it out yourself: take a codebase that has 100% code coverage. Remove all the assertions from the tests. Now the codebase still has 100% coverage, but does not test a single thing! So, code coverage does not tell you what's tested, only what's not tested.
There are tools out there, Jumble for one, that perform analysis through branch coverage, by mutating your code to see if your test fails for all different permutations.
Directly from their website:
Jumble is a class level mutation
testing tool that works in conjunction
with JUnit. The purpose of mutation
testing is to provide a measure of the
effectiveness of test cases. A single
mutation is performed on the code to
be tested, the corresponding test
cases are then executed. If the
modified code fails the tests, then
this increases confidence in the
tests. Conversely, if the modified
code passes the tests this indicates a
testing deficiency.
Nothing wrong with code coverage - what I see wrong is the 100% figure. At some point the law of diminished returns kicks in and it becomes more expensive to test the last 1% than the other 99%. Code coverage is a worthy goal but common sense goes a long way.
!00% code coverage means well tested code is a complete myth. As developers we know the hard/complex/delicate parts of a system, and I would much rather see those areas properly tested, and only get 50% coverage, rather than the meaningless figure that every line has been run at least once.
In terms of a real world example, the only team that I was on that had 100% coverage wrote some of the worst code I've ever seen. 100% coverage was used to replace code review - the result was predicatably awful, to the extent that most code was thrown away, even though it passed the tests.
We have good tools for measuring code-coverage from unit tests. So it's tempting to rely on code-coverage of 100% to represent that you're "done testing." This is not true.
As other folks have mentioned, 100% code coverage doesn't prove that you have tested adequately, nor does 50% code coverage necessarily mean that you haven't tested adequately.
Measuring lines of code executed by tests is just one metric. You also have to test for a reasonable variety of function inputs, and also how the function or class behaves depending on some other external state. For example, some code functions differently based on the data in a database or in a file.
I've also blogged about this recently: http://karwin.blogspot.com/2009/02/unit-test-coverage.html
100% code coverage doesn't mean you're done with usnit tests
function int divide(int a, int b) {
return a/b;
}
With just 1 unit test, I get 100% code coverage for this function:
return divide(4,2) == 2;
Now, nobody would argue that this unit code with 100% coverage indicates that he feature works just fine.
I think code coverage is a good element to know if you are missing any obvious code path, but I would use it carefully.
Ok let me be honest, I haven't written more than 10 unit tests in my life probably.
I am embarking on a new project, and being the sole programmer means I should be scared ... very scared.
The idea that I can pseudo guarantee that my software works brings about a sense of joy.
Sure I will miss a ton of cases where I should have tested, but that's where I will learn as time goes on.
Unit testing will help me sleep better at night, which is better for my health.
My code will fail, but at least I will have a better idea when it will.
How has unit testing made your life better (or has it?), despite the rest of your team not jumping on the bandwagon?
The far biggest value that unit test have on my project is confidence. With that confidence it's much easier to add new features that weren't planned at the beginning and to tear code apart to change something or turn this around.
With test I know I (or anyone else!) haven't broke something that already worked.
Without test you are brave (or stupid) when you make big changes and deploy them in production in the next minute.
By unit testing, I've reduced the number of "stupid-bugs" that get reported during the testing stage. It's also given me a higher level of confidence in my code.
As Elie said, unit-testing is a great way to flush out "stupid" or simple bugs very early.
For me, it changes the way I think about code; making my code testable makes it less brittle and more flexible (e.g. depenceny injection/inversion of control came quite naturally to me because it's something I did for testing purposes anyway).
The greatest benefit I personally get from a thorough test-suite is the confidence to change complicated code even months after I wrote it without fear of accidentally breaking something.
I'm not yet there, but with some discipline while writing them, unit-tests are a great way to document your code, also.
Okay, as nobody else has stepped up to the plate to play devil's advocate, I'll do it.
Automated unit testing can have good benefits for some projects, but can also have many of the following issues:
It sucks down a ton of engineering resources.
It can take many man-hours to remove environment and setup issues from the equation.
It has a low ROI for some project types, especially GUIs.
It won't catch many errors, because it's impossible to evaluate all execution paths for all but the most trivial programs.
It doesn't catch integration errors.
It doesn't catch broader system-level errors such as functions performed across multiple units, or non-functional areas such as performance.
Test coverage and coverage gates can become an increasingly useless mantra
It requires a sustainable process for ensuring that test case failures are reviewed and addressed immediately. Otherwise the app will evolve out of sync with the unit test suite.
It has a significant opportunity cost - there are other activities such as code reviews that have an equally good or even better ROI.
It can involve a significant culture change.
So developers shouldn't adopt a dogmatic (yes or no) approach towards unit testing, but instead do an ROI calculation for every project.
Using unit testing in conjunction with TDD provides me with motivation and drive to complete the task at hand. Without the small progress of writing test, fixing test, writing test, fixing test I can become unmotivated.
Unit tests are really helpful when you start debugging as well. If your tests fail, then you know where the bug is almost immediately. If they all run, then you know where the bugs aren't (most of the time).
Another area where unit tests help: migrating software. I'm finding that it's a lot easier to prepare Python code that has unit tests to be migrated to Python 3 than code that doesn't have unit tests.
Sure I will miss a ton of cases where I should have tested, but that's where I will learn as time goes on.
This is where Test Driven Development really shines. You don't have to worry quite as much about having the proper tests because that question will be answered for you beforehand.
Of course, just to make sure we're on the same page, "test driven development" means the process of coding where you write the test, verify that the test fails, and then write the code.
For me, it wasn't just unit tests that changed my life, but Test Driven Development (TDD). I liken it to a religious experience in my blog post (shameless, I know) My Year with TDD.
Getting into testing has been a career changing experience for me. I write less bugs, I write more readable code, I write more cohesive code, I know when something is broken (usually), etc, etc. I owe it all to Test Driven Development.
Try it, you like it :)
I've been a TDD(eveloper) since my 2nd* project at uni (back in 1988). I don't know if the term was even in use then.
The best thing is the ability to change things and very quickly check you haven't broken anything else. Easy regression testing.
They are good documentation of object/method usage as well.
*and that was directly because of how the first project went....
the biggest thing unit tests give you is confidence in your code. You know stuff works at a certain level of quality, and you know you can go in and change or rewrite something and not have things fail all over the place where you dont expect to. Verification is only a testrun away.
Now that I foster to test-drive my code, I know when I'm done with a function, a component, or a feature. Therefore I can report progress accurately.
I know it's not bug free code, but it is functional enough to be integrated, built and delivered to QA. I'm confident they'll be able to start testing without being blocked by a segmentation error, or any other silly problem.
I also have an environment ready so that I can quickly write a new test to reproduce any problem that will be reported, and I have a safety net to detect side effects and regressions when I modify or fix the code.
Can't speak for everyone, but I started writing tests because of a fellow developer that wrote tests that were a big help to me when I was learning the system.
I've also found that tests are a good way to verify assumptions when working with a new code base.
Unit testing is not a silver bullet. But it does have benefits, and it is very satisfying.
I find it means my code gets executed a lot earlier, since before I wrote unit tests, it took a lot of coding before there was enough functionality to try out in the application.
I'm also very grateful for my suite of unit tests when I come to refactor. I rewrote a whole date handling module a while ago, and there's no way I would have had the confidence to do that without regression tests.
I must say that I think the improvements in VS 2010, such as Ctrl+Enter (I think that was what it was) that can allow you to quickly stub the interface of a class while writing test (first) is going to make this ALOT easier for me.
Unit test advantages are multiple. To be more specific of how it makes my life better, well, I think that it's increase the confidence in change and give me the possibility to change a lot faster a code later. It increase my life in the long run.
Of course, I can tell that in short time that it's a little bit of more pain because it requires more time, but it's rapidly forgotten when you auto-validate yourself when you do these test.
I'm a big fan of unit testing, though my tests don't provide complete coverage nowadays... mostly because I'm working on a web site and much of my code just grabs data from the database, manipulates it, and spits it out. The manipulation code is generally well tested, but its a real pain to test the database code.
That being said, I can point out a case where unit tests saved me weeks of work...
I was working on a smallish project (4-6 devs) a while back and, after months of work we had reached a state of near completion. At this point, the folks in charge of the product decided that, instead of storing dates (and generating reports using them) in GMT, they wanted everything in EST. Given the product was built to handle large amounts of data/logs and generate information about that data based on timeframes, this was a fairly major change.
Over the course of the next few days, the development team went in and changed everything to deal with EST timestamps. What would have taken us weeks to do had we not had such extensive automated tests, took us just 3 days, allowing us to meet an aggressive schedule. We were able to jump into the code and start changing whatever we needed to; the unit tests giving us courage by knowing the system would complain quickly if we broken something. To this day, I use that experience as an example of how you can never truly understand the benefits of automated testing until it saves you... and it certainly did that for my team.
I'm currently in the process of trying to jump on the bandwagon. Work mates are already doing it before they've written a line of functional code. I'm still writing a full program before I've even run it through the main, let alone unit test it :/
I'll get there in the end I'm sure. But at the moment, I am of ill health through spending 90% of my life debugging :(
I've done quite a bit of work in java, and a year in Ruby.
In Ruby we used extensive testing (TDD). This was ABSOLUTELY REQUIRED. You can write nearly complete garbage into a Ruby file, and if you don't hit that specific line of code, you'll never know, so your tests need near 100% coverage.
In Java, I've never needed much more than a single, simple success path test--and that can usually be thrown away after the code runs. It's really the static type checking, strong typing and using coding patterns like strong encapsulation and parameter checking that makes this possible. You can actually get very close to proving that a small class can't be broken (is bug free) without tests, and when correctly designed, all classes should be small.
Another point of interest: On the Ruby project, we had a refactor that took us 2 days of real code work (split a prime model class into two classes) and 2 weeks of test repairs.
At some point all those tests have a price, they are still code you have to maintain.
That said, I find TDD fun and a good way to get things started, even in Java, and I also would reiterate that I ALWAYS have some success path testing at the very least (even if it's just a quick main method) in virtually every class I write.
Good unit tests that provide sufficient coverage can make you sleep better at night.
If you use assertions, you can find out potential bugs which are missed by the unit tests (sometimes it's not good enough perhaps), and you can sleep even better at night.
It saves me time, because when I run code TDD, it usually just works when it comes to integration time, so no need to spend agonizing hours debugging.
It also gives me confidence having a conversation with other developers claiming that API I created has bugs.
When you get to some critical mass of tests, a nice side-effect is often that if you are about to introduce a bug, it is likely to make some test fail, even if the tests aren't directly against the new, buggy code you are writing.
So you will be alerted to the bug you are about to commit before doing so, and you can then write tests against it and fix it right away.
(This is of course only true if you have tests that are not "just" very narrow test-one-thing-only tests, but I think that is more often the case than not.)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
If you were to mandate a minimum percentage code-coverage for unit tests, perhaps even as a requirement for committing to a repository, what would it be?
Please explain how you arrived at your answer (since if all you did was pick a number, then I could have done that all by myself ;)
This prose by Alberto Savoia answers precisely that question (in a nicely entertaining manner at that!):
http://www.artima.com/forums/flat.jsp?forum=106&thread=204677
Testivus On Test Coverage
Early one morning, a programmer asked
the great master:
“I am ready to write some unit tests. What code coverage should I aim
for?”
The great master replied:
“Don’t worry about coverage, just write some good tests.”
The programmer smiled, bowed, and
left.
...
Later that day, a second programmer
asked the same question.
The great master pointed at a pot of
boiling water and said:
“How many grains of rice should I put in that pot?”
The programmer, looking puzzled,
replied:
“How can I possibly tell you? It depends on how many people you need to
feed, how hungry they are, what other
food you are serving, how much rice
you have available, and so on.”
“Exactly,” said the great master.
The second programmer smiled, bowed,
and left.
...
Toward the end of the day, a third
programmer came and asked the same
question about code coverage.
“Eighty percent and no less!” Replied the master in a stern voice,
pounding his fist on the table.
The third programmer smiled, bowed,
and left.
...
After this last reply, a young
apprentice approached the great
master:
“Great master, today I overheard you answer the same question about
code coverage with three different
answers. Why?”
The great master stood up from his
chair:
“Come get some fresh tea with me and let’s talk about it.”
After they filled their cups with
smoking hot green tea, the great
master began to answer:
“The first programmer is new and just getting started with testing.
Right now he has a lot of code and no
tests. He has a long way to go;
focusing on code coverage at this time
would be depressing and quite useless.
He’s better off just getting used to
writing and running some tests. He can
worry about coverage later.”
“The second programmer, on the other hand, is quite experience both
at programming and testing. When I
replied by asking her how many grains
of rice I should put in a pot, I
helped her realize that the amount of
testing necessary depends on a number
of factors, and she knows those
factors better than I do – it’s her
code after all. There is no single,
simple, answer, and she’s smart enough
to handle the truth and work with
that.”
“I see,” said the young apprentice,
“but if there is no single simple
answer, then why did you answer the
third programmer ‘Eighty percent and
no less’?”
The great master laughed so hard and
loud that his belly, evidence that he
drank more than just green tea,
flopped up and down.
“The third programmer wants only simple answers – even when there are
no simple answers … and then does not
follow them anyway.”
The young apprentice and the grizzled
great master finished drinking their
tea in contemplative silence.
Code Coverage is a misleading metric if 100% coverage is your goal (instead of 100% testing of all features).
You could get a 100% by hitting all the lines once. However you could still miss out testing a particular sequence (logical path) in which those lines are hit.
You could not get a 100% but still have tested all your 80%/freq used code-paths. Having tests that test every 'throw ExceptionTypeX' or similar defensive programming guard you've put in is a 'nice to have' not a 'must have'
So trust yourself or your developers to be thorough and cover every path through their code. Be pragmatic and don't chase the magical 100% coverage. If you TDD your code you should get a 90%+ coverage as a bonus. Use code-coverage to highlight chunks of code you have missed (shouldn't happen if you TDD though.. since you write code only to make a test pass. No code can exist without its partner test. )
Jon Limjap makes a good point - there is not a single number that is going to make sense as a standard for every project. There are projects that just don't need such a standard. Where the accepted answer falls short, in my opinion, is in describing how one might make that decision for a given project.
I will take a shot at doing so. I am not an expert in test engineering and would be happy to see a more informed answer.
When to set code coverage requirements
First, why would you want to impose such a standard in the first place? In general, when you want to introduce empirical confidence in your process. What do I mean by "empirical confidence"? Well, the real goal correctness. For most software, we can't possibly know this across all inputs, so we settle for saying that code is well-tested. This is more knowable, but is still a subjective standard: It will always be open to debate whether or not you have met it. Those debates are useful and should occur, but they also expose uncertainty.
Code coverage is an objective measurement: Once you see your coverage report, there is no ambiguity about whether standards have been met are useful. Does it prove correctness? Not at all, but it has a clear relationship to how well-tested the code is, which in turn is our best way to increase confidence in its correctness. Code coverage is a measurable approximation of immeasurable qualities we care about.
Some specific cases where having an empirical standard could add value:
To satisfy stakeholders. For many projects, there are various actors who have an interest in software quality who may not be involved in the day-to-day development of the software (managers, technical leads, etc.) Saying "we're going to write all the tests we really need" is not convincing: They either need to trust entirely, or verify with ongoing close oversight (assuming they even have the technical understanding to do so.) Providing measurable standards and explaining how they reasonably approximate actual goals is better.
To normalize team behavior. Stakeholders aside, if you are working on a team where multiple people are writing code and tests, there is room for ambiguity for what qualifies as "well-tested." Do all of your colleagues have the same idea of what level of testing is good enough? Probably not. How do you reconcile this? Find a metric you can all agree on and accept it as a reasonable approximation. This is especially (but not exclusively) useful in large teams, where leads may not have direct oversight over junior developers, for instance. Networks of trust matter as well, but without objective measurements, it is easy for group behavior to become inconsistent, even if everyone is acting in good faith.
To keep yourself honest. Even if you're the only developer and only stakeholder for your project, you might have certain qualities in mind for the software. Instead of making ongoing subjective assessments about how well-tested the software is (which takes work), you can use code coverage as a reasonable approximation, and let machines measure it for you.
Which metrics to use
Code coverage is not a single metric; there are several different ways of measuring coverage. Which one you might set a standard upon depends on what you're using that standard to satisfy.
I'll use two common metrics as examples of when you might use them to set standards:
Statement coverage: What percentage of statements have been executed during testing? Useful to get a sense of the physical coverage of your code: How much of the code that I have written have I actually tested?
This kind of coverage supports a weaker correctness argument, but is also easier to achieve. If you're just using code coverage to ensure that things get tested (and not as an indicator of test quality beyond that) then statement coverage is probably sufficient.
Branch coverage: When there is branching logic (e.g. an if), have both branches been evaluated? This gives a better sense of the logical coverage of your code: How many of the possible paths my code may take have I tested?
This kind of coverage is a much better indicator that a program has been tested across a comprehensive set of inputs. If you're using code coverage as your best empirical approximation for confidence in correctness, you should set standards based on branch coverage or similar.
There are many other metrics (line coverage is similar to statement coverage, but yields different numeric results for multi-line statements, for instance; conditional coverage and path coverage is similar to branch coverage, but reflect a more detailed view of the possible permutations of program execution you might encounter.)
What percentage to require
Finally, back to the original question: If you set code coverage standards, what should that number be?
Hopefully it's clear at this point that we're talking about an approximation to begin with, so any number we pick is going to be inherently approximate.
Some numbers that one might choose:
100%. You might choose this because you want to be sure everything is tested. This doesn't give you any insight into test quality, but does tell you that some test of some quality has touched every statement (or branch, etc.) Again, this comes back to degree of confidence: If your coverage is below 100%, you know some subset of your code is untested.
Some might argue that this is silly, and you should only test the parts of your code that are really important. I would argue that you should also only maintain the parts of your code that are really important. Code coverage can be improved by removing untested code, too.
99% (or 95%, other numbers in the high nineties.) Appropriate in cases where you want to convey a level of confidence similar to 100%, but leave yourself some margin to not worry about the occasional hard-to-test corner of code.
80%. I've seen this number in use a few times, and don't entirely know where it originates. I think it might be a weird misappropriation of the 80-20 rule; generally, the intent here is to show that most of your code is tested. (Yes, 51% would also be "most", but 80% is more reflective of what most people mean by most.) This is appropriate for middle-ground cases where "well-tested" is not a high priority (you don't want to waste effort on low-value tests), but is enough of a priority that you'd still like to have some standard in place.
I haven't seen numbers below 80% in practice, and have a hard time imagining a case where one would set them. The role of these standards is to increase confidence in correctness, and numbers below 80% aren't particularly confidence-inspiring. (Yes, this is subjective, but again, the idea is to make the subjective choice once when you set the standard, and then use an objective measurement going forward.)
Other notes
The above assumes that correctness is the goal. Code coverage is just information; it may be relevant to other goals. For instance, if you're concerned about maintainability, you probably care about loose coupling, which can be demonstrated by testability, which in turn can be measured (in certain fashions) by code coverage. So your code coverage standard provides an empirical basis for approximating the quality of "maintainability" as well.
Code coverage is great, but functionality coverage is even better. I don't believe in covering every single line I write. But I do believe in writing 100% test coverage of all the functionality I want to provide (even for the extra cool features I came with myself and which were not discussed during the meetings).
I don't care if I would have code which is not covered in tests, but I would care if I would refactor my code and end up having a different behaviour. Therefore, 100% functionality coverage is my only target.
My favorite code coverage is 100% with an asterisk. The asterisk comes because I prefer to use tools that allow me to mark certain lines as lines that "don't count". If I have covered 100% of the lines which "count", I am done.
The underlying process is:
I write my tests to exercise all the functionality and edge cases I can think of (usually working from the documentation).
I run the code coverage tools
I examine any lines or paths not covered and any that I consider not important or unreachable (due to defensive programming) I mark as not counting
I write new tests to cover the missing lines and improve the documentation if those edge cases are not mentioned.
This way if I and my collaborators add new code or change the tests in the future, there is a bright line to tell us if we missed something important - the coverage dropped below 100%. However, it also provides the flexibility to deal with different testing priorities.
I'd have another anectode on test coverage I'd like to share.
We have a huge project wherein, over twitter, I noted that, with 700 unit tests, we only have 20% code coverage.
Scott Hanselman replied with words of wisdom:
Is it the RIGHT 20%? Is it the 20%
that represents the code your users
hit the most? You might add 50 more
tests and only add 2%.
Again, it goes back to my Testivus on Code Coverage Answer. How much rice should you put in the pot? It depends.
Many shops don't value tests, so if you are above zero at least there is some appreciation of worth - so arguably non-zero isn't bad as many are still zero.
In the .Net world people often quote 80% as reasonble. But they say this at solution level. I prefer to measure at project level: 30% might be fine for UI project if you've got Selenium, etc or manual tests, 20% for the data layer project might be fine, but 95%+ might be quite achievable for the business rules layer, if not wholly necessary. So the overall coverage may be, say, 60%, but the critical business logic may be much higher.
I've also heard this: aspire to 100% and you'll hit 80%; but aspire to 80% and you'll hit 40%.
Bottom line: Apply the 80:20 rule, and let your app's bug count guide you.
For a well designed system, where unit tests have driven the development from the start i would say 85% is a quite low number. Small classes designed to be testable should not be hard to cover better than that.
It's easy to dismiss this question with something like:
Covered lines do not equal tested logic and one should not read too much into the percentage.
True, but there are some important points to be made about code coverage. In my experience this metric is actually quite useful, when used correctly. Having said that, I have not seen all systems and i'm sure there are tons of them where it's hard to see code coverage analysis adding any real value. Code can look so different and the scope of the available test framework can vary.
Also, my reasoning mainly concerns quite short test feedback loops. For the product that I'm developing the shortest feedback loop is quite flexible, covering everything from class tests to inter process signalling. Testing a deliverable sub-product typically takes 5 minutes and for such a short feedback loop it is indeed possible to use the test results (and specifically the code coverage metric that we are looking at here) to reject or accept commits in the repository.
When using the code coverage metric you should not just have a fixed (arbitrary) percentage which must be fulfilled. Doing this does not give you the real benefits of code coverage analysis in my opinion. Instead, define the following metrics:
Low Water Mark (LWM), the lowest number of uncovered lines ever seen in the system under test
High Water Mark (HWM), the highest code coverage percentage ever seen for the system under test
New code can only be added if we don't go above the LWM and we don't go below the HWM. In other words, code coverage is not allowed to decrease, and new code should be covered. Notice how i say should and not must (explained below).
But doesn't this mean that it will be impossible to clean away old well-tested rubbish that you have no use for anymore? Yes, and that's why you have to be pragmatic about these things. There are situations when the rules have to be broken, but for your typical day-to-day integration my experience it that these metrics are quite useful. They give the following two implications.
Testable code is promoted.
When adding new code you really have to make an effort to make the code testable, because you will have to try and cover all of it with your test cases. Testable code is usually a good thing.
Test coverage for legacy code is increasing over time.
When adding new code and not being able to cover it with a test case, one can try to cover some legacy code instead to get around the LWM rule. This sometimes necessary cheating at least gives the positive side effect that the coverage of legacy code will increase over time, making the seemingly strict enforcement of these rules quite pragmatic in practice.
And again, if the feedback loop is too long it might be completely unpractical to setup something like this in the integration process.
I would also like to mention two more general benefits of the code coverage metric.
Code coverage analysis is part of the dynamic code analysis (as opposed to the static one, i.e. Lint). Problems found during the dynamic code analysis (by tools such as the purify family, http://www-03.ibm.com/software/products/en/rational-purify-family) are things like uninitialized memory reads (UMR), memory leaks, etc. These problems can only be found if the code is covered by an executed test case. The code that is the hardest to cover in a test case is usually the abnormal cases in the system, but if you want the system to fail gracefully (i.e. error trace instead of crash) you might want to put some effort into covering the abnormal cases in the dynamic code analysis as well. With just a little bit of bad luck, a UMR can lead to a segfault or worse.
People take pride in keeping 100% for new code, and people discuss testing problems with a similar passion as other implementation problems. How can this function be written in a more testable manner? How would you go about trying to cover this abnormal case, etc.
And a negative, for completeness.
In a large project with many involved developers, everyone is not going to be a test-genius for sure. Some people tend to use the code coverage metric as proof that the code is tested and this is very far from the truth, as mentioned in many of the other answers to this question. It is ONE metric that can give you some nice benefits if used properly, but if it is misused it can in fact lead to bad testing. Aside from the very valuable side effects mentioned above a covered line only shows that the system under test can reach that line for some input data and that it can execute without hanging or crashing.
If this were a perfect world, 100% of code would be covered by unit tests. However, since this is NOT a perfect world, it's a matter of what you have time for. As a result, I recommend focusing less on a specific percentage, and focusing more on the critical areas. If your code is well-written (or at least a reasonable facsimile thereof) there should be several key points where APIs are exposed to other code.
Focus your testing efforts on these APIs. Make sure that the APIs are 1) well documented and 2) have test cases written that match the documentation. If the expected results don't match up with the docs, then you have a bug in either your code, documentation, or test cases. All of which are good to vet out.
Good luck!
Code coverage is just another metric. In and of itself, it can be very misleading (see www.thoughtworks.com/insights/blog/are-test-coverage-metrics-overrated). Your goal should therefore not be to achieve 100% code coverage but rather to ensure that you test all relevant scenarios of your application.
I prefer to do BDD, which uses a combination of automated acceptance tests, possibly other integration tests, and unit tests. The question for me is what the target coverage of the automated test suite as a whole should be.
That aside, the answer depends on your methodology, language and testing and coverage tools. When doing TDD in Ruby or Python it's not hard to maintain 100% coverage, and it's well worth doing so. It's much easier to manage 100% coverage than 90-something percent coverage. That is, it's much easier to fill coverage gaps as they appear (and when doing TDD well coverage gaps are rare and usually worth your time) than it is to manage a list of coverage gaps that you haven't gotten around to and miss coverage regressions due to your constant background of uncovered code.
The answer also depends on the history of your project. I've only found the above to be practical in projects managed that way from the start. I've greatly improved the coverage of large legacy projects, and it's been worth doing so, but I've never found it practical to go back and fill every coverage gap, because old untested code is not well understood enough to do so correctly and quickly.
85% would be a good starting place for checkin criteria.
I'd probably chose a variety of higher bars for shipping criteria - depending on the criticality of the subsystems/components being tested.
Code coverage is great but only as long as the benefits that you get from it outweigh the cost/effort of achieving it.
We have been working to a standard of 80% for some time, however we have just made the decison to abandon this and instead be more focused on our testing. Concentrating on the complex business logic etc,
This decision was taken due to the increasing amount of time we spent chasing code coverage and maintaining existing unit tests. We felt we had got to the point where the benefit we were getting from our code coverage was deemed to be less than the effort that we had to put in to achieve it.
I use cobertura, and whatever the percentage, I would recommend keeping the values in the cobertura-check task up-to-date. At the minimum, keep raising totallinerate and totalbranchrate to just below your current coverage, but never lower those values. Also tie in the Ant build failure property to this task. If the build fails because of lack of coverage, you know someone's added code but hasn't tested it. Example:
<cobertura-check linerate="0"
branchrate="0"
totallinerate="70"
totalbranchrate="90"
failureproperty="build.failed" />
When I think my code isn't unit tested enough, and I'm not sure what to test next, I use coverage to help me decide what to test next.
If I increase coverage in a unit test - I know this unit test worth something.
This goes for code that is not covered, 50% covered or 97% covered.
Short answer: 60-80%
Long answer:
I think it totally depends on the nature of your project. I typically start a project by unit testing every practical piece. By the first "release" of the project you should have a pretty good base percentage based on the type of programming you are doing. At that point you can start "enforcing" a minimum code coverage.
If you've been doing unit testing for a decent amount of time, I see no reason for it not to be approaching 95%+. However, at a minimum, I've always worked with 80%, even when new to testing.
This number should only include code written in the project (excludes frameworks, plugins, etc.) and maybe even exclude certain classes composed entirely of code written of calls to outside code. This sort of call should be mocked/stubbed.
Generally speaking, from the several engineering excellence best practices papers that I have read, 80% for new code in unit tests is the point that yields the best return. Going above that CC% yields a lower amount of defects for the amount of effort exerted. This is a best practice that is used by many major corporations.
Unfortunately, most of these results are internal to companies, so there are no public literatures that I can point you to.
My answer to this conundrum is to have 100% line coverage of the code you can test and 0% line coverage of the code you can't test.
My current practice in Python is to divide my .py modules into two folders: app1/ and app2/ and when running unit tests calculate the coverage of those two folders and visually check (I must automate this someday) that app1 has 100% coverage and app2 has 0% coverage.
When/if I find that these numbers differ from standard I investigage and alter the design of the code so that coverage conforms to the standard.
This does mean that I can recommend achieving 100% line coverage of library code.
I also occasionally review app2/ to see if I could possible test any code there, and If I can I move it into app1/
Now I'm not too worried about the aggregate coverage because that can vary wildly depending on the size of the project, but generally I've seen 70% to over 90%.
With python, I should be able to devise a smoke test which could automatically run my app while measuring coverage and hopefully gain an aggreagate of 100% when combining the smoke test with unittest figures.
Check out Crap4j. It's a slightly more sophisticated approach than straight code coverage. It combines code coverage measurements with complexity measurements, and then shows you what complex code isn't currently tested.
Viewing coverage from another perspective: Well-written code with a clear flow of control is the easiest to cover, the easiest to read, and usually the least buggy code. By writing code with clearness and coverability in mind, and by writing the unit tests in parallel with the code, you get the best results IMHO.
In my opinion, the answer is "It depends on how much time you have". I try to achieve 100% but I don't make a fuss if I don't get it with the time I have.
When I write unit tests, I wear a different hat compared to the hat I wear when developing production code. I think about what the tested code claims to do and what are the situations that can possible break it.
I usually follow the following criteria or rules:
That the Unit Test should be a form of documentation on what's the expected behavior of my codes, ie. the expected output given a certain input and the exceptions it may throw that clients may want to catch (What the users of my code should know?)
That the Unit Test should help me discover the what if conditions that I may not yet have thought of. (How to make my code stable and robust?)
If these two rules doesn't produce 100% coverage then so be it. But once, I have the time, I analyze the uncovered blocks and lines and determine if there are still test cases without unit tests or if the code needs to be refactored to eliminate the unecessary codes.
It depends greatly on your application. For example, some applications consist mostly of GUI code that cannot be unit tested.
I don't think there can be such a B/W rule.
Code should be reviewed, with particular attention to the critical details.
However, if it hasn't been tested, it has a bug!
Depending on the criticality of the code, anywhere from 75%-85% is a good rule of thumb.
Shipping code should definitely be tested more thoroughly than in house utilities, etc.
This has to be dependent on what phase of your application development lifecycle you are in.
If you've been at development for a while and have a lot of implemented code already and are just now realizing that you need to think about code coverage then you have to check your current coverage (if it exists) and then use that baseline to set milestones each sprint (or an average rise over a period of sprints), which means taking on code debt while continuing to deliver end user value (at least in my experience the end user doesn't care one bit if you've increased test coverage if they don't see new features).
Depending on your domain it's not unreasonable to shoot for 95%, but I'd have to say on average your going to be looking at an average case of 85% to 90%.
I think the best symptom of correct code coverage is that amount of concrete problems unit tests help to fix is reasonably corresponds to size of unit tests code you created.
I think that what may matter most is knowing what the coverage trend is over time and understanding the reasons for changes in the trend. Whether you view the changes in the trend as good or bad will depend upon your analysis of the reason.
We were targeting >80% till few days back, But after we used a lot of Generated code, We do not care for %age, but rather make reviewer take a call on the coverage required.
From the Testivus posting I think the answer context should be the second programmer.
Having said this from a practical point of view we need parameter / goals to strive for.
I consider that this can be "tested" in an Agile process by analyzing the code we have the architecture, functionality (user stories), and then come up with a number. Based on my experience in the Telecom area I would say that 60% is a good value to check.