Quantifying Unit Test Coverage

Quantifying Unit Test Coverage - unit-testing

Our company is trying to enforce test driven development and as a development manager, I'm trying to define what that acceptance criteria really means. We're mostly follow an agile methodology and each story going to test needs some level of assurance (entrance criteria) of unit test coverage. I'm interested to hear how you guys enforce this (if you do) from a gating level effectively within your companies.

What you don't want is to set any code coverage requirements. Any requirement like that can and will be gamed.
Instead, I'd look at measuring RTF: Running, Tested Features. See http://xprogramming.com/articles/jatrtsmetric/

For our Ruby on Rails app, we use a code metric gem called SimpleCov. I am not sure what language your company uses, but I am sure there is a code metric for it. SimpleCov is great for Ruby, because it provides an extensive GUI, highlighting down to the line whether code was covered, skipped (filtered out), or missed.
We just started to track our code coverage for two months now. We began at 30%, and are now near 60%. Depending on the age of your company's application, you may want to raise your coverage expectations to 80% or higher... According to SimpleCov, anything 91% or higher is "in the green", and below 80% is "in the red" (for great color analogies).
I feel that the most important thing is to make sure you have your crucial features tested -- such features may have the most lines of code to be tested. Getting those done first will drastically increase coverage.
Another thing to note, if you use a library like SimpleCov, you may be able to skip (filter out) lines of code, or even entire files, that you feel are legacy and may lower your coverage. That is another reason why our coverage almost doubled in 2 months.
Again, we are new to measuring code coverage, but strongly believe in its benefit to our current testing suite and application development.

Related

How Do I Know That I'm Not Breaking Anything During Refactoring?

I've started my first experience in refactoring on huge system and writing unit tests for it, but I am just scared that I'm breaking the code without knowing it.
I studied the "the art of unit testing" and "working efficiently with legacy code" to find a solution, and my next plan is just stop refactoring for a while and write some integration testing(I have selected Fitnesse tool for integration testing purpose) to run them every time after I change some thing.
I am just wondering is there any other one with same experience? Do you think inetegration testing can be a good solution for this issue? Do you have any better idea?
I also checked this question (How can I check that I didn't break anything when refactoring?) but my situation is different with that, because there is no unit test available and I am here to write unit tests.

Integration testing is part of a good solution for refactoring. However some problems introduced by the refactoring will only show up when you have deployed the project.
A better idea would be to incorporate the integration testing into a continuous delivery strategy. This means you should have a clean and practical approach to build and deploy the project as often as possible to a near identical environment while refactoring it. The book Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment is a good resource. Here is one of the antipatterns it describes (Pages 7-9):
Antipattern: Deploying to a Production-like Environment Only after Development Is Complete
In this pattern, the first time the software is deployed to a
production-like environment (for example, staging) is once most of
the development work is done...
Once the application is deployed into staging, it is common for new
bugs to be found...
The remedy is to integrate the testing, deployment, and release
activities into the development process. Make them a normal and
ongoing part of development so that by the time you are ready to
release your system into production there is little to no risk,
because you have rehearsed it on many different occasions in a
progressively more production-like sequence of test environments. Make
sure everybody involved in the software delivery process, from the
build and release team to testers to developers, work together from
the start of the project.

At the end of the day, this is the problem of working with Legacy Code.
Integration Tests are your best bet, but to write those to correctly meet your needs, you would need to know the original intent of the original code, which often isn't as clear, because there are often hidden requirements.
There are no ideal solutions.

Although previous answers are very good, I'd like to add that unit tests are exactly for this. In our test project when we refactor each other components, its mandatory to run already existing unit tests prepared from initial developer + new ones before commit to the Version control. Besides - its a good approach to have smoke tests running on every check-in. An ofcourse - Integration, Regression etc. afterwards.
UPDATE
I'm in the exact same situation - chained to maintenance. Tools can vary greatly - depending of the needs. Starting from Web-, -Unit-Testing up to SOA- and Server-testing. If you provide more detailed info about your SUT I'll gladly try to help.

How can we decide which testing method can be used?

i have project in .net , i want to test it.
But i dont know anything about testing and its method.
how can i go ahead with testing.
which method is better for me for begining?
Is there anything to decide which testing method is taken into account for better result?

There is no "right" or "wrong" in testing. Testing is an art and what you should choose and how well it works out for you depends a lot from project to project and your experience.
But as a professional Tester Expert my suggestion is that you have a healthy mix of automated and manual testing.
AUTOMATED TESTING
Unit Testing
Use NUnit to test your classes, functions and interaction between them.
http://www.nunit.org/index.php
Automated Functional Testing
If it's possible you should automate a lot of the functional testing. Some frame works have functional testing built into them. Otherwise you have to use a tool for it. If you are developing web sites/applications you might want to look at Selenium.
http://www.peterkrantz.com/2005/selenium-for-aspnet/
Continuous Integration
Use CI to make sure all your automated tests run every time someone in your team makes a commit to the project.
http://martinfowler.com/articles/continuousIntegration.html
MANUAL TESTING
As much as I love automated testing it is, IMHO, not a substitute for manual testing. The main reason being that an automated can only do what it is told and only verify what it has been informed to view as pass/fail. A human can use it's intelligence to find faults and raise questions that appear while testing something else.
Exploratory Testing
ET is a very low cost and effective way to find defects in a project. It take advantage of the intelligence of a human being and a teaches the testers/developers more about the project than any other testing technique i know of. Doing an ET session aimed at every feature deployed in the test environment is not only an effective way to find problems fast, but also a good way to learn and fun!
http://www.satisfice.com/articles/et-article.pdf

Since it is not clear about the scale of the project you have, all you need to do is make sure:
Your tests are trustworthy - you should know they are telling u the truth.
Repeatable
Consistent - If you repeat test with same test data it should provide same output.
Proves you are covering all the problem areas.
To get this you can use:
Standard way : NUnit, MbUnit (myFav) or xUnit (havent got around to working with it) or MSTest
Quick and Dirty : Console app (not cool, not so flexible)

If you are using .Net, I'd recommend checking out NUnit. It's a great testing framework to use.
As far as learning about the "testing method", there are many different ways to test an application. When using a tool like NUnit, for example, you are writing automated tests which run without user interaction. In these types of tests, you typically write tests for each of the public methods in your application, and you ensure that given known inputs, these methods produce the expected outputs. Over time as the application changes (via enhancements, bug fixes, etc.) you have a core set of tests that you can re-run to ensure nothing breaks as a result of the changes. You can also do failure testing to ensure that given an invalid set of inputs to a method, it throws the proper exceptions, etc.
Besides automated testing with a tool like NUnit, it's also important to ensure that your end users test the product. "End users" here could be a Quality Assurance group in your company, or it could be the actual customer. The point is that you need to ensure that someone actually uses your application to make sure it works as expected, because no matter how good the automated tests are, there will still be many things you won't think of that your users will discover. One way to approach this type of testing is to write test scenarios, and have your users execute them to make sure the scenario results in the correct behavior.
I think the best testing approach combines both of the above, namely automated testing and user testing (with documented test scenarios).

What's the Point of Selenium?

Ok, maybe I'm missing something, but I really don't see the point of Selenium. What is the point of opening the browser using code, clicking buttons using code, and checking for text using code? I read the website and I see how in theory it would be good to automatically unit test your web applications, but in the end doesn't it just take much more time to write all this code rather than just clicking around and visually verifying things work?
I don't get it...

It allows you to write functional tests in your "unit" testing framework (the issue is the naming of the later).
When you are testing your application through the browser you are usually testing the system fully integrated. Consider you already have to test your changes before committing them (smoke tests), you don't want to test it manually over and over.
Something really nice, is that you can automate your smoke tests, and QA can augment those. Pretty effective, as it reduces duplication of efforts and gets the whole team closer.
Ps as any practice that you are using the first time it has a learning curve, so it usually takes longer the first times. I also suggest you look at the Page Object pattern, it helps on keeping the tests clean.
Update 1: Notice that the tests will also run javascript on the pages, which helps testing highly dynamic pages. Also note that you can run it with different browsers, so you can check cross-browser issues(at least on the functional side, as you still need to check the visual).
Also note that as the amount of pages covered by tests builds up, you can create tests with complete cycles of interactions quickly. Using the Page Object pattern they look like:
LastPage aPage = somePage
.SomeAction()
.AnotherActionWithParams("somevalue")
//... other actions
.AnotherOneThatKeepsYouOnthePage();
// add some asserts using methods that give you info
// on LastPage (or that check the info is there).
// you can of course break the statements to add additional
// asserts on the multi-steps story.
It is important to understand that you go gradual about this. If it is an already built system, you add tests for features/changes you are working on. Adding more and more coverage along the way. Going manual instead, usually hides what you missed to test, so if you made a change that affects every single page and you will check a subset (as time doesn't allows), you know which ones you actually tested and QA can work from there (hopefully by adding even more tests).

This is a common thing that is said about unit testing in general. "I need to write twice as much code for testing?" The same principles apply here. The payoff is the ability to change your code and know that you aren't breaking anything.

Because you can repeat the SAME test over and over again.

If your application is even 50+ pages and you need to do frequent builds and test it against X number of major browsers it makes a lot of sense.

Imagine you have 50 pages, all with 10 links each, and some with multi-stage forms that require you to go through the forms, putting in about 100 different sets of information to verify that they work properly with all credit card numbers, all addresses in all countries, etc.
That's virtually impossible to test manually. It becomes so prone to human error that you can't guarantee the testing was done right, never mind what the testing proved about the thing being tested.
Moreover, if you follow a modern development model, with many developers all working on the same site in a disconnected, distributed fashion (some working on the site from their laptop while on a plane, for instance), then the human testers won't even be able to access it, much less have the patience to re-test every time a single developer tries something new.
On any decent size of website, tests HAVE to be automated.

The point is the same as for any kind of automated testing: writing the code may take more time than "just clicking around and visually verifying things work", maybe 10 or even 50 times more.
But any nontrivial application will have to be tested far more than 50 times eventually, and manual tests are an annoying chore that will likely be omitted or done shoddily under pressure, which results in bugs remaining undiscovered until just bfore (or after) important deadlines, which results in stressful all-night coding sessions or even outright monetary loss due to contract penalties.

Selenium (along with similar tools, like Watir) lets you run tests against the user interface of your Web app in ways that computers are good at: thousands of times overnight, or within seconds after every source checkin. (Note that there are plenty of other UI testing pieces that humans are much better at, such as noticing that some odd thing not directly related to the test is amiss.)
There are other ways to involve the whole stack of your app by looking at the generated HTML rather than launching a browser to render it, such as Webrat and Mechanize. Most of these don't have a way to interact with JavaScript-heavy UIs; Selenium has you somewhat covered here.

Selenium will record and re-run all of the manual clicking and typing you do to test your web application. Over and over.
Over time studies of myself have shown me that I tend to do fewer tests and start skipping some, or forgetting about them.
Selenium will instead take each test, run it, if it doesn't return what you expect it, it can let you know.
There is an upfront cost of time to record all these tests. I would recommend it like unit tests -- if you don't have it already, start using it with the most complex, touchy, or most updated parts of your code.

And if you save those tests as JUnit classes you can rerun them at your leisure, as part of your automated build, or in a poor man's load test using JMeter.

In a past job we used to unit test our web-app. If the web-app changes its look the tests don't need to be re-written. Record-and-replay type tests would all need to be re-done.

Why do you need Selenium? Because testers are human beings. They go home every day, can't always work weekends, take sickies, take public holidays, go on vacation every now and then, get bored doing repetitive tasks and can't always rely on them being around when you need them.
I'm not saying you should get rid of testers, but an automated UI testing tool complements system testers.

The point is the ability to automate what was before a manual and time consuming test. Yes, it takes time to write the tests, but once written, they can be run as often as the team wishes. Each time they are run, they are verifying that behavior of the web application is consistent. Selenium is not a perfect product, but it is very good at automating realistic user interaction with a browser.

If you do not like the Selenium approach, you can try HtmlUnit, I find it more useful and easy to integrate into existing unit tests.

For applications with rich web interfaces (like many GWT projects) Selenium/Windmill/WebDriver/etc is the way to create acceptance tests. In case of GWT/GXT, the final user interface code is in JavaScript so creating acceptance tests using normal junit test cases is basically out of question. With Selenium you can create test scenarios matching real user actions and expected results.
Based on my experience with Selenium it can reveal bugs in the application logic and user interface (in case your test cases are well written). Dealing with AJAX front ends requires some extra effort but it is still feasible.

I use it to test multi page forms as this takes the burden out of typing the same thing over and over again. And having the ability to check if certain elements are present is great. Again, using the form as an example your final selenium test could check if something like say "Thanks Mr. Rogers for ordering..." appears at the end of the ordering process.

How to deal with code coverage?

The other day we had a hard discussion between different developers and project leads, about code coverage tools and the use of the corresponding reports.
Do you use code coverage in your projects and if so, why not?
Is code coverage a fixed part of your builds or continous integration
or do you just use it from time to time?
How do you deal with the numbers derived from the reports?

We use code coverage to verify that we aren't missing big parts in our testing efforts. Once a milestone or so we run a full coverage report and spend a few days analyzing the results, adding test coverage for areas we missed.
We don't run it every build because I don't know that we would analyze it on a regular enough basis to justify that.
We analyze the reports for large blocks of unhit code. We've found this to be the most efficient use. In the past we would try to hit a particular code coverage target but after some point, the returns become very diminishing. Instead, it's better to use code coverage as a tool to make sure you didn't forget anything.

1) Yes we do use code coverage
2) Yes it is part of the CI build (why wouldn't it be?)
3) The important part - we don't look for 100% coverage. What we do look for is buggy/complex code, that's easy to find from your unit tests, and the Devs/Leads will know the delicate parts of the system. We make sure the coverage of such code areas is good, and increases with time, not decreases as people hack in more fixes without the requisite tests.

Code coverage tells you how big your "bug catching" net is, but it doesn't tell you how big the holes are in your net.
Use it as an indicator to gauge your testing efforts but not as an absolute metric.
It is possible to write code that will give you 100% coverage and does not test anything at all.

The way to look at Code Coverage is to see how much is NOT covered and find out why it is not covered. Code coverage simply tells us that the lines of code is being hit when the unit tests are running. It does not tell us that the code works correctly or not. 100% code coverage is a good number but in medium/large projects it is very hard to achieve.

I like to measure code coverage on any non-trivial project. As has been mentioned, try not to get too caught up in achieving an arbitrary/magical percentage. There are better metrics, such as riskiness based on complexity, coverage by package/namespace, etc.
Take a look at this sample Clover dashboard for similar ideas.

We do it in a build, and we see that it should not drop below some value, like 85%.
I also do automatic Top 10 Largest Not-covered methods, to know what to start covering.

Many teams switching to Agile/XP use code coverage as an indirect way of gauging the ROI of their test automation efforts.
I think of it as an experiment - there's an hypothesis that "if we start writing unit tests, our code coverage will improve" - and it makes sense to collect the corresponding observation automatically, via CI, report it in a graph etc.
You use the results to detect rough spots: if the trend toward more coverage levels off at some point, for instance, you might stop to ask what's going on. Perhaps the team has trouble writing tests that are relevant.

We use code coverage to assure that we have no major holes in our tests, and it's run nightly in our CI.
Since we also have a full set of selenium-web tests that run all the way through the stack we also do an additional coverage trick:
We set up the web-application with coverage running. Then we run the full automated test battery of selenium tests. Some of these are smoke tests only.
When the full suite of tests has been run, we can identify suspected dead code simply by looking at the coverage and inspecting code. This is really nice when working on large projects, because you can have big branches of dead code after some time.
We don't really have any fixed metrics on how often we do this, but it's all set up to run with a keypress or two.

We do use code coverage, it is integrated in our nightly build. There are several tools to analyze the coverage data, commonly they report
statement coverage
branch coverage
MC/DC coverage
We expect to reach + 90% statement and branch coverage. MC/DC coverage on the other hand gives broader sense for test team. For the uncovered code, we expect justification records by the way.

I find it depends on the code itself. I won't repeat Joel's statements from SO podcast #38, but the upshot is 'try to be pragmatic'.
Code coverage is great in core elements of the app.
I look at the code as a tree of dependency, if the leaves work (e.g. basic UI or code calling a unit tested DAL) and I've tested them when I've developed them, or updated them, there is a large chance they will work, and if there's a bug, then it won't be difficult to find or fix, so the time taken to mock up some tests will probably be time wasted. Yes there is an issue that updates to code they are dependent on may affect them, but again, it's a case by case thing, and unit tests for the code they are dependent on should cover it.
When it comes to the trunks or branch of the code, yes code coverage of functionality (as opposed to each function), is very important.
For example, I recently was on a team that built an app that required a bundle of calculations to calculate carbon emissions. I wrote a suite of tests that tested each and every calculation, and in doing so was happy to see that the dependency injection pattern was working fine.
Inevitably, due to a government act change, we had to add a parameter to the equations, and all 100+ tests broke.
I realised to update them, over and above testing for a typo (which I could test once), I was unit/regression testing mathematics, and ended up spending the time on building another area of the app instead.

1) Yes we do measure simple node coverage, beacause:
it is easy to do with our current project* (Rails web app)
it encourages our developers to write tests (some come from backgrounds where testing was ad-hoc)
2) Code coverage is part of our continuous integration process.
3) The numbers from the reports are used to:
enforce a minimum level of coverage (95% otherwise the build fails)
find sections of code which should be tested
There are parts of the system where testing is not all that helpful (usually where you need to make use of mock-objects to deal with external systems). But generally having good coverage makes it easier to maintain a project. One knows that fixes or new features do not break existing functionality.
*Details for setting up required coverage for Rails: Min Limit 95 Ahead

What is the code-coverage percentage on your project?

What is the % code-coverage on your project? I'm curious as to reasons why.
Is the dev team happy with it? If not, what stands in the way from increasing it?
Stuart Halloway is one whose projects aim for 100% (or else the build breaks!). Is anyone at that level?
We are at a painful 25% but aspire to 80-90% for new code. We have legacy code that we have decided to leave alone as it evaporates (we are actively re-writing).

We run at 85% code coverage, but falling below it does not break the build. I think using code coverage as an important metric is a dangerous practice. Just because something is covered in a test does not mean the coverage is any good. We try to use it as guidance for the areas we are weakly covered, not as a hard fact.

80% is the exit criteria for the milestone. If we don't make it thrgouh the sprint (even though we do plan the time up front), we add it through the stabilization. We might take an exception for particular component or feature, but we open Pri 1 item for the next milestone.
During coding, code coverage is measured automatically on the daily build and the report is sent to the whole team. Anything that falls under 70% is yellow, under 50% is red. We don't fail the build currently, but we have a plan to add this in the next milestone.
Not sure what the dev happines has to do with unit testing. Devs are hired to build quality product and there should be a process to enforce minimum quality and way to measure it. If somebody is not happy about the process, they are free to suggest another way of validating their code, before it is integrated with the rest of the components.
Btw, we measure code coverage on automated scenario tests as well. Thus, we have three unmbers - unit, scenario and combined.

Our company goal is 80% statement coverage, including exception handling code. Personally, I like to be above 90% on all of the stuff I check in.

I often use code coverage under our automated test suite, but primarily to look for untested areas. We get about 70% coverage most of the time, and will never hit 100% for two reasons;
1) We typically automate new functionality after the release which is manually tested for it's first release and hence not included in coverage analysis. Automation is primarily for functional regression in our case and is the best place to execute and tweak code coverage.
2) Fault injection is required to get 100% coverage, as you need to get inside execption handlers. This is difficult and time consuming to automate. We don't currently do this and hence won't ever get 100%. Jame's Whittakers books on breaking software cover this subject well for anyone interested.
It is also worth remembering that code coverage does not equate to test coverage, as is regularly discussed in threads such as this and this over on SQAforums. Thus 100% code coverage can be a mis-leading metric.

A couple of years ago I measured Perl's test coverage. By the end of 250 test cases it reached 70% of the code and 33% of fully tested branches

0% sadly at our workplace yet.
Will aim to improve that but trying to tell the bosses that we need it, it isn't easy since they see testing != coding less money.

A project I did a couple of years ago achieved 100% line coverage but I had total control over it so I could enforce the target.
We've now got an objective to have 50% of new code covered, a figure that will rise in the near future, but no way to measure it. We will soon have tools in place to measure code coverage on every nightly run of the unit tests, so I'm convinced our position will improve.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js