Are C++ static code analyis tools worth it? - c++

Our management has recently been talking to some people selling C++ static analysis tools. Of course the sales people say they will find tons of bugs, but I'm skeptical.
How do such tools work in the real world? Do they find real bugs? Do they help more junior programmers learn?
Are they worth the trouble?

Static code analysis is almost always worth it. The issue with an existing code base is that it will probably report far too many errors to make it useful out of the box.
I once worked on a project that had 100,000+ warnings from the compiler... no point in running Lint tools on that code base.
Using Lint tools "right" means buying into a better process (which is a good thing). One of the best jobs I had was working at a research lab where we were not allowed to check in code with warnings.
So, yes the tools are worth it... in the long term. In the short term turn your compiler warnings up to the max and see what it reports. If the code is "clean" then the time to look at lint tools is now. If the code has many warnings... prioritize and fix them. Once the code has none (or at least very few) warnings then look at Lint tools.
So, Lint tools are not going to help a poor code base, but once you have a good codebase it can help you keep it good.
Edit:
In the case of the 100,000+ warning product, it was broken down into about 60 Visual Studio projects. As each project had all of the warnings removed it was changed so that the warnings were errors, that prevented new warnings from being added to projects that had been cleaned up (or rather it let my co-worker righteously yell at any developer that checked in code without compiling it first :-)

In my experience with a couple of employers, Coverity Prevent for C/C++ was decidedly worth it, finding some bugs even in good developers’ code, and a lot of bugs in the worst developers’ code. Others have already covered technical aspects, so I’ll focus on the political difficulties.
First, the developers whose code need static analysis the most, are the least likely to use it voluntarily. So I’m afraid you’ll need strong management backing, in practice as well as in theory; otherwise it might end up as just a checklist item, to produce impressive metrics without actually getting bugs fixed. Any static analysis tool is going to produce false positives; you’re probably going to need to dedicate somebody to minimizing the annoyance from them, e.g., by triaging defects, prioritizing the checkers, and tweaking the settings. (A commercial tool should be extremely good at never showing a false positive more than once; that alone may be worth the price.) Even the genuine defects are likely to generate annoyance; my advice on this is not to worry about, e.g., check-in comments grumbling that obviously destructive bugs are “minor.”
My biggest piece of advice is a corollary to my first law, above: Take the cheap shots first, and look at the painfully obvious bugs from your worst developers. Some of these might even have been found by compiler warnings, but a lot of bugs can slip through those cracks, e.g., when they’re suppressed by command-line options. Really blatant bugs can be politically useful, e.g., with a Top Ten List of the funniest defects, which can concentrate minds wonderfully, if used carefully.

As a couple people remarked, if you run a static analysis tool full bore on most applications, you will get a lot of warnings, some of them may be false positives or may not lead to an exploitable defect. It is that experience that leads to a perception that these types of tools are noisy and perhaps a waste of time. However, there are warnings that will highlight a real and potentially dangerous defects that can lead to security, reliability, or correctness issues and for many teams, those issues are important to fix and may be nearly impossible to discover via testing.
That said, static analysis tools can be profoundly helpful, but applying them to an existing codebase requires a little strategy. Here are a couple of tips that might help you..
1) Don't turn everything on at once, decide on an initial set of defects, turn those analyses on and fix them across your code base.
2) When you are addressing a class of defects, help your entire development team to understand what the defect is, why it's important and how to code to defend against that defect.
3) Work to clear the codebase completely of that class of defects.
4) Once this class of issues have been fixed, introduce a mechanism to stay in that zero issue state. Luckily, it is much easier make sure you are not re-introducing an error if you are at a baseline has no errors.

It does help. I'd suggest taking a trial version and running it through a part of your codebase which you think is neglected. These tools generate a lot of false positives. Once you've waded through these, you're likely to find a buffer overrun or two that can save a lot of grief in near future. Also, try at least two/three varieties (and also some of the OpenSource stuff).

I've used them - PC-Lint, for example, and they did find some things. Typically they are configurable and you can tell them 'stop bothering me about xyz', if you determine that xyz really isn't an issue.
I don't know that they help junior programmers learn a lot, but they can be used as a mechanism to help tighten up the code.
I've found that a second set of (skeptical, probing for bugs) eyes and unit testing is typically where I've seen more bug catching take place.

Those tools do help. lint has been a great tool for C developers.
But one objection that I have is that they're batch processes that run after you've written a fair amount of code and potentially generate a lot of messages.
I think a better approach is to build such a thing into your IDE and have it point out the problem while you're writing it so you can correct it right away. Don't let those problems get into the code base in the first place.
That's the difference between the FindBugs static analysis tool for Java and IntelliJ's Inspector. I greatly prefer the latter.

You are probably going to have to deal with a good amount of false positives, particularly if your code base is large.
Most static analysis tools work using "intra-procedural analysis", which means that they consider each procedure in isolation, as opposed to "whole-program analysis" which considers the entire program.
They typically use "intra-procedural" analysis because "whole-program analysis" has to consider many paths through a program that won't actually ever happen in practice, and thus can often generate false positive results.
Intra-procedural analysis eliminates those problems by just focusing on a single procedure. In order to work, however, they usually need to introduce an "annotation language" that you use to describe meta-data for procedure arguments, return types, and object fields. For C++ those things are usually implemented via macros that you decorate things with. The annotations then describe things like "this field is never null", "this string buffer is guarded by this integer value", "this field can only be accessed by the thread labeled 'background'", etc.
The analysis tool will then take the annotations you supply and verify that the code you wrote actually conforms to the annotations. For example, if you could potentially pass a null off to something that is marked as not null, it will flag an error.
In the absence of annotations, the tool needs to assume the worst, and so will report a lot of errors that aren't really errors.
Since it appears you are not using such a tool already, you should assume you are going to have to spend a considerably amount of time annotating your code to get rid of all the false positives that will initially be reported. I would run the tool initially, and count the number of errors. That should give you an estimate of how much time you will need to adopt it in your code base.
Wether or not the tool is worth it depends on your organization. What are the kinds of bugs you are bit by the most? Are they buffer overrun bugs? Are they null-dereference or memory-leak bugs? Are they threading issues? Are they "oops we didn't consider that scenario", or "we didn't test a Chineese version of our product running on a Lithuanian version of Windows 98?".
Once you figure out what the issues are, then you should know if it's worth the effort.
The tool will probably help with buffer overflow, null dereference, and memory leak bugs. There's a chance that it may help with threading bugs if it has support for "thread coloring", "effects", or "permissions" analysis. However, those types of analysis are pretty cutting-edge, and have HUGE notational burdens, so they do come with some expense. The tool probably won't help with any other type of bugs.
So, it really depends on what kind of software you write, and what kind of bugs you run into most frequently.

I think static code analysis is well worth, if you are using the right tool. Recently, we tried the Coverity Tool ( bit expensive). Its awesome, it brought out many critical defects,which were not detected by lint or purify.
Also we found that, we could have avoided 35% of the customer Field defects, if we had used coverity earlier.
Now, Coverity is rolled out in my company and when ever we get a customer TR in old software version, we are running coverity against it to bring out the possible canditates for the fault before we start the analysis in a susbsytem.

Paying for most static analysis tools is probably unnecessary when there's some very good-quality free ones (unless you need some very special or specific feature provided by a commercial version). For example, see this answer I gave on another question about cppcheck.

I guess it depends quite a bit on your programming style. If you are mostly writing C code (with the occasional C++ feature) then these tools will likely be able to help (e.g. memory management, buffer overruns, ...). But if you are using more sophisticated C++ features, then the tools might get confused when trying to parse your source code (or just won't find many issues because C++ facilities are usually safer to use).

As with everything the answer depends ... if you are the sole developer working on a knitting-pattern-pretty-printer for you grandma you'll probably do not want to buy any static analysis tools. If you are having a medium sized project for software that will go into something important and maybe on top of that you have a tight schedule, you might want to invest a little bit now that saves you much more later on.
I recently wrote a general rant on this: http://www.redlizards.com/blog/?p=29
I should write part 2 as soon as time permits, but in general do some rough calculations whether it is worth it for you:
how much time spent on debugging?
how many resources bound?
what percentage could have been found by static analysis?
costs for tool setup?
purchase price?
peace of mind? :-)
My personal take is also:
get static analysis in early
early in the project
early in the development cycle
early as in really early (before nightly build and subsequent testing)
provide the developer with the ability to use static analysis himself
nobody likes to be told by test engineers or some anonymous tool
what they did wrong yesterday
less debugging makes a developer happy :-)
provides a good way of learning about (subtle) pitfalls without embarrassment

This rather amazing result was accomplished using Elsa and Oink.
http://www.cs.berkeley.edu/~daw/papers/fmtstr-plas07.pdf
"Large-Scale Analysis of Format String Vulnerabilities in Debian Linux"
by Karl Chen, David Wagner,
UC Berkeley,
{quarl, daw}#cs.berkeley.edu
Abstract:
Format-string bugs are a relatively common security vulnerability, and can lead to arbitrary code execution. In collaboration with others, we designed and implemented a system to eliminate format string vulnerabilities from an entire Linux distribution, using typequalifier inference, a static analysis technique that can find taint violations. We successfully analyze 66% of C/C++ source packages in the Debian 3.1 Linux distribution. Our system finds 1,533 format string taint warnings. We estimate that 85% of these are true positives, i.e., real bugs; ignoring duplicates from libraries, about 75% are real bugs. We suggest that the technology exists to render format string vulnerabilities extinct in the near future.
Categories and Subject Descriptors D.4.6 [Operating Systems]: Security and Protection—Invasive Software;
General Terms: Security, Languages;
Keywords: Format string vulnerability, Large-scale analysis, Typequalifier inference

Static analysis that finds real bugs is worth it regardless of whether it's C++ or not. Some tend to be quite noisy, but if they can catch subtle bugs like signed/unsigned comparisons causing optimizations that break your code or out of bounds array accesses, they are definitely worth the effort.

At a former employer we had Insure++.
It helped to pinpoint random behaviour (use of uninitialized stuff) which Valgrind could not find. But most important: it helpd to remove mistakes which were not known as errors yet.
Insure++ is good, but pricey, that's why we bought one user license only.

Related

Study of the Consequences of C++ warnings [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I just started work at a job that has very old code, compiling this code with even basic warnings enabled produces thousands of warnings, many of them are really scary to me. Much of this code was written in the 80s, so management is not excited to have some new engineer try to fix it.
My opinion is that it should be a high priority to get rid of these warnings, but I don't have any data to back me up. I understand management's thoughts that fixing all the warnings is a serious undertaking, and may not be worth the effort on code that appears to work as is. I'm looking for a study that points to bugs/warning or something like that so I can go to them and say something like: "We have 200,000 warnings, and from study X, it's likely that there are 2,000 bugs hidden in there."
Very similar discussion here: https://softwareengineering.stackexchange.com/questions/111616/handling-false-positives-and-legacy-code-warnings-in-static-analysis-of-c-code
I don't think that this should be closed due to off-topic because it is a: "practical, answerable problems that are unique to the programming profession."
This was closed due to off-topic, but I found a study: http://institute.lanl.gov/isti/irhpit/presentations/ensuring-sq.pdf
http://collaboration.csc.ncsu.edu/laurie/Papers/TSE-0197-0705-2.pdf
I don't think there are any such studies. I'm also going to go out on a limb here and say that management is right - trying to fix all those warnings, simply to fix all those warnings, is a bad idea.
You WILL break things by fixing the warnings! Right now there's probably a lot wrong with that code, but there's a lot right with it, and for you go through and cause problems in 'working' code is NOT going to go over well.
In your shoes, I'd get to work on whatever modifications/fixes I need to do. In the process, write unit tests for EVERYTHING you touch, AND clear up the warnings for everything you touch.
I don't know what your build system is, but if it's possible, build everything you touch with warnings on, leaving warnings off for the older stuff, and move code you change into new 'clean compile' files.
You aren't going to get hold of it quickly, but you can't just jump in and change a bunch of stuff. You have to find a way to work that lets you make progress, without breaking things for the end users.
UPDATE:
I started this as a comment, but I think it deserves to be part of the answer.
Clean code (code that has no warnings, code that has unit tests, code that is easy to understand) IS NOT and should NEVER BE the goal of any developer who actually has to get things done in the real world.
Clean code is a TOOL that we use in order to make our products better and our lives (and the lives of those who follow in our footsteps) easier. It's a good tool, nay, a great tool, but it's JUST A TOOL.
If you lose sight of the actual goal (producing software that works), you can bog yourself down in the trivia associated with your process. That may feel good, but it generally doesn't satisfy the users, or ship the product on time.
Warnings do not mean defects. Every product that is not thoroughly enough tested contains defects. If product is well tested then there are no defects.
Lot of warnings are bad for other reason. The warnings hide new warnings (that may indicate freshly made defects) from maintenance. So maintenance of code with lot of warnings is more expensive.
Fixing warnings and sometimes even real defects like memory leaks should be done with care. It should never be done mechanically. That will most likely break working product. That I have seen tens of times in practice.
I agree with most of what Michael Khone says, but quite a bit of product/development management is involved here and the evaluation depends on this. Some (but not all) major questions you should ask yourself:
What is the development mode of this product? Is it just bugfixes with minor enhancements from time to time or are you constantly developing new features on it.
Is there decent enough test coverage? Undertaking major rewrites as this is suicide if you don't have a decent regression suite. Even then a few bugs will still slip through, but it's much better than nothing.
Is there a need to update the toolchain/platform? This is important because many warnings are actually things that will not show any problem as long as you stick to the same environment where you're sure this "problematic" behavior is predictable (and probably correct for your code). If however you want to change one of these drastically, it might be a very good idea to invest time in solving all those warnings.
So warnings -> bugs depends much on your actual product.
The sad truth is that fixing old code is more often than not a fruitless adventure. Old code has often no unit tests and people who wrote the code are long gone from the company. If you see some weird thing in the code it may have been done for a particular reason but so many factors have changed since then that it is just difficult to do any impact analysis. There are just too many things that can go wrong when rewriting old code. Instead what you should focus on is to write new code in a good style.
Another issue is that management has absolutely no understanding why code should be rewritten, for them code is functionality - period

Embedded Software Defect Rate

What defect rate can I expect in a C++ codebase that is written for an embedded processor (DSP), given that there have been no unit tests, no code reviews, no static code analysis, and that compiling the project generates about 1500 warnings. Is 5 defects/100 lines of code a reasonable estimate?
Your question is "Is 5 defects/100 lines of code a reasonable estimate?" That question is extremely difficult to answer, and it's highly dependent on the codebase & code complexity.
You also mentioned in a comment "to show the management that there are probably lots of bugs in the codebase" -- that's great, kudos, right on.
In order to open management's figurative eyes, I'd suggest at least a 3-pronged approach:
take specific compiler warnings, and show how some of them can cause undefined / disastrous behavior. Not all warnings will be as weighty. For example, if you have someone using an uninitialized pointer, that's pure gold. If you have someone stuffing an unsigned 16-bit value into an unsigned 8-bit value, and it can be shown that the 16-bit value will always be <= 255, that one isn't gonna help make your case as strongly.
run a static analysis tool. PC-Lint (or Flexelint) is cheap & provides good "bang for the buck". It will almost certainly catch stuff the compiler won't, and it can also run across translation units (lint everything together, even with 2 or more passes) and find more subtle bugs. Again, use some of these as indications.
run a tool that will give other metrics on code complexity, another source of bugs. I'd recommend M Squared's Resource Standard Metrics (RSM) which will give you more information and metrics (including code complexity) than you could hope for. When you tell management that a complexity score over 50 is "basically untestable" and you have a score of 200 in one routine, that should open some eyes.
One other point: I require clean compiles in my groups, and clean Lint output too. Usually this can accomplished solely by writing good code, but occasionally the compiler / lint warnings need to be tweaked to quiet the tool for things that aren't problems (use judiciously).
But the important point I want to make is this: be very careful when going in & fixing compiler & lint warnings. It's an admirable goal, but you can also inadvertantly break working code, and/or uncover undefined behavior that accidentally worked in the "broken" code. Yes, this really does happen. So tread carefully.
Lastly, if you have a solid set of tests already in place, that will help you determine if you accidentally break something while refactoring.
Good luck!
Despite my scepticism of the validity of any estimate in this case, I have found some statistics that may be relevant.
In this article, the author cites figures from a "a large body of empirical studies", published in Software Assessments, Benchmarks, and Best Practices (Jones, 2000). At SIE CMM Level 1, which sounds like the level of this code, one can expect a defect rate of 0.75 per function point. I'll leave it to you to determine how function points and LOC may relate in your code - you'll probably need a metrics tool to perform that analysis.
Steve McConnell in Code Complete cites a study of 11 projects developed by the same team, 5 without code reviews, 6 with code reviews. The defect rate for the non-reviewed code was 4.5 per 100 LOC, and for the reviewed it was 0.82. So on that basis, your estimate seems fair in the absence of any other information. However I have to assume a level of professionalism amongst this team (just from the fact that they felt the need to perform the study), and that they would have at least attended to the warnings; your defect rate could be much higher.
The point about warnings is that some are benign, and some are errors (i.e. will result in undesired behaviour of the software), if you ignore them on the assumption that they are all benign, you will introduce errors. Moreover some will become errors under maintenance when other conditions change, but if you have already chosen to accept a warning, you have no defence against introduction of such errors.
Take a look at the code quality. It would quickly give you a indication of the amount of problems hiding in the source. If the source is ugly and take a long time to understand there will be a lot of bugs in the code.
Well structured code with consistent style and that is easy to understand are going to contain less problems. Code shows how much effort and thought went into it.
My guess is if the source contains that many warnings there is going to be a lot of bugs hiding out in the code.
That also depends on who wrote the code (level of experience), and how big the code base is.
I would treat all warnings as errors.
How many errors do you get when you run a static analysis tool on the code?
EDIT
Run cccc, and check the mccabe's cyclic complexity. It should tell how complex the code it.
http://sourceforge.net/projects/cccc/
Run other static analysis tools.
If you want to get an estimate of the number of defects, the usual way of statistical estimatation is to subsample the data. I would pick three medium-sized subroutines at random, and check them carefully for bugs (eliminate compiler warnings, run static analysis tool, etc). If you find three bugs in 100 total lines of code selected at random, it seems reasonable that a similar density of bugs are in the rest of the code.
The problem mentioned here of introducing new bugs is an important issue, but you don't need to check the modified code back into the production branch to run this test. I would suggest a thorough set of unit tests before modifying any subroutines, and cleaning up all the code followed by very thorough system testing before releasing new code to production.
If you want to demonstrate the benefits of unit tests, code reviews, static analysis tools, I suggest doing a pilot study.
Do some unit tests, code reviews, and run static analysis tools on a portion of the code. Show management how many bugs you find using those methods. Hopefully, the results speak for themselves.
The following article has some numbers based on real-life projects to which static analysis has been applied to: http://www.stsc.hill.af.mil/crosstalk/2003/11/0311German.html
Of course the criteria by which an anomaly is counted can affect the results dramatically, leading to the large variation in the figures shown in Table 1. In this table, the number of anomalies per thousand lines of code for C ranges from 500 (!) to about 10 (auto generated).

Which programming technique helps you most to avoid or resolve bugs before they come into production

I don't mean external tools. I think of architectural patterns, language constructs, habits. I am mostly interested in C++
Automated Unit Testing .
There's an oft-unappreciated technique that I like to call The QA Team that can do wonders for weeding out bugs before they reach production.
It's been my experience (and is often quoted in textbooks) that programmers don't make the best testers, despite what they may think, because they tend to test to behaviour they already know to be true from their coding. On top of that, they're often not very good at putting themelves in the shoes of the end user (if it's that kind of app), and so are likely to neglect UI formatting/alignment/usability issues.
Yes, unit testing is immensely important and I'm sure others can give you better tips than I on that, but don't neglect your system/integration testing. :)
..and hey, it's a language independent technique!
Code Review, Unit Testing, and Continuous Integration may all help.
I find the following rather handy.
1) ASSERTs.
2) A debug logger that can output to the debug spew, console or file.
3) Memory tracking tools.
4) Unit testing.
5) Smart pointers.
Im sure there are tonnes of others but I can't think of them off the top of my head :)
RAII to avoid resource leakage errors.
Strive for simplicity and conciseness.
Never leave cases where your code behavior is undefined.
Look for opportunities to leverage the type system and have the compiler check as much as possible at compile time. Templates and code generation are your friends as long as you keep your common sense.
Minimize the number of singletons and global variables.
Use RAII !
Use assertions !
Automatic testing of some nominal and all corner cases.
Avoid last minute changes like the plague.
I use thinking.
Reducing variables scope to as narrow as possible. Less variables in outer scope - less chances to plant and hide an error.
I found that, the more is done and checked at compile time, the less can possibly go wrong at run-time. So I try to leverage techniques that allow stricter checking at compile-time. That's one of the reason I went into template-meta programming. If you do something wrong, it doesn't compile and thus never leaves your desk (and thus never arrives at the customer's).
I find many problems before i start testing at all using
asserts
Testing it with actual, realistic data from the start. And testing is necessary not only while writing the code, but it should start early in the design phase. Find out what your worst use cases will be like, and make sure your design can handle it. If your design feels good and elegant even against these use cases, it might actually be good.
Automated tests are great for making sure the code you write is correct. However, before you get to writing code, you have to make sure you're building the right things.
Learning functional programming helps somehow.
HERE
Learn you a haskell for great good.
Model-View-Controller, and in general anything with contracts and interfaces that can be unit-tested automatically.
I agree with many of the other answers here.
Specific to C++, the use of 'const' and avoiding raw pointers (in favor of references and smart pointers) when possible has helped me find errors at compile time.
Also, having a "no warnings" policy helps find errors.
Requirements.
From my experience, having full and complete requirements is the number one step in creating bug-free software. You can't write complete and correct software if you don't know what it's supposed to do. You can't write proper tests for software if you don't know what it's supposed to do; you'll miss a fair amount of stuff you should test. Also, the simple process of writing the requirements helps you to flesh them out. You find so many issues and problems before you ever write the first line of code.
I find peer progamming tends to help avoid a lot of the silly mistakes, and al ot of the time generates discussions which uncover flaws. Plus with someone free to think about the why you are doing something, it tends to make everything cleaner.
Code reviews; I've personally found lots of bugs in my colleagues' code and they have found bugs in mine.
Code reviews, early and often, will help you to both understand each others' code (which helps for maintenance), and spot bugs.
The sooner you spot a bug the easier it is to fix. So do them as soon as you can.
Of course pair programming takes this to an extreme.
Using an IDE like IntelliJ that inspects my code as I write it and flags dodgy code as I write it.
Unit Testing followed by Continious Integration.
Book suggestions: "Code Complete" and "Release it" are two must-read books on this topic.
In addition to the already mentioned things I believe that some features introduced with C++0x will help avoiding certain bugs. Features like strongly-typed enums, for-in loops and deleteing standard functions of objects come to mind.
In general strong typing is the way to go imho
Coding style consistency across a project.
Not just spaces vs. tab issues, but the way that code is used. There is always more than one way to do things. When the same thing gets done differently in different places, it makes catching common errors more difficult.
It's already been mentioned here, but I'll say it again because I believe this cannot be said enough:
Unnecessary complexity is the arch nemesis of good engineering.
Keep it simple. If things start looking complicated, stop and ask yourself why and what you can do to break the problem down into smaller, simpler chunks.
Hire someone that test/validate your software.
We have a guy that use our software before any of our customer. He finds bugs that our automated tests processes do not find, because he thinks as a customer not as a software developper. This guy also gives support to our customers, because he knows very well the software from the customer point of view. INVALUABLE.
all kinds of 'trace'.
Something not mentioned yet - when there's even semi-complex logic going on, name your variables and functions as accurately as you can (but not too long). This will make incongruencies in their interactions with each other, and with what they're supposed to be doing stand out better. The 'meaning', or language-parsing part of your brain will have more to grab on to. I find that with vaguely named things, your brain sort of glosses over what's really there and sees what is /supposed to/ be happening rather than what actually is.
Also, make code clean, it helps to keep your brain from getting fuzzy.
Test-driven development combined with pair programming seems to work quite well on keeping some bugs down. Getting the tests created early helps work out some of the design as well as giving some confidence should someone else have to work with the code.
Creating a string representation of class state, and printing those out to console.
Note that in some cases single line-string won't be enough, you will have to code small printing loop, that would create multi-line representation of class state.
Once you have "visualized" your program in such a way you can start to search errors in it. When you know which variable contained wrong value in the end, it's easy to place asserts everywhere where this variable is assigned or modified. This way you can pin point the exact place of error, and fix it without using the step-by-step debugging (which is rather slow way to find bugs imo).
Just yesterday found a really nasty bug without debugging a single line:
vector<string> vec;
vec.push_back("test1");
vec.push_back(vec[0]); // second element is not "test1" after this, it's empty string
I just kept placing assert-statements and restarting the program, until multi-line representation of program's state was correct.

Reducing defect injection rates in large software development projects

In most software projects, defects originate from requirements, design, coding and defect corrections. From my experience the majority of defects originate from the coding phase.
I am interested in finding out what practical approaches software developers use to reduce defect injection rates.
I have seen the following appraoches used with varying levels of success and associated cost
code inspections
unit tests
static code analysis tools
use of programming style
peer programming
In my experience it has been the fault of the process, not developers, that permit defects. See They Write the Right Stuff on how the process affects bugs.
Competitive Testing
Software developers should aspire to prevent testers from finding issues with the software they have written. Testers should be rewarded (does not have to be financial) for finding issues with software.
Sign Off
Put a person in charge of the software who has a vested interest in making sure the software is devoid of issues. The software is not shipped until that person is satisfied.
Requirements
Avoid changing requirements. Get time estimates from developers for how long it will take to implement the requirements. If the time does not match the required delivery schedule, do not hire more developers. Instead, eliminate some features.
Task Switching
Allow developers to complete the task they are working on before assigning them to another. After coming back to a new task, much time is spent getting familiar with where the task was abandoned and what remaining items are required to complete the it. Along the way, certain technical details can be missed.
Metrics
Gather as many possible metrics you can. Lines of code per method, per class, dependency relationships, and others.
Standards
Ensure everyone is adhering to corporate standards, including:
Source code formatting. This can be automated, and is not a discussion.
Naming conventions (variables, database entities, URLs, and such). Use tools when possible, and weekly code reviews to enforce.
Code must compile without warnings. Note and review all exceptions.
Consistent (re)use of APIs, both internally and externally developed.
Independent Review
Hire a third-party to perform code reviews.
Competent Programmers
Hire the best programmers you can afford. Let go of the programmers who shirk corporate standards.
Disseminate Information
Hold review sessions where developers can share (with the entire team) their latest changes to the framework(s). Allow them freedom to deprecate old portions of the code in favour of superior methods.
Task Tracking
Have developers log how long (within brackets of 15 minutes) each task has taken them. This is not to be used to measure performance, and must be stressed that it has no relation to review or salary. It is simply a measure of how long it takes certain technical tasks to be implemented. From there you can see, generally, how much time is being spent on different aspects of the system. This will allow you to change focus, if necessary.
Evaluate the Process
If many issues are still finding their way into the software, consider reevaluating the process with which the software is being developed. Metrics will help pinpoint the areas that need to be addressed.
First, bugs injected at requirements time are far, far more costly than coding bugs. A zero-value requirement, correctly implemented is a piece of zero-value, unused (or unusable) functionality.
Two things reduce the incidence of bugs
Agility. You are less likely to inject bugs at every step (requirements, design, etc.) if you aren't doing as much in each step. If you try to write all the requirements, you will make terrible mistakes. If you try to write requirements for the next sprint, odds are better that you will get those few requirements correct.
TDD. You are less likely to struggle with bad requirements or bad design if you have to write a test first. If you can't figure out what you're testing, you have a requirements bug. Stop coding. Step away from the keyboard.
If you can't figure out how to test it, you have a design bug. Again, stop coding. Fix the design so it's testable. Then move forward.
I think the main problem of injection rates can become from a lot of sources, and it vary from environment to environment.
You can use a lot of best practices like TDD, DDD, pair programming, continuous integration, etc. But you will never be free from bugs, because what creates bugs are human people, and not exactly the processes.
But IMO, using a bug tracker tool could bring you hints of which problem is more recurrent. From there, you can start attacking your main problem.
The majority of defects may occur during coding, but the impact of coding defects is generally much lower than the impact of errors made during the process of understanding requirements and while developing a resilient architecture. Thus the use of short executable-producing iterations focused on
identifying and correcting ambiguous, imprecise, or just plain incorrect requirements
exposing a suboptimal and/or brittle architecture
can save enormous amounts of time and collective stomach lining in a project of significant scope.
Unit testing, scenario testing, and static analysis tools can detect defects after they are created, but to reduce the number of defects created in the first place, reduce the number of interruptions that developers must endure:
reduce, eliminate, and/or consolidate meetings
provide an interrupt-free working environment
allow developers to correct their defects when they find them (while the responsible code is still fresh in their mind) rather than defer them to a later time when context must be re-established
Step 1 - Understand where your defects are being injected.
Use a technique such as Orthogonal Defect Classification (ODC) to measure when in the software lifecycle defects are injected and when they are detected. Once you know when the defects are injected and have identified when they were discovered you can start to understand the gaps in your process with respect to defect injection and removal.
Step 2 - Develop defect "filters" and adapt your process
Once you know when defects are being injected you can devise strategies to prevent them from entering the system. Different strategies are effective at different points in the software lifecycle. For example, static analysis tools don't help with defects that originated in the requirements, instead you should be looking into some kind of peer review or inspection, maybe even changing the way requirements are specified so you use automated analysis or achieve a more meaning sign-off, etc.
Generally I use a combination of inspection, static analysis, and testing (many different kinds) to filter as many bugs as I can, as soon after they are injected as I am able.
In addition:
Project knowledge base. It says how we do activity X (like 'form validation') in this project. This allows unification and re-use of tested solution, preventing bugs injected when re-inventing-the-wheel.
Production bug monitoring. When a production bug occurs it is investigated. Why this bug was not caught? How we can ensure that this won't happen again? Then we change the process accordingly.

Why do code quality discussions evoke strong reactions? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I like my code being in order, i.e. properly formatted, readable, designed, tested, checked for bugs, etc. In fact I am fanatic about it. (Maybe even more than fanatic...) But in my experience actions helping code quality are hardly implemented. (By code quality I mean the quality of the code you produce day to day. The whole topic of software quality with development processes and such is much broader and not the scope of this question.)
Code quality does not seem popular. Some examples from my experience include
Probably every Java developer knows JUnit, almost all languages implement xUnit frameworks, but in all companies I know, only very few proper unit tests existed (if at all). I know that it's not always possible to write unit tests due to technical limitations or pressing deadlines, but in the cases I saw, unit testing would have been an option. If a developer wanted to write some tests for his/her new code, he/she could do so. My conclusion is that developers do not want to write tests.
Static code analysis is often played around in small projects, but not really used to enforce coding conventions or find possible errors in enterprise projects. Usually even compiler warnings like potential null pointer access are ignored.
Conference speakers and magazines would talk a lot about EJB3.1, OSGI, Cloud and other new technologies, but hardly about new testing technologies or tools, new static code analysis approaches (e.g. SAT solving), development processes helping to maintain higher quality, how some nasty beast of legacy code was brought under test, ... (I did not attend many conferences and it probably looks different for conferences on agile topics, as unit testing and CI and such has a higher value there.)
So why is code quality so unpopular/considered boring?
EDIT:
Thank your for your answers. Most of them concern unit testing (and has been discussed in a related question). But there are lots of other things that can be used to keep code quality high (see related question). Even if you are not able to use unit tests, you could use a daily build, add some static code analysis to your IDE or development process, try pair programming or enforce reviews of critical code.
One obvious answer for the Stack Overflow part is that it isn't a forum. It is a database of questions and answers, which means that duplicate questions are attempted avoided.
How many different questions about code quality can you think of? That is why there aren't 50,000 questions about "code quality".
Apart from that, anyone claiming that conference speakers don't want to talk about unit testing or code quality clearly needs to go to more conferences.
I've also seen more than enough articles about continuous integration.
There are the common excuses for not
writing tests, but they are only
excuses. If one wants to write some
tests for his/her new code, then it is
possible
Oh really? Even if your boss says "I won't pay you for wasting time on unit tests"?
Even if you're working on some embedded platform with no unit testing frameworks?
Even if you're working under a tight deadline, trying to hit some short-term goal, even at the cost of long-term code quality?
No. It is not "always possible" to write unit tests. There are many many common obstacles to it. That's not to say we shouldn't try to write more and better tests. Just that sometimes, we don't get the opportunity.
Personally, I get tired of "code quality" discussions because they tend to
be too concerned with hypothetical examples, and are far too often the brainchild of some individual, who really hasn't considered how aplicable it is to other people's projects, or codebases of different sizes than the one he's working on,
tend to get too emotional, and imbue our code with too many human traits (think of the term "code smell", for a good example),
be dominated by people who write horrible bloated, overcomplicated and verbose code with far too many layers of abstraction, or who'll judge whether code is reusable by "it looks like I can just take this chunk of code and use it in a future project", rather than the much more meaningful "I have actually been able to take this chunk of code and reuse it in different projects".
I'm certainly interested in writing high quality code. I just tend to be turned off by the people who usually talk about code quality.
Code review is not an exact science. Metrics used are somehow debatable. Somewhere on that page : "You can't control what you can't measure"
Suppose that you have one huge function of 5000 lines with 35 parameters. You can unit test it how much you want, it might do exactly what it is supposed to do. Whatever the inputs are. So based on unit testing, this function is "perfect". Besides correctness, there are tons of others quality attributes you might want to measure. Performance, scalability, maintainability, usability and such. Did you ever wondered why software maintenance is such a nightmare?
Real software projects quality control goes far beyond simply checking if the code is correct. If you check the V-Model of software development, you'll notice that coding is only a small part of the whole equation.
Software quality control can go to as far as 60% of the whole cost of your project. This is huge. Instead, people prefer to cut to 0% and go home thinking they made the right choice. I think the real reason why so little time is dedicated to software quality is because software quality isn't well understood.
What is there to measure?
How do we measure it?
Who will measure it?
What will I gain/lose from measuring it?
Lots of coder sweatshops do not realise the relation between "less bugs now" and "more profit later". Instead, all they see is "time wasted now" and "less profit now". Even when shown pretty graphics demonstrating the opposite.
Moreover, software quality control and software engineering as a whole is a relatively new discipline. A lot of the programming space so far has been taken by cyber cowboys. How many times have you heard that "anyone" can program? Anyone can write code that's for sure, but it's not everyone who can be a programmer.
EDIT *
I've come across this paper (PDF) which is from the guy who said "You can't control what you can't measure". Basically he's saying that controlling everything is not as desirable as he first thought it would be. It is not an exact cooking recipe that you can blindly apply to all projects like the software engineering schools want to make you think. He just adds another parameter to control which is "Do I want to control this project? Will it be needed?"
Laziness / Considered boring
Management feeling it's unnecessary -
Ignorant "Just do it right" attitude.
"This small project doesn't need code
quality management" turns into "Now
it would be too costly to implement
code quality management on this large
project"
I disagree that it's dull though. A solid unit testing design makes creating tests a breeze and running them even more fun.
Calculating vector flow control - PASSED
Assigning flux capacitor variance level - PASSED
Rerouting superconductors for faster dialing sequence - PASSED
Running Firefly hull checks - PASSED
Unit tests complete. 4/4 PASSED.
Like anything it can get boring if you do too much of it but spending 10 or 20 minutes writing some random tests for some complex functions after several hours of coding isn't going to suck the creative life from you.
Why is code quality so unpopular?
Because our profession is unprofessional.
However, there are people who do care about code quality. You can find such-minded people for example from the Software Craftsmanship movement's discussion group. But unfortunately the majority of people in software business do not understand the value of code quality, or do not even know what makes up good code.
I guess the answer is the same as to the question 'Why is code quality not popular?'
I believe the top reasons are:
Laziness of the developers. Why invest time in preparing unit tests, review the solution, if it's already implemented?
Improper management. Why ask the developers to cope with code quality, if there are thousands of new feature requests and the programmers could simply implement something instead of taking care of quality of something already implemented.
Short answer: It's one of those intangibles only appreciated by other, mainly experienced, developers and engineers unless something goes wrong. At which point managers and customers are in an uproar and demand why formal processes weren't in place.
Longer answer: This short-sighted approach isn't limited to software development. The American automotive industry (or what's left of it) is probably the best example of this.
It's also harder to justify formal engineering processes when projects start their life as one-off or throw-away. Of course, long after the project is done, it takes a life of its own (and becomes prominent) as different business units start depending on it for their own business process.
At which point a new solution needs to be engineered; but without practice in using these tools and good-practices, these tools are less than useless. They become a time-consuming hindrance. I see this situation all too often in companies where IT teams are support to the business, where development is often reactionary rather than proactive.
Edit: Of course, these bad habits and many others are the real reason consulting firms like Thought Works can continue to thrive as well as they do.
One big factor that I didn't see mentioned yet is that any process improvement (unit testing, continuos integration, code reviews, whatever) needs to have an advocate within the organization who is committed to the technology, has the appropriate clout within the organization, and is willing to do the work to convince others of the value.
For example, I've seen exactly one engineering organization where code review was taken truly seriously. That company had a VP of Software who was a true believer, and he'd sit in on code reviews to make sure they were getting done properly. They incidentally had the best productivity and quality of any team I've worked with.
Another example is when I implemented a unit-testing solution at another company. At first, nobody used it, despite management insistence. But several of us made a real effort to talk up unit testing, and to provide as much help as possible for anyone who wanted to start unit testing. Eventually, a couple of the most well-respected developers signed on, once they started to see the advantages of unit testing. After that, our testing coverage improved dramatically.
I just thought of another factor - some tools take a significant amount of time to get started with, and that startup time can be hard to come by. Static analysis tools can be terrible this way - you run the tool, and it reports 2,000 "problems", most of which are innocuous. Once you get the tool configured properly, the false-positive problem get substantially reduced, but someone has to take that time, and be committed to maintaining the tool configuration over time.
Probably every Java developer knows JUnit...
While I believe most or many developers have heard of JUnit/nUnit/other testing frameworks, fewer know how to write a test using such a framework. And from those, very few have a good understanding of how to make testing a part of the solution.
I've known about unit testing and unit test frameworks for at least 7 years. I tried using it in a small project 5-6 years ago, but it is only in the last few years that I've learned how to do it right. (ie. found a way that works for me and my team...)
For me some of those things were:
Finding a workflow that accomodates unit testing.
Integrating unit testing in my IDE, and having shortcuts to run/debug tests.
Learning how to test what. (Like how to test logging in or accessing files. How to abstract yourself from the database. How to do mocking and use a mocking framework. Learn techniques and patterns that increase testability.)
Having some tests is better than having no tests at all.
More tests can be written later when a bug is discovered. Write the test that proves the bug, then fix the bug.
You'll have to practice to get good at it.
So until finding the right way; yeah, it's dull, non rewarding, hard to do, time consuming, etc.
EDIT:
In this blogpost I go in depth on some of the reasons given here against unit testing.
Code Quality is unpopular? Let me dispute that fact.
Conferences such as Agile 2009 have a plethora of presentations on Continuous Integration, and testing techniques and tools. Technical conference such as Devoxx and Jazoon also have their fair share of those subjects.
There is even a whole conference dedicated to Continuous Integration & Testing (CITCON, which takes place 3 times a year on 3 continents).
In fact, my personal feeling is that those talks are so common, that they are on the verge of being totally boring to me.
And in my experience as a consultant, consulting on code quality techniques & tools is actually quite easy to sell (though not very highly paid).
That said, though I think that Code Quality is a popular subject to discuss, I would rather agree with the fact that developers do not (in general) do good, or enough, tests. I do have a reasonably simple explanation to that fact.
Essentially, it boils down to the fact that those techniques are still reasonably new (TDD is 15 years old, CI less than 10) and they have to compete with 1) managers, 2) developers whose ways "have worked well enough so far" (whatever that means).
In the words of Geoffrey Moore, modern Code Quality techniques are still early in the adoption curve. It will take time until the entire industry adopts them.
The good news, however, is that I now meet developers fresh from university that have been taught TDD and are truly interested in it. That is a recent development. Once enough of those have arrived on the market, the industry will have no choice but to change.
It's pretty simple when you consider the engineering adage "Good, Fast, Cheap: pick two". In my experience 98% of the time, it's Fast and Cheap, and by necessity the other must suffer.
It's the basic psychology of pain. When you'ew running to meet a deadline code quality takes the last seat. We hate it because it's dull and boring.
It reminds me of this Monty Python skit:
"Exciting? No it's not. It's dull. Dull. Dull. My God it's dull, it's so desperately dull and tedious and stuffy and boring and des-per-ate-ly DULL. "
I'd say for many reasons.
First of all, if the application/project is small or carries no really important data at a large scale the time needed to write the tests is better used to write the actual application.
There is a threshold where the quality requirements are of such a level that unit testing is required.
There is also the problem of many methods not being easily testable. They may rely on data in a database or similar, which creates the headache of setting up mockup data to be fed to the methods. Even if you set up mockup data - can you be certain the database would behave the same way?
Unit testing is also weak at finding problems that haven't been considered. That is, unit testing is bad at simulating the unexpected. If you haven't considered what could happen in a power outage, if the network link sends bad data that is still CRC correct. Writing tests for this is futile.
I am all in favour of code inspections as they let programmers share experience and code style from other programmers.
"There are the common excuses for not writing tests, but they are only excuses."
Are they? Get eight programmers in a room together, ask them a question about how best to maintain code quality, and you're going to get nine different answers, depending on their age, education and preferences. 1970s era Computer Scientists would've laughed at the notion of unit testing; I'm not sure they would've been wrong to.
Management needs to be sold on the value of spending more time now to save time down the road. Since they can't actually measure "bugs not fixed", they're often more concerned about meeting their immediate deadlines & ship date than the longterm quality off the project.
Code quality is subjective. Subjective topics are always tedious.
Since the goal is simply to make something that works, code quality always comes in second. It adds time and cost. (I'm not saying that it should not be considered a good thing though.)
99% of the time, there are no third party consquences for poor code quality (unless you're making spaceshuttle or train switching software).
Does it work? = Concrete.
Is it pretty? = In the eye of the beholder.
Read Fred Brooks' The Mythical Man Month. There is no silver bullet.
Unit Testing takes extra work. If a programmer sees that his product "works" (eg, no unit testing), why do any at all? Especially when it is not nearly as interesting as implementing the next feature in the program, etc. Most people just tend to be lazy when it comes down to it, which isn't quite a good thing...
Code quality is context specific and hard to generalize no matter how much effort people try to make it so.
It's similar to the difference between theory and application.
I also have not seen unit tests written on a regular basis. The reason for that was given as the code being too extensively changed at the beginning of the project so everyone dropped writing unit tests until everything got stabilized. After that everyone was happy and not in need of unit tests. So we have a few tests stay there as a history but they are not used and are probably not compatible with the current code.
I personally see writing unit tests for big projects as not feasible, although I admit I have not tried it nor talked to people who did. There are so many rules in business logic that if you just change something somewhere a little bit you have no way of knowing which tests to update beyond those that will crash. Who knows, the old tests may now not cover all possibilities and it takes time to recollect what was written five years ago.
The other reason being the lack of time. When you have a task assigned where it says "Completion time: O,5 man/days", you only have time to implement it and shallow test it, not to think of all possible cases and relations to other project parts and write all the necessary tests. It may really take 0,5 days to implement something and a couple of weeks to write the tests. Unless you were specifically given an order to create the tests, nobody will understand that tremendous loss of time, which will result in yelling/bad reviews. And no, for our complex enterprise application I cannot think of a good test coverage for a task in five minutes. It will take time and probably a very deep knowledge of most application modules.
So, the reasons as I see them is time loss which yields no useful features and the nightmare to maintain/update old tests to reflect new business rules. Even if one wanted to, only experienced colleagues could write those tests - at least one year deep involvement in the project, but two-three is really needed. So new colleagues do not manage proper tests. And there is no point in creating bad tests.
It's 'dull' to catch some random 'feature' with extreme importance for more than a day in mysterious code jungle wrote by someone else x years ago without any clue what's going wrong, why it's going wrong and with absolutely no ideas what could fix it when it was supposed to end in a few hours. And when it's done, no one is satisfied cause of huge delay.
Been there - seen that.
A lot of the concepts that are emphasized in modern writing on code quality overlook the primary metric for code quality: code has to be functional first and foremost. Everything else is just a means to that end.
Some people don't feel like they have time to learn the latest fad in software engineering, and that they can write high-quality code already. I'm not in a place to judge them, but in my opinion it's very difficult for your code to be used over long periods of time if people can't read, understand and change it.
Lack of 'code quality' doesn't cost the user, the salesman, the architect nor the developer of the code; it slows down the next iteration, but I can think of several successful products which seem to be made out of hair and mud.
I find unit testing to make me more productive, but I've seen lots of badly formatted, unreadable poorly designed code which passed all its tests ( generally long-in-the-tooth code which had been patched many times ). By passing tests you get a road-worthy Skoda, not the craftsmanship of a Bristol. But if you have 'low code quality' and pass your tests and consistently fulfill the user's requirements, then that's a valid business model.
My conclusion is that developers do not want to write tests.
I'm not sure. Partly, the whole education process in software isn't test driven, and probably should be - instead of asking for an exercise to be handed in, give the unit tests to the students. It's normal in maths questions to run a check, why not in software engineering?
The other thing is that unit testing requires units. Some developers find modularisation and encapsulation difficult to do well. A good technical lead will create a modular architecture which localizes the scope of a unit, so making it easy to test in isolation; many systems don't have good architects who facilitate testability, or aren't refactored regularly enough to reduce inter-unit coupling.
It's also hard to test distributed or GUI driven applications, due to inherent coupling. I've only been in one team that did that well, and that had as large a test department as a development department.
Static code analysis is often played around in small projects, but not really used to enforce coding conventions or find possible errors in enterprise projects.
Every set of coding conventions I've seen which hasn't been automated has been logically inconsistent, sometimes to the point of being unusable - even ones claimed to have been used 'successfully' in several projects. Non-automatic coding standards seem to be political rather than technical documents.
Usually even compiler warnings like potential null pointer access are ignored.
I've never worked in a shop where compiler warnings were tolerated.
One attitude that I have met rather often (but never from programmers that were already quality-addicts) is that writing unit tests just forces you to write more code without getting any extra functionality for the effort. And they think that that time would be better spent adding functionality to the product instead of just creating "meta code".
That attitude usually wears off as unit tests catch more and more bugs that you realize would be serious and hard to locate in a production environment.
A lot of it arises when programmers forget, or are naive, and act like their code won't be viewed by somebody else at a later date (or themselves months/years down the line).
Also, commenting isn't near as "cool" as actually writing a slick piece of code.
Another thing that several people have touched on is that most development engineers are terrible testers. They don't have the expertise or mind-set to effectively test their own code. This means that unit testing doesn't seem very valuable to them - since all of their code always passes unit tests, why bother writing them?
Education and mentoring can help with that, as can test-driven development. If you write the tests first, you're at least thinking primarily about testing, rather than trying to get the tests done, so you can commit the code...
The likelyhood of you being replaced by a cheaper fresh out of college student or outsource worker is directly proportional to the readability of your code.
People don't have a common sense of what "good" means for code. A lot of people will drop to the level of "I ran it" or even "I wrote it."
We need to have some kind of a shared sense of what good code is, and whether it matters. For the first part of that,I have written up some thoughts:
http://agileinaflash.blogspot.com/2010/02/seven-code-virtues.html
As for whether it matters, that's been covered plenty of times. It matters quite a lot if your code is to live very long. If it really won't ever sell or won't be deployed, then it clearly doesn't. If it's not worth doing, it's not worth doing well.
But if you don't practice writing virtuous code, then you can't do it when it matters. I think people have practiced doing poor work, and don't know anything else.
I think code quality is over-rated. the more I do it the less it means to me. Code quality frameworks prefer over-complicated code. You never see errors like "this code is too abstract, no one will understand it.", but for example PMD says that I have too many methods in my class. So I should cut the class into abstract class/classes (the best way since PMD doesn't care what I do) or cut the classes based on functionality (worst way since it might still have too many methods - been there).
Static Analysis is really cool, however it's just warnings. For example FindBugs has problem with casting and you should use instaceof to make warning go away. I don't do that just to make FindBugs happy.
I think too complicated code is not when method has 500 lines of code, but when method is using 500 other methods and many abstractions just for fun. I think code quality masters should really work on finding when code is too complicated and don't care so much about little things (you can refactor them with the right tools really quickly.).
I don't like idea of code coverage since it's really useless and makes unit-test boring. I always test code with complicated functionality, but only that code. I worked in a place with 100% code coverage and it was a real nightmare to change anything. Because when you change anything you had to worry about broken (poorly written) unit-tests and you never know what to do with them, many times we just comment them out and add todo to fix them later.
I think unit-testing has its place and for example I did a lot of unit-testing in my webpage parser, because all the time I found diffrent bugs or not supported tags. Testing Database programs is really hard if you want to also test database logic, DbUnit is really painful to work with.
I don't know. Have you seen Sonar? Sure it is Maven specific, but point it at your build and boom, lots of metrics. That's the kind of project that will facilitate these code quality metrics going mainstream.
I think that real problem with code quality or testing is that you have to put a lot of work into it and YOU get nothing back. less bugs == less work? no, there's always something to do. less bugs == more money? no, you have to change job to get more money. unit-testing is heroic, you only do it to feel better about yourself.
I work at place where management is encouraging unit-testing, however I am the only person that writes tests(i want to get better at it, its the only reason I do it). I understand that for others writing tests is just more work and you get nothing in return. surfing the web sounds cooler than writing tests.
someone might break your tests and say he doesn't know how to fix or comment it out(if you use maven).
Frameworks are not there for real web-app integration testing(unit test might pass, but it might not work on a web page), so even if you write test you still have to test it manually.
You could use framework like HtmlUnit, but its really painful to use. Selenium breaks with every change on a webpage. SQL testing is almost impossible(You can do it with DbUnit, but first you have to provide test data for it. test data for 5 joins is a lot of work and there is no easy way to generate it). I dont know about your web-framework, but the one we are using really likes static methods, so you really have to work to test the code.