Mutation testing has been out there for a while now, and it seems there are at least one or two commercial mutation testing frameworks for C/C++. Have you used them? What are your experiences? Are there any open source alternatives?
A brief search resulted in:
PlexTest:
http://www.itregister.com.au/products/plextest_detail.htm
Insure++:
http://www.parasoft.com/jsp/products/insure.jsp;jsessionid=baacpvbaDywLID?itemId=63
MILU (may be only for C):
http://www.dcs.kcl.ac.uk/pg/jiayue/milu/
With that said, you need to realize that mutation testing isn't particularly useful (at least from some stuff I've previously read). It's an interesting tool when faced with hard (metaphorically speaking) asserts and for making sure that data requirements are heeded to (when dealing with if and only if situations).
In my opinion, there are much more established ways of analyzing the robustness of code.
Notice that Parasoft's tool only generate equivalent mutations. That echoes the problem described on Wikipedia article about Mutation Testing - it is hard to distinguish between equivalent and non-equivalent mutations so they decided to stick with equivalent.
I tried another interesting tool that can automatically discover invariants in instrumented C and C++ code - it is called "Daikon". Essentially it is doing same thing as tool that generates equivalent mutations, but instead of identifying problematic code it gives you a set of invariants such as "A == B + 1". I think invariants are more useful because when you look at discovered invariant it gives you assurance that your code is correct if invariant make sense, and then you can convert invariants into asserts and that gives you more confidence when you change code.
A straight forward python script for mutating c programs is available at:
https://github.com/parunbabu/mutate.py
the author says it works better if the code under test is de-commented and indented.
and it is also free and opensource ... i think this is what you are looking for.
Mull is LLVM-based and seems to be actively developed and easy to use.
dextool mutate also LLVM-based and actively developed, more complicated to use but has more features like re-running alive mutants and only mutate introduced changes based on a git diff
I have no experience with it but Mutate++ seems to be an option that is missing from the ones already mentioned.
Mutate++ - A C++ Mutation Test Environment
The existing frameworks where way too time-consuming to set up and use so I did my own implementation, a quick and easy solution that should work on any machine. There is binaries available for MacOSX, Windows and RaspberryPi (Linux):
https://github.com/RagnarDa/dumbmutate
Hope it helps anyone!
Related
I'm developing cross-platform project that would support :
Four C++ compilers - GCC, MSVC, SunStudio, Intel,
Five Operating Systems: Linux, OpenSolaris, FreeBSD, Windows, Mac OS X.
I totally understand that without proper unit testing there is no chance to perform proper QA on all these platforms.
However, as you all know writing unit tests is extremely boring and slow down development process (because it is boring and development of FOSS software shouldn't be such)
How do you manage to write good unit-testing code and not stop writing code.
If you at least get salary for this, you can say - at least I get something for this, but if you don't, this is much harder!
Clarification:
I understand that TDD should be the key, but TDD has following very strict restrictions:
You have exact specifications.
You have fully defined API.
This is true for project that is developed in customer-provider style, but it can't be done for project that evolves.
Sometimes to decide what feature do I need, I have to create something and understand if it works well, if API is suitable and helps me or it is ugly and does not satisfy me.
I see the development process more like evolution, less development according to specifications. Because when I begin implementing some feature, sometimes I do not know if
it would work well and what model would it use.
This is quite different style of development that contradicts TDD.
On the other hand, support of wide range of systems requires unit tests to make sure that
existing code works on various platform and if I want to support new one I only need to
compile the code and run tests.
I suggest to do no unit test at all. Work a bit on the project and see where it leads. If you cannot put enough motivation into doing the obviously right thing then work a bit on your problem do some refactoring, some bug fixing and multiple releases. If you then see what kinds of problems pop up think of TDD as one of the possible tools to solve them.
The problems can be
low quality
high bug fixing costs
reluctance to refactor (i.e. fear to change existing code)
suboptimal APIs (APIs are used to late to change)
high testing costs (i.e. manuall testing)
There is a big difference between theoretically knowing that unit testing and test first are the right approaches and experiencing the pain and learning from that experience. Motivation will come with this experience.
TDD is not a panacea. It can be implemented in a horrible fashion. It should not become a check box in your project check list.
Personally, I don't find testing boring. It's the first time I get to see my code actually run and find out whether it works or not!
Without some form of test program to run the new code directly, I wouldn't get to see it run until after I've built a user interface and wired it all together to make the new bits available through the UI and then, when it doesn't work the first time, I have to try to debug the new code, plus the UI, plus the glue that holds them together and dear god, I don't even know what layer the bug is in, never mind trying to identify the actual offending code. And even that much is assuming I still remember what I was working on before I went off on an excursion into UI-land.
A proper test harness bypasses all that and lets me just call the new code, localize any bugs to the tested section of code so they can be found quickly and fixed easily, see that it produces the right results, get my "it works!" rush, and move on to the next bit of code and my next rush of reward as quickly as possible.
write them jumping from unit tests to code to unit test to code... and so on.
Unit tests should follow all the best practices of production code, such as the DRY principle. If you get bored writing unit tests, you will also get bored writing production code.
Test-Driven Development (TDD) may help you, though, as you constantly switch back and forth between writing a unit test and then a bit of production code.
As others have told you: writing the tests first makes it fun. Your statements that it can't be done for a project that evolves need to be reconsidered. Actually the opposite is true. If you are going the agile route, you are highly discouraged to define everything up front. TDD fits in a paradigm that this is impossible and that change will happen. A book that makes this very clear, and gives examples of this is applying uml and patterns.
Try using TDD (Test Driven Development) - instead of writing your tests after the actual coding was done write them before and let them drive your design.
Due to the nature of the project a fair amount of automation is required - find a way to write the test once for one OS/compiler and then run it for all of the other options available.
Personally, I find writing code that I know works is quite exhilarating. But if you don't want to be bored writing unit tests then you'll need to cultivate a fascination for debugging.
To be serious, if you think that writing unit tests is boring and slow, you really need to re-address how you write them. I suggest you investigate using Test Driven Development. Write the tests in the programming language and run them automatically. Use the feedback from the tests to shape your code.
There are Test First frameworks for pretty much any language you care to mention, inspired by Kent Beck and Erich Gamma's work with JUnit. The Wikipedia article on TDD has more info, including a helpful link to a list of frameworks organized by language. Find out more.
I got a task related to ANCIENT C++ project which hasn't any documentation, comments at all and all code/variables is written in foreign language. Do I have a chance to analyze this code in a 1 working day and make a design/UML to create new features? I have been sitting around for 3 hours already and I feel so frustrated... Maybe somebody also had same problem? Any advice?
BR,
I suspect the biggest issue may be the fact that it's in a foreign language. You can use various static code analysis tools to try and understand what's going on, but if everything is presented in an unfamiliar language then that's still no use. Your first step (I believe) is to find someone who can speak this language and get them to translate as you go...
1) Use Doxygen , You can configure doxygen to extract the code structure from undocumented source files.
2) Use source Insight, Source Insight is an advanced code editor and browser with built-in analysis for C/C++, C#, and Java programs
Short answer, no - you probably don't have a chance to understand the code in one day. Reading/maintaining code is one of the hardest things to do, especially when it's lacking documentation. The fact that the code is in a foreign language (!) makes it even harder.
Sounds like you are on a very restricted (unrealistic) time-budget, but Working With Legacy Software is a good book if you're working with legacy systems. If you are planning to keep adding new features to the legacy system it's your responsibility to make your management aware of the scope of the operation. Or at least try.
Under this time constraint (1 day) it may or may not be doable depending on the size of the project - if its a few hundred lines of code then for sure. If its a serious project with several tens of thousands code lines, then likely no.
The first thing you need to know is what is this program supposed to do at all. If you have no idea what it does and how it does it, then analyzing the code will give you the answer but it will be a long and frustrating task. So my first suggestion would be to get yourself familiar with the outer workings of the software - what does it supposed to do and generally how it is supposed to do it. If you are doing it as part as your work then you should be able to get someone to walk you through using the program - even if its UI is in a foreign language (which I hope it doesn't, even if the code is written by a foreign language speaker).
Once you know what the software is attempting to do, then it should be fairly straight forward (even if lengthy and daunting) to rewrite all the comments in your own language for you to understand. I suggest doing so in a bottoms-up approach: its easier to understand the small and trivial things a program does, then to understand the top-level logic - and a lot of trivial things in order make up the logic of the software.
Only once you understand - to a large degree, anyway - the inner workings of the program you may write its functional spec and work on features.
Non-free way on Windows:
You can use CppDepend. This application is able to parse your visual project or your source files. It gives you a lot of information like dependency trees. You can try the trial (Maybe it will be enough for what you have to do).
Free way multi-platform:
You can use doxygen with a special configuration (extract code structure from undocumented code) and analyze the result.
I was quite happy with a tool called Understand (15-day eval license available) for this kind of task. However, I agree with Guss that the time you'll need depends a lot on the size of the code, and one day is probably just enough for a small program.
cscope & ctags are a must when I do my own code, and even more when looking to other's code.
You may also try this ::
http://www.sgvsarc.com/product_crystalflow.htm
I don't mean external tools. I think of architectural patterns, language constructs, habits. I am mostly interested in C++
Automated Unit Testing .
There's an oft-unappreciated technique that I like to call The QA Team that can do wonders for weeding out bugs before they reach production.
It's been my experience (and is often quoted in textbooks) that programmers don't make the best testers, despite what they may think, because they tend to test to behaviour they already know to be true from their coding. On top of that, they're often not very good at putting themelves in the shoes of the end user (if it's that kind of app), and so are likely to neglect UI formatting/alignment/usability issues.
Yes, unit testing is immensely important and I'm sure others can give you better tips than I on that, but don't neglect your system/integration testing. :)
..and hey, it's a language independent technique!
Code Review, Unit Testing, and Continuous Integration may all help.
I find the following rather handy.
1) ASSERTs.
2) A debug logger that can output to the debug spew, console or file.
3) Memory tracking tools.
4) Unit testing.
5) Smart pointers.
Im sure there are tonnes of others but I can't think of them off the top of my head :)
RAII to avoid resource leakage errors.
Strive for simplicity and conciseness.
Never leave cases where your code behavior is undefined.
Look for opportunities to leverage the type system and have the compiler check as much as possible at compile time. Templates and code generation are your friends as long as you keep your common sense.
Minimize the number of singletons and global variables.
Use RAII !
Use assertions !
Automatic testing of some nominal and all corner cases.
Avoid last minute changes like the plague.
I use thinking.
Reducing variables scope to as narrow as possible. Less variables in outer scope - less chances to plant and hide an error.
I found that, the more is done and checked at compile time, the less can possibly go wrong at run-time. So I try to leverage techniques that allow stricter checking at compile-time. That's one of the reason I went into template-meta programming. If you do something wrong, it doesn't compile and thus never leaves your desk (and thus never arrives at the customer's).
I find many problems before i start testing at all using
asserts
Testing it with actual, realistic data from the start. And testing is necessary not only while writing the code, but it should start early in the design phase. Find out what your worst use cases will be like, and make sure your design can handle it. If your design feels good and elegant even against these use cases, it might actually be good.
Automated tests are great for making sure the code you write is correct. However, before you get to writing code, you have to make sure you're building the right things.
Learning functional programming helps somehow.
HERE
Learn you a haskell for great good.
Model-View-Controller, and in general anything with contracts and interfaces that can be unit-tested automatically.
I agree with many of the other answers here.
Specific to C++, the use of 'const' and avoiding raw pointers (in favor of references and smart pointers) when possible has helped me find errors at compile time.
Also, having a "no warnings" policy helps find errors.
Requirements.
From my experience, having full and complete requirements is the number one step in creating bug-free software. You can't write complete and correct software if you don't know what it's supposed to do. You can't write proper tests for software if you don't know what it's supposed to do; you'll miss a fair amount of stuff you should test. Also, the simple process of writing the requirements helps you to flesh them out. You find so many issues and problems before you ever write the first line of code.
I find peer progamming tends to help avoid a lot of the silly mistakes, and al ot of the time generates discussions which uncover flaws. Plus with someone free to think about the why you are doing something, it tends to make everything cleaner.
Code reviews; I've personally found lots of bugs in my colleagues' code and they have found bugs in mine.
Code reviews, early and often, will help you to both understand each others' code (which helps for maintenance), and spot bugs.
The sooner you spot a bug the easier it is to fix. So do them as soon as you can.
Of course pair programming takes this to an extreme.
Using an IDE like IntelliJ that inspects my code as I write it and flags dodgy code as I write it.
Unit Testing followed by Continious Integration.
Book suggestions: "Code Complete" and "Release it" are two must-read books on this topic.
In addition to the already mentioned things I believe that some features introduced with C++0x will help avoiding certain bugs. Features like strongly-typed enums, for-in loops and deleteing standard functions of objects come to mind.
In general strong typing is the way to go imho
Coding style consistency across a project.
Not just spaces vs. tab issues, but the way that code is used. There is always more than one way to do things. When the same thing gets done differently in different places, it makes catching common errors more difficult.
It's already been mentioned here, but I'll say it again because I believe this cannot be said enough:
Unnecessary complexity is the arch nemesis of good engineering.
Keep it simple. If things start looking complicated, stop and ask yourself why and what you can do to break the problem down into smaller, simpler chunks.
Hire someone that test/validate your software.
We have a guy that use our software before any of our customer. He finds bugs that our automated tests processes do not find, because he thinks as a customer not as a software developper. This guy also gives support to our customers, because he knows very well the software from the customer point of view. INVALUABLE.
all kinds of 'trace'.
Something not mentioned yet - when there's even semi-complex logic going on, name your variables and functions as accurately as you can (but not too long). This will make incongruencies in their interactions with each other, and with what they're supposed to be doing stand out better. The 'meaning', or language-parsing part of your brain will have more to grab on to. I find that with vaguely named things, your brain sort of glosses over what's really there and sees what is /supposed to/ be happening rather than what actually is.
Also, make code clean, it helps to keep your brain from getting fuzzy.
Test-driven development combined with pair programming seems to work quite well on keeping some bugs down. Getting the tests created early helps work out some of the design as well as giving some confidence should someone else have to work with the code.
Creating a string representation of class state, and printing those out to console.
Note that in some cases single line-string won't be enough, you will have to code small printing loop, that would create multi-line representation of class state.
Once you have "visualized" your program in such a way you can start to search errors in it. When you know which variable contained wrong value in the end, it's easy to place asserts everywhere where this variable is assigned or modified. This way you can pin point the exact place of error, and fix it without using the step-by-step debugging (which is rather slow way to find bugs imo).
Just yesterday found a really nasty bug without debugging a single line:
vector<string> vec;
vec.push_back("test1");
vec.push_back(vec[0]); // second element is not "test1" after this, it's empty string
I just kept placing assert-statements and restarting the program, until multi-line representation of program's state was correct.
I have been using TDD to drive the project that I am currently working on and the results have been fairly satisfying. I did run into a problem (described here; still without a solution or any suggestions!) where there are some aspects of a particular method which may not be able to be tested (as in my example; briefly, I want to be able to handle a ManagementException which has a specific ErrorCode - but it doesn't seem possible for me to set up a test which throws a ManagementException like that).
So, how does one deal with that? Do we simply accept the fact that some logical paths are untestable (because of the framework that we are working in or limitations in the testing framework(s) that are currently available)?
Some designs do not lend themselves to testability.. especially ones that do not have testability as one of the design goals. Generally TDDed designs do not fall into this category.
To answer your original question, I've posted a response which involves using reflection to slot in the requested error code. However this may not work in all situations and is not a general solution.
The tradeoff here is the effort in writing the test vs the benefit of having that particular piece of code under automated tests. If you feel that the cost to benefit ratio is huge and probability of failure is miniscule, you may write it up as an exceptional manual test, a comment to future developers and verify it manually for now. I'd say be pragmatic, if you've spent 30-40 mins of a couple of developers' brain time trying to get it under test, maybe you need to step back and rethink your strategy. Have a look at Michael Feather's 'Working effectively with legacy code' on some suggestions to overcome barriers to testability.
I don't think you could say that anything is logically untestable, but you will certainly find areas of code where the effort required to test them would be better spent elsewhere.
This is a great question, and one which I also found myself contemplating recently.
So first, I wouldn't say some logical paths are "untestable" - at most they are probably very hard to test with automatic unit testing. You could probably still test most of these problematic paths with some serious heavy duty system tests.
Consider this - anything you test can be thought to run inside a virtual machine under your control and you can (theoretically) simulate every aspect of its operation in order to test your software. Whether or not this is practical for most applications is another question.
I've just tried answering your original question (and collided in midflight with somebody else saying the same thing more concisely, or most of it at least;-). Anyway, there surely exist frameworks that are way too rigid (thanks to private and friends), and if you can't use introspection to go around that (despite having done all proper incantations), then you're just using a language that's too rigid as well as a framework that is.
I'd be astonished if that was the case in an overall system that supports dynamic languages (as .NET now does) such as IronRuby and IronPython -- maybe if C# won't let you go around accessibility limitations via introspection, the dynamic languages could serve.
That said, it is surely possible for the overall environment to be designed so badly and so rigidly to make it impossible to unit-test certain things -- even though I'm not entirely convinced that this is the case in your current situation.
Some things cannot be tested in an automated unit test because the language/framework/situation is just not open to it. The way to handle that is to reduce that area as much as possible and keep it so simple that it is highly unlikely to be a source of bugs or behavior changes later on.
There is also more to testing than just unit testing, and those areas (such as Acceptance testing, QA, etc.) are not covered by unit testing as well.
I have to do enhancements to an existing C++ project with above 100k lines of code.
My question is How and where to start with such projects ?
The problem increases further if the code is not well documented.
Are there any automated tools for studying code flow with large projects?
Thanx,
Use Source Control before you touch anything!
There's a book for you: Working Effectively with Legacy Code
It's not about tools, but about various approaches, processes and techniques you can use to better understand and make changes to the code. It is even written from a mostly C++ perspective.
First study the existing interface well.
Write tests if they are absent, or expand already written ones.
Modify the source code.
Run tests to check if the modification somehow breaks the older behaviour.
There is another good book, currently freely available on the net, about object oriented reengineering : http://www.iam.unibe.ch/~scg/OORP/
The book "Code Reading" by Diomidis Spinellis contains lots of advice about how to gain an overview and in-depth knowledge about larger, unknown projects.
Chapter 6 is focuses sonely on that topic (Tacking Large Projects). Also the chapters about tooling (Ch. 9) and architecture (Ch. 8) might contain nice hints for you.
However, the book is about understanding (by reading) the "code". It does not tackle directly the maintenance step.
First thing I would do is try to find the product's requirements.
It's almost unthinkable that a product of this size would be developed without requirements.
By perusing the requirements, you'll be able to:
get a sense of what the product (and hence the code) is at least supposed to be doing
see just how well (or poorly) the code actually fulfills those requirements
Otherwise you're just looking at code, trying to divine the intention of the developers...
If you are able to run the code in a PC, you can try to build a callgraph usually from a profiling output.
Also cross referencing tools like cscope, ctags, lxr, etc. Can help a lot. A
Spending some time reading, building class diagrams or even adding comments to the parts of the code you took long to understand are steps towards getting familiar with the codebase and getting ready to modify/extend it.
The first thing you need to do is understand how the code works. Read what documentation there is and then watch the program operate under a debugger. If you watch the main function/loop and then slowly work your way deeper into the program, you can gain a pretty good idea how things are operating. Make sure you write down your findings so others who follow after you have a better position to start from.
Running Doxygen with the EXTRACT_ALL tag set to document all the relationships in the code base. It's not going to help you with the code flow, but hopefully it will shed some light with regards to the structure and design of the entire application.
A very good austrian programmer once told me that in order to understand a program you first have to understand the data-structures that the program uses.