Related
In JUnit FAQ you can read that you shouldn't test methods that are too simple to break. While all examples seem logical (getters and setters, delegation etc.), I'm not sure I am able to grasp the "can't break on its own" concept in full. When would you say that the method "can't break on its own"? Anyone care to elaborate?
I think "can't break on its own" means that the method only uses elements of its own class, and does not depend upon the behavior of any other objects/classes, or that it delegates all of its functionality to some other method or class (which presumably has its own tests).
The basic idea is that if you can see everything the method does, without needing to refer to other methods or classes, and you are pretty sure it is correct, then a test is probably not necessary.
There is not necessarily a clear line here. "Too simple to break" is in the eye of the beholder.
Try thinking of it this way. You're not really testing methods. You're describing some behaviour and giving some examples of how to use it, so that other people (including your later self) can come and change that behaviour safely later. The examples happen to be executable.
If you think that people can change your code safely, you don't need to worry.
No matter how simple a method is, it can still be wrong. For example you might have two similarly named variables and access the wrong one. However, these errors will likely be quickly found and once these methods are written correctly, they are going to stay correct and so it is not worthwhile permanently keeping around a test for this. Rather than "too simple to break", I would recommend considering whether it is too simple to be worth keeping a permanent test.
Put it this way, you're building a wood table.
You'll test things that may fail. For instance, putting a jar in the table, or sitting over the table, or pushing the table from one side to another in the room, etc. You're testing table, in a way you know it is somehow vulnerable or at least in a way you know you'll use it.
You don't test though, nails, or one of its legs, because they are "too simple to break on its own".
Same goes for unit testing, you don't test getters/setters, because the only way they may fail, it because the runtime environment fail. You don't test methods that forward the message to other methods, because they are too simple to break on it own, you better test the referenced method.
I hope this helps.
If you are using TDD, that advice is wrong. In the TDD case your function exists because the test failed. It doesn't matter if the function is simple or not.
But if you are adding tests afterwards to some existing code, I can sort of understand the argument that you shouldn't need to test code that cannot break. But I still think that is just an excuse for not having to write tests. Also ask yourself: if that piece of code is not worthy of testing, then maybe that code is not needed at all?
I like the risk based approach of GAMP 5 which basically means (in this context) to first asses the various possible risks of a software and only define tests for the higher-risk parts.
Although this applies to GxP environments, it can be adapted in the way: How likely is a certain class to have erroneous methods, and how big is the impact an error would have? E.g., if a method decides whether to give a user access to a resource, you must of course test that extensively enough.
That means in deterimining where to draw the line between "too simple to break" it can help to take into consideration the possible consequences of a potential flaw.
I don't mean external tools. I think of architectural patterns, language constructs, habits. I am mostly interested in C++
Automated Unit Testing .
There's an oft-unappreciated technique that I like to call The QA Team that can do wonders for weeding out bugs before they reach production.
It's been my experience (and is often quoted in textbooks) that programmers don't make the best testers, despite what they may think, because they tend to test to behaviour they already know to be true from their coding. On top of that, they're often not very good at putting themelves in the shoes of the end user (if it's that kind of app), and so are likely to neglect UI formatting/alignment/usability issues.
Yes, unit testing is immensely important and I'm sure others can give you better tips than I on that, but don't neglect your system/integration testing. :)
..and hey, it's a language independent technique!
Code Review, Unit Testing, and Continuous Integration may all help.
I find the following rather handy.
1) ASSERTs.
2) A debug logger that can output to the debug spew, console or file.
3) Memory tracking tools.
4) Unit testing.
5) Smart pointers.
Im sure there are tonnes of others but I can't think of them off the top of my head :)
RAII to avoid resource leakage errors.
Strive for simplicity and conciseness.
Never leave cases where your code behavior is undefined.
Look for opportunities to leverage the type system and have the compiler check as much as possible at compile time. Templates and code generation are your friends as long as you keep your common sense.
Minimize the number of singletons and global variables.
Use RAII !
Use assertions !
Automatic testing of some nominal and all corner cases.
Avoid last minute changes like the plague.
I use thinking.
Reducing variables scope to as narrow as possible. Less variables in outer scope - less chances to plant and hide an error.
I found that, the more is done and checked at compile time, the less can possibly go wrong at run-time. So I try to leverage techniques that allow stricter checking at compile-time. That's one of the reason I went into template-meta programming. If you do something wrong, it doesn't compile and thus never leaves your desk (and thus never arrives at the customer's).
I find many problems before i start testing at all using
asserts
Testing it with actual, realistic data from the start. And testing is necessary not only while writing the code, but it should start early in the design phase. Find out what your worst use cases will be like, and make sure your design can handle it. If your design feels good and elegant even against these use cases, it might actually be good.
Automated tests are great for making sure the code you write is correct. However, before you get to writing code, you have to make sure you're building the right things.
Learning functional programming helps somehow.
HERE
Learn you a haskell for great good.
Model-View-Controller, and in general anything with contracts and interfaces that can be unit-tested automatically.
I agree with many of the other answers here.
Specific to C++, the use of 'const' and avoiding raw pointers (in favor of references and smart pointers) when possible has helped me find errors at compile time.
Also, having a "no warnings" policy helps find errors.
Requirements.
From my experience, having full and complete requirements is the number one step in creating bug-free software. You can't write complete and correct software if you don't know what it's supposed to do. You can't write proper tests for software if you don't know what it's supposed to do; you'll miss a fair amount of stuff you should test. Also, the simple process of writing the requirements helps you to flesh them out. You find so many issues and problems before you ever write the first line of code.
I find peer progamming tends to help avoid a lot of the silly mistakes, and al ot of the time generates discussions which uncover flaws. Plus with someone free to think about the why you are doing something, it tends to make everything cleaner.
Code reviews; I've personally found lots of bugs in my colleagues' code and they have found bugs in mine.
Code reviews, early and often, will help you to both understand each others' code (which helps for maintenance), and spot bugs.
The sooner you spot a bug the easier it is to fix. So do them as soon as you can.
Of course pair programming takes this to an extreme.
Using an IDE like IntelliJ that inspects my code as I write it and flags dodgy code as I write it.
Unit Testing followed by Continious Integration.
Book suggestions: "Code Complete" and "Release it" are two must-read books on this topic.
In addition to the already mentioned things I believe that some features introduced with C++0x will help avoiding certain bugs. Features like strongly-typed enums, for-in loops and deleteing standard functions of objects come to mind.
In general strong typing is the way to go imho
Coding style consistency across a project.
Not just spaces vs. tab issues, but the way that code is used. There is always more than one way to do things. When the same thing gets done differently in different places, it makes catching common errors more difficult.
It's already been mentioned here, but I'll say it again because I believe this cannot be said enough:
Unnecessary complexity is the arch nemesis of good engineering.
Keep it simple. If things start looking complicated, stop and ask yourself why and what you can do to break the problem down into smaller, simpler chunks.
Hire someone that test/validate your software.
We have a guy that use our software before any of our customer. He finds bugs that our automated tests processes do not find, because he thinks as a customer not as a software developper. This guy also gives support to our customers, because he knows very well the software from the customer point of view. INVALUABLE.
all kinds of 'trace'.
Something not mentioned yet - when there's even semi-complex logic going on, name your variables and functions as accurately as you can (but not too long). This will make incongruencies in their interactions with each other, and with what they're supposed to be doing stand out better. The 'meaning', or language-parsing part of your brain will have more to grab on to. I find that with vaguely named things, your brain sort of glosses over what's really there and sees what is /supposed to/ be happening rather than what actually is.
Also, make code clean, it helps to keep your brain from getting fuzzy.
Test-driven development combined with pair programming seems to work quite well on keeping some bugs down. Getting the tests created early helps work out some of the design as well as giving some confidence should someone else have to work with the code.
Creating a string representation of class state, and printing those out to console.
Note that in some cases single line-string won't be enough, you will have to code small printing loop, that would create multi-line representation of class state.
Once you have "visualized" your program in such a way you can start to search errors in it. When you know which variable contained wrong value in the end, it's easy to place asserts everywhere where this variable is assigned or modified. This way you can pin point the exact place of error, and fix it without using the step-by-step debugging (which is rather slow way to find bugs imo).
Just yesterday found a really nasty bug without debugging a single line:
vector<string> vec;
vec.push_back("test1");
vec.push_back(vec[0]); // second element is not "test1" after this, it's empty string
I just kept placing assert-statements and restarting the program, until multi-line representation of program's state was correct.
How to test a result of a program that is basically a black box? For example one year ago I had to write a B tree as a homework and I really struggled with testing the correctness. What strategies do you use in such scenarios? Visualization? Robust input-->result sets of testing data? What do you do when it is hard to get such data because the only way how to get them is your proper working program?
EDIT: I think that my question was misunderstood. There was no problem with understanding how B tree works. That is trivial. But writing robust tests for validating its proper functionality is not so trivial. I think that this school problem is similar to many practical REAL word scenarios and test cases. And sometimes understanding the domain is quite different from delivering working and correct program...
EDIT2: And yes, with B tree it is possible to validate proper behavior with pen and paper. But this is really dirty and not fun :) This is not working well with problems that requires huge amount of data for their validation...
I'm not sure these answers really capture the problem at hand. A B-tree's input and output aren't any different from those of any other dictionary---but the algorithm performs better, if it's implemented correctly. It's only really got two functions to test (add, and find) so theoretically, "black-box" testing of this single component should be fine. Designing for testability isn't the issue, since no matter how you do it the whole algorithm will be one component.
So the question is: when you have to implement subtle algorithms, the kinds with complicated output that you can't always understand in your head so well, how do you test them? I think there are three different strategies you can use:
Black-box test basic functionality. For the B-tree case, this is things like cwash suggested, and also, things like making sure that when you add an item, you can then find it, etc.
Test certain invariants that your algorithm should maintain (the B-tree should be balanced, values within nodes should be sorted, etc.)
A few, small "pencil-and-paper" tests may be necessary -- work the algorithm out by hand and check that it matches what your code does. But the big-data tests can all be of type 2. These can also be brittle, so unless you need to be really sure about your algorithm, you may want to avoid them.
If you do not grasp the problem at hand, how can you develop a solution to it? My suggestion would be to understand the domain enough to be able to work out the problem on paper and ensure that your program matches.
Consult with an expert on the subject.
I know if I have a convoluted procedure I'm trying to fix, I have no idea what the output should be after my changes, so I need to consult a fellow developer with more knowledge of the business need, and they are able to verify what I've done is correct.
I would focus on constructing test cases that exercise the functionality of your B-tree algorithm. I haven't looked at it for years, but I'm fairly sure you'll be able to find a documented sequence of steps to insert a set of values in a specific order, then validate that the leaf nodes are as they should be. If you construct your testing along these lines, you should be able to prove your implementation is correct.
The key is to know there is a balance between testing something to death and doing tests that adequately cover what should be covered. Edge cases, e.g null inputs or checking inputs are numeric by testing an alphabet character or a punctuation character, are likely most of the tests you'd need. To complement this there may be one or two common cases to handle to show the program can handle a non-edge case as well. To cover all valid input in most programs is overkill and would result in an overwhelmingly large amount of tests.
I think the answer to the question you're asking boils down to designing for testability. Often you get a testable design for free when you test-drive the development of the solution. But let's face it, when you're implementing a highly mathematical algorithm, this just doesn't fall out.
To make sure you have a testable design, you need to understand what a seam is. Then you need to know a few rules of thumb, such as avoiding statics, using polymorphism, and properly decomposing problems and separating concerns.
Watch "The Clean Code Talks -- Unit Testing" by Misko Hevery, I think it will help you wrap your head around it.
Try looking at it from a requirements point of view, rather than an implementation point of view. Before you write code, you must understand exactly what you want it to do.
Testing and requirements should be a matching pair. If you're having trouble defining tests, maybe it's because the requirements are not well-defined. That in turn implies that you may have bugs that aren't so much implementation bugs, but "lack of clear requirements" bugs. The code writer in that case would be working to a mental list of requirements that he/she thinks is requirements, but can't be sure, and they're not written down for independent understanding and verification.
I've struggled with software where the requirements weren't clear, because the customer couldn't even tell us what they wanted. But when we delivered to them, they sure could tell us then what they didn't like about it! A big part of software engineering is getting the requirements right before the coding begins. This is true on the high-level (overall product, with requirement input from customer) and also the smaller level (modules, individual functions, where requirements are internally defined by software team or individuals). It is still true to some degree I think for iterative development, although the high-level requirements are more fluid.
#Bystrik Jurina,
I often get involved in projects which involve conversions between disparate data formats. Most answers have focused on testing a B-tree or similar algorithm, but it seems that you're looking for a more general answer.
Most of my work is based on the command line. It may sounds like a contradiction, but one of the first tools I use is visualization. I'll write some methods to write out my data structures in a format that's easy to consume. This can (and usually does) include something that's visually clear. But often it also means something that I could easily parse with a smaller test program, or even import into Excel.
I'll start by focusing on the basic outline, and write a program that does the bare minimum of what I need to accomplish. If it's a multi-step process, this might mean implementing one step at a time and validating the results of each step before moving on. Or writing something that works only in specific cases, and then expanding the set of cases where it's expected to work. At first you can validate that the code works in the limited set of cases, such as for known input data. As the project moves forward, you can start logging warnings for cases you might not have tested, or for unexpected types of input data. This has drawbacks, but is a nice approach when you're dealing with a known set of input data
Validation techniques can include formal test cases, or informal programs that work to challenge your assumptions. It could mean writing a basic driver program to exercise the "core" routines. A good example would be to add a record to a database, then read it back and compare the original object against the one loaded from the database.
If you have trouble wrapping your head around the way a program functions, think about what it needs to accomplish. It might be easier to writing code that tests the way different inputs produce different outputs. Producing visualizations is a good help, because the act of deciding how to display the data can make you think about different conditions and focus in on the most critical parts of your data structures.
Often I've found that building a visualization brings me to admit that the way the data is being stored just isn't very clear. For a B-tree, the representation isn't very flexible. But for other cases, you may be using parallel arrays when a nested tree of objects would be more natural.
I'm creating a design document for a security subsystem, to be written in C++. I've created a class diagram and sequence diagrams for the major use cases. I've also specified the public attributes, associations and methods for each of the classes. But, I haven't drilled the method definitions down to the C++ level yet. Since I'm new to C++ , as is the other developer, I wonder if it might not make sense go ahead and specify to this level. Thoughts?
edit: Wow - completely against, unanimous. I was thinking about, for example, the whole business about specifying const vs. non-const, passing references, handling default constructor and assigns, and so forth. I do believe it's been quite helpful to spec this out to this level of detail so far. I definitely have gotten a clearer idea of how the system will work. Maybe if I just do a few methods, as an example, before diving into the code?
I wouldn't recommend going to this level, but then again you've already gone past where I would go in a design specification. My personal feeling is that putting a lot of effort into detailed design up-front is going to be wasted as you find out in developing code that your guesses as to how the code will work are wrong. I would stick with a high-level design and think about using TDD (test driven development) to guide the low-level design and implementation.
I would say it makes no sense at all, and that you have gone too far already. If you are new to C++ you are in no position to write a detailed design document for a C++ project. I would recommend you try to implement what you already have in C++, learn by the inevitable mistakes (like public attributes) and then go back and revise it.
Since you're new, it probably makes sense not to drill down.
Reason: You're still figuring out the language and how things are best structured. That means you'll make mistakes initially and you'll want to correct them without constantly updating the documentation.
It really depends on who the design document is targeted at. If it's for a boss who is non-technical, then you are good with what you have.
If it's for yourself, then you are using the tool to help you, so you decide. I create method level design docs when I am creating a project, but it's at a high level so I can figure out what the features of the various classes should be. I've found that across languages, the primary functionalities of a class have little to do with the programming language we are working in. Some of the internal details and functions required certainly vary due to the chosen language, but those are implementation details that I don't bother with during the design phase.
It certainly helps me to know that for instance an authorization class might have an authenticate function that takes a User object as a parameter. I don't really care during design that I might need an internal string md5 function wrapper to accomplish some specific goal. I find out about that while coding.
The goal of initial design is to get organized so you can make progress with clarity and forethought rather than tearing out and reimplementing the same function 4 times because you forgot some scenario due to not planning.
EDIT: I work in PHP a lot, and I actually use PhpDoc to do some of the design docs, by simply writing the method signature with no implementation, then putting a detailed description of what the method should do in the method header comments. This helps anyone that is using my class in the future, because the design IS the documentation. I can also change the documentation if I do need to make some alterations while coding.
I work in php4 a lot, so I don't get to use interfaces. In php5, I create the interface, then implement it elsewhere.
The best way to specify how the code should actually fit together is in code. The design document is for other things that are not easily expressed in code. You should use it for describing the actual need the program fills, How it interacts with users, what the constraints are in terms of hardware and operating systems. Certainly describe the overall architecture of your application in a design document, but, for instance, the API should actually be described in the code that exposes the API.
You have already gone far enough with the documentation part. As you still a beginner in C++, when you would understand the language, you might want to change the structure of your program. Then you would have to do changes in the documentation. I would suggest that you have already gone too far with the documentation. No need to drill more into it
Like everyone else says, you've gone way past where you need to go with the design. Do you have a good set of requirements to the simple true/false statement level that you derived that design from? You can design all day long, but if you don't have requirements that simply say WHAT you're going to do, it doesn't matter how good your design is.
Me and a colleague are starting a new project and attempting to take full advantage of TDD. We're still figuring out all the concepts around unit testing and are thus far basing them primarily on other examples.
My colleague recently brought into question the point of the NUnit syntax helpers and I'm struggling to explain their benefit (since I don't really understand it myself other than my gut says they're good!). Here is an example assertion:
Assert.That(product.IsValid(), Is.False);
To me this makes complete sense, we're saying we expect the value of product.IsValid() to be false. My colleague on the other hand would prefer us to simply write:
Assert.That(!product.IsValid());
He says to him this makes more sense and he can read it easier.
So far the only thing we can agree on is that you are likely to get more helpful output when the test is failing from the former, but I think there must be a better explanation. I've looked up some information on the syntax helpers (http://nunit.com/blogs/?p=44) and they make sense, but I don't fully understand the concept of constraints other than they 'feel' right.
I wonder if someone could explain why we use the concept of constraints, and why they improve the unit test examples above?
Thanks.
I think it's mostly to do with the pure English reading of the statement.
The first reads
Assert that product is valid is false
The second reads
Assert that not product is valid
I personally find the first easier to process. I think it's all down to preference really. Some of the extension methods out there are interesting though that let you do you assertions like this:
product.IsValid().IsFalse();
I can see your version being better than your colleagues. However, I'd still be at least as comfortable with:
Assert.IsFalse(product.IsValid());
If you can convince me that the Assert.That syntax has an objective benefit over the above, I'd be very interested :) It may well just force of habit, but I can very easily read the "What kind of assertion are we making? Now what are we asserting it about?" style.
It's all sugar. Internally they're converted to constraints.
From Pragmatic Unit Testing, pg 37:
"NUnit 2.4 introduced a new style of assertions that are a little less procedural and allow for a more object oriented underlying implementation.
...
For instance:
Assert.That(actual, Is.EqualTo(expected));
Converts to:
Assert.That(actual, new EqualConstraint(expected));"
Using constraints also allows you to inherit from Constraint and create your own custom constraint(s) while keeping a consistent syntax.
I don't like Assert.That, specifically the fact that its most common scenario (comparing two objects' equality) is measurably worse than the 'classic' Assert.AreEqual() syntax.
On the other hand, I super love the MSpec NUnit extensions. I recommend you check them out (or look at the SpecUnit extensions, or NBehave extensions, or NBehaveSpec*Unit extensions, I think they're all the same).