Best practice approach for automated testing - unit-testing

This is a very strange request for advice for which I truly feel there is no real answer. In my project I have archiving routines on various objects that have been consumed for logical calculations, I archive these items for the sake of audit trail and to check up on calculation errors or prove correctiveness at a later stage. I am working with Entity Framework and things are slightly different to perhaps your own project.
I consume the original object, modify it directly, create a clone of the modified item, revert the original item from store and save changes accordingly. An object is not reverted to original if never consumed by a calculation, in these instances, I save directly over that object along with the various relationships that exist with further objects.
This may sound long winded, but I assure you - it seems the easiest so far in terms of my workings with EF in my situation.
My trouble with these archiving routines is, that over time as I introduce further functionality - I sometimes, without knowing, break critical code to a point where I have to regression test the entire solution over, from beginning to end, to ensure that the archiving requirements remain intact.
Is there any unit test approach or automated methodology for testing these sorts of requirements. It would speed up deployment of packages cutting down on my own manual testing.
Any advice or links to simlar situations appreciated.

I think there are two pieces to this problem you are describing:
First you need some unit tests that you can build which will represent technical requirements of the system. Think of the unit tests as the rules which you have set up to technically accomplish the goal that the end user desires. In this way, I would craft unit tests that you can feel confident will break if a technical assumption you had made about the system fails because of a code change. Remember to keep the unit tests at the unit level so that you don't have a large amount of dependencies interacting to fail a test. A unit test should test exactly one thing. If you do this, when you make code changes you can run all your unit tests and immediately know what assumptions you had made about the system which are now not being met.
I would also set up some sort of integration functional tests which are automated. I think in your problem domain it would make sense to set up integrated tests which are similar to unit tests (you can use the same tool.) Here you will want to take bigger pieces of functionality, perhaps pipes which data flows through the system and test that the correct series of transformations occur on the data.

One best practice is to make sure the tests can be run in any order. You could separate the produce routines from the archive routines, perhaps by using "gold" data on the archive routing.

The number one best practice for unit tests is just do it! Beyond that, I'd like to recommend xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros.

Related

How to Unit Test Against Remote Resources

I've been trying to learn how to properly unittest and set up unit tests for all of my code on a new project. The project I'm currently doing this for requires me running a lot of actions against Google BigQuery (i.e. create tables, insert, query, delete). I'm feeling like I can't truly test all of this functionality by mocking BigQuery because the actions I do against it are complicated and interdependent, and if there's a break in the middle somewhere, I want to catch it. Is it generally frowned upon to have something like an environment variable that specifies a test account built into my unit tests so they actually run against the remote service? This feels like the best way to truly test everything and hit tests that I couldn't hit with a mock. So, is this something people do? Are there some major downsides to doing things this way?
I tend to have a mix of unit and integration tests in my project. I believe both are equally valuable, but one thing to keep in mind when doing integration testing is to ensure that the tests are stable and repeatable.
There are several approaches, but I favor the approach of making the tests self sufficient by ensuring that all data dependencies are built in the test itself. This is important since you avoid failing tests due to failed assumptions about existing data in your data source.
A variation on this is to have a scaffolding script populate your data source with fixed test data. I find this to be less manageable since it can introduce dependencies between tests and changing the test data for one test may cause failure in another.
What you're looking to do is technically called integration tests but I do see your point. I myself am doing both as well currently. My interaction in my integration tests is with a database. I find that these integration tests often catch way more errors than true unit tests and are generally more beneficial. I will say however that unit tests are important as well.
I have found that integration tests can tend to take a much longer time since it's doing all this interaction and if this is a part of your nightly build process for example this can greatly increase the amount of time it takes for a build to complete. Some of our builds take close to an hour at this point to complete which is sometimes a problem for us.
I will say when you introduce things like environment variables into the mix you have to start making sure that every developer on the team has this environment variable if they want to run the tests. As a general rule of thumb I try to make it as simple as possible for everyone to build and run tests directly out of source control. There is nothing more frustrating than not being able to build source code or execute unit tests directly out of source control.
It's helpful to think of things like BigQuery as just implementation details; means to an end.
Something in your application currently says "I need x - I'll use BigQuery to get it." Instead of having explicit knowledge of BigQuery, this thing could instead have knowledge of "some entity capable of getting x". This is the location of a seam, and is where mocking would take place.
You mentioned that you don't want to mock all of the objects involved in creating a BigQuery request. You are absolutely right in avoiding this. That doesn't mean that you can't mock out BigQuery, though; you just need to move up a rung.

What are the pros and cons of automated Unit Tests vs automated Integration tests?

Recently we have been adding automated tests to our existing java applications.
What we have
The majority of these tests are integration tests, which may cover a stack of calls like:-
HTTP post into a servlet
The servlet validates the request and calls the business layer
The business layer does a bunch of stuff via hibernate etc and updates some database tables
The servlet generates some XML, runs this through XSLT to produce response HTML.
We then verify that the servlet responded with the correct XML and that the correct rows exist in the database (our development Oracle instance). These rows are then deleted.
We also have a few smaller unit tests which check single method calls.
These tests are all run as part of our nightly (or adhoc) builds.
The Question
This seems good because we are checking the boundaries of our system: servlet request/response on one end and database on the other. If these work, then we are free to refactor or mess with anything inbetween and have some confidence that the servlet under test continues to work.
What problems are we likely to run into with this approach?
I can't see how adding a bunch more unit tests on individual classes would help. Wouldn't that make it harder to refactor as it's much more likely we will need to throw away and re-write tests?
Unit tests localize failures more tightly. Integration-level tests more closely correspond to user requirements and so are better predictor of delivery success. Neither of them is much good unless built and maintained, but both of them are very valuable if properly used.
(more...)
The thing with units tests is that no integration level test can exercise all the code as much as a good set of unit tests can. Yes, that can mean that you have to refactor the tests somewhat, but in general your tests shouldn't depend on the internals so much. So, lets say for example that you have a single function to get a power of two. You describe it (as a formal methods guy, I'd claim you specify it)
long pow2(int p); // returns 2^p for 0 <= p <= 30
Your test and your spec look essentially the same (this is sort of pseudo-xUnit for illustration):
assertEqual(1073741824,pow2(30);
assertEqual(1, pow2(0));
assertException(domainError, pow2(-1));
assertException(domainError, pow2(31));
Now your implementation can be a for loop with a multiple, and you can come along later and change that to a shift.
If you change the implementation so that, say, it's returning 16 bits (remember that sizeof(long) is only guaranteed to be no less than sizeof(short)) then this tests will fail quickly. An integration-level test should probably fail, but not certainly, and it's just as likely as not to fail somewhere far downstream of the computation of pow2(28).
The point is that they really test for diferent situations. If you could build sufficiently details and extensive integration tests, you might be able to get the same level of coverage and degree of fine-grained testing, but it's probably hard to do at best, and the exponential state-space explosion will defeat you. By partitioning the state space using unit tests, the number of tests you need grows much less than exponentially.
You are asking pros and cons of two different things (what are the pros and cons of riding a horse vs riding a motorcycle?)
Of course both are "automated tests" (~riding) but that doesn't mean that they are alternative (you don't ride a horse for hundreds of miles, and you don't ride a motorcycle in closed-to-vehicle muddy places)
Unit Tests test the smallest unit of the code, usually a method. Each unit test is closely tied to the method it is testing, and if it's well written it's tied (almost) only with that.
They are great to guide the design of new code and the refactoring of existing code. They are great to spot problems long before the system is ready for integration tests. Note that I wrote guide and all the Test Driven Development is about this word.
It does not make any sense to have manual Unit Tests.
What about refactoring, which seems to be your main concern? If you are refactoring just the implementation (content) of a method, but not its existence or "external behavior", the Unit Test is still valid and incredibly useful (you cannot imagine how much useful until you try).
If you are refactoring more aggressively, changing methods existence or behavior, then yes, you need to write a new Unit Test for each new method, and possibly throw away the old one. But writing the Unit Test, especially if you write it before the code itself, will help to clarify the design (i.e. what the method should do, and what it shouldn't) without being confused by the implementation details (i.e. how the method should do the thing that it needs to do).
Automated Integration Tests test the biggest unit of the code, usually the entire application.
They are great to test use cases which you don't want to test by hand. But you can also have manual Integration Tests, and they are as effective (only less convenient).
Starting a new project today, it does not make any sense not to have Unit Tests, but I'd say that for an existing project like yours it does not make too much sense to write them for everything you already have and it's working.
In your case, I'd rather use a "middle ground" approach writing:
smaller Integration Tests which only test the sections you are going to refactor. If you are refactoring the whole thing, then you can use your current Integration Tests, but if you are refactoring only -say- the XML generation, it does not make any sense to require the presence of the database, so I'd write a simple and small XML Integration Test.
a bunch of Unit Tests for the new code you are going to write. As I already wrote above, Unit Tests will be ready as soon as you "mess with anything in between", making sure that your "mess" is going somewhere.
In fact your Integration Test will only make sure that your "mess" is not working (because at the beginning it will not work, right?) but it will not give you any clue on
why it is not working
if your debugging of the "mess" is really fixing something
if your debugging of the "mess" is breaking something else
Integration Tests will only give the confirmation at the end if the whole change was successful (and the answer will be "no" for a long time). The Integration Tests will not give you any help during the refactoring itself, which will make it harder and possibly frustrating. You need Unit Tests for that.
I agree with Charlie about Integration-level tests corresponding more to user actions and the correctness of the system as a whole. I do think there is alot more value to Unit Tests than just localizing failures more tightly though. Unit tests provide two main values over integration tests:
1) Writing unit tests is as much an act of design as testing. If you practice Test Driven Development/Behavior Driven Development the act of writing the unit tests helps you design exactly what you code should do. It helps you write higher quality code (since being loosely coupled helps with testing) and it helps you write just enough code to make your tests pass (since your tests are in effect your specification).
2) The second value of unit tests is that if they are properly written they are very very fast. If I make a change to a class in your project can I run all the corresponding tests to see if I broke anything? How do I know which tests to run? And how long will they take? I can guarantee it will be longer than well written unit tests. You should be able to run all of you unit tests in a couple of minutes at the most.
Just a few examples from personal experience:
Unit Tests:
(+) Keeps testing close to the relevant code
(+) Relatively easy to test all code paths
(+) Easy to see if someone inadvertently changes the behavior of a method
(-) Much harder to write for UI components than for non-GUI
Integration Tests:
(+) It's nice to have nuts and bolts in a project, but integration testing makes sure they fit each other
(-) Harder to localize source of errors
(-) Harder to tests all (or even all critical) code paths
Ideally both are necessary.
Examples:
Unit test: Make sure that input index >= 0 and < length of array. What happens when outside bounds? Should method throw exception or return null?
Integration test: What does the user see when a negative inventory value is input?
The second affects both the UI and the back end. Both sides could work perfectly, and you could still get the wrong answer, because the error condition between the two isn't well-defined.
The best part about Unit testing we've found is that it makes devs go from code->test->think to think->test->code. If a dev has to write the test first, [s]he tends to think more about what could go wrong up front.
To answer your last question, since unit tests live so close to the code and force the dev to think more up front, in practice we've found that we don't tend to refactor the code as much, so less code gets moved around - so tossing and writing new tests constantly doesn't appear to be an issue.
The question has a philisophical part for sure, but also points to pragmatic considerations.
Test driven design used as the means to become a better developer has its merits, but it is not required for that. Many a good programmer exists who never wrote a unit test. The best reason for unit tests is the power they give you when refactoring, especially when many people are changing the source at the same time. Spotting bugs on checkin is also a huge time-saver for a project (consider moving to a CI model and build on checkin instead of nightly). So if you write a unit test, either before or after you written the code it tests, you are sure at that moment about the new code you've written. It is what can happen to that code later that the unit test ensures against - and that can be significant. Unit tests can stop bugs before tehy get to QA, thereby speeding up your projects.
Integration tests stress the interfaces between elements in your stack, if done correctly. In my experience, integration is the most unpredictable part of a project. Getting individual pieces to work tends not to be that hard, but putting everything together can be very difficult because of the types of bugs that can emerge at this step. In many cases, projects are late because of what happens in integration. Some of the errors encountered in this step are found in interfaces that have been broken by some change made on one side that was not communicated to the other side. Another source of integration errors are in configurations discovered in dev but forgotten by the time the app goes to QA. Integration tests can help reduce both types dramatically.
The importance of each test type can be debated, but what will be of most importance to you is the application of either type to your particular situation. Is the app in question being developed by a small group of people or many different groups? Do you have one repository for everything, or many repos each for a particular component of the app? If you have the latter, then you will have challenges with inter compatability of different versions of different components.
Each test type is designed to expose the problems of different levels of integration in the development phase to save time. Unit tests drive the integration of the output many developers operating on one repository. Integration tests (poorly named) drive the integration of components in the stack - components often written by separate teams. The class of problems exposed by integration tests are typically more time-consuming to fix.
So pragmatically, it really boils down to where you most need speed in your own org/process.
The thing that distinguishes Unit tests and Integration tests is the number of parts required for the test to run.
Unit tests (theoretically) require very (or no) other parts to run.
Integration tests (theoretically) require lots (or all) other parts to run.
Integration tests test behaviour AND the infrastructure. Unit tests generally only test behaviour.
So, unit tests are good for testing some stuff, integration tests for other stuff.
So, why unit test?
For instance, it is very hard to test boundary conditions when integration testing. Example: a back end function expects a positive integer or 0, the front end does not allow entry of a negative integer, how do you ensure that the back end function behaves correctly when you pass a negative integer to it? Maybe the correct behaviour is to throw an exception. This is very hard to do with an integration test.
So, for this, you need a unit test (of the function).
Also, unit tests help eliminate problems found during integration tests. In your example above, there are a lot of points of failure for a single HTTP call:
the call from the HTTP client
the servlet validation
the call from the servlet to the business layer
the business layer validation
the database read (hibernate)
the data transformation by the business layer
the database write (hibernate)
the data transformation -> XML
the XSLT transformation -> HTML
the transmission of the HTML -> client
For your integration tests to work, you need ALL of these processes to work correctly. For a Unit test of the servlet validation, you need only one. The servlet validation (which can be independent of everything else). A problem in one layer becomes easier to track down.
You need both Unit tests AND integration tests.
Unit tests execute methods in a class to verify proper input/output without testing the class in the larger context of your application. You might use mocks to simulate dependent classes -- you're doing black box testing of the class as a stand alone entity. Unit tests should be runnable from a developer workstation without any external service or software requirements.
Integration tests will include other components of your application and third party software (your Oracle dev database, for example, or Selenium tests for a webapp). These tests might still be very fast and run as part of a continuous build, but because they inject additional dependencies they also risk injecting new bugs that cause problems for your code but are not caused by your code. Preferably, integration tests are also where you inject real/recorded data and assert that the application stack as a whole is behaving as expected given those inputs.
The question comes down to what kind of bugs you're looking to find and how quickly you hope to find them. Unit tests help to reduce the number of "simple" mistakes while integration tests help you ferret out architectural and integration issues, hopefully simulating the effects of Murphy's Law on your application as a whole.
Joel Spolsky has written very interesting article about unit-testing (it was dialog between Joel and some other guy).
The main idea was that unit tests is very good thing but only if you use them in "limited" quantity. Joel doesn't recommend to achive state when 100% of your code is under testcases.
The problem with unit tests is that when you want to change architecture of your application you'll have to change all corresponding unit tests. And it'll take very much time (maybe even more time than the refactoring itself). And after all that work only few tests will fail.
So, write tests only for code that really can make some troubles.
How I use unit tests: I don't like TDD so I first write code then I test it (using console or browser) just to be sure that this code do nessecary work. And only after that I add "tricky" tests - 50% of them fail after first testing.
It works and it doesn't take much time.
We have 4 different types of tests in our project:
Unit tests with mocking where necessary
DB tests that act similar to unit tests but touch db & clean up afterwards
Our logic is exposed through REST, so we have tests that do HTTP
Webapp tests using WatiN that actually use IE instance and go over major functionality
I like unit tests. They run really fast (100-1000x faster than #4 tests). They are type safe, so refactoring is quite easy (with good IDE).
Main problem is how much work is required to do them properly. You have to mock everything: Db access, network access, other components. You have to decorate unmockable classes, getting a zillion mostly useless classes. You have to use DI so that your components are not tightly coupled and therefore not testable (note that using DI is not actually a downside :)
I like tests #2. They do use the database and will report database errors, constraint violations and invalid columns. I think we get valuable testing using this.
#3 and especially #4 are more problematic. They require some subset of production environment on build server. You have to build, deploy and have the app running. You have to have a clean DB every time. But in the end, it pays off. Watin tests require constant work, but you also get constant testing. We run tests on every commit and it is very easy to see when we break something.
So, back to your question. Unit tests are fast (which is very important, build time should be less than, say, 10 minutes) and the are easy to refactor. Much easier than rewriting whole watin thing if your design changes. If you use a nice editor with good find usages command (e.g. IDEA or VS.NET + Resharper), you can always find where your code is being tested.
With REST/HTTP tests, you get a good a good validation that your system actually works. But tests are slow to run, so it is hard to have a complete validation at this level. I assume your methods accept multiple parametres or possibly XML input. To check each node in XML or each parameter, it would take tens or hundreds of calls. You can do that with unit tests, but you cannot do that with REST calls, when each can take a big fraction of a second.
Our unit tests check special boundary conditions far more often than #3 tests. They (#3) check that main functionality is working and that's it. This seems to work pretty well for us.
As many have mentioned, integration tests will tell you whether your system works, and unit tests will tell you where it doesn't. Strictly from a testing perspective, these two kinds of tests complement each other.
I can't see how adding a bunch more
unit tests on individual classes would
help. Wouldn't that make it harder to
refactor as it's much more likely we
will need to throw away and re-write
tests?
No. It will make refactoring easier and better, and make it clearer to see what refactorings are appropriate and relevant. This is why we say that TDD is about design, not about testing. It's quite common for me to write a test for one method and in figuring out how to express what that method's result should be to come up with a very simple implementation in terms of some other method of the class under test. That implementation frequently finds its way into the class under test. Simpler, more solid implementations, cleaner boundaries, smaller methods: TDD - unit tests, specifically - lead you in this direction, and integration tests do not. They're both important, both useful, but they serve different purposes.
Yes, you may find yourself modifying and deleting unit tests on occasion to accommodate refactorings; that's fine, but it's not hard. And having those unit tests - and going through the experience of writing them - gives you better insight into your code, and better design.
Although the setup you described sounds good, unit testing also offers something important. Unit testing offers fine levels of granularity. With loose coupling and dependency injection, you can pretty much test every important case. You can be sure that the units are robust; you can scrutinise individual methods with scores of inputs or interesting things that don't necessarily occur during your integration tests.
E.g. if you want to deterministically see how a class will handle some sort of failure that would require a tricky setup (e.g. network exception when retrieving something from a server) you can easily write your own test double network connection class, inject it and tell it to throw an exception whenever you feel like it. You can then make sure that the class under test gracefully handles the exception and carries on in a valid state.
You might be interested in this question and the related answers too. There you can find my addition to the answers that were already given here.

How does unit testing work when the program doesn't lend itself to a functional style?

I'm thinking of the case where the program doesn't really compute anything, it just DOES a lot. Unit testing makes sense to me when you're writing functions which calculate something and you need to check the result, but what if you aren't calculating anything? For example, a program I maintain at work relies on having the user fill out a form, then opening an external program, and automating the external program to do something based on the user input. The process is fairly involved. There's like 3000 lines of code (spread out across multiple functions*), but I can't think of a single thing which it makes sense to unit test.
That's just an example though. Should you even try to unit test "procedural" programs?
*EDIT
Based on your description these are the places I would look to unit test:
Does the form validation work of user input work correctly
Given valid input from the form is the external program called correctly
Feed in user input to the external program and see if you get the right output
From the sounds of your description the real problem is that the code you're working with is not modular. One of the benefits I find with unit testing is that it code that is difficult to test is either not modular enough or has an awkward interface. Try to break the code down into smaller pieces and you'll find places where it makes sense to write unit tests.
I'm not an expert on this but have been confused for a while for the same reason. Somehow the applications I'm doing just don't fit to the examples given for UNIT testing (very asynchronous and random depending on heavy user interaction)
I realized recently (and please let me know if I'm wrong) that it doesn't make sense to make a sort of global test but rather a myriad of small tests for each component. The easiest is to build the test in the same time or even before creating the actual procedures.
Do you have 3000 lines of code in a single procedure/method? If so, then you probably need to refactor your code into smaller, more understandable pieces to make it maintainable. When you do this, you'll have those parts that you can and should unit test. If not, then you already have those pieces -- the individual procedures/methods that are called by your main program.
Even without unit tests, though, you should still write tests for the code to make sure that you are providing the correct inputs to the external program and testing that you handle the outputs from the program correctly under both normal and exceptional conditions. Techniques used in unit testing -- like mocking -- can be used in these integration tests to ensure that your program is operating correctly without involving the external resource.
An interesting "cut point" for your application is you say "the user fills out a form." If you want to test, you should refactor your code to construct an explicit representation of that form as a data structure. Then you can start collecting forms and testing that the system responds appropriately to each form.
It may be that the actions taken by your system are not observable until something hits the file system. Here are a couple of ideas:
Set up something like a git repository for the initial state of the file system, run a form, and look at the output of git diff. It's likely this is going to feel more like regression testing than unit testing.
Create a new module whose only purpose is to make your program's actions observable. This can be as simple as writing relevant text to a log file or as complex as you like. If necessary, you can use conditional compilation or linking to ensure this module does something only when the system is under test. This is closer to traditional unit testing as you can now write tests that say upon receiving form A, the system should take sequence of actions B. Obviously you have to decide what actions should be observed to form a reasonable test.
I suspect you'll find yourself migrating toward something that looks more like regression testing than unit testing per se. That's not necessarily bad. Don't overlook code coverage!
(A final parenthetical remark: in the bad old days of interactive console applications, Don Libes created a tool called Expect, which was enormously helpful in allowing you to script a program that interacted like a user. In my opinion we desperately need something similar for interacting with web pages. I think I'll post a question about this :-)
You don't necessarily have to implement automated tests that test individual methods or components. You could implement an automated unit test that simulates a user interacting with your application, and test that your application responds in the correct way.
I assume you are manually testing your application currently, if so then think about how you could automate that and work from there. Over time you should be able to break your tests into progressively smaller chunks that test smaller sections of code. Any sort of automated testing is usually a lot better than nothing.
Most programs (regardless of the language paradigm) can be broken into atomic units which take input and provide output. As the other responders have mentioned, look into refactoring the program and breaking it down into smaller pieces. When testing, focus less on the end-to-end functionality and more on the individual steps in which data is processed.
Also, a unit doesn't necessarily need to be an individual function (though this is often the case). A unit is a segment of functionality which can be tested using inputs and measuring outputs. I've seen this when using JUnit to test Java APIs. Individual methods might not necessarily provide the granularity I need for testing, though a series of method calls will. Therefore, the functionality I regard as a "unit" is a little greater than a single method.
You should at least refactor out the stuff that looks like it might be a problem and unit test that. But as a rule, a function shouldn't be that long. You might find something that is unit test worthy once you start refactoring
Good object mentor article on TDD
As a few have answered before, there are a few ways you can test what you have outlined.
First the form input, can be tested in a few ways.
What happens if invalid data is inputted, valid data, etc.
Then each of the function can be tested to see if the functions when supplied with various forms of correct and incorrect data react in the proper manner.
Next you can mock the application that are being called so that you can make sure that your application send and process data to the external programs correctly. Don't for get to make sure your program deals with unexpected data from the external program as well.
Usually, the way I figure out how I want to write tests for a program I have been assigned to maintain, is to see what I am do manually to test the program. Then try and figure how to automate as much of it as possible. Also, don't restrict your testing tools just to the programming language you are writing the code in.
I think a wave of testing paranoia is spreading :) Its good to examine things to see if tests would make sense, sometimes the answer is going to be no.
The only thing that I would test is making sure that bogus form input is handled correctly.. I really don't see where else an automated test would help. I think you'd want the test to be non invasive (i.e. no record is actually saved during testing), so that might rule out the other few possibilities.
If you can't test something how do you know that it works? A key to software design is that the code should be testable. That may make the actual writing of the software more difficult, but it pays off in easier maintenance later.

Testing a test?

I primarily spend my time working on automated tests of win32 and .NET applications, which take about 30% of our time to write and 70% to maintain. We have been looking into methods of reducing the maintenance times, and have already moved to a reusable test library that covers most of the key components of our software. In addition we have some work in progress to get our library to a state where we can use keyword based testing.
I have been considering unit testing our test library, but I'm wondering if it would be worth the time. I'm a strong proponent of unit testing of software, but I'm not sure how to handle test code.
Do you think automated Gui testing libraries should be unit tested? Or is it just a waste of time?
First of all I've found it very useful to look at unit-test as "executable specifications" instead of tests. I write down what I want my code to do and then implement it. Most of the benefits I get from writing unit tests is that they drive the implementation process and focus my thinking. The fact that they're reusable to test my code is almost a happy coincidence.
Testing tests seems just a way to move the problem instead of solving it. Who is going to test the tests that test the tests? The 'trick' that TDD uses to make sure tests are actually useful is by making them fail first. This might be something you can use here too. Write the test, see it fail, then fix the code.
I dont think you should unit test your unit tests.
But, if you have written your own testing library, with custom assertions, keyboard controllers, button testers or what ever, then yes. You should write unit tests to verify that they all work as intented.
The NUnit library is unit tested for example.
In theory, it is software and thus should be unit-tested. If you are rolling your own Unit Testing library, especially, you'll want to unit test it as you go.
However, the actual unit tests for your primary software system should never grow large enough to need unit testing. If they are so complex that they need unit testing, you need some serious refactoring of your software and some attention to simplifying your unit tests.
You might want to take a look at Who tests the tests.
The short answer is that the code tests the tests, and the tests test the code.
Huh?
Testing Atomic Clocks
Let me start with an analogy. Suppose you are
travelling with an atomic clock. How would you know that the clock is
calibrated correctly?
One way is to ask your neighbor with an atomic clock (because everyone
carries one around) and compare the two. If they both report the same
time, then you have a high degree of confidence they are both correct.
If they are different, then you know one or the other is wrong.
So in this situation, if the only question you are asking is, "Is my
clock giving the correct time?", then do you really need a third clock
to test the second clock and a fourth clock to test the third? Not if
all. Stack Overflow avoided!
IMPO: it's a tradeoff between how much time you have and how much quality you'd like to have.
If I would be using a home made test harnas, I'd test it if time permits.
If it's a third party tool I'm using, I'd expect the supplier to have tested it.
There really isn't a reason why you could/shouldn't unit test your library. Some parts might be too hard to unit test properly, but most of it probably can be unit tested with no particular problem.
It's actually probably particularly beneficial to unit test this kind of code, since you expect it to be both reliable and reusable.
The tests test the code, and the code tests the tests. When you say the same intention in two different ways (once in tests and once in code), the probability of both of them being wrong is very low (unless already the requirements were wrong). This can be compared to the dual entry bookkeeping used by accountants. See http://butunclebob.com/ArticleS.UncleBob.TheSensitivityProblem
Recently there has been discussion about this same issue in the comments of http://blog.objectmentor.com/articles/2009/01/31/quality-doesnt-matter-that-much-jeff-and-joel
About your question, that should GUI testing libraries be tested... If I understood right, you are making your own testing library, and you want to know if you should test your testing library. Yes. To be able to rely on the library to report tests correctly, you should have tests which make sure that library does not report any false positives or false negatives. Regardless of whether the tests are unit tests, integration tests or acceptance tests, there should be at least some tests.
Usually writing unit tests after the code has been written is too late, because then the code tends to be more coupled. The unit tests force the code to be more decoupled, because otherwise small units (a class or a closely related group of classes) can not be tested in isolation.
When the code has already been written, then usually you can add only integration tests and acceptance tests. They will be run with the whole system running, so you can make sure that the features work right, but covering every corner case and execution path is harder than with unit tests.
We generally use these rules of thumb:
1) All product code has both unit tests (arranged to correspond closely with product code classes and functions), and separate functional tests (arranged by user-visible features)
2) Do not write tests for 3rd party code, such as .NET controls, or third party libraries. The exception to this is if you know they contain a bug which you are working around. A regression test for this (which fails when the 3rd party bug disappears) will alert you when upgrades to your 3rd party libraries fix the bug, meaning you can then remove your workarounds.
3) Unit tests and functional tests are not, themselves, ever directly tested - APART from using the TDD procedure of writing the test before the product code, then running the test to watch it fail. If you don't do this, you will be amazed by how easy it is to accidentally write tests which always pass. Ideally, you would then implement your product code one step at a time, and run the tests after each change, in order to see every single assertion in your test fail, then get implemented and start passing. Then you will see the next assertion fail. In this way, your tests DO get tested, but only while the product code is being written.
4) If we factor out code from our unit or functional tests - creating a testing library which is used in many tests, then we do unit test all of this.
This has served us very well. We seem to have always stuck to these rules 100%, and we are very happy with our arrangement.
Kent Beck's book "Test-Driven Development: By Example" has an example of test-driven development of a unit test framework, so it's certainly possible to test your tests.
I haven't worked with GUIs or .NET, but what concerns do you have about your unit tests?
Are you worried that it may describe the target code as incorrect when it is functioning properly? I suppose this is a possibility, but you'd probably be able to detect that if this was happening.
Or are you concerned that it may describe the target code as functioning properly even if it isn't? If you're worried about that, then mutation testing may be what you're after. Mutation testing changes parts of code being tested, to see if those changes cause any tests to fail. If it doesn't, then either the code isn't being run, or the results of that code isn't being tested.
If mutation testing software isn't available on your system, then you could do the mutation manually, by sabotaging the target code yourself and seeing if it causes the unit tests to fail.
If you're building a suite of unit testing products that aren't tied to a particular application, then maybe you should build a trivial application that you can run your test software on and ensure it gets the failures and successes expected.
One problem with mutation testing is that it doesn't ensure that the tests cover all potential scenarios a program may encounter. Instead, it only ensures that the scenarios anticipated by the target code are all tested.
Answer
Yes, your GUI testing libraries should be tested.
For example, if your library provides a Check method to verify the contents of a grid against a 2-dimensional array, you want to be sure that it works as intended.
Otherwise, your more complex test cases that test business processes in which a grid must receive particular data may be unreliable. If an error in your Check method produces false negatives, you'll quickly find the problem. However, if it produces false positives, you're in for major headaches down the line.
To test your CheckGrid method:
Populate a grid with known values
Call the CheckGrid method with the values populated
If this case passes, at least one aspect of CheckGrid works.
For the second case, you're expecting the CheckGrid method to report a test failure.
The particulars of how you indicate the expectation will depend on your xUnit framework (see an example later). But basically, if the Test Failure is not reported by CheckGrid, then the test case itself must fail.
Finally, you may want a few more test case for special conditions, such as: empty grids, grid size mismatching array size.
You should be able to modify the following dunit example for most frameworks in order to test that CheckGrid correctly detects errors:
begin
//Populate TheGrid
try
CheckGrid(<incorrect values>, TheGrid);
LFlagTestFailure := False;
except
on E: ETestFailure do
LFlagTestFailure := True;
end;
Check(LFlagTestFailure, 'CheckGrid method did not detect errors in grid content');
end;
Let me reiterate: your GUI testing libraries should be tested; and the trick is - how do you do so effectively?
The TDD process recommends that you first figure out how you intend testing a new piece of functionality before you actually implement it. The reason is, that if you don't, you often find yourself scratching your head as to how you're going to verify it works. It is extremely difficult to retrofit test cases onto existing implementations.
Side Note
One thing you said bothers me a little... you said it takes "70% (of your time) to maintain (your tests)"
This sounds a little wrong to me, because ideally your tests should be simple, and should themselves only need to change if your interfaces or rules change.
I may have misunderstood you, but I got the impression that you don't write "production" code. Otherwise you should have more control over the cycle of switching between test code and production code so as to reduce your problem.
Some suggestions:
Watch out for non-deterministic values. For example, dates and artificial keys can play havoc with certain tests. You need a clear strategy of how you'll resolve this. (Another answer on its own.)
You'll need to work closely with the "production developers" to ensure that aspects of the interfaces you're testing can stabilise. I.e. They need to be cognisant of how your tests identify and interact with GUI components so they don't arbitrarily break your tests with changes that "don't affect them".
On the previous point, it would help if automated tests are run whenever they make changes.
You should also be wary of too many tests that simply boil down to arbitrary permutations. For example, if each customer has a category A, B, C, or D; then 4 "New Customer" tests (1 for each category) gives you 3 extra tests that don't really tell you much more than the first one, and are 'hard' to maintain.
Personally, I don't unit test my automation libraries, I run them against a modified version of the baseline to ensure all the checkpoints work. The principal here is that my automation is primarily for regression testing, e.g. that the results for the current run are the same as the expect results (typically this equates to the results of the last run). By running the tests against a suitably modified set of expected results, all the tests shoud fail. If they don't you have a bug in your test suite. This is a concept borrowed from mutation testing that I find works well for checking GUI automation suites.
From your question, I can understand that you are building a Keyword Driven Framework for performing automation testing. In this case, it is always recommended to do some white box testing on the common and GUI utility functions. Since you are interested in Unit testing each GUI testing functionality in your libraries, please go for it. Testing is always good. It is not a waste of time, I would see it as a 'value-add' to your framework.
You have also mentioned about handling test code, if you mean the test approach, please group up different functions/modules performing similar work eg: GUI element validation (presence), GUI element input, GUI element read. Group for different element types and perform a type unit test approach for each group. It would be easier for you to track the testing. Cheers!
I would suggest test the test is a good idea and something that must be done. Just make sure that what you're building to test your app is not more complex that the app itself. As it was said before, TDD is a good approach even when building automated functional tests (I personally wouldn't do it like that, but it is a good approach anyway). Unit testing you test code is a good approach as well. IMHO, if you're automating GUI testing, just go ahead with whatever manual tests are available (you should have steps, raw scenarios, expected results and so on), make sure they pass. Then, for other test that you might create and that are not already manually scripted, unit test them and follow a TDD approach. (then if you have time you could unit test the other ones).
Finally, keyword driven, is, IMO, the best approach you could follow because it gives you the most flexible approach.
You may want to explore a mutation testing framework ( if you work with Java : check out PIT Mutation Testing ). Another way to assess the quality of your unit testing is to look at reports provided by tools such as SonarQube ; the reports include various coverage metrics;

Unit Test Connundrum

I'm looking to unit testing as a means of regression testing on a project.
However, my issue is that the project is basically a glorified DIR command -- it performs regular expression tests and MD5 filters on the results, and allows many criteria to be specified, but the entire thing is designed to process input from the system on which it runs.
I'm also a one-man development team, and I question the value of a test for code written by me which is written by me.
Is unit testing worthwhile in this situation? If so, how might such tests be accomplished?
EDIT: MD5 and Regex functions aren't provided by me -- they are provided by the Crypto++ library and Boost, respectively. Therefore I don't gain much by testing them. Most of the code I have simply feeds data into the libraries, and the prints out the results.
The value of test-after, the way you are asking, can indeed be limited in certain circumstances, but the way to unit test, from the description would be to isolate the regular expression tests and MD5 filters into one section of code, and abstract the feeding of the input so that in production it feeds from the system, and during the unit test, your test class passes in that input.
You then collect a sampling of the different scenarios you intend to support, and feed those in via different unit tests that exercise each scenario.
I think the value of the unit test will come through if you have to change the code to handle new scenarios. You will be confident that the old scenarios don't break as you make changes.
Is unit testing worthwhile in this situation?
Not necessarily: especially for a one-man team I think it may be sufficient to have automated testing of something larger than a "unit" ... further details at "Should one test internal implementation, or only test public behaviour?"
Unit testing can still provide value in a one-man show. It gives you confidence in the functionality and correctness (at some level) of the module. But some design considerations may be needed to help make testing more applicable to your code. Modularization makes a big difference, especially if combined with some kind of dependency injection, instead of tight coupling. This allows test versions of collaborators to be used for testing a module in isolation. In your case, a mock file system object could return a predictable set of data, so your filtering and criteria code can be evaluated.
The value of regression testing is often not realized until it's automated. Once that's done, things become a lot easier.
That means you have to be able to start from a known position (if you're generating MD5s on files, you have to start with the same files each time). Then get one successful run where you can save the output - that's the baseline.
From that point on, regression testing is simply a push-button job. Start your test, collect the output and compare it to your known baseline (of course, if the output ever changes, you'll need to check it manually, or with another independent script before saving it as the new baseline).
Keep in mind the idea of regression testing is to catch any bugs introduced by new code (i.e., regressing the software). It's not to test the functionality of that new code.
The more you can automate this, the better, even as a one-man development team.
When you were writing the code, did you test it as you went? Those tests could be written into an automated script, so that when you have to make a change in 2 months time, you can re-run all those tests to ensure you haven't regressed something.
In my experience the chance of regression increases sharply depending on how much time goes by after the time you finish version 1 and start coding version 2, because you'll typically forget the subtle nuances of how it works under different conditions - your unit tests are a way of encoding those nuances.
An integration test against the filesystem would be worthwhile. Just make sure it does what it needs to do.
Is unit testing valuable in a one-man shop scenario? Absolutely! There's nothing better than being able to refactor your code with absolute confidence you haven't broken anything.
How do I unit test this? Mock the system calls, and then test the rest of your logic.
I question the value of a test for code written by me which is written by me
Well, that's true now but in a year it will be you, the one-year-more-experienced developer developing against software written by you-now, the less experienced and knowledgeable developer (by comparison). Won't you want the code written by that less experienced guy (you a year ago) to be properly tested so you can make changes with confidence that nothing has broken?