How to write tests without so many mocks? - unit-testing

I am a heavy advocate of proper Test Driven Design or Behavior Driven Design and I love writing tests. However, I keep coding myself into a corner where I need to use 3-5 mocks in a particular test case for a single class. No matter which way I start, top down or bottom up I end up with a design that requires at least three collaborators from the highest level of abstraction.
Can somebody give good advice on how to avoid this pitfall?
Here's a typical scenario. I design a Widget that produces a Midget from a given text value. It always starts really simple until I get into the details. My Widget must interact with several hard to test things like file systems, databases, and the network.
So, instead of designing all that into my Widget I make a Bridget collaborator. The Bridget takes care of one half of the complexity, the database and network, allowing me to focus on the other half which is multimedia presentation. So, then I make a Gidget that performs the multimedia piece. The entire thing needs to happen in the background, so now I include a Thridget to make that happen. When all is said and done I end up with a Widget that hands work to a Thridget which talks over a Bridget to give its result to a Gidget.
Because I'm working in CocoaTouch and trying to avoid mock objects I use the self-shunt pattern where abstractions over collaborators become protocols that my test adopts. With 3+ collaborators my test balloons and become too complicated. Even using something like OCMock mock objects leaves me with an order of complexity that I'd rather avoid. I tried wrapping my brain around a daisy-chain of collaborators (A delegates to B who delegates to C and so on) but I can't envision it.
Edit
Taking an example from below let's assume we have an object that must read/write from sockets and present the movie data returned.
//Assume myRequest is a String param...
InputStream aIn = aSocket.getInputStram();
OutputStream aOut = aSocket.getOutputStram();
DataProcessor aProcessor = ...;
// This gets broken into a "Network" collaborator.
for(stuff in myRequest.charArray()) aOut.write(stuff);
Object Data = aIn.read(); // Simplified read
//This is our second collaborator
aProcessor.process(Data);
Now the above obviously deals with network latency so it has to be Threaded. This introduces a Thread abstraction to get us out of the practice of threaded unit tests. We now have
AsynchronousWorker myworker = getWorker(); //here's our third collaborator
worker.doThisWork( new WorkRequest() {
//Assume myRequest is a String param...
DataProcessor aProcessor = ...;
// Use our "Network" collaborator.
NetworkHandler networkHandler = getNetworkHandler();
Object Data = networkHandler.retrieveData(); // Simplified read
//This is our multimedia collaborator
aProcessor.process(Data);
})
Forgive me for working backwards w/o tests but I'm about to take my daughter outside and I'm rushing thru the example. The idea here is that I'm orchestrating the collaboration of several collaborators from behind a simple interface that will get tied to a UI button click event. So the outter-most test reflects a Sprint task that says given a "Play Movie" button, when it is clicked, the movie will play.
Edit
Lets discuss.

Having many mock objects shows that:
1) You have too much dependencies.
Re-look at your code and try to break it further down. Especially, try to separate data transformation and processing.
Since I don't have experience in the environment you are developing in. So let me give my own experience as example.
In Java socket, you will be given a set of InputStream and OutputStream simple so that you can read data from and send data to your peer. So your program looks like this:
InputStream aIn = aSocket.getInputStram();
OutputStream aOut = aSocket.getOutputStram();
// Read data
Object Data = aIn.read(); // Simplified read
// Process
if (Data.equals('1')) {
// Do something
// Write data
aOut.write('A');
} else {
// Do something else
// Write another data
aOut.write('B');
}
If you want to test this method, you have to ends up create mock for In and Out which may require quite a complicated classes behind them for supporting.
But if you look carefully, read from aIn and write to aOut can be separated from processing it. So you can create another class which will takes the read input and return output object.
public class ProcessSocket {
public Object process(Object readObject) {
if (readObject.equals(...)) {
// Do something
// Write data
return 'A';
} else {
// Do something else
// Write another data
return 'B';
}
}
and your previous method will be:
InputStream aIn = aSocket.getInputStram();
OutputStream aOut = aSocket.getOutputStram();
ProcessSocket aProcessor = ...;
// Read data
Object Data = aIn.read(); // Simplified read
aProcessor.process(Data);
This way you can test the processing with little need for mock. you test can goes:
ProcessSocket aProcessor = ...;
assert(aProcessor.process('1').equals('A'));
Becuase the processing is now independent from input, output and even socket.
2) You are over unit testing by unit test what should be integration tested.
Some tests are not for unit testing (in the sense that it require unnecessarily more effort and may not efficiently get a good indicator). Examples of these kind of tests are those involving concurrency and user interfaces. They require different ways of testing than unit testing.
My advice would be that you further break them down (similar to the technique above) until some of them are unit-test suitable. So you have the little hard-to-test parts.
EDIT
If you believe you already broken it into very fine pieces, perhaps, that is your problem.
Software components or sub-components are related to each other in some way like characters are combined to words, words are combined to sentences, sentences to paragraphs, paragraphs to subsection, section, chapters and so on.
My example says, your should broken subsection to paragraphs and you things you already downs to words.
Look at it this way, most of the time, paragraphs are related to other paragraphs in a less loosely degree than sentences related (or depends on) other sentences. Subsection, section are even more loosely while words and characters are more dependent (as the grammatical rules kick in).
So perhaps, you are breaking it so fine that the language syntax force to those dependencies and in turn forcing you to have so much mock objects.
If that is the case, your solution is to balance the test. If a part are depended by many and it is require a complex set of mock object (or simple more effort to test it). May be you don't need to test it. For example, If A uses B,C uses B and B is so damn hard to test. So why don't you just test A+B as one and C+B as anther. In my example, if SocketProcessor is so hard to test, too hard to the point that you will spend more time testing and maintain the tests more than developing it then it is not worth it and I will just test the whole things at once.
Without seeing your code (and with the fact that I am never develop CocaoTouch) it will be hard to tell. And I may be able to provide good comment here. Sorry :D.
EDIT 2
See your example, it is pretty clear that you are dealing with integration issue. Assuming that you already test play movie and UI separatedly. It is understandable why you need so much mock objects. If this is the first time you use these kind of integration structure (this concurrent pattern), then those mock objects may actually be needed and there is nothing much you can do about it. That's all I can say :-p
Hope this helps.

My solution (not CocoaTouch) is to continue to mock the objects, but to refactory mock set up to a common test method. This reduces the complexity of the test itself while retaining the mock infrastructure to test my class in isolation.

I do some fairly complete testing, but it's automated integration testing and not unit-testing, so I have no mocks (except the user: I mock the end-user, simulating user-input events and testing/asserting whatever's output to the user): Should one test internal implementation, or only test public behaviour?
What I'm looking for is best practices using TDD.
Wikipedia describes TDD as,
a software development technique that
relies on the repetition of a very
short development cycle: First the
developer writes a failing automated
test case that defines a desired
improvement or new function, then
produces code to pass that test and
finally refactors the new code to
acceptable standards.
It then goes on to prescribe:
Add a test
Run all tests and see if the new one fails
Write some code
Run the automated tests and see them succeed
Refactor code
I do the first of these, i.e. "very short development cycle", the difference in my case being that I test after it's written.
The reason why I test after it's written is so that I don't need to "write" any tests at all, even the integration tests.
My cycle is something like:
Rerun all automated integration tests (start with a clean slate)
Implement a new feature (with refactoring of the existing code if necessary to support the new feature)
Rerun all automated integration tests (regression testing to ensure that new development hasn't broken existing functionality)
Test the new functionality:
a. End-user (me) does user input via the user interface, intended to exercise the new feature
b. End-user (me) inspects the corresponding program output, to verify whether the output is correct for the given input
When I do the testing in step 4, the test environment captures the user input and program output into a data file; the test environment can replay such a test in the future (recreate the user input, and assert whether the corresponding output is the same as the expected output captured previously). Thus, the test cases which were run/created in step 4 are added to the suite of all automated tests.
I think this gives me the benefits of TDD:
Testing is coupled with development: I test immediately after coding instead of before coding, but in any case the new code is tested before it's checked in; there's never untested code.
I have automated test suites, for regression testing
I avoid some costs/disadvantages:
Writing tests (instead I create new tests using the UI, which is quicker and easier, and closer to the original requirements)
Creating mocks (required for unit testing)
Editing tests when the internal implementation is refactored (because the tests depend only on the public API and not on the internal implementation details).

To get rid of excessive mocking you can follow the Test Pyramid which suggests having a lot of unit + component tests and a smaller number of slow & fragile system tests. It boils down to several simple rules:
Write tests at the lowest possible level that wouldn't require mocking. If you can write a unit test (e.g. parsing a string), then write it. But if you want to check whether the parsing is invoked by the upper layer, then this would require initializing more of the stuff.
Mock external systems. Your system needs to be a self-contained, independent piece. Relying on external apps (which would have their own bugs) would complicate testing a lot. Writing mocks/stubs is much easier.
After that have couple of tests checking your app with real integrations.
With this mindset you eliminate almost all mocking.

Related

Unit/integration testing nHibenrate query

Scenario: I need to write a complex nHibernate query, that would return projected DTO, but I want to use TDD approach. The method would look like this:
public PrintDTO GetUsersForPrinting(int userId)
{
Session.QueryOver<User>().//some joins, conditions etc.
//returns projected dto
}
Questions:
Since the most common approach is to use in memory database for this kind of operations. Should I write integration test?
If I am using in memory db can I write Unit tests?
Is one test is enough?
Since my integration test probably will check projection, how should I name it? "GetUserForPrinting_return_correct_DTO" seems too abstract and silly.
I ask because:
There is lots of abstract information about TDD and integration testing, but when it comes to concrete implementation it is very difficult to apply that information.
TDD suggests that integration test should be made of unit tests:
This is not really a very good problem to learn TDD with. I assume you don't already know what the complex query looks like, and you want to use test-driven techniques to drive it out. Awesome :)
But let's see if I can answer your questions.
Yes
any test that includes a real db, whether it is in-memory or on-disk, is not a unit test. A unit test would use a mock db.
Maybe - if you query is complex enough, then no.
testGetUsersForPrinting or getUsersForPrintingTest or similar
Most probably I would drive out the query in a SQL interpreter, not in code. The aim would be to produce a series of integration tests against an in-memory db based on what I learn during this process.
Start from the minimum possible DTO you can think of, and build up from there.
Finally convert the query into nhibernate calls, then make the integration tests pass.
Test-driven, but not really unit-test-driven.
If you are willing to accept maximum TDD discipline and deal with working slower and being more annoyed than usual, you can automate each integration test as you develop it and write code to make it pass. This will mean you are switching frequently among 3 levels of abstraction / editors / environments (direct SQL queries, integration tests, c# code) - I deal with this by setting up techniques to force myself to follow the right steps each time.
This last bit is why this is not a good problem to learn TDD with. You will need a lot of discipline you probably haven't forced yourself to acquire yet!
Good luck.
ok some concrete examples. I would modify your code sample to look like this
public PrintDTO GetUsersForPrinting(int userId, ISession session)
{
var data = session.QueryOver<User>().//some joins, conditions etc.
return data; // or whatever
}
In your unit test you would write
public testDTO()
{
//Arrange
StubSession session = .... setup a stub session, which returns hardcoded values
// Act
PrintDTO users = GetUsersForPrinting(111, session);
// Assert
Assert.That(users.size(), Is.EqualTo(1));
Assert.That(users.get(0).userId, Is.EqualTo(111));
}
In your integration test, you would use a real db, and your session object would actually connect to it, and the queries would be resolved against that db
Arrange-Act-Assert is a standard method for organizing unit tests.
Generally you want as few Asserts as possible in a unit test. And you will have multiple unit tests.
When you are writing a unit test, start by writing the Assert, then fill in the rest to make it compile/get the result you want. Make the test fail first, because then you know you have really delivered something when it passes.
In this example to implement a stub ISession you would derive a local StubSession class (only visible to the test suite) from ISession and just fill in the absolute minimum to get it to compile, and return the minimum data to get the test to pass.
To build up to your whole DTO - assuming you know what you want in your DTO - proceed, as you say in the comments, incrementally. Build up each part of your DTO
a piece at a time, add a unit test for each piece.
Keeping track of this is another piece of TDD discipline.
Set yourself up with a TODO list - just a simple text file, or possibly a lengthy comments at the start of your test suite. List all the things you want to test e.g. zero results, one result, two results, 20 results. User id, whatever other pieces of information you need to have.
If you are doing a complex query across tables or whatever add an todo item for each join, each part of the where clause, etc.
Add items for ordering and paging etc if you are using those.
Pick the simplest things first. Only do one small thing (in a single red-green-refactor cycle) at a time. As you work through your list, you might want to break items up into smaller pieces, or you might think of additional things you need to do. Add them to the TODO list rather than working directly on them.
In this particular case I would swap - after each red-green-refactor cycle - into the SQL environment and/or the sqlite integration test to work out how to make the next piece work. I guess this is a sort of step between red and green - choose what you will test next, write the test (which fails obviously), fiddle around in SQL until you know how to make it pass, write the nHibernate calls to make your test green, then refactor.
Be aware some of the things you list might run out not to be necessary, or take too long, etc. It's good to write them down still, so you know what are not doing as well as what you are doing. Keep focused on your goal.
I tend to also develop a list of "smells" and/or refactorings that I can see I will want to do but am not quite ready for this cycle. Remember to minimise duplication/refactor your tests as well as your SUT (System Under Test).
It's a doing rather then seeing thing. The list of what unit tests you end up with, and the code they exercise, is not a very good description of the journey. Kent Beck's original TDD book is slim and will give you some good overall pointers, but not really about constructing queries.
Does any of that help?
Since the most common approach is to use in memory database for this kind of operations. Should I write integration test?
Using in memory database still is an integration test (because it actually tests if your query generates correct SQL and execute it against a database, see).
If I am using in memory db can I write Unit tests?
No, it would be an integration test
Is one test is enough?
Probably not, you should check each condition of your query, for example one test per one where clause, one for paging and one for sorting if applicable.
Since my integration test probably will check projection, how should I
name it? "GetUserForPrinting_return_correct_DTO" seems too abstract
and silly.
GivenUserForPrinting_WhenGetUserForPrinting_ThenMapToDTO would be a better naming

unit test via output sanity checks

I have often seen tests where canned inputs are fed into a program, one checks the outputs generated against canned (expected) outputs usually via diff. If the diff is accepted, the code is deemed to pass the test.
Questions:
1) Is this an acceptable unit test?
2) Usually the unit test inputs are read in from the file system and are big
xml files (maybe they represent a very large system). Are unit tests supposed
to touch the file system? Or would a unit test create a small input on the fly
and feed that to the code to be tested?
3) How can one refactor existing code to be unit testable?
Output differences
If your requirement is to produce output with certain degree of accuracy, then such tests are absolutely fine. It's you who makes the final decision - "Is this output good enough, or not?".
Talking to file system
You don't want your tests to talk to file system in terms of relying on some files to exists somewhere in order for your tests to work (for example, reading values from configuration files). It's a bit different with tests input resources - you can usually embed them in your tests (or at least test project), treat them as part of codebase, and on top of that they usually should be loaded before test executes. For example, when testing rather large XMLs it's reasonable to have them stored as separete files, rather than strings in code files (which sometimes can be done instead).
Point is - you want to keep your tests isolated and repeatable. If you can achieve that with file being loaded at runtime - it's probably fine. However it's still better to have them as part of codebase/resources than standard system file lying somewhere.
Refactoring
This question is fairly broad, but to put you in the right direction - you want to introduce more solid design, decouple objects and separate responsibilities. Better design will make testing easier and, what's most important - possible. Like I said, it's broad and complex topic, with entire books dedicated to it.
1) is this an acceptable unit test?
This is not a unit test by definition. A unit test focuses on the smallest possible amount of code. Your test can still be a useful test, regression test, self-documenting test, TDD test, etc. It is just not a unit test, although it may be equally useful.
2) Are unit tests supposed to touch the file system?
Typically not, unless you need to unit test something explicitly related to the filesystem. One reason is, if you have several hundred unit tests it is nice to have them run in a couple seconds rather than minutes.
3) How can one refactor existing code to be unit testable?
A better question is why do you want the code to be unit testable? If you are trying to learn TDD it is better to start with a new project. If you have bugs then try to write tests for the bugs. If the design is slowing you down then you can refactor towards testability over time.
Addressing only the 3rd question. It is extremely difficult. You really need to write tests at the same time you write the code, or before. It is a nightmare to try to slap tests onto an existing code base, and it is often more productive to throw away the code and start over.
This is an acceptable unit test.
The files being read should be part of the test project so anyone that checks out the project from repository will have the same files at the same relative location.
Having black box tests is a great start, you can refactor the existing code and use the current tests to verify that it is still working (depending on the quality of the tests). Here is a short blog about refactoring for testability: http://www.beletsky.net/2011/02/refactoring-to-testability.html
A diff test can be acceptable as a Unit Tests, especially when your using test data that is shared between Unit Tests.
If you don't know how many items there are in the SUT you could use the following:
int itemsBeforeTest = SUT.Items.Count;
SUT.AddItem();
Assert.AreEqual(itemsBeforeTest + 1, SUT.Items.Count);
If a Unit Tests requires so much data that it needs to be read from a big XML file, it's not a real Unit Test. A Unit Test should test a class in complete isolation and mock out all dependencies.
Using a pattern like the Builder pattern, can also help in creating test data for your unit test. The biggest problem with having your test data in a separate file, is that it's hard to understand what the test does exactly. If you create your test data in the arrange part of your unit test, it's immediately clear what is important for your test.
For example, let's say you have the following arrange code to test if the price of an invoice is correct:
Address billingAddress = new Address("Stationsweg 9F",
"Groningen", "Nederland", "9726AE"); shippingAddress = new Address("Aweg 1",
"Groningen", "Nederland", "9726AB");
Customer customer = new Customer(99, "Piet", "Klaassens",
30,
billingAddress,
shippingAddress);
Product product = new Product(88, "Tafel", 19.99);
Invoice invoice = new Invoice(customer);
Can be changed to the following when using a Builder
Invoice invoice = Builder<Invoice>.CreateNew()
.With(i => i.Product = Builder<Product>.CreateNew()
.With(p => p.Price = 19.99)
.Build())
.Build();
When using a Builder its much easier to see what is important and your code is also more maintainable.
Refactoring code to become more testable is a broad topic. It comes down to thinking about 'how would I test this code?' while you are writing the code.
Take the following example:
public class AlarmClock
{
public AlarmClock()
{
SatelliteSyncService = new SatelliteSyncService();
HardwareClient = new HardwareClient();
}
}
This is hard to test. You need to make sure that both the SatteliteSyncService and the HardwareClient are functional when testing the AlarmClock.
This change tot the constructor makes it much easier to test:
public AlarmClock(IHardwareClient hardwareClient, ISatelliteSyncService satelliteSyncService)
{
SatelliteSyncService = satelliteSyncService;
HardwareClient = hardwareClient;
}
Techniques like Dependency Injection help with refactoring your code to be more testable. Also watch out for static values like DateTime.Now or the use of a Singleton because they are hard to test.
A really good introduction to writing testable code can be found here.
You should not require the code to be refactored to be able to perform unit tests. Unit tests, as the name implies, are testing a unit of code for the system. The best unit tests are small, quick to execute and exercise only a very small subset of the piece of code being tested (e.g. class).
The reason for having small, compact unit tests that only exercise one part of the code is that the objective of unit tests is to find bugs in that unit of code. If the unit test takes a long time to execute and tests lots of things it makes the identification of a bug in the code that much harder.
As to accessing the file system, I see no problem. Some unit tests may require a database to be constructed before the test is carried out, the output to be checked that would be difficult or time expensive to write the checks in code.
The files for unit testing should be treated like the rest of the code - put under version control. If you are paranoid you could implement a check within the unit test such as perform a MD5 on it and check against a hard coded value so future reruns of the test can verify that the test data has not inadvertenly changed.
Just my humble thoughts.

What do you do to test methods that produce complicated object graphs?

I'm a controls developer and a relative newbie to unit testing. Almost daily, I fight the attitude that you cannot test controls because of the UI interaction. I'm producing a demonstration control to show that it's possible to dramatically reduce manual testing if the control is designed to be testable. Currently I've got 50% logic coverage, but I think I could bump that up to 75% or higher if I could find a way to test some of the more complicated parts.
For example, I have a class with properties that describe the control's state and a method that generates a WPF PathGeometry object made of several segments. The implementation looks something like this:
internal PathGeometry CreateOuterGeometry()
{
double arcRadius = OuterCoordinates.Radius;
double sweepAngle = OuterCoordinates.SweepAngle;
ArcSegment outerArc = new ArcSegment(...);
LineSegment arcEndToCenter = new LineSegment(...);
PathFigure fig = new PathFigure();
// configure figure and add segments...
PathGeometry outerGeometry = new PathGeometry();
outerGeometry.Figures.Add(fig);
return outerGeometry;
}
I've got a few other methods like this that account for a few hundred blocks of uncovered code, an extra 25% coverage. I originally planned to test these methods, but rejected the notion. I'm still a unit testing newbie, and the only way I could think of to test the code would be several methods like this:
void CreateOuterGeometry_AngleIsSmall_ArcSegmentIsCorrect()
{
ClassUnderTest classUnderTest = new ClassUnderTest();
// configure the class under test...
ArcSegment expectedArc = // generate expected Arc...
PathGeometry geometry = classUnderTest.CreateOuterGeometry()
ArcSegment arc = geometry.Figures.Segments[0];
Assert.AreEqual(expectedArc, arc)
}
The test itself looks fine; I'd write one for each expected segment. But I had some problems:
Do I need tests to verify "Is the first segment an ArcSegment?" In theory the test tests this, but shouldn't each test only test one thing? This sounds like two things.
The control has at least six cases for calculation and four edge cases; this means for each method I need at least ten tests.
During development I changed how the various geometries were generated several times. This would cause me to have to rewrite all of the tests.
The first problem gave me pause because it seemed like it might inflate the number of tests. I thought I might have to test things like "Were there x segments?" and "Is segment n the right type?", but now that I've thought more I see that there's no branching logic in the method so I only need to do those tests once. The second problem made me more confident that there would be much effort associated with the test. It seems unavoidable. The third problem compounds the first two. Every time I changed the way the geometry was calculated, I'd have to edit an estimated 40 tests to make them respect the new logic. This would also include adding or removing tests if segments were added or removed.
Because of these three problems, I opted to write an application and manual test plan that puts the control in all of the interesting states and asks the user to verify it looks a particular way. Was this wrong? Am I overestimating the effort involved with writing the unit tests? Is there an alternative way to test this that might be easier? (I'm currently studying mocks and stubs; it seems like it'd require some refactoring of the design and end up being approximately as much effort.)
Use dependency injection and mocks.
Create interfaces for ArcSegmentFactory, LineSegmentFactory, etc., and pass a mock factory to your class. This way, you'll isolate the logic that is specific to this object (this should make testing easier), and won't be depending on the logic of your other objects.
About what to test: you should test what's important. You probably have a timeline in which you want to have things done, and you probably won't be able to test every single thing. Prioritize stuff you need to test, and test in order of priority (considering how much time it will take to test). Also, when you've already made some tests, it gets much easier to create new tests for other stuff, and I don't really see a problem in creating multiple tests for the same class...
About the changes, that's what tests are for: allowing you to change and don't really fear your change will bring chaos to the world.
You might try writing a control generation tool that generates random control graphs, and test those. This might yield some data points that you might not have thought of.
In our project, we use JUnit to perform tests which are not, strictly speaking, unit tests. We find, for example, that it's helpful to hook up a blank database and compare an automatic schema generated by Hibernate (an Object-Relational Mapping tool) to the actual schema for our test database; this helps us catch a lot of issues with wrong database mappings. But in general... you should only be testing one method, on one class, in a given test method. That doesn't mean you can't do multiple assertions against it to examine various properties of the object.
My approach is to convert the graph into a string (one segment per line) and compare this string to an expected result.
If you change something in your code, tests will start to fail but all you need to do is to check that the failures are in the right places. Your IDE should offer a side-by-side diff for this.
When you're confident that the new output is correct, just copy it over the old expected result. This will make sure that a mistake won't go unnoticed (at least not for long), the tests will still be simple and they are quick to fix.
Next, if you have common path parts, then you can put them into individual strings and build the expected result of a test from those parts. This allows you to avoid repeating yourself (and if the common part changes, you just have to update a single place for all tests).
If I understand your example correctly, you were trying to find a way to test whether a whole bunch of draw operations produce a given result.
Instead of human eyes, you could have produced a set of expected images (a snapshot of verified "good" images), and created unit tests which use the draw operations to create the same set of images and compare the result with an image comparison. This would allow you to automate the testing of the graphic operations, which is what I understand your problem to be.
The textbook way to do this would be to move all the business logic to libraries or controllers which are called by a 1 line method in the GUI. That way you can unit test the controller or library without dealing with the GUI.

How far should I go with unit testing?

I'm trying to unit test in a personal PHP project like a good little programmer, and I'd like to do it correctly. From what I hear what you're supposed to test is just the public interface of a method, but I was wondering if that would still apply to below.
I have a method that generates a password reset token in the event that a user forgets his or her password. The method returns one of two things: nothing (null) if everything worked fine, or an error code signifying that the user with the specified username doesn't exist.
If I'm only testing the public interface, how can I be sure that the password reset token IS going in the database if the username is valid, and ISN'T going in the database if the username is NOT valid? Should I do queries in my tests to validate this? Or should I just kind of assume that my logic is sound?
Now this method is very simple and this isn't that big of a deal - the problem is that this same situation applies to many other methods. What do you do in database centric unit tests?
Code, for reference if needed:
public function generatePasswordReset($username)
{
$this->sql='SELECT id
FROM users
WHERE username = :username';
$this->addParam(':username', $username);
$user=$this->query()->fetch();
if (!$user)
return self::$E_USER_DOESNT_EXIST;
else
{
$code=md5(uniqid());
$this->addParams(array(':uid' => $user['id'],
':code' => $code,
':duration' => 24 //in hours, how long reset is valid
));
//generate new code, delete old one if present
$this->sql ='DELETE FROM password_resets WHERE user_id=:uid;';
$this->sql.="INSERT INTO password_resets (user_id, code, expires)
VALUES (:uid, :code, now() + interval ':duration hours')";
$this->execute();
}
}
The great thing about unit testing, for me at least, is that it shows you where you need to refactor. Using your sample code above, you've basically got four things happening in one method:
//1. get the user from the DB
//2. in a big else, check if user is null
//3. create a array containing the userID, a code, and expiry
//4. delete any existing password resets
//5. create a new password reset
Unit testing is also great because it helps highlight dependencies. This method, as shown above, is dependent on a DB, rather than an object that implements an interface. This method interacts with systems outside its scope, and really could only be tested with an integration test, rather than a unit test. Unit tests are for ensuring the working/correctness of a unit of work.
Consider the Single Responsibility Principle: "Do one thing". It applies to methods as well as classes.
I'd suggest that your generatePasswordReset method should be refactored to:
be given a pre-defined existing user object/id. Do all those sanity checks outside of this method. Do one thing.
put the password reset code into its own method. That would be a single unit of work that could be tested independently of the SELECT, DELETE and INSERT.
Make a new method that could be called OverwriteExistingPwdChangeRequests() which would take care of the DELETE + INSERT.
The reason this function is more difficult to unit test is because the database update is a side-effect of the function (i.e. there is no explicit return for you to test).
One way of dealing with state updates on remote objects like this is to create a mock object that provides the same interface as the DB (i.e. it looks identical from the perspective of your code). Then in your test you can check the state changes within this mock object and confirm you have received what you should.
You can break it down some more, that function is doing alot which makes testing it a bit tricky, not impossible but tricky. If on the other hand you pulled out some smaller extra functions (getUserByUsername, deletePasswordByUserID, addPasswordByUserId, etc. Then you can test those easily enough once and know they work so you don't have to test them again. This way you test the lower down calls making sure they are sound so you don't have to worry about them further up the chain. Then for this function all you need to do is throw it a user that does not exist and make sure it comes back with a USER_DOESNT_EXIST error then one where a user does exist (this is where you test DB comes in). The inner works have already been exercised else where (hopefully).
Unit tests serve the purpose of verifying that a unit works. If you care to know whether a unit works or not, write a test. It's that simple. Choosing to write a unit test or not shouldn't be based on some chart or rule of thumb. As a professional it's your responsibility to deliver working code, and you can't know if it's working or not unless you test it.
Now, that doesn't mean you write a test for each and every line of code. Nor does it necessarily mean you write a unit test for every single function. Deciding to test or not test a particular unit of work boils down to risk. How willing are you to risk that your piece of untested code gets deployed?
If you're asking yourself "how do I know if this functionality works", the answer is "you don't, until you have repeatable tests that prove it works".
In general one might "mock" the object you are invoking, verifying that it receives the expected requests.
In this case I'm not sure how helpful that is, you amost end up writing the same logic twice ... we thought we sent "DELETE from password" etc. Oh look we did!
Hmmm, what did we actually check. If the string was badly formed, we woudln't know!
It may be against the letter of Unit testing law, but I would instead test these side-effects by doing separate queries against the database.
Testing the public interface is necessary, but not sufficient. There are many philosophies on how much testing is required, and I can only give my own opinion. Test everything. Literally. You should have a test that verifies that each line of code has been exercised by the test suite. (I only say 'each line' because I'm thinking of C and gcov, and gcov provides line-level granularity. If you have a tool that has finer resolution, use it.) If you can add a chunk of code to your code base without adding a test, the test suite should fail.
Databases are global variables. Global variables are public interfaces for every unit that uses them. Your test cases must therefore vary inputs not only on the function parameter, but also the database inputs.
If your unit tests have side-effects (like changing a database) then they have become integration tests. There is nothing wrong in itself with integration tests; any automated testing is good for the quality of your product. But integration tests have a higher maintenance cost because they are more complex and are easier to break.
The trick is therefore to minimize the code that can only be tested with side effects. Isolate and hide the SQL queries in a separate MyDatabase class which does not contain any business logic. Pass an instance of this object to your business logic code.
Then when you unit test your business logic, you can substitute the MyDatabase object with a mock instance which is not connected to a real database, and which can be used to verify that your business logic code uses the database correctly.
See the documentation of SimpleTest (a php mocking framework) for an example.

How to best do unit testing for a web application

I am writing a web application that is very complex in terms of UI and relies heavily on AJAX, DOM and image manipulations.
Is there any standard practice (I don't want tools) which can be followed to reduce the bugs?
A very simple technique is the smoke test, where you automate a click-through of all of your application. If the script runs and there are no errors anywhere in the logs, you have at least one defined level of quality.
This technique actually catches a fair amount of regression breaks and is much more effective than it sounds like. We use selenium for this.
Separate the logic and the UI portions - do not have all your business logic and complex code in the code behind page. Instead build them off the standard tier structure (data layer, business rules / logic layer, UI layer). This will ensure that your logic code you want to test does not reference the form, but instead uses classes that are easily unit tested.
For a very basic example, don't have code that does this:
string str = TextBox1.Text.ToString();
//do whatever your code does
TextBox2.Text = str;
Instead extract the logic into a separate class with a method:
TextBox2.Text = DoWork(TextBox1.Text.ToString());
public class Work
{
public string DoWork(string str)
{
//do work
return str2;
}
}
This way you can write unit tests to verify that DoWork is returning the correct values:
string return = DoWork("TestThisString");
Now all of your logic is unit testable, with only code that HAS to reference the page directly still in your UI layer.
Watin is a great tool for this.
A simple checklist (even on a piece of paper!) is the best way to make sure you never skip the important things. It's a good "smoke test" that nothing "standard" has been broken.