How far should I go with unit testing?

How far should I go with unit testing? - unit-testing

I'm trying to unit test in a personal PHP project like a good little programmer, and I'd like to do it correctly. From what I hear what you're supposed to test is just the public interface of a method, but I was wondering if that would still apply to below.
I have a method that generates a password reset token in the event that a user forgets his or her password. The method returns one of two things: nothing (null) if everything worked fine, or an error code signifying that the user with the specified username doesn't exist.
If I'm only testing the public interface, how can I be sure that the password reset token IS going in the database if the username is valid, and ISN'T going in the database if the username is NOT valid? Should I do queries in my tests to validate this? Or should I just kind of assume that my logic is sound?
Now this method is very simple and this isn't that big of a deal - the problem is that this same situation applies to many other methods. What do you do in database centric unit tests?
Code, for reference if needed:
public function generatePasswordReset($username)
{
$this->sql='SELECT id
FROM users
WHERE username = :username';
$this->addParam(':username', $username);
$user=$this->query()->fetch();
if (!$user)
return self::$E_USER_DOESNT_EXIST;
else
{
$code=md5(uniqid());
$this->addParams(array(':uid' => $user['id'],
':code' => $code,
':duration' => 24 //in hours, how long reset is valid
));
//generate new code, delete old one if present
$this->sql ='DELETE FROM password_resets WHERE user_id=:uid;';
$this->sql.="INSERT INTO password_resets (user_id, code, expires)
VALUES (:uid, :code, now() + interval ':duration hours')";
$this->execute();
}
}

The great thing about unit testing, for me at least, is that it shows you where you need to refactor. Using your sample code above, you've basically got four things happening in one method:
//1. get the user from the DB
//2. in a big else, check if user is null
//3. create a array containing the userID, a code, and expiry
//4. delete any existing password resets
//5. create a new password reset
Unit testing is also great because it helps highlight dependencies. This method, as shown above, is dependent on a DB, rather than an object that implements an interface. This method interacts with systems outside its scope, and really could only be tested with an integration test, rather than a unit test. Unit tests are for ensuring the working/correctness of a unit of work.
Consider the Single Responsibility Principle: "Do one thing". It applies to methods as well as classes.
I'd suggest that your generatePasswordReset method should be refactored to:
be given a pre-defined existing user object/id. Do all those sanity checks outside of this method. Do one thing.
put the password reset code into its own method. That would be a single unit of work that could be tested independently of the SELECT, DELETE and INSERT.
Make a new method that could be called OverwriteExistingPwdChangeRequests() which would take care of the DELETE + INSERT.

The reason this function is more difficult to unit test is because the database update is a side-effect of the function (i.e. there is no explicit return for you to test).
One way of dealing with state updates on remote objects like this is to create a mock object that provides the same interface as the DB (i.e. it looks identical from the perspective of your code). Then in your test you can check the state changes within this mock object and confirm you have received what you should.

You can break it down some more, that function is doing alot which makes testing it a bit tricky, not impossible but tricky. If on the other hand you pulled out some smaller extra functions (getUserByUsername, deletePasswordByUserID, addPasswordByUserId, etc. Then you can test those easily enough once and know they work so you don't have to test them again. This way you test the lower down calls making sure they are sound so you don't have to worry about them further up the chain. Then for this function all you need to do is throw it a user that does not exist and make sure it comes back with a USER_DOESNT_EXIST error then one where a user does exist (this is where you test DB comes in). The inner works have already been exercised else where (hopefully).

Unit tests serve the purpose of verifying that a unit works. If you care to know whether a unit works or not, write a test. It's that simple. Choosing to write a unit test or not shouldn't be based on some chart or rule of thumb. As a professional it's your responsibility to deliver working code, and you can't know if it's working or not unless you test it.
Now, that doesn't mean you write a test for each and every line of code. Nor does it necessarily mean you write a unit test for every single function. Deciding to test or not test a particular unit of work boils down to risk. How willing are you to risk that your piece of untested code gets deployed?
If you're asking yourself "how do I know if this functionality works", the answer is "you don't, until you have repeatable tests that prove it works".

In general one might "mock" the object you are invoking, verifying that it receives the expected requests.
In this case I'm not sure how helpful that is, you amost end up writing the same logic twice ... we thought we sent "DELETE from password" etc. Oh look we did!
Hmmm, what did we actually check. If the string was badly formed, we woudln't know!
It may be against the letter of Unit testing law, but I would instead test these side-effects by doing separate queries against the database.

Testing the public interface is necessary, but not sufficient. There are many philosophies on how much testing is required, and I can only give my own opinion. Test everything. Literally. You should have a test that verifies that each line of code has been exercised by the test suite. (I only say 'each line' because I'm thinking of C and gcov, and gcov provides line-level granularity. If you have a tool that has finer resolution, use it.) If you can add a chunk of code to your code base without adding a test, the test suite should fail.

Databases are global variables. Global variables are public interfaces for every unit that uses them. Your test cases must therefore vary inputs not only on the function parameter, but also the database inputs.

If your unit tests have side-effects (like changing a database) then they have become integration tests. There is nothing wrong in itself with integration tests; any automated testing is good for the quality of your product. But integration tests have a higher maintenance cost because they are more complex and are easier to break.
The trick is therefore to minimize the code that can only be tested with side effects. Isolate and hide the SQL queries in a separate MyDatabase class which does not contain any business logic. Pass an instance of this object to your business logic code.
Then when you unit test your business logic, you can substitute the MyDatabase object with a mock instance which is not connected to a real database, and which can be used to verify that your business logic code uses the database correctly.
See the documentation of SimpleTest (a php mocking framework) for an example.

Related

Unit testing an over-large front controller -based class: optimal strategy

I have a web application which currently exists as a single large class containing a front controller, and a bootstrap file which runs the class and passes in the settings.
Over time, the class has become over-large, with multiple concerns, that I have long wanted to refactor into what will be rather obvious subclasses. I'm very familiar with all kinds of refactorings.
However, for this application, at present there is no test coverage. Because all interactions are done by reading GET and POST parameters, there is only one public method into the system, i.e. such URL interactions through the front controller entry point. So the class is hard to unit test (I will be using PHPUnit) as it stands.
Obviously it is far safer to refactor with tests already in place. So I would welcome views on which strategy is best:
1) Create a pile of tests that implement GET and POST interactions as per a user using the web application; or
2) Create a pile of tests against the private functions by using the ReflectionClass workaround, and convert these to standard PHPUnit tests concurrently with the refactoring; or
3) Add unit testing after doing the subclass refactoring, testing the public entry API points to the new subclasses.

Either way how you approach it, I would do it one-functionality at a time if you can, it makes the task less daunting and therefore more achievable...
When confronted with such situations in the past (quite a few times in fact), I usually do this:
First of all, does the application use external dependencies like databases? The first thing is that we need to make these predictable. I usually try to find where the creation of these external dependencies occur and then delegate it to a good Dependency Injection framework (like Pimple).
Once that is done you can add rules to the DI factory methods that return test databases depending on some conditions set: I usually use a test domain for this, so if the site is foo.local then i also have a domain test.foo.local that sets (in the Apache config for example) an environment variable APPLICATION_ENV set to 'tests'. The DI container can then inspect this environment variable to serve the right dependency, in our case a connection pointing to a test database.
(It don't like that production code is aware of that it might be in a test scenario, but it's a lesser evil, and it can be alleviated by using config files that have the environment variable in the filename for example. I digress.)
When this is done, in my tests I then have a reference to the same test database which I then use to set up some predictable test data with some thing like DbUnit or test-db-acle (disclaimer, I wrote the latter one, so I tend towards that).
Once the test data is in place, I issue a request to the url in question with something like Guzzle (in various important variations) and record the output(s), which I then use to make a test that sets the behaviour 'in stone'.
NOTE: this is a very imprecise method, the best way for tests is in a test-driven way, but this is not a solution in this case obviously.
However, if you are reasonable sure that some manual testing has been done on this, then this is a good way to at least get a few smoke tests into place. In my experience the value of having even one rudimentary smoke test in place is infinitely more valuable than not having any.
Now that you have a smoke test in place, you can refactor this part of the application and move to a new one.
I hope this help....

Unit/integration testing nHibenrate query

Scenario: I need to write a complex nHibernate query, that would return projected DTO, but I want to use TDD approach. The method would look like this:
public PrintDTO GetUsersForPrinting(int userId)
{
Session.QueryOver<User>().//some joins, conditions etc.
//returns projected dto
}
Questions:
Since the most common approach is to use in memory database for this kind of operations. Should I write integration test?
If I am using in memory db can I write Unit tests?
Is one test is enough?
Since my integration test probably will check projection, how should I name it? "GetUserForPrinting_return_correct_DTO" seems too abstract and silly.
I ask because:
There is lots of abstract information about TDD and integration testing, but when it comes to concrete implementation it is very difficult to apply that information.
TDD suggests that integration test should be made of unit tests:

This is not really a very good problem to learn TDD with. I assume you don't already know what the complex query looks like, and you want to use test-driven techniques to drive it out. Awesome :)
But let's see if I can answer your questions.
Yes
any test that includes a real db, whether it is in-memory or on-disk, is not a unit test. A unit test would use a mock db.
Maybe - if you query is complex enough, then no.
testGetUsersForPrinting or getUsersForPrintingTest or similar
Most probably I would drive out the query in a SQL interpreter, not in code. The aim would be to produce a series of integration tests against an in-memory db based on what I learn during this process.
Start from the minimum possible DTO you can think of, and build up from there.
Finally convert the query into nhibernate calls, then make the integration tests pass.
Test-driven, but not really unit-test-driven.
If you are willing to accept maximum TDD discipline and deal with working slower and being more annoyed than usual, you can automate each integration test as you develop it and write code to make it pass. This will mean you are switching frequently among 3 levels of abstraction / editors / environments (direct SQL queries, integration tests, c# code) - I deal with this by setting up techniques to force myself to follow the right steps each time.
This last bit is why this is not a good problem to learn TDD with. You will need a lot of discipline you probably haven't forced yourself to acquire yet!
Good luck.
ok some concrete examples. I would modify your code sample to look like this
public PrintDTO GetUsersForPrinting(int userId, ISession session)
{
var data = session.QueryOver<User>().//some joins, conditions etc.
return data; // or whatever
}
In your unit test you would write
public testDTO()
{
//Arrange
StubSession session = .... setup a stub session, which returns hardcoded values
// Act
PrintDTO users = GetUsersForPrinting(111, session);
// Assert
Assert.That(users.size(), Is.EqualTo(1));
Assert.That(users.get(0).userId, Is.EqualTo(111));
}
In your integration test, you would use a real db, and your session object would actually connect to it, and the queries would be resolved against that db
Arrange-Act-Assert is a standard method for organizing unit tests.
Generally you want as few Asserts as possible in a unit test. And you will have multiple unit tests.
When you are writing a unit test, start by writing the Assert, then fill in the rest to make it compile/get the result you want. Make the test fail first, because then you know you have really delivered something when it passes.
In this example to implement a stub ISession you would derive a local StubSession class (only visible to the test suite) from ISession and just fill in the absolute minimum to get it to compile, and return the minimum data to get the test to pass.
To build up to your whole DTO - assuming you know what you want in your DTO - proceed, as you say in the comments, incrementally. Build up each part of your DTO
a piece at a time, add a unit test for each piece.
Keeping track of this is another piece of TDD discipline.
Set yourself up with a TODO list - just a simple text file, or possibly a lengthy comments at the start of your test suite. List all the things you want to test e.g. zero results, one result, two results, 20 results. User id, whatever other pieces of information you need to have.
If you are doing a complex query across tables or whatever add an todo item for each join, each part of the where clause, etc.
Add items for ordering and paging etc if you are using those.
Pick the simplest things first. Only do one small thing (in a single red-green-refactor cycle) at a time. As you work through your list, you might want to break items up into smaller pieces, or you might think of additional things you need to do. Add them to the TODO list rather than working directly on them.
In this particular case I would swap - after each red-green-refactor cycle - into the SQL environment and/or the sqlite integration test to work out how to make the next piece work. I guess this is a sort of step between red and green - choose what you will test next, write the test (which fails obviously), fiddle around in SQL until you know how to make it pass, write the nHibernate calls to make your test green, then refactor.
Be aware some of the things you list might run out not to be necessary, or take too long, etc. It's good to write them down still, so you know what are not doing as well as what you are doing. Keep focused on your goal.
I tend to also develop a list of "smells" and/or refactorings that I can see I will want to do but am not quite ready for this cycle. Remember to minimise duplication/refactor your tests as well as your SUT (System Under Test).
It's a doing rather then seeing thing. The list of what unit tests you end up with, and the code they exercise, is not a very good description of the journey. Kent Beck's original TDD book is slim and will give you some good overall pointers, but not really about constructing queries.
Does any of that help?

Since the most common approach is to use in memory database for this kind of operations. Should I write integration test?
Using in memory database still is an integration test (because it actually tests if your query generates correct SQL and execute it against a database, see).
If I am using in memory db can I write Unit tests?
No, it would be an integration test
Is one test is enough?
Probably not, you should check each condition of your query, for example one test per one where clause, one for paging and one for sorting if applicable.
Since my integration test probably will check projection, how should I
name it? "GetUserForPrinting_return_correct_DTO" seems too abstract
and silly.
GivenUserForPrinting_WhenGetUserForPrinting_ThenMapToDTO would be a better naming

What do you test with Unit tests?

I am new to unit testing. Suppose I am building a web application. How do I know what to test? All the examples that you see are some sort of basic sum function that really has no real value, or at least I've never written a function to add to inputs and then return!
So, my question...on a web application, what are the sort of things that need tested?
I know that this is a broad question but anything will be helpful. I would be interested in links or anything that gives real life examples as opposed to concept examples that don't have any real life usage.

Take a look at your code, especially the bits where you have complex logic with loops, conditionals, etc, and ask yourself: How do I know if this works?
If you need to change the complex logic to take into account other corner cases then how do you know that the changes you introduce don't break the existing cases? This is precisely what unit testing is intended to address.
So, to answer your question about how it applies to web applications: suppose you have some code that lays out the page differently depending on the browser. One of your customers refuses to upgrade from IE6 and insists that you support that. So you unit test your layout code by simulating the connection string from IE6 and checking that the layout is what you expect.
A customer tells you they've found a security hole where using a particular cookie will give you administrator access. How do you know that you've fixed the bug and it doesn't happen again? Create a unit test for it, and run the unit tests on a daily basis so that you get an early warning if it fails.
You discover a bug where users with accents in their names get corrupted in the database. Abstract out the webform input from the database layer and add unit tests to ensure that (eg) UTF8 encoded data is stored in the database correctly and can be retrieved.
You get the idea. Anywhere where part of the process has a well-defined input and output is ideal for unit testing. Anything that doesn't is ideal for refactoring until it is well defined. Take a look at projects such as WebUnit, HTMLUnit, XMLUnit, CSSUnit.

The first part of testing is to write testable applications. Separate out as much functionality as possible from the UI. Refactor into smaller methods. Learn about dependency injection, and try using that to create methods that can take simple, throw-away input that produces known (and therefor testable) results. Look at mocking tools.
Infrastructure and data layer code is easiest to test.
Look at behavior-driven testing as well as test-driven design. For my money, behavior testing is better than pure unit testing; you can follow use-cases, so that tests match against expected usage patterns.

Unit testing means testing any unit of work, the smallest units of work are methods and functions., The art of unit testing is to define tests for a function that cannot just be checked by inspection, what unit test aims at is to test every possible functional requirement of a method.
Consider for example you have a login function, then there could be following tests that you could write for failures:
1. Does the function fail on empty username and password
2. Does the function fail on the correct username but the wrong password
3. Does the function fail on the correct password but the wrong username
The you would also write tests that the function would pass:
1. Does the function pass on correct username and password
This is just a basic example but this is what unit testing attempts to achieve, testing out things that may have been overlooked during development.
Then there is a purist approach too where a developer is first supposed to write tests and then the code to pass those tests (aka test driven development).
Resources:
http://devzone.zend.com/article/2772
http://www.ibm.com/developerworks/library/j-mocktest.html

If you're new to TDD, may I suggest a quick trip into the world of BDD? My experience is that the language really helps people pick up TDD more quickly. Particularly, I point you at this article, in which Dan North suggests "what to test":
http://blog.dannorth.net/introducing-bdd/
Note for transparency: I may be heavily involved in the BDD movement.
Regarding the classes to unit test in a web-app, I'd consider starting with controllers, domain objects if they have complex behaviour, and anything called "service", "manager", "helper" or "util". Please also try renaming any classes like this so that they are less generic and actually say what they do. Classes called "calculator" or "converter" are also good candidates, and you'll probably find more in the same package / folder.
There are a couple of good books which could help you too:
Martin Fowler, "Refactoring"
Michael Feathers, "Working Effectively with Legacy Code"
Good luck!

If you start out saying, "How do I test my web app?" that is biting off a lot at once, and it's going to be hard to see unit testing as providing any kind of benefit. I got into unit testing by starting with small pieces that were isolated, then writing libraries test-first, and only then building whole applications that were testable.
Generally a web app has a domain model, it has data access objects that do queries on a database and return domain objects, it has services that call the data access objects, and it has controllers that accept http requests and call the services.
Tests for the controllers will check that they call the right service method with the right parameters. Service objects can be mocks injected during test setup.
Tests for the services will check that they call the right data access objects and perform whatever logic they need to be performing. Data access objects can be mocks injected during test setup.
Tests for the data access objects will check that they perform the right database operation (query or update or whatever) by checking the contents of the database before and after. For dao tests you'll need a database, and a tool like DBUnit to pre-populate it before the test. Also your domain objects' getters and setters will get exercised with this test so you won't need a separate test for them.
Tests for the domain model will check that whatever domain logic you have encoded in them works (Sometimes you may not have any). If you design your domain model so it is not coupled to the database then the more logic you put in the domain model the better because it's easy to test. You shouldn't need any mocks for these tests.

For a web app the kind of tests you need to do are slightly different. Unit tests are tests which test a particular component of your program. For a web app, you would need to test that forms accept/reject the right inputs, that all links point to the right place, that it can cope with unexpected inputs etc. I'd have a look at Selenium if I were you, I've used it extensively in testing a number of sites: Selenium HQ

I don't have experience of testing web apps, but speaking generally: you unit test the smallest 'chunks' of your program possible. That means you test each function on an individual basis. Anything on a larger scale becomes an integration test.
Of course, there are going to be methods so simple that its not worth your time to write a test for them, but on the whole aim to test as great a proportion of your code as possible.

A rule of thumb is that if it is not worth testing it is not worth writing.
However, some things are very difficult to test, so you have the do some cost benefit analysis on what you test. If you initially aim for 70% code coverage, you will be on the right track.

Patterns for loading default data when unit testing

Looking for some strategies for how you guys are loading default data when doing unit tests.

I use a builder that contains the default values, just like this: http://elegantcode.com/2008/04/26/test-data-builders-refined/. Then the test only specifies the value it cares for:
Customer customer = new CustomerBuilder()
.WithFirstName("this test only cares about a special ' ... first name value");
After reading the other answers, I want to clear that its not for database data. Its to build the instances/data you pass to the classes you are unit testing.
Its a matter of convenience/keeping the tests simple, plenty of times you are testing very specific behavior that depends on 1-3 fields and you don't care about the rest of the fields.

For unit testing I generally don't load data in advance - each test is designed to work against a data source that may or may not already contain existing records, and so each test writes all any records that are needed to complete the test.
When choosing values to submit to the database I use GUIDs (or other random values) whenever possible as it guarantees that values in the database are unique (e.g. if you create someone named "Mr X Y", it is helpful to know that searching for "X" should return only 1 result, and that there is no chance you have chanced on someone else in the database whose last name happens to be Y)
Often when unit testing I'm testing methods that modify data alongside methods that read data, and so my unit tests use the same API (the one being tested) to write to the database. (It's nice if each unit test covers a specific area of functionality, but it's not absolutely necessary)
If the API being tested doesn't have methods to write to the database however, I write my own set of helper functions - the exact structure is going to depend on the data source, but as an example this is where I often use LINQ to SQL.

TDD is about testing a piece of code in isolation. One create an instance of a class with its dependencies (or mocks of them), call the method under test and assert to verify the outcome of the test.
Usually with TDD one starts with a simple test, without data. When data are needed, they are created in the test fixture (the isolated environment where the test is executed) by the test setUp() method and then destroyed by the tearDown() method after the test has been run. Data are not loaded from the database.

Preferred strategy is in-transaction data. Spring offers extensive support (for both JUnit 3 and 4). With this strategy your test begins brand new transaction each time and your data is rolled back at the end of test.
Of course sometimes it's not enough: either data set is too extensive and shared across tests, or multiple transactions are part of the test scope. In that case, I recommend creating shared test data bed that is created before running test suite. There are frameworks for this (dbUnit) but you can also do without them if careful and consistent.
UPD: creating in-transaction data doesn't mean you not need test data, you are likely to end up creating re-usable and shared helper classes to maintain test data in all cases.

I typically have methods like GetCustomer() that return a generic customer. If I need to make the returned customer suite my needs for a particular test, I will simply change the property after it gets returned.
Other times I may pass some configuration information into my GetCustomer() method. For example GetCustomer(string customerType).
I've read expert's opinions that says that each test should contain its own unique data to work with and not try to make the data generic. Even though this may make each test "larger" in size, over all it will make the test clearer because the setup is specific to each test and the goals of each test. I like this advice because I've ran into many cases where trying to make the setup data generic, made things very sloppy very quick.

How to unit-test a NextPasswordChangeDate function against the Active Directory

I am working on a project using the Active Directory, intensively. I set up a few unit tests for several things against the AD, some of which I achieve using mocked objects, some which I achieve through real calls against the AD.
As one of the functions of my project, I have to retrieve a so called "user profile". This user profile consists mostly of simple attributes, like "cn", "company", "employeeid", etc. However, one property that I am trying to fill is not a simple one "NextPasswordChangeDate".
To the best of my knowledge, the only way to get this, is by getting the domain policy's maxPwdAge and use this information together with pwdLastSet.
Now my question: How can I unit test this in an intelligent way? I came up with three options, all of which are not great:
Use my own account as the searched account, find out the date by other means and hard code it in the unit test. By this way, I can unit test my code well, but every month, I have to change the unit test, because I changed my password.
Use some account that has password never expires set. This is kind of pointless, because I cannot really test the correctness of my code by that.
Use a mock object and make sure that the correct API calls happen. This option allows to test the correctness of the function's behaviour, but then the tested logic is in fact in the unit test and hence I cannot be sure, that it is doing the right thing, even if the test is passed.
Which of the three do you suggest? Or maybe you have a better option?

Since 1 and 2 the rely on AD existing and having known values seem more like integration tests to me.
I generally take the side that any non-deterministic behavior should be interfaced out and mocked if possible (#3). As you noted this will always leave an amount of real implementation code that is not unit-testable, but would then be covered by your integration tests running against a known AD system.
Related Question/Answer

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js