I don't understand how I'm testing anything with unit testing.
Suppose I am testing that my repository class can retrieve values from the database correctly. The proper way to do this would be to actually call the real database and retrieve and check those values.
But the idea behind unit testing is that it should be done in isolation, and connecting to a running database is not isolation. So what is usually done is to mock or stub the database.
But why would testing on a fake database with hardcoded data and hardcoded return values even test anything? It seems tautological and a waste of time.
Or am I not understanding how to unit test properly?
Does one even unit test database calls?
I don't understand how I'm testing anything with unit testing.
Short answer: you are testing the logic, and leaving out the side effects.
You aren't testing everything; but you are testing something.
Furthermore, if you keep in mind that you aren't really testing the code with side effects, then you are motivated to arrange your code so that the pieces that actually depend on the side effect are small. The big pieces don't actually care where the data comes from, to those are easy to test.
So "something" can be "most things".
There is an impedance problem -- if your test doubles impersonate the production originals inadequately, then some of your test results will be inaccurate.
my philosophy is to test as little as possible to reach a given level of confidence
Kent Beck, 2008
One way of imagining "as little as possible" is to think in terms of cost -- we're aiming for a given confidence level, so we want to achieve as much of that confidence as we can using cheap unit tests, and then make up the difference with more expensive techniques.
Cory Benfield's talk Building Protocol Libraries the Right Way describes an example of the kind of separation we're talking about here. The logic of how to parse an HTTP message is separable from the problem of reading the bytes. If you make the complicated part easy to test, and the hard to test part too simple to fail, your chances of succeeding are quite good.
I think your concern is valid. For me, TDD is more of an evolutionary design practice than unit testing practice, but I'll save that for another discussion.
In your example, what we are really testing is that the logic contained within your individual classes is sound. By stubbing the data coming from the database you have a controlled scenario that you can ensure your code works for that particular scenario. This makes it much easier to ensure full test coverage for all data scenarios. You're correct that this really doesn't test the whole system end to end, but the point is to reduce the overall test maintenance costs and enable faster feedback.
My approach is to mock most collaborators at the unit test level, then write acceptance tests at the integration test level, which validates your system using real data. Because the unit tests with their mocked data allows you to test various data scenarios out, you only need to test a few of those scenarios using integration tests to feel confident that your code will perform as you expect.
You can test your code against actual database in isolation. Just create new database instance for every test, or execute tests synchronously one after another and clean database before next test.
But using actual database will make your tests slow, which will slow down your work, because you want quick feedback on what you are doing.
Do not test every class - test main feature logic, which can use many different classes and mock/stub only dependencies which makes tests slow.
Find your application boundaries and tests logic between them without mocking.
For example in trivial web api application boundaries can be:
- controller action -> request(input)
- controller action -> response(output)
- database -> side effect of received request.
Assume we live in perfect world where new database and web server setup will takes milliseconds. Then you will tests whole pipeline of your application:
1. Configure database for test
2. Send request to the web api server
3. Assert that response contains expected data
4. Assert that database state changed as expected
But in now days world your boundaries will be controller action and abstracted database access point. Which makes your test look like below:
1. Configure mocked database access point(repository)
2. Call controller action with given parameters
3. Assert that action returns expected result
4. Possibly assert that mocked repository received expected update arguments.
If your application have no logic, just read/update data from database - test with actual database or, if your database framework allows it, use database in-memory.
Related
I am just starting to get into unit testing and cant see an easy way to do a lot of test cases due to the interaction with a database.
Is there a standard method/process for unit testing where database access (read and write) is required in order to assert tests?
The best I can come up with so far is to have a config file used to bootstrap my app using a different db connection and then use the startup method to copy over the live db to a db used in isolation for tests?
Am I close? Or is there a better approach to this?
Your business logic shouldn't directly interact with the Database. Instead it should go through a data access layer that you can fake and mock in the context of unit testing. Look into mocking frameworks to do the mocking for you. Your tests should not depend on a database at all. Instead you should specify the data returned from your data access layer explicitly, and then ensure that your business logic behaves correctly with that information.
Testing that the program works with a DB attached is more of an integration test, and those have a lot of costs associated with them. They are slower (so it's harder to run them every time you compile), and more complicated (so they require more time and effort to maintain). If you can get simpler unit tests in place, I would recommend you do that first. Later you can add integration tests that might use the DB as well, but you'll get the most value from adding simpler unit tests first.
As far as unit-test go, I think whatever works for you in practice is the way to go. It's important that unit tests give you some value and improve the quality of your system and your ability to develop and maintain it.
I would suggest you probably don't want to be copying the live db over to your test db. There's probably no guarantees that your live database will contain suitable data that will cause your unit-tests to consistently run. The unit-tests should test that your code works, they shouldn't be testing that the live database happens to contain suitable data that causes them to pass, because as it's live, your users might change the content of it so that your tests fail.
You're unit test code itself should probably populate your test db with data required that simulates the scenarios you want to write unit tests for. I messed around with some ruby on rails code a few years ago; the test framework for that would have a test class which setup the db with some fake data, then multiple test methods from the class would be written to run against that data, and the tear-down method would wipe data from the database. So, different test-classes (or sometimes people call them fixtures) would run against a certain data setup, that meant you could run a number of tests against the same data setup instead of creating it for every test case you wanted to run. Setting up data for every test could end up causing your tests to run slowly, such that you get bored of waiting for them to run and stop bothering with them.
I am new to unit testing. Suppose I am building a web application. How do I know what to test? All the examples that you see are some sort of basic sum function that really has no real value, or at least I've never written a function to add to inputs and then return!
So, my question...on a web application, what are the sort of things that need tested?
I know that this is a broad question but anything will be helpful. I would be interested in links or anything that gives real life examples as opposed to concept examples that don't have any real life usage.
Take a look at your code, especially the bits where you have complex logic with loops, conditionals, etc, and ask yourself: How do I know if this works?
If you need to change the complex logic to take into account other corner cases then how do you know that the changes you introduce don't break the existing cases? This is precisely what unit testing is intended to address.
So, to answer your question about how it applies to web applications: suppose you have some code that lays out the page differently depending on the browser. One of your customers refuses to upgrade from IE6 and insists that you support that. So you unit test your layout code by simulating the connection string from IE6 and checking that the layout is what you expect.
A customer tells you they've found a security hole where using a particular cookie will give you administrator access. How do you know that you've fixed the bug and it doesn't happen again? Create a unit test for it, and run the unit tests on a daily basis so that you get an early warning if it fails.
You discover a bug where users with accents in their names get corrupted in the database. Abstract out the webform input from the database layer and add unit tests to ensure that (eg) UTF8 encoded data is stored in the database correctly and can be retrieved.
You get the idea. Anywhere where part of the process has a well-defined input and output is ideal for unit testing. Anything that doesn't is ideal for refactoring until it is well defined. Take a look at projects such as WebUnit, HTMLUnit, XMLUnit, CSSUnit.
The first part of testing is to write testable applications. Separate out as much functionality as possible from the UI. Refactor into smaller methods. Learn about dependency injection, and try using that to create methods that can take simple, throw-away input that produces known (and therefor testable) results. Look at mocking tools.
Infrastructure and data layer code is easiest to test.
Look at behavior-driven testing as well as test-driven design. For my money, behavior testing is better than pure unit testing; you can follow use-cases, so that tests match against expected usage patterns.
Unit testing means testing any unit of work, the smallest units of work are methods and functions., The art of unit testing is to define tests for a function that cannot just be checked by inspection, what unit test aims at is to test every possible functional requirement of a method.
Consider for example you have a login function, then there could be following tests that you could write for failures:
1. Does the function fail on empty username and password
2. Does the function fail on the correct username but the wrong password
3. Does the function fail on the correct password but the wrong username
The you would also write tests that the function would pass:
1. Does the function pass on correct username and password
This is just a basic example but this is what unit testing attempts to achieve, testing out things that may have been overlooked during development.
Then there is a purist approach too where a developer is first supposed to write tests and then the code to pass those tests (aka test driven development).
Resources:
http://devzone.zend.com/article/2772
http://www.ibm.com/developerworks/library/j-mocktest.html
If you're new to TDD, may I suggest a quick trip into the world of BDD? My experience is that the language really helps people pick up TDD more quickly. Particularly, I point you at this article, in which Dan North suggests "what to test":
http://blog.dannorth.net/introducing-bdd/
Note for transparency: I may be heavily involved in the BDD movement.
Regarding the classes to unit test in a web-app, I'd consider starting with controllers, domain objects if they have complex behaviour, and anything called "service", "manager", "helper" or "util". Please also try renaming any classes like this so that they are less generic and actually say what they do. Classes called "calculator" or "converter" are also good candidates, and you'll probably find more in the same package / folder.
There are a couple of good books which could help you too:
Martin Fowler, "Refactoring"
Michael Feathers, "Working Effectively with Legacy Code"
Good luck!
If you start out saying, "How do I test my web app?" that is biting off a lot at once, and it's going to be hard to see unit testing as providing any kind of benefit. I got into unit testing by starting with small pieces that were isolated, then writing libraries test-first, and only then building whole applications that were testable.
Generally a web app has a domain model, it has data access objects that do queries on a database and return domain objects, it has services that call the data access objects, and it has controllers that accept http requests and call the services.
Tests for the controllers will check that they call the right service method with the right parameters. Service objects can be mocks injected during test setup.
Tests for the services will check that they call the right data access objects and perform whatever logic they need to be performing. Data access objects can be mocks injected during test setup.
Tests for the data access objects will check that they perform the right database operation (query or update or whatever) by checking the contents of the database before and after. For dao tests you'll need a database, and a tool like DBUnit to pre-populate it before the test. Also your domain objects' getters and setters will get exercised with this test so you won't need a separate test for them.
Tests for the domain model will check that whatever domain logic you have encoded in them works (Sometimes you may not have any). If you design your domain model so it is not coupled to the database then the more logic you put in the domain model the better because it's easy to test. You shouldn't need any mocks for these tests.
For a web app the kind of tests you need to do are slightly different. Unit tests are tests which test a particular component of your program. For a web app, you would need to test that forms accept/reject the right inputs, that all links point to the right place, that it can cope with unexpected inputs etc. I'd have a look at Selenium if I were you, I've used it extensively in testing a number of sites: Selenium HQ
I don't have experience of testing web apps, but speaking generally: you unit test the smallest 'chunks' of your program possible. That means you test each function on an individual basis. Anything on a larger scale becomes an integration test.
Of course, there are going to be methods so simple that its not worth your time to write a test for them, but on the whole aim to test as great a proportion of your code as possible.
A rule of thumb is that if it is not worth testing it is not worth writing.
However, some things are very difficult to test, so you have the do some cost benefit analysis on what you test. If you initially aim for 70% code coverage, you will be on the right track.
If I'm having Data Access Layer (nHibernate) for example a class called UserProvider
and a Business Logic class UserBl, should I both test their methods SaveUser or GetUserById, or any other public method in DA layer which is called from BL layer. Is this a redundancy or a common practice to do?
Is it common to unit test DA layer, or that belongs to Integration test domain?
Is it better to have test database, or create database data during test?
Any help is appreciated.
There's no right answer to this, it really depends. Some people (e.g Roy Osherove) say you should only test code which has conditional logic (IF statements etc), which may or may not include your DAL. Some people (often those doing TDD) will say you should test everything, including the DAL, and aim for 100% code coverage.
Personally I only test it if it has logic in, so end up with some DAL methods tested and some not. Most of the time you just end up checking that your BL calls your DAL, which has some merit but I don't find necessary. I think it makes more sense to have integration tests which cover the app end-to-end, including the database, which covers things like GetUserById.
Either way, and you probably know this already, but make sure your unit tests don't touch an actual database. (No problem doing this, but that's an integration test not a unit test, as it takes a lot longer and involves complex setup, and should be run separately).
It is a good practice to write unit test for every layer, even the DAL.
I don't think running tests on the real db is a good idea, you might ruin important data. We used to set up a copy of the db for tests with just enough data in it to run tests on.
In our test project we had a special web.config file with test settings, like a ConnectionString to our test db.
In my experience it was helpful to test each layer on its own. Integrating it and test again. Integration test normally does not test all aspects. Sometimes if the Data Access Layer (I don't know nHibernate) is generated code or sort of generic code it looks like overkill. But I have seen it more than once that systematic testing pays off.
Is it redundancy? In my opinion it is not.
Is it common practice? Hard to tell. I would say no. I have seen it in some projects but not in all projects I worked in. Was often dependend on time/resources and mentality of the team / individiual developer.
Is it better to have test database, or create database data during test? This is quite a different question. Cannot be answered easily. Depends on your project. Create a new one is good but sometimes throws up unreal bugs (although bugs). It is depending on your project (product development or a proprietary development). Usually in an proprietary on site development a database gets migrated into from somewhere. So a second test is definitely needed with the migrated data. But this is rather at a system test level.
Unit testing the DAL is worth it as mentioned if there is logic in there, for example if using the same StoredProc for insert & update its worth knowing that an insert works, a subsequent call updates the previous and a select returns it and not a list. In your case SaveUser method probably inserts first time around and subsequently updates, its nice to know that this is whats being done at unit test stage.
If you're using a framework like iBatis or Hibernate where you can implement typehandlers its worth confirming that the handlers handle values in a way that's acceptable to your underlying DB.
As for testing against an actual DB if you use a framework like Spring you can avail of the supported database unit test classes with auto rollback of transactions so you run your tests and the DB is unaffected afterwards. See here for information. Others probably offer similiar support.
I used TDD as a development style on some projects in the past two years, but I always get stuck on the same point: how can I test the integration of the various parts of my program?
What I am currently doing is writing a testcase per class (this is my rule of thumb: a "unit" is a class, and each class has one or more testcases). I try to resolve dependencies by using mocks and stubs and this works really well as each class can be tested independently. After some coding, all important classes are tested. I then "wire" them together using an IoC container. And here I am stuck: How to test if the wiring was successfull and the objects interact the way I want?
An example: Think of a web application. There is a controller class which takes an array of ids, uses a repository to fetch the records based on these ids and then iterates over the records and writes them as a string to an outfile.
To make it simple, there would be three classes: Controller, Repository, OutfileWriter. Each of them is tested in isolation.
What I would do in order to test the "real" application: making the http request (either manually or automated) with some ids from the database and then look in the filesystem if the file was written. Of course this process could be automated, but still: doesn´t that duplicate the test-logic? Is this what is called an "integration test"? In a book i recently read about Unit Testing it seemed to me that integration testing was more of an anti-pattern?
IMO, and I have no literature to back me on this, but the key difference between our various forms of testing is scope,
Unit testing is testing isolated pieces of functionality [typically a method or stateful class]
Integration testing is testing the interaction of two or more dependent pieces [typically a service and consumer, or even a database connection, or connection to some other remote service]
System integration testing is testing of a system end to end [a special case of integration testing]
If you are familiar with unit testing, then it should come as no surprise that there is no such thing as a perfect or 'magic-bullet' test. Integration and system integration testing is very much like unit testing, in that each is a suite of tests set to verify a certain kind of behavior.
For each test, you set the scope which then dictates the input and expected output. You then execute the test, and evaluate the actual to the expected.
In practice, you may have a good idea how the system works, and so writing typical positive and negative path tests will come naturally. However, for any application of sufficient complexity, it is unreasonable to expect total coverage of every possible scenario.
Unfortunately, this means unexpected scenarios will crop up in Quality Assurance [QA], PreProduction [PP], and Production [Prod] cycles. At which point, your attempts to replicate these scenarios in dev should make their way into your integration and system integration suites as automated tests.
Hope this helps, :)
ps: pet-peeve #1: managers or devs calling integration and system integration tests "unit tests" simply because nUnit or MsTest was used to automate it ...
What you describe is indeed integration testing (more or less). And no, it is not an antipattern, but a necessary part of the sw development lifecycle.
Any reasonably complicated program is more than the sum of its parts. So however well you unit test it, you still have not much clue about whether the whole system is going to work as expected.
There are several aspects of why it is so:
unit tests are performed in an isolated environment, so they can't say anything about how the parts of the program are working together in real life
the "unit tester hat" easily limits one's view, so there are whole classes of factors which the developers simply don't recognize as something that needs to be tested*
even if they do, there are things which can't be reasonably tested in unit tests - e.g. how do you test whether your app server survives under high load, or if the DB connection goes down in the middle of a request?
* One example I just read from Luke Hohmann's book Beyond Software Architecture: in an app which applied strong antipiracy defense by creating and maintaining a "snapshot" of the IDs of HW components in the actual machine, the developers had the code very well covered with unit tests. Then QA managed to crash the app in 10 minutes by trying it out on a machine without a network card. As it turned out, since the developers were working on Macs, they took it for granted that the machine has a network card whose MAC address can be incorporated into the snapshot...
What I would do in order to test the
"real" application: making the http
request (either manually or automated)
with some ids from the database and
then look in the filesystem if the
file was written. Of course this
process could be automated, but still:
doesn´t that duplicate the test-logic?
Maybe you are duplicated code, but you are not duplicating efforts. Unit tests and integrations tests serve two different purposes, and usually both purposes are desired in the SDLC. If possible factor out code used for both unit/integration tests into a common library. I would also try to have separate projects for your unit/integration tests b/c
your unit tests should be ran separately (fast and no dependencies). Your integration tests will be more brittle and break often so you probably will have a different policy for running/maintaining those tests.
Is this what is called an "integration
test"?
Yes indeed it is.
In an integration test, just as in a unit test you need to validate what happened in the test. In your example you specified an OutfileWriter, You would need some mechanism to verify that the file and data is good. You really want to automate this so you might want to have a:
Class OutFilevalidator {
function isCorrect(fName, dataList) {
// open file read data and
// validation logic
}
You might review "Taming the Beast", a presentation by Markus Clermont and John Thomas about automated testing of AJAX applications.
YouTube Video
Very rough summary of a relevant piece: you want to use the smallest testing technique you can for any specific verification. Spelling the same idea another way, you are trying to minimize the time required to run all of the tests, without sacrificing any information.
The larger tests, therefore are mostly about making sure that the plumbing is right - is Tab A actually in slot A, rather than slot B; do both components agree that length is measured in meters, rather than feet, and so on.
There's going to be duplication in which code paths are executed, and possibly you will reuse some of the setup and verification code, but I wouldn't normally expect your integration tests to include the same level of combinatoric explosion that would happen at a unit level.
Driving your TDD with BDD would cover most of this for you. You can use Cucumber / SpecFlow, with WatiR / WatiN. For each feature it has one or more scenarios, and you work on one scenario (behaviour) at a time, and when it passes, you move onto the next scenario until the feature is complete.
To complete a scenario, you have to use TDD to drive the code necessary to make each step in the current scenario pass. The scenarios are agnostic to your back end implementation, however they verify that your implementation works; if there is something that isn't working in the web app for that feature, the behaviour needs to be in a scenario.
You can of course use integration testing, as others pointed out.
How are people unit testing their business applications? I've seen a lot of examples of unit testing with "simple to test" examples. Ex. a calculator. How are people unit testing data-heavy applications? How are you putting together your sample data? In many cases, data for one test may not work at all for another test which makes it hard to just have one test database?
Testing the data access portion of the code is fairly straightforward. It's testing out all the methods that work against the data that seem to be hard to test. For example, imagine a posting process where there is heavy data access to determine what is posted, numbers are adjusted, etc. There are a number of interim steps that occur (and need to be tested) along with tests afterwards that ensure the posting was successful. Some of those steps may actually be stored procedures.
In the past I've tried inserting the test data in a test database, then running the test, but honestly it's pretty painful to write this kind of code (and error prone). I've also tried just building a test database up front and rolling back the changes. That works OK but in a number of places you can't easily do this either (and many people would say that's integration testing; so be it, I still need to be able to test this somehow).
If the answer is that there isn't a nice way of handling this and it currently just sort of sucks, that would be useful to know as well.
Any thoughts, ideas, suggestions, or tips are appreciated.
My automated functional tests usually follow one of two patters:
Database Connected Tests
Mock Persistence Layer Tests
Database Connected Tests
When I have automated tests that are connected to the database, I usually make a single test database template that has enough data for all the tests. When the automated tests are run, a new test database is generated from the template for every test. The test database has to be constantly re-generated because test will often change the data. As tests are added, I usually append more data to the test database template.
There are some nice advantages to this testing method. The obvious advantage is that the tests also exercise your schema. Another advantage is that after setting up the initial tests, most new tests will be able to re-use the existing test data. This makes it easy to add more tests.
The downside is that the test database will become unwieldy. Because data will usually be added one test at time, it will be inconsistent and maybe even unrealistic. You will also end up cursing the person who setup the test database when there is a significant database schema change (which for me usually means I end up cursing myself).
This style of testing obviously doesn't work if you can't generate new test databases at will.
Mock Persistence Layer Tests
For this pattern, you create mock objects that live with the test cases. These mock objects intercept the calls to the database so that you can programmatically provide the appropriate results. Basically, when the code you're testing calls the findCustomerByName() method, your mock object is called instead of the persistence layer.
The nice thing about using mock object tests is that you can get very specific. Often times, there are execution paths that you simply can't reach in automated tests w/o mock objects. They also free you from maintaining a large, monolithic set of test data.
Another benefit is the lack of external dependencies. Because the mock objects simulate the persistence layer, your tests are no longer dependent on the database. This is often the deciding factor when choosing which pattern to choose. Mock objects seem to get more traction when dealing with legacy database systems or databases with stringent licensing terms.
The downside of mock objects is that they often result in a lot of extra test code. This isn't horrible because almost any amount of testing code is cheap when amortized over the number of times you run the test, but it can be annoying to have more test code then production code.
I have to second the comment by #Phil Bennett as I try to approach these integration tests with a rollback solution.
I have a very detailed post about integration testing your data access layer here
I show not only the sample data access class, base class, and sample DB transaction fixture class, but a full CRUD integration test w/ sample data shown. With this approach you don't need multiple test databases as you can control the data going in with each test and after the test is complete the transactions are all rolledback so your DB is clean.
About unit testing business logic inside your app, I would also second the comments by #Phil and #Mark because if you mock out all the dependencies your business object has, it becomes very simple to test your application logic one entity at a time ;)
Edit: So are you looking for one huge integration test that will verify everything from logic pre-data base / stored procedure run w/ logic and finally a verification on the way back? If so you could break this out into 2 steps:
1 - Unit test the logic that happens before the data is pushed
into your data access code. For
example, if you have some code that
calculates some numbers based on
some properties -- write a test that
only checks to see if the logic for
this 1 function does what you asked
it to do. Mock out any dependancy
on the data access class so you can
ignore it for this test of the
application logic alone.
2 - Integration test the logic that happens once you take your
manipulated data (from the previous
method we unit tested) and call the
appropriate stored procedure. Do
this inside a data specific testing
class so you can rollback after it's
completed. After your stored
procedure has run, do a query
against the database to get your
object now that we have done some
logic against the data and verify it
has the values you expected
(post-stored procedure logic /etc )
If you need an entry in your database for the stored procedure to run, simply insert that data before you run the sproc that has your logic inside it. For example, if you have a product that you need to test, it might require a supplier and category entry to insert so before you insert your product do a quick and dirty insert for a supplier and category so your product insert works as planned.
It depends on what you're testing. If you're testing a business logic component -- then its immaterial where the data is coming from and you'd probably use a mock or a hand rolled stub class that simulates the data access routine the component would have called in the wild. The only time I mess with the data access is when I'm actually testing the data access components themselves.
Even then I tend to open a DB transaction in the TestFixtureSetUp method (obviously this depends on what unit testing framework you might be using) and rollback the transaction at the end of the test suite TestFixtureTeardown.
Mocking Frameworks enable you to test your business objects.
Data Driven tests often end up becoming more of a intergration test than a unit test, they also carry with them the burden of managing the state of a data store pre and post execution of the test and the time taken in connecting and executing queries.
In general i would avoid doing unit tests that touch the database from your business objects. As for Testing your database you need a different stratergy.
That being said you can never totally get away from data driven testing only limiting the amout of tests that actually need to invoke your back end systems.
It sounds like you might be testing message based systems, or systems with highly parameterised interfaces, where there are large numbers of permutations of input data.
In general all the rules of standard unti testing still hold:
Try to make the units being tested as small and discrete as possible.
Try to make tests independant.
Factor code to decouple dependencies.
Use mocks and stubs to replace dependencies (like dataaccess)
Once this is done you will have removed a lot of the complexity from the tests, hopefully revealing good sets of unit tests, and simplifying the sample data.
A good methodology for then compiling sample data for test that still require complex input data is Orthogonal testing, or see here.
I've used that sort of method for generating test plans for WCF and BizTalk solutions where the permutations of input messages can create multiple possible execution paths.
For lots of different runs over the same logic but with different data you can use CSV, as many columns as you like for the input and the last for the output etc.