unit testing DAOs - unit-testing

unit testing DAOs - unit-testing

Let's say I'm doing unit testing methods for a UserDAO. I'm writing the test for the UserDao's deletion method. I would first insert a user into the db, then call the deletion method, and verify if the object still exists.
My question is: for the deletion unit test, when I'm inserting a user to test, should I be calling the UserDao's insertion method...OR would it be better not to call any methods of the object that I'm testing and use the native way, say using jdbc to do the insertion, and then call my deletion method?

DAOs are often too simple to break, then I think it is not worth to spend resources on testing. Your explanation looks like this is the case.
Only if there is some logic involved (building more complex queries together) I would think of testing some parts.
Maybe provide some code snippets to help out more.

Use DBUnit or something similar to set up the data for the test. DBUnit lets you specify what test data gets inserted for a test, you can even specify a clean insert that deletes everything (from the tables that have test data specified for them), then inserts only what you want. It's best if tests are independent of each other, you don't want a problem with the insert code to cause problems for other tests that depend on it for setup.

Related

Unit testing of a very dynamic web application?

My business web application (PHP with HTML/Javascript) has lots of very different options (about 1000) which are stored in the database, so the users can change them theirselves. These options for example define if a button, tab or inputfield is visible, the validation of inputs and the workflow, like when e-mails should be sent. Each user has a user-role, which also defines what they're able to see and do.
My users can use any combination of these options, so I find it very difficult to write tests for all these situations. I have 100+ customers so writing tests for each customer is definetely not an option.
The problem is that some options work together. So while testing one option it's necessary to know the value of some other options. Ideally the tests should also be able to read the options-profiles for each customer. But that would almost be like rewriting the whole application, just for testing, which seems error-prone by itself.
Is it common in unit testing to read the database to get the test-data and options, or is that not a good idea?
How would you handle the situation I described ?

First of all yes, that's perfectly possible. Although it's not recommended to write unit tests after the application is already written and its extremely difficult.
Here are a few advises for your case:
Data Providers
Data Providers are making it possible to call the same test with different parameters, which prevents code duplication in your tests. Perfect if you want to test the same method with different configurations.
https://phpunit.de/manual/3.7/en/writing-tests-for-phpunit.html#writing-tests-for-phpunit.data-providers
Mock objects
If objects depend on other objects you use mock objects. Mocking an object is basically nothing more than creating a dummy object which has a defined behavior and won't do anything else than the things you told it to do.
Note that you can also mock the tested class itself! A mock will keep the methods of the mocked class by default, so you can mock the class you want to test and define a specific behavior for some methods while testing another.
If this is still not enough, you might want to think about splitting up you methods in smaller, more specific methods to get smaller units.
https://phpunit.de/manual/3.7/en/test-doubles.html
Keep it small
Unit tests are called unit tests because they test the smallest possible unit of your code without executing anything else. So instead of testing if a button behaves the way it should, just test if its visible if it should be. Just test one kind of behavior and nothing more.
Don't read the database
It is highly unusual to read the database when writing unit tests and its even more unusual to use actual user data. Instead you define test data. Instead of testing your users configuration, you should test every possible configuration
Code Coverage
A decent way to check if your code is covered by the tests is code coverage. It will show you how much and which code was executed by the tests. Although 100% coverage does not mean full coverage in reality, specially in your case. Just because all lines of code were executed does not mean every option was considered. But its a handy tool anyways and you can see which code you are done with and which you forgot about.
https://phpunit.de/manual/current/en/code-coverage-analysis.html
Conclusion:
What you are trying to do is error-prone itself, yes. Because usually you'd write all your tests before writing the actual method. And you will probably write more test code than the application has itself, but that's not uncommon.

Is unit testing in an order discouraged?

I have a large program that I need to write tests for. I'm wondering if it would be wrong to write tests that run in a specific order, as some necessarily have to run in order and that depend upon a previous test.
For example a scenario like the following:
CreateEmployer
CreateEmployee (requires employer)
Add Department
The drawback I see to this approach is that if one test fails, all the of the follow tests will also fail. But, I am going to have to write the code to build the database anyway, so it might be a more effective approach to use the code that builds the mock database as a sort of integration-like unit test.
Should I create the database without using the tests as a seed method, and then run each of the methods again to see the result? The problem I see with this approach is that if the seed method does not work all of the tests will fail and it won't be immediately clear that the error is in the seed method and not the services or the tests themselves.

Yes, this is discouraged. Tests shouldn't be "temporally coupled".
Each test should run in complete isolation of other tests. If you find yourself in a situation where the artifacts created by Test A are needed by Test B then you have two problems to correct:
Test A shouldn't be creating artifacts (or side-effects).
Test B should be using mock data as part of its "arrange" step.
Basically, the unit tests shouldn't be using a real database. They should be testing the logic of the code, not the interaction with the infrastructure dependencies. Mock those dependencies in order to test just the code.

It is a bad idea to have unit tests depend upon each other. In your example you could have a defect in both CreateEmployer and AddDepartment, but when all three fail because of the CreateEmployer test you might mistakenly assume that only the CreateEmployer test is 'really' failing. This means you've lost the potently valuable information that AddDepartment is failing as well.
Another problem is that you might create a separate workflow in the future that calls AddDepartment without calling CreateEmployer. Now your tests assume that CreateEmployer will always be called, but in reality it isn't. All three tests could pass, but the application could still break because you have a dependency you didn't know was there.
The best tests won't rely on a database at all, but will instead allow you to manually specify or "Mock" the data. Then you don't have to worry about an unrelated database problem breaking all of your tests.

If these are truly Unit tests then yes, requiring a specific order is a bad practice for several reasons.
Coupling - As you point out, if one test fails, then subsequent ones will fail as well. This will mask real problems.
TDD - A core principle of TDD is make tests easy to run. If you do so, developers are more likely to run them. If they are hard to run (e.g. I have to run the entire suite), then they are less likely to be run and their value is lost.

Ideally, unit tests should not depend upon the completion of another test in order to run. It is also hard to run them in a given sequence, but that may depend upon your unit testing tool.
In your example, I would create a test that tests the CreateEmployer method and makes sure it returns a new object the way you expect.
The second test I would create would be CreateEmployee, and if that test requires an Employer object, using dependency injection, your CreateEmployee method could receive its Employer object. Here is where you would use a mock object (one that code to get created, returning a fixed/known Employer) as the Employer object the CreateEmployee method would consume. This lets you test the CreateEmployee method and its actions upon that object, with a given/known instance of the Employer object.
Your third test, AddDepartment, I assume also depends upon an Employer object. This unit test can follow the same pattern, and receive a mock Employer object to consumer during its test. (The same object you pass to unit test two above.)
Each test now runs/fails on its own, and can run in any sequence.

Mock Objects vs Test Database

What's the advantage of using mock objects in comparison to a static test database that has known data and using transactions to make sure nothing changes when testing against the database.

You can do both. Use mock objects to test you BLL logic and then use a test database to test your DAL logic. That way if something breaks you can easily see where the problem lies by which test fails.

Firstly, using a mock will be much quicker than connecting to an external database. However the main reason is that mocks behave the same each time the test is run, which you can't guarentee with an external service like a database, which means unit tests don't fail randomly. You can also easily simulate any types of failure you want to handle with a mock object.
It's also a good idea to run integration tests against a real database however to test things like configuration and performance.

If you can ensure that the static test db doesn't change during testing then I think static test db is better than mock objects. But, then it depends on what you want to test and the complexity of the code being tested. I would go with mock objects as they are simpler to maintain compared with a db.

Imagine you're about to write a class which doesn't exist. Maybe it's a controller. It's not something which talks straight to the database.
You have a good idea of how it ought to behave, and you know what it should be responsible for, and what it should delegate to some other service (using Single Responsibility Principle). So you write interfaces to represent the roles of the helper classes it's going to use.
Then you write an example of how you might use your class that you're about to create. If you like, you can call the example a unit test. You mock out the interactions with the helper classes. When you run the example, it fails, because you haven't written the code yet. You can now write the code to make it pass, using the interfaces of the helper classes - which you also haven't written yet.
Then you do the same with the helper classes, mocking out their helpers.
Eventually you'll reach a class which talks to the database. Or maybe it talks to a web service. Or perhaps the data is static, or in memory. (It doesn't matter to the original class, because your class is decoupled from this).
At this point you'll want an example to describe the behaviour of this class, too, and if it's a database connector, you'll need a database for the example.
Writing code this way produces code that's easy to use and understand, with nothing extra that isn't needed by something further up the stack. This is usually more robust than code that's easier to write. That's why we sometimes use mocks - because writing tests first using mocks helps produce good, maintainable, decoupled design.

Usually you would use both of these approaches.
For unit tests you would use mocked objects. This way you can test your system in a much more fine-grained way, because you can mock every object - not only the ones that wrap the database connection. In general it is good practise to unit test every class in separation of all the dependencies. The benefits are - its is possible to test all the error handling on all levels and make sure you test all the code paths, when you get a test failure you can immediately find the cause, etc.
For integration and end-to-end tests you test big parts of the system or the whole thing. Then you would connect to the database (because this connection is part of the test). Obviously you have to ensure that the database is in a known state, etc., etc.
You will end up having much more unit tests than integration tests, so its actually quite important to make them very quick - yet another advantage of using mock objects.

The setup for testing with a db can be arduous. If you find yourself spending more time in setting up the db, just to test some functional aspect of your business logic, then you might want to use a mock/fake.

Should you Unit Test Business Logic methods that consists primarily of a query?

I have a business logic layer with a broad spectrum of classes and their corresponding methods. Creating a Unit Test for a method is generally a given but, in some cases, the method really just returns a set of matching records from the database. It seems kind of silly to write a Unit Test by, say, storing five matching records and then calling the BLL method and see if it returns all five records.
What is best practice here? What do you actually do - as opposed to what you'd like to say you would ideally do?

I believe the real question here is, why do you have methods in your Business Logic Layer that don't seem to contain any real business logic?
In this case, it seems like the method in question is just a Repository style method to retrieve records matching some criteria from the database.
In either situation, I would still Unit Test the method. There are several reasons:
Since the method is in the Business Logic Layer (in your case), it's possible that the method could end up becoming more involved. Adding the Unit Test now will ensure that even in the future, no matter the logic, the method is still getting tested for unexpected behavior.
If there is any logic at all (like determining which records match the business criteria), you still have to test that logic.
If you end up moving the method to the Data Layer, you should be testing your query against some mock data repository to make sure your queries work. That way, if your queries become more complex in the future...you're covered.

I use DBUnit to fill in the database with a number of records, more than should be returned by the query. Then, call the method, and make sure that only the right records are returned. This ensures that your query logic works, and helps to catch regression issues in the future if you refactor the database.

TDD - top level function has too many mocks. Should I even bother testing it?

I have a .NET application with a web front-end, WCF Windows service back-end. The application is fairly simple - it takes some user input, sending it to the service. The service does this - takes the input (Excel spreadsheet), extracts the data items, checks SQL DB to make sure the items are not already existing - if they do not exist, we make a real-time request to a third party data vendor and retrieve the results, inserting them into the database. It does some logging along the way.
I have a Job class with a single public ctor and public Run() method. The ctor takes all the params, and the Run() method does all of the above logic. Each logical piece of functionality is split into a separate class - IParser does file parsing, IConnection does the interaction with the data vendor, IDataAccess does the data access, etc. The Job class has private instances of these interfaces, and uses DI to construct the actual implementations by default, but allows the class user to inject any interface.
In the real code, I use the default ctor. In my unit tests for the Run() method, I use all mock objects creating via NMock2.0. This Run() method is essentially the 'top level' function of this application.
Now here's my issue / question: the unit tests for this Run() method are crazy. I have three mock objects I'm sending into the ctor, and each mock object sets expectations on themselves. At the end I verify. I have a few different flows that the Run method can take, each flow having its own test - it could find everything is already in the database and not make a request to vendor... or an exception could be thrown and the job status could be set to 'failed'... OR we can have the case where we didn't have the data and needed to make the vendor request (so all those function calls would need to be made).
Now - before you yell at me and say 'your Run() method is too complicated!' - this Run method is only a mere 50 lines of code! (It does make calls to some private function; but the entire class is only 160 lines). Since all the 'real' logic is being done in the interfaces that are declared on this class. however, the biggest unit test on this function is 80 lines of code, with 13 calls to Expect.BLAH().. _
This makes re-factoring a huge pain. If I want to change this Run() method around, I have to go edit my three unit tests and add/remove/update Expect() calls. When I need to refactor, I end up spending more time creating my mock calls than I did actually writing the new code. And doing real TDD on this function makes it even more difficult if not impossible. It's making me think that it's not even worth unit testing this top level function at all, since really this class isn't doing much logic, it's just passing around data to its composite objects (which are all fully unit tested and don't require mocking).
So - should I even bother testing this high level function? And what am I gaining by doing this? Or am I completely misusing mock/stub objects here? Perhaps I should scrap the unit tests on this class, and instead just make an automated integration test, which uses the real implementations of the objects and Asserts() against SQL Queries to make sure the right end-state data exists? What am I missing here?
EDIT: Here's the code - the first function is the actual Run() method - then my five tests which test all five possible code paths. I changed it some for NDA reasons but the general concept is still there. Anything you see wrong with how I'm testing this function, any suggestions on what to change to make it better? Thanks.

I guess my advice echos most of what is posted here.
It sounds as if your Run method needs to be broken down more. If its design is forcing you into tests that are more complicated than it is, something is wrong. Remember this is TDD we're talking about, so your tests should dictate the design of your routine. If that means testing private functions, so be it. No technological philosophy or methodology should be so rigid that you can't do what feels right.
Additionally, I agree with some of the other posters, that your tests should be broken down into smaller segments. Ask yourself this, if you were going to be writting this app for the first time and your Run function didn't yet exist, what would your tests look like? That response is probably not what you have currently (otherwise you wouldn't be asking the question). :)
The one benefit you do have is that there isn't a lot of code in the class, so refactoring it shouldn't be very painful.
EDIT
Just saw you posted the code and had some thoughts (no particular order).
Way too much code (IMO) inside your SyncLock block. The general rule is to keep the code to a minimal inside a SyncLock. Does it ALL have to be locked?
Start breaking code out into functions that can be tested independently. Example: The ForLoop that removes ID's from the List(String) if they exist in the DB. Some might argue that the m_dao.BeginJob call should be in some sort of GetID function that can be tested.
Can any of the m_dao procedures be turned into functions that can tested on their own? I would assume that the m_dao class has it's own tests somewhere, but by looking at the code it appears that that might not be the case. They should, along with the functionality in the m_Parser class. That will relieve some of the burden of the Run tests.
If this were my code, my goal would be to get the code to a place where all the individual procedure calls inside Run are tested on their own and that the Run tests just test the final out come. Given input A, B, C: expect outcome X. Give input E, F, G: expect Y. The detail of how Run gets to X or Y is already tested in the other procedures' tests.
These were just my intial thoughts. I'm sure there are a bunch of different approaches one could take.

Two thoughts: first you should have an integration test anyway to make sure everything hangs together. Second, it sounds to me like you're missing intermediate objects. In my world, 50 lines is a long method. It's hard to say anything more precise without seeing the code.

The first thing I would try would be refactroing your unit tests to share the set up code between tests by refactoring to a method that sets up the mocks and expectations. Parameterize the method so your expectations are configurable. You may need one or perhaps more of these set up methods depending on how much alike the set up is from test to test.

So - should I even bother testing this
high level function?
Yes. If there are different code-paths, you should be.
And what am I gainging by doing this?
Or am I completely mis-using mock/stub
objects here?
As J.B. pointed out (Nice seeing you at AgileIndia2010!), Fowler's article is recommended read. As a gross simplification: Use Stubs, when you don't care about the values returned by the collaborators. If you the return values from the collaborator.call_method() changes the behavior(or you need non trivial checks on args, computation for return values), you need mocks.
Suggested refactorings:
Try moving the creation and injection of mocks into a common Setup method. Most unit testing frameworks support this; will be called before each test
Your LogMessage calls are beacons - calling out once again for intention revealing methods. e.g. SubmitBARRequest(). This will shorten your production code.
Try n move each Expect.Blah1(..) into intention revealing methods.
This will shorten your test code and make it immensely readable and easier to modify. e.g.
Replace all instances of
.
Expect.Once.On(mockDao) _
.Method("BeginJob") _
.With(New Object() {submittedBy, clientID, runDate, "Sent For Baring"}) _
.Will([Return].Value(0));
with
ExpectBeginJobOnDAO_AndReturnZero(); // you can name it better

on whether to test such function: you said in a comment
" the tests read just like the actual
function, and since im using mocks,
its only asserting the functions are
called and sent params (i can check
this by eyeballing the 50 line
function)"
imho eyeballing the function isn't enough, haven't you heard: "I can't believe I missed that!" ... you have a fair amount of scenarios that could go wrong in that Run method, covering that logic is a good idea.
on tests being brittle: try having some shared methods that you can use in the test class for the common scenarios. If you are concerned about a later change breaking all the tests, put the pieces that concerned you in specific methods that can be changed if needed.
on tests being too long / hard to know what's in there: don't test single scenarios with every single assertion that its related to it. Break it up, test stuff like it should log x messages when y happens (1 test), it should save to the db when y happens (another separate test), it should send a request to a third party when z happens (yet another test), etc.
on doing integration/system tests instead of these unit tests: you can see from your current situation that there are plenty of scenarios & little variations involved in that part of your system. That's with the shield of replacing yet more logic with those mocks & the ease of simulating different conditions. Doing the same with the whole thing will add a whole new level of complexity to your scenario, something that is surely unmanageable if you want to cover a wide set of scenarios.
imho you should minimize the combinations that you are leaving for your system tests, exercising some main scenarios should already tell you that a Lot of the system is working correctly - it should be a lot about everything being hooked correctly.
The above said, I do recommend adding focused integration tests for all the integration code you have that might not be currently covered by your tests / since by definition unit tests don't get there. This exercises specifically the integration code with all the variations you expect from it - the corresponding tests are much simpler than trying to reach those behaviors in the context of the whole system & tell you very quickly if any assumptions in those pieces is causing trouble.

If you think unit-tests are too hard, do this instead: add post-conditions to the Run method. Post-conditions are when you make an assertion about the code. For example, at the end of that method, you may want some variable to hold a particular value or one value out of some possible choices.
After, you can derive your pre-conditions for the method. This is basically the data type of each parameter and the limits and constraints on each of those parameters (and on any other variable initialized at the beginning of the method).
In this way, you can be sure both the input and output are what is desired.
That probably still won't be enough so you will have to look at the code of the method line by line and look for large sections that you want to make assertions about. If you have an If statement, you should check for some conditions before and after it.
You won't need any mock objects if you know how to check if the arguments to the object are valid and you know what range of outputs are desired.

Your tests are too complicated.
You should test aspects of your class rather than writing a unittest for each member of yor class. A unittest should not cover the entire functionality of a member.

I'm going to guess that each test for Run() set expectations on every method they call on the mocks, even if that test doesn't focus on checking every such method invocation. I strongly recommend you Google "mocks aren't stubs" and read Fowler's article.
Also, 50 lines of code is pretty complex. How many codepaths through the method? 20+? You might benefit from a higher level of abstraction. I'd need to see code to judge more certainly.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js