How do you unit test a DAL?

How do you unit test a DAL? - unit-testing

I have a Data Access Layer in my application which wraps an ADO.NET data provider. The DAL converts the data returned by the data provider into .NET objects. I've seen a lot of posts advising against unit testing the DAL, but it worries me that so much could go wrong in there - there's lots of looping and casting and null checks.
I had some thoughts about creating a mock DbProvider with something like RhinoMocks, but the number of interfaces I'd have to mock out in each test would be overwhelming, and the number of expectations I'd have to set would make the tests very hard to read. It seems that each test would be more complex than the code it was testing - which would be a disaster from the perspective of the 3 goals of unit testing:
Readability
Maintainability
Trustworthiness
I had an idea to implement a friendly DbProviderFactory to load the sample data from xml. I could plug it in via Dependency Injection in the tests. It should make maintaining the tests much simpler. A trivial example might be:
[TestCase]
public void CanGetCustomer()
{
var expectedCommand = new XmlCommand("sp_get_customer");
expectedCommand.ExpectExecuteDataReader(
resultSet: #"<customer firstName=""Joe"" lastName=""Blogs"" ... />");
var factory = new XmlProviderFactory(expectedCommand);
var dal = new CustomerDal(factory);
Customer customer = dal.GetCustomer();
Assert.IsNotNull(customer, "The customer should never be null");
Assert.AreEqual(
"Joe", customer.FirstName,
"The customer had an unexpected FirstName.");
}
I think this approach - using a friendly DbProvider - might make it easier to test the DAL code. It would have the following advantages:
The test data would be in xml, and could be placed in source control along with the unit tests. It could be in external files, or in-line in the tests.
I'm not using a real database so this eliminates the statefulness problem. So I don't have to put a database into a known state prior to each test.
I don't have to mock out all of the ADO.NET interfaces in each test. I'll code one set of fake implementations which I can re-use all over the codebase.
Could people provide some critique on this idea? Is there already a similar implementation that I could use for this?
Thanks

I get into a lot of philosophical arguments about the topic of proper unit testing (no, not integration testing) of data access classes (DALs, DACs, DAOs, Repositories, etc.). Some argue it is pointless since you are doing integration testing. I find tremendous value in unit testing these often neglected units of code. First, in order to properly unit test a data access class, it must be structured correctly and it must draw lines in the sand to which consumers may interact – think interfaces. The data access implementation should have an interface defined that it implements which the consuming application code only has a dependency on. Your choice of infrastructure code (ADO.NET, NHibernate, NDatabase, etc.) should have interfaces which your data access code only has dependencies on. With these infrastructure interfaces (IDBConnection, ISession, IDatabase, etc.) available and leveraged correctly, you can then mock these interfaces in your unit tests using your mocking tool of choice. This leaves you with higher quality data access code which has been unit tested (mocking infrastructure interfaces), integration tested (against a REAL database), and has lower net coupling all around.
One note: In my opinion, a bad code smell to be wary of is when data access related code bleeds past the data access (or persistence) layer. For example, if you see connections, commands, sessions, etc. used higher that the data access class implementation, this reeks of a violation of Separation of Concerns.

Related

TDD + DDD: Model abstractions

I've recently had an interesting experience but didn't find a satisfying answer so far: I'm a big fan of DDD and try to define rich domain objects with behavior and good information hiding, even if the team officially doesn't practice DDD. At the end of the day, it doesn't matter, as you have a well-defined object, which represents something in the problem domain.
That said, I would also like to practice TDD more. Unfortunately, if I test a service, which uses such rich domain models, the models are usually not abstracted. Therefore, to test the behavior of the service, I need to set up the model as well. This model comes with its own invariants etc., therefore with every service test, I also test the model the service is using.
This seems like a big no-go, as I'm not only "not really unit-testing", but it's also troublesome to set up the tests, as the arrange-code gets large.
In my opinion, there seems to be no way around this but to start creating interfaces for models. But it seems like I am the only person thinking so. For example, here is a big article, why this is an anti-pattern:
https://lostechies.com/jamesgregory/2009/05/09/entity-interface-anti-pattern/
I’m also not that too delighted to create interfaces for all models, as they should really represent something and adding another layer of abstraction just for testing seems like overkill. That said, what would be the best solution hereby? How are people on the field, which do combine DDD and TDD, handling this?

This seems like a big no-go, as I'm not only "not really unit-testing", but it's also troublesome to set up the tests, as the arrange-code gets large.
I think you can dismiss "not really unit-testing"; the important thing is to use tools that are fit for purpose, not the branding.
That said, troublesome to set up the tests is a legitimate concern, and all by itself sufficient excuse to look for a way to improve the design.
If your service were tightly coupled to some third party implementation, that offered no affordances for substitution, what would you do to decouple that from your tests? The usual answer would be to introduce a seam - a new design element between your code and the 3rd party code.
The two important characteristics of the seam:
it does afford substitution; which is to say, you have an interface.
the implementation of the interface that integrates with the third party code is "so simple there are obviously no deficiencies".
Then, in your tests, you introduce a substitute implementation.
The game with your "domain model" is exactly the same. Assuming that you are applying the usual lifecycle patterns, the seam includes a substitute for the repository and a substitute for the aggregate root entity.
Some good news - you only don't necessarily need to shadow the entire aggregate: only the parts of the interface that your service cares about. In effect, what you are doing is defining - for each service - the contract that describes the interactions between your service and the domain model. "Role interfaces" will be a useful search term here.

First I will make sure these two conditions are meet:
Domain models are POJOs
Domain layer isolation (other layers can access domain layer but not the way around)
Then Factory, Builder or TestHelpers can be used to bring models to desire state for tests.

Basics
Testing Scopes
Unit Testing
Integration Testing
Domain Models
These should be unit tests, which tests the Domain Models / Aggregate's methods.
Services
These should be integration tests, which tests the integration of Service methods and the associated models.
My Broad Approach
When you're testing your domain models, there may be many variance, that you'll need to account for in your unit tests.
When these then translate over to a requirement to use within an integration test, I tend to go for some sort of CreationFactory (or ArrangementFactory) for your domain models.
You can then use these in both sets of tests.
So for example...
public class ArrangeUser {
public static User ArrangeStandardUser() {
return new User(...standard...);
}
public static User ArrangeAdminUser() {
return new User(...admin...);
}
}
Then in your Unit Test...
// Arrange
User standardUser = ArrangeUser.StandardUser();
// Act
bool canDoSomething = standardUser.CanDoSomething();
// Assert
Assert.True(canDoSomething);
Then in your Integration Test...
// Arrange
User standardUser = ArrangeUser.StandardUser();
ServiceToTest service = new ServiceToTest(standardUser); // replace with some sort of Repository Mock or whatever suits.
// Act
var bool canDo = service.CanDoService();
// Assert
Assert.True(canDo);
This way you can test both the unit aspect, and the service aspect - by creation a common way to create the arrangements, without having to abstract out the entities and solves the problem of recreation the same thing over and over again.
NB. This is just a basic code demo than can be made more complex, based on the scenario, or your preferred test style.

I had a similar challenge and, together with my team, we created a tool that simplifies the test data arranging process by employing a random data generator: https://github.com/ocadotechnology/test-arranger. Especially take a look at:
How to organize tests with Test Arranger as it explicitly refers to the common DDD building blocks and explains how to arrange test data around them. In my case, following those recommendations resulted in a significant reduction in the amount and complexity of code for preparing the test data.
Custom Arrangers as it shows how to deal with the model invariants.
Besides the recommendations given on the test-arranger page, it is also handy to use Lombok's #Builder(toBuilder = true) (or an equivalent like Kotlin's copy method from data classes) on your domain classes. With the toBuilder method you can easily adjust randomly generated value objects and entities to the needs of a certain test case.

How to do unit testing in Microsoft Dynamics AX 2012 in a real world project

Dynamics AX 2012 comes with unit testing support.
To have meaningful tests some test data needs to be provided (stored in tables in the database).
To get a reproducable outcome of the unit tests we need to have the same data stored in the tables every time the tests are run. Now the question is, how can we accomplish this?
I learned that there is the possibility of setting the isolation level for the TestSuite to SysTestSuiteCompanyIsolateClass. This will create an empty company and delete the company after the tests have been run. In the setup() method I can fill my testdata into the tables with insert statements. This works fine for small scenarios but becomes cumbersome very fast if you have a real life project.
I was wondering if there is anyone out there with a practical solution of how to use the X++ Unit Test Framework in a real world scenario. Any input is very much appreciated.

I agree that creating test data in a new and empty company only works for fairly trivial scenarios or scenarios where you implemented the whole data structure yourself. But as soon as existing data structures are needed, this approach can become very time consuming.
One approach that worked well for me in the past is to run unit tests in a existing company that already has most of the configuration data (e.g. financial setup, inventory setup, ...) needed to run the test. The test itself runs in a ttsBegin - ttsAbort block so that the unit test does not actually create any data.
Another approach is to implement data provider methods that are test agnostic, but create data that is often used in unit tests (e.g. a method that creates a product). It takes some time to create a useful set of data provider methods, but once they exist, writing unit tests becomes a lot faster. See SysTest part V.: Test execution (results, runners and listeners) on how Microsoft uses a similar approach (or at least they used to back in 2007 for AX 4.0).
Both approaches can also be combined, you would call the data provider methods inside the ttsBegin - ttsAbort block to create the needed data only for the unit test.
Another useful method is to use doInsert or doUpdate to create your test data, especially if you are only interested in a few fields and do not need to create a completely valid record.

I think that the unit test framework was an afterthought. In order to really use it, Microsoft would have needed to provide unit test classes, then when you customize their code, you also customize their unit tests.
So without that, you're essentially left coding unit tests that try and encompass base code along with your modifications, which is a huge task.
Where I think you can actually use it is around isolated customizations that perform some function, and aren't heavily built on base code. And also with customizations that are integrations with external systems.

Well, from my point of view, you will not be able to leverage more than what you pointed from the standard framework.
What you can do is more around release management. You can setup an integration environment with the targeted data and push your nightbuild model into this environmnet at the end of the build process and then run your tests.
Yes, it will need more effort to set it up and to maintain but it's the only solution I've seen untill now to have a large and consistent set of data to run unit or integration tests on.

To have meaningful tests some test data needs to be provided (stored
in tables in the database).
As someone else already indicated - I found it best to leverage an existing company for data. In my case, several existing companies.
To get a reproducable outcome of the unit tests we need to have the
same data stored in the tables every time the tests are run. Now the
question is, how can we accomplish this?
We have built test helpers, that help us "run the test", automating what a person would do - give you have architeced your application to be testable. In essence our test class uses the helpers to run the test, then provides most of the value in validating the data it created.
I learned that there is the possibility of setting the isolation level
for the TestSuite to SysTestSuiteCompanyIsolateClass. This will create
an empty company and delete the company after the tests have been run.
In the setup() method I can fill my testdata into the tables with
insert statements. This works fine for small scenarios but becomes
cumbersome very fast if you have a real life project.
I did not find this practical in our situation, so we haven't leveraged it.
I was wondering if there is anyone out there with a practical solution
of how to use the X++ Unit Test Framework in a real world scenario.
Any input is very much appreciated.
We've been using the testing framework as stated above and it has been working for us. the key is to find the correct scenarios to test, also provides a good foundation for writing testable classes.

Should I bother unit testing my repository layer

Just putting this one out for debate really.
I get unit testing. Sometimes feels time consuming but I'm all for the benefits.
I've an application set up that contains a repository layer and a service layer, using IoC, and I've been unit testing the methods.
Now I know the benefits of isolating my methods for unit testing so there is little or no dependency on other methods.
The question I've got is this. If I only ever access my repository methods through my service layer methods would only testing the service layers not be good enough? I'm testing against a test database.
Could it not be considered an extension of the idea that you only need to test your public methods? Maybe I'm just trying to skip some testing ;)

Yes, you should test your repository layer. Although the majority of these tests fall into a different classification of tests. I usually refer to them as integration tests to distinguish them from my unit tests. The difference being that there is an external dependency on a resource (your database) and that these tests will likely take much longer to run.
The primary reason for testing your repositories separately is that you'll be testing different things. The repository is responsible for handling translation and interaction with whatever persistence store you're using. The service layer, on the other hand, is responsible for coordinating your various respositories and other dependencies into functionality that represents business logic, which likely involves more than just a relay to a repository method and in some instances may involve multiple calls to multiple repositories.
First, to clarify the service layer testing - when testing the service layer, the repositories should be mocked so that they are isolated from what you're testing in the service layer. As you pointed out in your comment, this gives you a more granular level of testing and isolates the code under test. Your unit tests will also run much faster now because there are no database connections slowing them down.
Now, here are a few advantages of adding integration tests to your repositories...
It allows you to test out those pieces of code as you're writing them, a la TDD.
It ensures that whatever persistence language you're using (SQL, HQL, serialized objects, etc.) is formulated correctly for the operation you're attempting to perform.
If you're using an object-relational mapper, it ensures that your mappings are defined correctly.
In the future, you may find that you need to support another type of persistence. Depending on how your repository tests are structured, you may be able to reuse a large number of the tests to verify that the new database schema works correctly. For repository methods that implement database specific logic, obviously you'll have to create separate tests.
When coupled with Continuous Integration it's nice to have the repository tests separated. Integration tests, by nature take longer to run than unit tests. As such, they're usually run at less frequent intervals so that the immediate feedback available from running unit tests is not delayed.
Those are all advantages that I've seen in various projects that I've worked on. There may be more.
All that having been said, I will admit that I'm not as thorough with the repository integration tests as I am with unit tests. When it comes to testing an update on a particular object, for example, I'm usually content testing that one database column was successfully updated rather than creating a separate test for each individual column or a larger test that verifies every column in one test. For me, it depends on the complexity of the operation that the respository method is performing and whether there's any special condition that needs to be isolated.

You should test your repository layer. However if you have integration, story or system tests that cover it, then you can make a good case of not having unit tests as well.
Unit testing is great for complex stand-a-lone objects, but there is no point spending a long time writing unit tests for simple methods that are covered by “higher level” tests.

Wouldn't this depend on how how smart the repository access layer is? If your repository takes parameters to filter (Linq to SQL for example) the given result set surely this logic will need to be tested.

Unit tests: test an individual logic (a method) without worrying the dependency of that logic. Mostly falls in white box category.
Integration test: can test end to end flow or more than one layer together to ensure its correctness. Mostly falls in black box category.
In Dao most of the time there is no business logic, it just forms a query for a particular database implementation. So no need for a unit test if we already covered it in our integration test. Still, we can write unit tests for Dao if there is some logic in it.
As dao layers are so tightly coupled with database implementation, most of the time junit test for dao has become synonyms for testing of underlying databases.
The query we build can only be validated by the underlying Database engine.
I used to write unit tests (can call integration tests) for dao by using actual database or mocking  a database with a compatible database(follow the same sql standard ,for example mysql engine can be replaced by sqlite or in memory H2 database) and inject this database in dao for testing the dao layer and query build in that dao layer.

I get unit testing
Next step is Test Driven Development (TDD). It will answer your question.

Testing Real Repositories

I've set up unit tests that test a fake repository and tests that make use of a fake repository.
But what about testing the real repository that hits the database ? If this is left to integration tests then it would be seem that it isn't tested directly and problems could be missed.
Am I missing something here?

Well, the integration tests would only test the literal persistence or retrieval of data to and from the layer of persistence. If your repository is doing any kind of logic concerning that data (validation, throwing exceptions if an object isn't found, etc.), that can be unit tested by faking what the persistence layer returns (whether it returns the queried object, a return code, or something else). Your integration test will assure you that the code can physically persist/retrieve data from persistence, and that's it. Any sort of logic to test ought to belong in a unit test.
Sometimes, however, logic could exist in the persistence layer itself (e.g. stored procedures). This could be for the sake of efficiency, or it could merely be legacy code. This is harder to properly unit test, as you can only access the logic by getting to the database. In this scenario, it'd probably be best to try and move the logic to your code base as much as possible, so that it can be tested more easily. There probably exist unit testing frameworks for scenarios such as these, but I'm not aware of them (merely out of inexperience).

Can you set up a real repository that tests against a fake database?
Regardless of what you do, this is integration testing, not unit testing.

I'd definitely suggest integration tests against the DAL, within reason.
We don't use the Repository pattern per se (to our chagrin), but our policy for similar classes (Searchers) is as follows:
If the method does a simple retrieve from the database using an O/RM call, don't test it.
If the method uses query-building features of the O/RM, test it.
If the method contains a string (such as a column name), test it.
If the method calls a stored procedure, test it.
If the method contains logic, test it. But try to avoid logic.
If the method bypasses the O/RM and uses raw SQL, test it. But really try to avoid this.
The gist is you should know your O/RM works (and hopefully has tests), so there's no reason to test basic CRUD behavior.
You'll definitely want a "test deck" - an in-memory database, a local file-backed database that can be checked into source control, or (if you have to) a shared database. Some testing frameworks offer rollback facilities to restore the database state; just be careful if you're hitting multiple databases in the same test or (in some cases) if you have embedded transactions.
EDIT: Note that these integration tests will still test your repository in "isolation" (save for the database). All your other unit tests will use a fake repository.

I recently covered a very similar question over here.
In summary: test your concrete Repository implementations if there's value in doing so. If you are doing something complex in your implementation, it is probably a good idea to test it. If you are using an ORM with no custom logic, there may not be much value in writing tests at that level.

How do you unit test business applications?

How are people unit testing their business applications? I've seen a lot of examples of unit testing with "simple to test" examples. Ex. a calculator. How are people unit testing data-heavy applications? How are you putting together your sample data? In many cases, data for one test may not work at all for another test which makes it hard to just have one test database?
Testing the data access portion of the code is fairly straightforward. It's testing out all the methods that work against the data that seem to be hard to test. For example, imagine a posting process where there is heavy data access to determine what is posted, numbers are adjusted, etc. There are a number of interim steps that occur (and need to be tested) along with tests afterwards that ensure the posting was successful. Some of those steps may actually be stored procedures.
In the past I've tried inserting the test data in a test database, then running the test, but honestly it's pretty painful to write this kind of code (and error prone). I've also tried just building a test database up front and rolling back the changes. That works OK but in a number of places you can't easily do this either (and many people would say that's integration testing; so be it, I still need to be able to test this somehow).
If the answer is that there isn't a nice way of handling this and it currently just sort of sucks, that would be useful to know as well.
Any thoughts, ideas, suggestions, or tips are appreciated.

My automated functional tests usually follow one of two patters:
Database Connected Tests
Mock Persistence Layer Tests
Database Connected Tests
When I have automated tests that are connected to the database, I usually make a single test database template that has enough data for all the tests. When the automated tests are run, a new test database is generated from the template for every test. The test database has to be constantly re-generated because test will often change the data. As tests are added, I usually append more data to the test database template.
There are some nice advantages to this testing method. The obvious advantage is that the tests also exercise your schema. Another advantage is that after setting up the initial tests, most new tests will be able to re-use the existing test data. This makes it easy to add more tests.
The downside is that the test database will become unwieldy. Because data will usually be added one test at time, it will be inconsistent and maybe even unrealistic. You will also end up cursing the person who setup the test database when there is a significant database schema change (which for me usually means I end up cursing myself).
This style of testing obviously doesn't work if you can't generate new test databases at will.
Mock Persistence Layer Tests
For this pattern, you create mock objects that live with the test cases. These mock objects intercept the calls to the database so that you can programmatically provide the appropriate results. Basically, when the code you're testing calls the findCustomerByName() method, your mock object is called instead of the persistence layer.
The nice thing about using mock object tests is that you can get very specific. Often times, there are execution paths that you simply can't reach in automated tests w/o mock objects. They also free you from maintaining a large, monolithic set of test data.
Another benefit is the lack of external dependencies. Because the mock objects simulate the persistence layer, your tests are no longer dependent on the database. This is often the deciding factor when choosing which pattern to choose. Mock objects seem to get more traction when dealing with legacy database systems or databases with stringent licensing terms.
The downside of mock objects is that they often result in a lot of extra test code. This isn't horrible because almost any amount of testing code is cheap when amortized over the number of times you run the test, but it can be annoying to have more test code then production code.

I have to second the comment by #Phil Bennett as I try to approach these integration tests with a rollback solution.
I have a very detailed post about integration testing your data access layer here
I show not only the sample data access class, base class, and sample DB transaction fixture class, but a full CRUD integration test w/ sample data shown. With this approach you don't need multiple test databases as you can control the data going in with each test and after the test is complete the transactions are all rolledback so your DB is clean.
About unit testing business logic inside your app, I would also second the comments by #Phil and #Mark because if you mock out all the dependencies your business object has, it becomes very simple to test your application logic one entity at a time ;)
Edit: So are you looking for one huge integration test that will verify everything from logic pre-data base / stored procedure run w/ logic and finally a verification on the way back? If so you could break this out into 2 steps:
1 - Unit test the logic that happens before the data is pushed
into your data access code. For
example, if you have some code that
calculates some numbers based on
some properties -- write a test that
only checks to see if the logic for
this 1 function does what you asked
it to do. Mock out any dependancy
on the data access class so you can
ignore it for this test of the
application logic alone.
2 - Integration test the logic that happens once you take your
manipulated data (from the previous
method we unit tested) and call the
appropriate stored procedure. Do
this inside a data specific testing
class so you can rollback after it's
completed. After your stored
procedure has run, do a query
against the database to get your
object now that we have done some
logic against the data and verify it
has the values you expected
(post-stored procedure logic /etc )
If you need an entry in your database for the stored procedure to run, simply insert that data before you run the sproc that has your logic inside it. For example, if you have a product that you need to test, it might require a supplier and category entry to insert so before you insert your product do a quick and dirty insert for a supplier and category so your product insert works as planned.

It depends on what you're testing. If you're testing a business logic component -- then its immaterial where the data is coming from and you'd probably use a mock or a hand rolled stub class that simulates the data access routine the component would have called in the wild. The only time I mess with the data access is when I'm actually testing the data access components themselves.
Even then I tend to open a DB transaction in the TestFixtureSetUp method (obviously this depends on what unit testing framework you might be using) and rollback the transaction at the end of the test suite TestFixtureTeardown.

Mocking Frameworks enable you to test your business objects.
Data Driven tests often end up becoming more of a intergration test than a unit test, they also carry with them the burden of managing the state of a data store pre and post execution of the test and the time taken in connecting and executing queries.
In general i would avoid doing unit tests that touch the database from your business objects. As for Testing your database you need a different stratergy.
That being said you can never totally get away from data driven testing only limiting the amout of tests that actually need to invoke your back end systems.

It sounds like you might be testing message based systems, or systems with highly parameterised interfaces, where there are large numbers of permutations of input data.
In general all the rules of standard unti testing still hold:
Try to make the units being tested as small and discrete as possible.
Try to make tests independant.
Factor code to decouple dependencies.
Use mocks and stubs to replace dependencies (like dataaccess)
Once this is done you will have removed a lot of the complexity from the tests, hopefully revealing good sets of unit tests, and simplifying the sample data.
A good methodology for then compiling sample data for test that still require complex input data is Orthogonal testing, or see here.
I've used that sort of method for generating test plans for WCF and BizTalk solutions where the permutations of input messages can create multiple possible execution paths.

For lots of different runs over the same logic but with different data you can use CSV, as many columns as you like for the input and the last for the output etc.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js