I have a Setup method with many sql inserts. This method is called before every Test method.
The more tables I add to my database the bigger gets the Setup method and its hard to overview and maintain.
Sometimes I think I should make a private setup method for each test method so it would be less insert data for each test method but generally it would be more sql inserts than the former Setup method.
Often I also do not need some sql insert in the setup method for specific tests. Therefore I can not say easily which setup data is for which test method.
What have you found out is a good way?
Consider using file resources (sql scripts to be precise). Having SQL stored as strings within class often ends up being maintainability nightmare.
In my C# projects what I usually do is:
create scripts inserting data required by given test (setup)
create script reverting database to its state before inserts (teardown)
add those scripts to project resources (so that they can be stored in .sql/text files and easily run on database when needed, and don't clutter class code)
And run them before/after test starts (within test method body though). For example (pseudocode):
public void DeleteClient_DeletesClientAndOrderHistory()
{
ExecuteSql(Resources.Scripts.DeleteClientTest_SetupScript);
// perform test
ExecuteSql(Resources.Scripts.DeleteClientTest_TeardownScript);
}
Of course, you wrap such constructs with catch-finally and assure other kinds of revert-to-starting point safety mechanisms. This however will have rather large impact on tests execution time. You can consider two extra options:
having class-wide setup and teardown, where you insert all data required by tested DAO class (runs once per class)
having assembly/namespace-wide setup and teardown, where you insert all data required by all DAO classes (runs once per all classes in assembly/namespace)
Naturally, you'll have to check whether your framework supports such methods (for example, NUnit does with [TestFixtureSetup] (class-wide) [SetUpFixture] (assembly/namespace-wide) attributes).
Related
With code targeting the full .net Framework I could mock up an IDbConnection and point it at a mocked DataSet in order to test that my queries are executing correctly. Similarly if I were using EntityFramework 6 I could have a mocked DbSet return IQueryables and test my data layer logic against that.
However .net core doesn't support DataSets (though that may change in the future?).
In the meantime, is there a way to create a collection of objects which dapper can query using an IDbConnection in order to test the query logic?
No, all dapper is, are extension methods on top of the IDbConnection class.
There is no InMemory implementation for this (IDbConnection) (that understands SQL strings).
Your best bet however, if you want to run it completely autonomous, would be to spin up a new sql server for each time you run unit tests. This can easily be done with the docker image that Microsoft has made for sqlserver: https://hub.docker.com/r/microsoft/mssql-server-linux/
or...
Or migrate to Entity framework, they allow you to unit test against an in-memory backing store.
why?
Dapper just contains some useful features to generate SQL. It by no means abstracts away from SQL. And sql is just plain text for C# code. it does not parse it, nor execute it. Thus you cant unit test your sql/dapper code without using a database behind it.
Entity framework does it differently. it tries to make, everything that you would want to do in a database into C# code/abstraction (eg the IDbCollection). Then they make 1 implementation that generates sql code and one implementation that uses in-memory backing store. this way you can unit test your code.
Microsofts solution
Microsoft often advertises using the Repository Pattern. This is basically an expensive word for abstracting all your database calls/commands into a separate class and interfacing these classes, and use the interfaces everywhere in code (using dependency injection). Now you can write unit tests that test all your code expect for the sql queries, for this interface you make a mock to test if the method is actually called.
Another option to test you database access code (queries etc.) is use a local SQL database instance but instead recreate it every time you can start a database transaction as part of your unit-test setup and rollback the transaction in tear down. Depending on the isolation level you have chosen this also addresses concurrency issues when tests / fixtures are executed in parallel.
Looking for some strategies for how you guys are loading default data when doing unit tests.
I use a builder that contains the default values, just like this: http://elegantcode.com/2008/04/26/test-data-builders-refined/. Then the test only specifies the value it cares for:
Customer customer = new CustomerBuilder()
.WithFirstName("this test only cares about a special ' ... first name value");
After reading the other answers, I want to clear that its not for database data. Its to build the instances/data you pass to the classes you are unit testing.
Its a matter of convenience/keeping the tests simple, plenty of times you are testing very specific behavior that depends on 1-3 fields and you don't care about the rest of the fields.
For unit testing I generally don't load data in advance - each test is designed to work against a data source that may or may not already contain existing records, and so each test writes all any records that are needed to complete the test.
When choosing values to submit to the database I use GUIDs (or other random values) whenever possible as it guarantees that values in the database are unique (e.g. if you create someone named "Mr X Y", it is helpful to know that searching for "X" should return only 1 result, and that there is no chance you have chanced on someone else in the database whose last name happens to be Y)
Often when unit testing I'm testing methods that modify data alongside methods that read data, and so my unit tests use the same API (the one being tested) to write to the database. (It's nice if each unit test covers a specific area of functionality, but it's not absolutely necessary)
If the API being tested doesn't have methods to write to the database however, I write my own set of helper functions - the exact structure is going to depend on the data source, but as an example this is where I often use LINQ to SQL.
TDD is about testing a piece of code in isolation. One create an instance of a class with its dependencies (or mocks of them), call the method under test and assert to verify the outcome of the test.
Usually with TDD one starts with a simple test, without data. When data are needed, they are created in the test fixture (the isolated environment where the test is executed) by the test setUp() method and then destroyed by the tearDown() method after the test has been run. Data are not loaded from the database.
Preferred strategy is in-transaction data. Spring offers extensive support (for both JUnit 3 and 4). With this strategy your test begins brand new transaction each time and your data is rolled back at the end of test.
Of course sometimes it's not enough: either data set is too extensive and shared across tests, or multiple transactions are part of the test scope. In that case, I recommend creating shared test data bed that is created before running test suite. There are frameworks for this (dbUnit) but you can also do without them if careful and consistent.
UPD: creating in-transaction data doesn't mean you not need test data, you are likely to end up creating re-usable and shared helper classes to maintain test data in all cases.
I typically have methods like GetCustomer() that return a generic customer. If I need to make the returned customer suite my needs for a particular test, I will simply change the property after it gets returned.
Other times I may pass some configuration information into my GetCustomer() method. For example GetCustomer(string customerType).
I've read expert's opinions that says that each test should contain its own unique data to work with and not try to make the data generic. Even though this may make each test "larger" in size, over all it will make the test clearer because the setup is specific to each test and the goals of each test. I like this advice because I've ran into many cases where trying to make the setup data generic, made things very sloppy very quick.
How do you unit test your code that utilizes stored procedure calls?
In my applications, I use a lot of unit testing (NUnit). For my DAL, I use the DevExpress XPO ORM. One benefit of XPO is that it lets you use in-memory data storage. This allows me to set up test data in my test fixtures without having an external dependency on a database.
Then, along came optimization! For some parts of our applications, we had to resort to replacing code that manipulated data through our ORM to calling T-SQL stored procedures. That of course broke our nice testable code by adding a new external dependency. We can't just "mock out" the stored procedure call, because we were testing the side effects of the data manipulation.
I already have plans to eventually replace my usage of XPO with LINQ to SQL; LINQ to SQL seems to allow me better querying capabilities than XPO, removing the need for some of the stored procedures. I'm hoping that if I change over to LINQ to SQL, I will be able to have my unit tests use LINQ to Objects to avoid a database dependency. However, I doubt all spocs can be replaced by LINQ to SQL.
Should I:
bite the bullet and change some of my test fixtures so they create SQL Server databases,
create database unit tests instead of testing the code,
or skip testing these isolated incidents because they're not worth it?
I'd also love to hear about your alternative setups where stored procedures peacefully co-exist with your testable code.
The approach I use for this is to encapsulate the the logic layers from the calling of stored procedures behind another method or class. Then you can test the database layer logic separate from testing of the application logic. This way you can create separate unit tests for the client side application logic and integration tests for the server side (database) application logic. Given a piece of code that utilizes a stored procedure call as I have below:
class foo:
prop1 = 5
def method1(listOfData):
for item in listOfData:
dbobj.callprocedure('someprocedure',item+prop1)
It can be refactored to encapsulate the call to the remote system to it's own method:
class foo:
prop1 = 5
def method1(listOfData):
for item in listOfData:
someprocedure(item+prop1)
def someprocedure(value):
dbobj.callprocedure('someprocedure',value)
Now when you write your unit tests, mock out the someprocedure() class method so that it does not actually make the database call. Then create a separate set of integration tests which require a configured database which calls the actual version of someprocedure() and then verifies that the database is in the correct state.
I am curious what strategies folks have found for unit testing a data access class that does not involve loading (and presumably unloading) a real database for each test method? Are you using mock objects to represent the database connection? If so, are you required to pass the mock object into every method-under-test, and thus forcing the API to require a real db connection as a parameter to every method? Or, are you passing a mock object into the constructor at setup()?
I have a class that is implementing what I believe is a Data Mapper (or maybe gateway) pattern. It is the class responsible for encapsulating SQL and returning (or saving) "business objects". The rest of the code can interact with this mapper layer and the business objects, with total disregard for the persistence model. This code needs to have/maintain, or just know about, a live db connection in the real system. Emulating this under test is tricky.
The problem is how to unit test one of these mapper classes. The practice for creating a unit test under xUnit that I have seen most often is using the setup() method of the test to instantiate the SUT (system under test), usually your object that you're testing, and store it in a local variable in the test class. Then each of your test methods, interact with a unique instance of that SUT.
The assumption though is that whatever you're doing in the setup() method will presumably be replicated somewhere in your real code. So, you have to think about the setup process as "is this something I will want to repeatedly reproduce every time I need to use this object in the real world." If I am passing a db connection into the mapper's constructor in the setup that's fine, but doesn't that mean I'll have to pass a live db connection into the mapper object's constructor every time I want to really use one? Imagine that you'll have all kinds of places where you need to retrieve or store a business object and that to use a data mapper object, you need to pass in the db connection every time?
In my case, I am trying to establish tests for these data mapper objects that achieve the following:
Do not require the database connection object to be instantiated and passed into every method of the mapper class.
Do not require that the test case either connect to a real db or create a real, but "test", db on the fly for each test method.
I have basically seen two suggestions, pass the connection object as a parameter (which I have already addressed) or extend the SUT class just for the test and override whatever db connection setup process you have in the real world to use a mock system instead.
I am curious if anyone else is facing these issues, with any language, and what you have done to solve them? Maybe there is something obvious that I am missing?
In my experience, the responsibility for connection to a database is a sore point in data access. I solved this by letting the DAO take care of that based on the configuration file (app.config, etc). This way I don't need to worry about that when I write my tests. The DAL keeps one or more database connection profiles and connects/disconnects on every data access because in the end the connection pool will take care of physically connecting/disconnecting.
Another thing that helped me was using dbUnit to load baseline data before running the tests. I found it easier to go straight to the database instead of using mock objects. Also by connecting to a real database I can (to a certain point) test concurrency by issuing commands in different threads - mock objects wouldn't give me the real behavior.
You can use DbUnit to test SQL
It depends on what you're really trying to test. If you want to test that your SQL does what you expect, that's really heading into Integration Test territory. Assuming you're using Java, there are several pure-java RDBMS solutions (Apache Derby, HSQLDB, H2) you can use for that.
If on the other hand you're really just testing your Java <-> JDBC code (i.e. reading from ResultSets), then you can mock out pretty much all the relevant parts of JDBC since they're mostly interfaces. JMock is great for this. Simply add a setConnection() method to your Class Under Test, and pass in the mocked java.sql.Connection that will do your bidding. This works really well for keeping tests short and sweet.
Depending on how complex is your database setup, it might be a great option using an in memory store.
Normally I do my unit testing with a in-memory SQLite session. This is full blown database 100% in memory, no files, no config needed. Just one line.
Now this is not always an option. SQLite does not support all sql features of full blown server databases. Normally I use a layer trying to make my code database independent. In those cases I just switch to a in-memory database instance which I quickly create/destroy in memory during every setUp/tearDown.
Are you using any mid-layer to access your database? In most cases the greatest benefit of using that type of middleware is not database portability, but a simplified test harness.
How are people unit testing their business applications? I've seen a lot of examples of unit testing with "simple to test" examples. Ex. a calculator. How are people unit testing data-heavy applications? How are you putting together your sample data? In many cases, data for one test may not work at all for another test which makes it hard to just have one test database?
Testing the data access portion of the code is fairly straightforward. It's testing out all the methods that work against the data that seem to be hard to test. For example, imagine a posting process where there is heavy data access to determine what is posted, numbers are adjusted, etc. There are a number of interim steps that occur (and need to be tested) along with tests afterwards that ensure the posting was successful. Some of those steps may actually be stored procedures.
In the past I've tried inserting the test data in a test database, then running the test, but honestly it's pretty painful to write this kind of code (and error prone). I've also tried just building a test database up front and rolling back the changes. That works OK but in a number of places you can't easily do this either (and many people would say that's integration testing; so be it, I still need to be able to test this somehow).
If the answer is that there isn't a nice way of handling this and it currently just sort of sucks, that would be useful to know as well.
Any thoughts, ideas, suggestions, or tips are appreciated.
My automated functional tests usually follow one of two patters:
Database Connected Tests
Mock Persistence Layer Tests
Database Connected Tests
When I have automated tests that are connected to the database, I usually make a single test database template that has enough data for all the tests. When the automated tests are run, a new test database is generated from the template for every test. The test database has to be constantly re-generated because test will often change the data. As tests are added, I usually append more data to the test database template.
There are some nice advantages to this testing method. The obvious advantage is that the tests also exercise your schema. Another advantage is that after setting up the initial tests, most new tests will be able to re-use the existing test data. This makes it easy to add more tests.
The downside is that the test database will become unwieldy. Because data will usually be added one test at time, it will be inconsistent and maybe even unrealistic. You will also end up cursing the person who setup the test database when there is a significant database schema change (which for me usually means I end up cursing myself).
This style of testing obviously doesn't work if you can't generate new test databases at will.
Mock Persistence Layer Tests
For this pattern, you create mock objects that live with the test cases. These mock objects intercept the calls to the database so that you can programmatically provide the appropriate results. Basically, when the code you're testing calls the findCustomerByName() method, your mock object is called instead of the persistence layer.
The nice thing about using mock object tests is that you can get very specific. Often times, there are execution paths that you simply can't reach in automated tests w/o mock objects. They also free you from maintaining a large, monolithic set of test data.
Another benefit is the lack of external dependencies. Because the mock objects simulate the persistence layer, your tests are no longer dependent on the database. This is often the deciding factor when choosing which pattern to choose. Mock objects seem to get more traction when dealing with legacy database systems or databases with stringent licensing terms.
The downside of mock objects is that they often result in a lot of extra test code. This isn't horrible because almost any amount of testing code is cheap when amortized over the number of times you run the test, but it can be annoying to have more test code then production code.
I have to second the comment by #Phil Bennett as I try to approach these integration tests with a rollback solution.
I have a very detailed post about integration testing your data access layer here
I show not only the sample data access class, base class, and sample DB transaction fixture class, but a full CRUD integration test w/ sample data shown. With this approach you don't need multiple test databases as you can control the data going in with each test and after the test is complete the transactions are all rolledback so your DB is clean.
About unit testing business logic inside your app, I would also second the comments by #Phil and #Mark because if you mock out all the dependencies your business object has, it becomes very simple to test your application logic one entity at a time ;)
Edit: So are you looking for one huge integration test that will verify everything from logic pre-data base / stored procedure run w/ logic and finally a verification on the way back? If so you could break this out into 2 steps:
1 - Unit test the logic that happens before the data is pushed
into your data access code. For
example, if you have some code that
calculates some numbers based on
some properties -- write a test that
only checks to see if the logic for
this 1 function does what you asked
it to do. Mock out any dependancy
on the data access class so you can
ignore it for this test of the
application logic alone.
2 - Integration test the logic that happens once you take your
manipulated data (from the previous
method we unit tested) and call the
appropriate stored procedure. Do
this inside a data specific testing
class so you can rollback after it's
completed. After your stored
procedure has run, do a query
against the database to get your
object now that we have done some
logic against the data and verify it
has the values you expected
(post-stored procedure logic /etc )
If you need an entry in your database for the stored procedure to run, simply insert that data before you run the sproc that has your logic inside it. For example, if you have a product that you need to test, it might require a supplier and category entry to insert so before you insert your product do a quick and dirty insert for a supplier and category so your product insert works as planned.
It depends on what you're testing. If you're testing a business logic component -- then its immaterial where the data is coming from and you'd probably use a mock or a hand rolled stub class that simulates the data access routine the component would have called in the wild. The only time I mess with the data access is when I'm actually testing the data access components themselves.
Even then I tend to open a DB transaction in the TestFixtureSetUp method (obviously this depends on what unit testing framework you might be using) and rollback the transaction at the end of the test suite TestFixtureTeardown.
Mocking Frameworks enable you to test your business objects.
Data Driven tests often end up becoming more of a intergration test than a unit test, they also carry with them the burden of managing the state of a data store pre and post execution of the test and the time taken in connecting and executing queries.
In general i would avoid doing unit tests that touch the database from your business objects. As for Testing your database you need a different stratergy.
That being said you can never totally get away from data driven testing only limiting the amout of tests that actually need to invoke your back end systems.
It sounds like you might be testing message based systems, or systems with highly parameterised interfaces, where there are large numbers of permutations of input data.
In general all the rules of standard unti testing still hold:
Try to make the units being tested as small and discrete as possible.
Try to make tests independant.
Factor code to decouple dependencies.
Use mocks and stubs to replace dependencies (like dataaccess)
Once this is done you will have removed a lot of the complexity from the tests, hopefully revealing good sets of unit tests, and simplifying the sample data.
A good methodology for then compiling sample data for test that still require complex input data is Orthogonal testing, or see here.
I've used that sort of method for generating test plans for WCF and BizTalk solutions where the permutations of input messages can create multiple possible execution paths.
For lots of different runs over the same logic but with different data you can use CSV, as many columns as you like for the input and the last for the output etc.