unit testing failing when run in batch

unit testing failing when run in batch - unit-testing

I am new to unit testing. I have created various tests and when I run test each one by one, all tests passing. However, when I run run on the whole as a batch, some tests failing. Why is that so? How can I correct that?

To resolve this issue it is important to follow certain rules when writing unit tests. Some of the rules are easy to follow and apply while other may need further considerations depending your circumstances.
Each test should set up a unique set of data. This is in particular important when you work with persistent data, e.g. in a database. When the test creates a user with a particular user id, then write the test so that it uses a different user id each time. For example (C#):
var user = new User(Guid.NewGuid());
At the end of each test, cleanup the data that the test has created. For example in the tear down method remove the data that you created (C#, NUnit):
[TearDown]
public void TheTearDownMethod() {
_repository.Delete(_user);
}
Variations are possible, e.g. when you test against a database you may choose to load a backup just before you run the test suite. If you craft your tests carefully you don't need to cleanup the database after each test (or subset of tests).
To get from where you are now (each test passes when run in isolation) to where you would like to be start with running the first two tests in sequence, make them pass. Then run three in sequence, make them pass, etc. In each iteration identify what previous test causes the added test to fail. Resolve that dependency. This way you learn a lot about your tests but also how to avoid writing tests that depend on each other.
Once the suite passes in one patch run it frequently so that you detect dependency as early as possible.
This doesn't cover all scenarios and variations but hopefully gives you a guideline for building on what you already have.

Probably some of your tests are dependent on the prior state of the machine. Unit tests should not depend on the previous state of the process/machine, so you should look at the failing tests and work out what they are depending on.

Sometimes final conditions of one test have an impact on initial conditions of the next.
Manual run and batch run may have different behaviour regarding how initial conditions of each test are set.

You obviously have side effects from some tests that create unintended dependencies. Debug.
Good unit tests are atomic and with zero dependencies to other tests (important). A good practice is that each test creates (removes) everything it is dependent on before running the test. Cleaning up afterwards is also a good practice, it really helps and is recommended but not 100 % necessary.

Related

Technique for TDD testing cycles differentiating types of test?

A newbie in this art... but so far, from my reading, I understand that there are broadly 3 categories: unit tests, acceptance/integration tests (not the same) and end-to-end tests.
The thing is, of these 3, it appears that only unit tests are meant to run lightning-fast. It seems perfectly reasonable to be running ALL the unit tests for the entire project, all the time during development. But the same, it seems, can't be said of the other types.
It seems to me, therefore, that you'd want to be running a single acceptance test (or maybe a group of related ones) at each test run, while running all the unit tests for the whole project.
As for the latest end-to-end test that is in the "red" state, given that these can be even slower than acceptance tests, mightn't you want to run that only intermittently? And the entire end-to-end collection maybe only when you're doing something else, or at night or sthg?
I'm using Gradle, and I'm aware you can create a special test task to only run, for example, all the unit tests under a tests\unittests directory... but, if my thinking is valid, is there a habitual way of skipping, or selecting, particular acceptance tests, other than by constantly editing the code - which can get pretty tiresome?
For example, by somehow tagging particular acceptance or end-to-end tests as a certain "category", or maybe by arranging these tests in a hierarchical folder structure?

I have not used gradle, but in python I regularly use both ways you described:
tagging of specific classes of functional tests (a subset are usually tagged as "smoke" tests, to be run on each deploy)
representing tests in hierarchies
small/unit
integration
function (smoke are usually tagged functional tests)
ui
e2e
it appears that only unit tests are meant to run lightning-fast. It seems perfectly reasonable to be running ALL the unit tests for the entire project,
This is the goal, all unit tests are encouraged to be IO free, to run lighting-fast on ever single commit. This process is usually codifed with CI build jobs to trigger on every commit to a repo.
But the same, it seems, can't be said of the other types.
It really depends on what an acceptable build time is, and the size of your projects. I have found that most projects don't actually have that many integration, and if they do have an excessive number of integration, it is usually a good indication that the service should be rethought. For ever integration how many tests are necessary to protect against difficult to reproduce error cases, and to make sure their are checks that will break on interface changes?? In my experience, not many. I have recently started to use docker-compose for integration tests, which allows many tests 20-30 to be executed very quickly for every commit.
docker-compose also allows for a clean e2e environment to be brought up to have acceptance/functional tests executed against it.
It is also my experience that the higher level tests are executed less frequently, but should be executed as frequently as they can be. For example I work with an API, with 300 functional tests covering every method on every endpoint. Because they don't interact with a UI and only use HTTP, they take about a minute to execute. They are executed on every deploy to an environment and at regular intervals.

Is it possible to reuse test conditions or test steps in an SSDT Database Unit Test?

I have recently started using SSDT database unit tests. I find I have many test conditions which are identical, but I can't find a way to reuse them. How many times do I need to say, "and only one row may be returned"?
Similarly, I have begun to find multiple unit tests which need the same data. This seems to mean that I should use the same pre-test step. But I also can't find a way to reuse the pre-test steps.
Is there something I'm missing? Either that there is a way to reuse these components, or that there is some reason why I don't really need to reuse them?

First and foremost there is no way to reuse pre-tests, post-tests or test conditions across multiple tests. Yes this is annoying however I can kind of understand it - ensuring each test is self-contained strikes me as a good idea.
One way I've gotten around the problem of inability to reuse test data in the past is to write stored procedures that create that test data and include those stored procedures in the SSDT project that I am testing, then I just call the stored procedures from Pre-Test.
The downside of that approach is that those stored procedures then end up as part of your deployed database - not a good thing as you don't want stored procs that exist solely for creating test data in your production databases. The way to get around that is to put these "test data" stored procs into a dedicated SSDT project and create a "Same database" database reference from that project to the SSDT project that contains all your code to be tested. Deploying the second project will also deploy the first one. This technique is better known as Composite Projects.
So that's one way of solving the "sharing test data" problem. I don't know of a way of sharing test conditions other than going and hacking the underlying C# code-behind but that's not something I'd be comfortable in recommending.

In a unit test class there is a section for (Common scripts) found in the same drop down as the individual tests you add.
This section has "Test initialize" and "Test cleanup" where you can add code that runs before and after every test in that class.
Your general test structure would look like this;
"Common scripts > Test initialize" : Setup generic data (and check any test initialize conditions).
"My Test > Pre-test" : make any test specific data adjustments (and check any pre-test conditions).
"My Test > Test" : Execute the testable code and check any test specific conditions.
"My Test > Post-test" : clean up any test specific data (and check any post-test conditions).
"Common scripts > Test cleanup" : clean up any generic data (and check and cleanup conditions).
That is half the battle. Unfortunately there doesn't appear to be a way to run the same test conditions for a number of tests.
However! Test initialize and Test cleanup can have their own test conditions, so if each of your unit tests for a particular stored procedure should result in the same output, e.g. always one row in a table (like an audit entry), then you could add this test condition to the Test cleanup instead; while keeping specific test conditions in the test where they belong.
It could be a little misleading that a core test is found in the Test cleanup section so it is a bit of a work around.
You would have to ensure that the Test cleanup section doesn't remove any data you want to check in your general test conditions as it would be executed first. I work around this by doing my cleanup in the initialize section just before setting up the data, in effect the clean up always puts the database into a known state regardless of what test had run before.
For the audit example, step 5. might confirm that [dbo].[Audit] has one record while step 1. will delete from [dbo].[Audit] returning the row count to 0.

A similar problem is duplication of setup/teardown code. I like having a test class per stored procedure. Within a test class I often have many tests that look at the side effects and data changes caused by running the associated stored procedure.
I don't like that each class has >80% the same setup and teardown code. That means I have dozens of classes to change if my setup or teardown changes.
On the other hand, this post suggests that is just the way things should be: What does “DAMP not DRY” mean when talking about unit tests?

Do atomic tests make sense in dynamically created environments?

We´re building a product that allows users to create custom databases and store data within those DBs (WebApp).
Our issue for testing of the frontend (coffeescript) is that every test should be atomic but that would require setting up a DB for seeing if an item within that DB can be created and persists or to see how changes in a DB affect items.
Essentially, the issue is that the setup code needed to get to the item tests basically sets up a new DB and therefore equals the code that tests setting up a new DB.
There are two approaches and we´re torn on which to use:
1) Create and tear down a new DB with each group of tests
(+) Sorta Atomic (still fails if setting up a DB fails)
(-) Takes a lot of time to execute
(-) Tons of surounding code
(-) No way to explore the created environment
(-) Messy on errors, everything fails
2) Do the setup step by step as seperate tests depending on each other, cleanup routine at beginning of a test
(+) The created environment can be accessed via the UI (not automatically torn down)
(+) Step by step testing, less overall/repetitive code
(-) Tests depended on each other (messy)
(-) Somewhat overall messy
We´re wondering therefore if the golden rule that tests should be atomic makes sense in such a dynamic environment?

Basically, what you are talking about is Integration tests. These are different from Unit Tests. Examples of integration test would be Automated UI tests or Coded UI tests. In most of the projects I've worked on we've had both types of tests and I strongly encourage you to have both types in your project too.
The philosophy behind both these tests is slightly different.
Unit Tests are meant to test isolated bits of functionality.
They are meant to be very fast.
A developer should be able to run them all on their machine in a reasonable amount of time.
There are various consequences of this philosophy.
Because unit test is testing an isolated bit of functionality, you should use mocks and stubs to isolate the rest of the environment and only focus on tiny bits of functionality.
The isolation helps your "design thinking" while writing these tests. In fact this is the reason why the unit tests are required to be fast, because a developer is actively and constantly changing the code and unit tests as part of the design and redesign process. There should be very low overhead to set up, change and run the unit tests. I should be able to ignore everything other than the problem I am trying to solve and quickly iterate and reiterate my designs and tests. This is the idea behind TDD and its claim to help write good testable code. If you are spending a long time trying to set up an overly complex unit test then you have to start reconsidering your design.
The fast nature means that you could run these as part of your Continuous Integration build.
The disadvantage is that because you are testing each functionality in isolation you don't know if they will all work together as a whole. Each time you write a mock, you are implicitly baking in an assumption about how the rest of the system works and that the rest of the system is currently working as it is meant to (i.e nothing else is broken as part of your deployment or running or patching of the OS etc.)
Integration Tests are meant to test the functionality from end to end. You try NOT to mock out or isolate any part of the system.
There are again various consequence of this philosophy. Note that there is no requirement for integration tests to be fast.
Integration tests, by their very nature need to run after your full deployment (as opposed to unit tests which can be run as soon as your code compiles).
Because they take longer, you don't run them as part of your CI environment, but you still need to run them regularly. We usually run them as part of our nightly builds. Or you can run it twice daily etc.
Because the integration tests take a black box approach to the whole system, it doesn't really help you with you "design thinking" about how to actually build the system. But it does help your thinking about the specifications of the system as a whole. i.e What the system should do, not how it should do something.
Note that in both cases the rule of tests being atomic still applies. Each test is different from other tests. This way when a test fails you can be sure about all the conditions that are causing it to fail and concentrate on only fixing that. It's just that an integration test touches as many parts your system as possible.
To give you an example on our current project.
Lets say we need to write a bit of functionality that requires us to add a new table to the DB and bring it through all the layers to show it in the UI.
We start by creating our business logic classes, domain classes, write the appropriate web service, build view models, modify the database etc. While doing each of these we write unit tests to test the code we are currently writing. So when building the business logic classes, we mock out everything else to ensure that the logic in the class is valid (for example, clients over 60 years old get a 50% discount on their car insurance etc.)
Once we do that, we now need to update our deployment scripts / packages etc. to be able to deploy it. i.e update the database creation SQL scripts and the database alteration SQL scripts etc. (In your case this will be complex process).
Now we write integration tests. In this case we might test from SQL Server to Web Service. There is a SQL Integration test base class which contains the set up and tear down method for each test. In the set up we create a brand new database using our sql deployment scripts. Each test also specifies a test data sql script. So for example this test data script might insert a new record into the client table whose age is 70 years. We run this script as part of the "Arrange" of our test. Then make a web service call to search for clients older than 60. This is the "Act" part of the test and from the result, we check to make sure that we only get back the user we've inserted into the DB. At the end of the test, the database is deleted. We've caught bugs here when the columns in SQL database aren't nullable or the datetime columns overflow because the default minimum datetime in .Net is a different size from SQL server's minimum datetime.
Some functionality requires us to interact with an Oracle database. For example, if a new record is added to Oracle, then a trigger/db procedure kicks off and transfers that record to SQL and then we need to bring it up the layers. In this case we have an OracleSQL integration test base class. As you might have guessed, this follows a simliar pattern, but creates both Oracle and SQL dbs inserts test data into Oracle and blows them both away at the end of the test.
The developers usually pick the Web service layer for writing their integration tests. The testers on the other hand use UI automation tools to make sure that the data is actually showing up on screen. For example they will record a test that goes to web page, clicks search button, puts "60" into the age box, clicks the search button etc. That test might leverages the same test data sql script that inserts test data that the developer wrote (or the testing team might come to the developer and ask help crafting sql scripts to insert whatever highly convoluted data they can think of). But the point is, once the test data insertion script is created, it leverages the same underlying system to blow away the whole db, create a new one, insert test data, and run the specified test.
So, to answer your question, you will need two types of tests, unit tests and integration tests. You might have to put in some initial work into creating some base classes or helper methods to create/delete databases, automating your deployment to install/uninstall other components of your system etc. You will have to do this for your final deployment anyway. Integration tests will also be closely related to and dependent on your deployment strategy. This is an advantage and not a disadvantage in my opinion. While it might be painful at first to set it all up, one of the things your integration tests are implicitly testing is your deployment mechanism. If there are any issues with deploying/installing any of the components required by your system, you want to know about it as quickly as possible. Not the day before you are supposed to be deploying to production.
A good suite of tests is invaluable. It also needs to be isolated, rigorous and comprehensive. The tests shouldn't fail when they don't need to but more importantly, they should fail when they need to. And when they do fail, you want them to provide as much information as possible and point you at the exact location of failure. This makes fixing the issue a much easier task. Any time you put into building this test suite will more than pay for itself in no time.

You're not doing atomic tests if you're talking to a database.
You need to mock the database interface and talk to the mock instead. That will be fast, and you'll be able to use the mock to introduce errors that would be difficult using the real database.

why testing an individual junit test works, while testing them together won't?

The test that fails when tested together with mvn test (or through the ide) is called EmpiricalTest.
If I test the file alone it goes through, but not otherwise. Why could that be?
You can checkout the Maven source code (to test) from here.
This is how I make sure the database is 'blank' before each test:
abstract public class PersistenceTest {
#Before
public void setUp() {
db.destroy();
assertIsEmpty(MUser.class);
assertIsEmpty(Meaning.class);
assertIsEmpty(Expression.class);
}
private <Entity> void assertIsEmpty(final Class<Entity> entityClass){
final List<Entity> all = db.getAll(entityClass);
Assert.assertTrue(all.isEmpty());
}
and the test that fails:
public class EmpiricalTest extends PersistenceTest {

It got to do with the id automatically assigned. The PU creates a SEQUENCE table, and although I empty the database from my entities, I don't actually drop that table. So when I'm testing EmpiricalTest alone the sequence starts as expected from 1, while when testing together the test is executed later and starts with a higher, unexpected number.
This leads to this question.

This very much sounds as if there are dependencies between the test. As far as I understand from looking at your test, you're accessing the your data storage in the test. Is there a chance that one of the tests doesn't properly cleanup his traces, therefore causing others to fail??
Testing against a DB is usually not considered a unit test, though it is very useful. Those kind of tests (you may call them integration tests) are however more difficult and time consuming to code because you have to pay a lot of attention that your test leaves the environment in the exact state it found it before.

Your problem is very common. In ideal TDD world each test should be executed in the perfect isolation from the other test. You violated the isolation and that's the problem.
However there is no simple solution for the test isolation problem. The main reason is that SQL DLL doesn't support database creation/deletion, while automatically droping tables is complicated due to the possible complex foreign keys constrains.
In my experience the best idea is to execute tests within transaction and rollback data on the end of the test (just like Pascal suggested). Spring test module provides great support for that.
In case you cannot execute test within the transaction boundaries (like yours) you must be sure that each of your test doesn't leave anything in the database (including foreignkeys, constrains, sequences, etc.) and also that tests are designed to be independent of each others (for example don't depend on autogenerated id value because sequence generation could be executed in previous tests).
You must debug your Maven test session order to check out what is wrong with the assertion (I guess that you cannot tell that from the Surefire logs). Then fix the tests (both the failing one and the other one which leaves the rubbish in the DB) to be isolated from each others.

Integration testing - can it be done right?

I used TDD as a development style on some projects in the past two years, but I always get stuck on the same point: how can I test the integration of the various parts of my program?
What I am currently doing is writing a testcase per class (this is my rule of thumb: a "unit" is a class, and each class has one or more testcases). I try to resolve dependencies by using mocks and stubs and this works really well as each class can be tested independently. After some coding, all important classes are tested. I then "wire" them together using an IoC container. And here I am stuck: How to test if the wiring was successfull and the objects interact the way I want?
An example: Think of a web application. There is a controller class which takes an array of ids, uses a repository to fetch the records based on these ids and then iterates over the records and writes them as a string to an outfile.
To make it simple, there would be three classes: Controller, Repository, OutfileWriter. Each of them is tested in isolation.
What I would do in order to test the "real" application: making the http request (either manually or automated) with some ids from the database and then look in the filesystem if the file was written. Of course this process could be automated, but still: doesn´t that duplicate the test-logic? Is this what is called an "integration test"? In a book i recently read about Unit Testing it seemed to me that integration testing was more of an anti-pattern?

IMO, and I have no literature to back me on this, but the key difference between our various forms of testing is scope,
Unit testing is testing isolated pieces of functionality [typically a method or stateful class]
Integration testing is testing the interaction of two or more dependent pieces [typically a service and consumer, or even a database connection, or connection to some other remote service]
System integration testing is testing of a system end to end [a special case of integration testing]
If you are familiar with unit testing, then it should come as no surprise that there is no such thing as a perfect or 'magic-bullet' test. Integration and system integration testing is very much like unit testing, in that each is a suite of tests set to verify a certain kind of behavior.
For each test, you set the scope which then dictates the input and expected output. You then execute the test, and evaluate the actual to the expected.
In practice, you may have a good idea how the system works, and so writing typical positive and negative path tests will come naturally. However, for any application of sufficient complexity, it is unreasonable to expect total coverage of every possible scenario.
Unfortunately, this means unexpected scenarios will crop up in Quality Assurance [QA], PreProduction [PP], and Production [Prod] cycles. At which point, your attempts to replicate these scenarios in dev should make their way into your integration and system integration suites as automated tests.
Hope this helps, :)
ps: pet-peeve #1: managers or devs calling integration and system integration tests "unit tests" simply because nUnit or MsTest was used to automate it ...

What you describe is indeed integration testing (more or less). And no, it is not an antipattern, but a necessary part of the sw development lifecycle.
Any reasonably complicated program is more than the sum of its parts. So however well you unit test it, you still have not much clue about whether the whole system is going to work as expected.
There are several aspects of why it is so:
unit tests are performed in an isolated environment, so they can't say anything about how the parts of the program are working together in real life
the "unit tester hat" easily limits one's view, so there are whole classes of factors which the developers simply don't recognize as something that needs to be tested*
even if they do, there are things which can't be reasonably tested in unit tests - e.g. how do you test whether your app server survives under high load, or if the DB connection goes down in the middle of a request?
* One example I just read from Luke Hohmann's book Beyond Software Architecture: in an app which applied strong antipiracy defense by creating and maintaining a "snapshot" of the IDs of HW components in the actual machine, the developers had the code very well covered with unit tests. Then QA managed to crash the app in 10 minutes by trying it out on a machine without a network card. As it turned out, since the developers were working on Macs, they took it for granted that the machine has a network card whose MAC address can be incorporated into the snapshot...

What I would do in order to test the
"real" application: making the http
request (either manually or automated)
with some ids from the database and
then look in the filesystem if the
file was written. Of course this
process could be automated, but still:
doesn´t that duplicate the test-logic?
Maybe you are duplicated code, but you are not duplicating efforts. Unit tests and integrations tests serve two different purposes, and usually both purposes are desired in the SDLC. If possible factor out code used for both unit/integration tests into a common library. I would also try to have separate projects for your unit/integration tests b/c
your unit tests should be ran separately (fast and no dependencies). Your integration tests will be more brittle and break often so you probably will have a different policy for running/maintaining those tests.
Is this what is called an "integration
test"?
Yes indeed it is.

In an integration test, just as in a unit test you need to validate what happened in the test. In your example you specified an OutfileWriter, You would need some mechanism to verify that the file and data is good. You really want to automate this so you might want to have a:
Class OutFilevalidator {
function isCorrect(fName, dataList) {
// open file read data and
// validation logic
}

You might review "Taming the Beast", a presentation by Markus Clermont and John Thomas about automated testing of AJAX applications.
YouTube Video
Very rough summary of a relevant piece: you want to use the smallest testing technique you can for any specific verification. Spelling the same idea another way, you are trying to minimize the time required to run all of the tests, without sacrificing any information.
The larger tests, therefore are mostly about making sure that the plumbing is right - is Tab A actually in slot A, rather than slot B; do both components agree that length is measured in meters, rather than feet, and so on.
There's going to be duplication in which code paths are executed, and possibly you will reuse some of the setup and verification code, but I wouldn't normally expect your integration tests to include the same level of combinatoric explosion that would happen at a unit level.

Driving your TDD with BDD would cover most of this for you. You can use Cucumber / SpecFlow, with WatiR / WatiN. For each feature it has one or more scenarios, and you work on one scenario (behaviour) at a time, and when it passes, you move onto the next scenario until the feature is complete.
To complete a scenario, you have to use TDD to drive the code necessary to make each step in the current scenario pass. The scenarios are agnostic to your back end implementation, however they verify that your implementation works; if there is something that isn't working in the web app for that feature, the behaviour needs to be in a scenario.
You can of course use integration testing, as others pointed out.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js