Unit testing methods whose impact can only be confirmed using another method - unit-testing

Assume there are two methods in a module, insert and print. The module's inner properties are private and can only be accessed by calling print().
In testing, print has its own set of tests, and now I'm trying to test insert. The issue is, the only way to confirm if insert did in fact insert any data into the model is to call print() and check its output. Which also means that if something were to break print in the future, then insert's tests would also fail.
The most straightforward solution is to expose the inner variables that insert affects and assert the changes directly, but that would clutter the module's API and potentially confuse end users.
Is there another practice/solution to situations like this or is exposing the inner variable the way to go?

Related

Unit test and private vars

I'm writing a BDD unit test for a public method. The method changes a private property (private var) so I'd like to write an expect() and ensure it's being set correctly. Since it's private, I can't work out how access it from the unit test target.
For Objective-C, I'd just add an extension header. Are there any similar tricks in Swift? As a note, the property has a didSet() with some code as well.
(Note that Swift 2 adds the #testable attribute which can make internal methods and properties available for testing. See #JeremyP's comments below for some more information.)
No. In Swift, private is private. The compiler can use this fact to optimize, so depending on how you use that property, it is legal for the compiler to have removed it, inlined it, or done any other thing that would give the correct behavior based on the code actually in that file. (Whether the optimizer is actually that smart today or not, it's allowed to be.)
Now of course if you declare your class to be #objc, then you can break those optimizations, and you can go poking around with ObjC to read it. And there are bizarre workarounds that can let you use Swift to call arbitrary #objc exposed methods (like a zero-timeout NSTimer). But don't do that.
This is a classic testing problem, and the classic testing answer is don't test this way. Don't test internal state. If it is literally impossible to tell from the outside that something has happened, then there is nothing to test. Redesign the object so that it is testable across its public interface. And usually that means composition and mocks.
Probably the most common version of this problem is caching. It's very hard to test that something is actually cached, since the only difference may be that it is retrieved faster. But it's still testable. Move the caching functionality into another object, and let your object-under-test accept a custom caching object. Then you can pass a mock that records whether the right cache calls were made (or networking calls, or database calls, or whatever the internal state holds).
Basically the answer is: redesign so that it's easier to test.
OK, but you really, really, really need it... how to do it? OK, it is possible without breaking the world.
Create a function inside the file to be tested that exposes the thing you want. Not a method. Just a free function. Then you can put that helper function in an #if TEST, and set TEST in your testing configuration. Ideally I'd make the function actually test the thing you care about rather than exposing the variable (and in that case, maybe you can let the function be internal or even public). But either way.

What are recommended ways to refactor the internal representation of a class?

Say I have a class which stores some data in a private variable 'data', e.g. an array. There are different methods of this class using the data variable and different unit tests testing these methods. Now for some reason you want to change the container type of data (e.g. a map instead of an array), which needs to be handled in a slightly different way. If one simply changes the type of data from array to map all the code in the methods gets broken and one would have to change them all before any unit tests can be run again. This usually is not what one wants. So what is the recommended way to do this iteratively without breaking all the code at once ?
Unit tests typically test the outer functionality of a class.
Therefore a refactor of a classes "internals" should not break any tests.
(this is one of the benefits of testing. ).
Many IDE's will allow you to do a automated refactor of internals in one hit. but you did not specify the language or IDE which you are using.
If your class is so large that a change to a data-type creates "hours" of work then you should probably consider breaking your class into smaller bits of functionality prior to a refactor of the data type.
Alternatively you could hand refactor in chunks and attempt to keep the class stable and 'commenting out' the array definition temporarily to help tell you where you need to focus.

How would you create unit tests for a data intensive application which could run an endless amount of db queries?

I am working on a reporting application (in PHP). This app has a huge amount of different filters, granulations, etc. in the UI and based on those filters etc, the backend constructs a massive query to pull hundreds of rows of data from the db.
How is it possible to write unit tests for something like this?
Lets say I create a test db with some known data. Would I create a bunch of tests where I compare the returned data set (for whatever filter settings) against hardcoded SQL queries in the tests?
Would this mean that for any schema change, I have to go back and change every single SQL query in the tests?
Unit testing isn't testing in way that uses real code or data, you mock everything you work with. You wouldn't test it in the way you are describing, nor need to. You aren't testing what data you get, only that the data you feed it, after the method processes it, is what you expect or similar.
For example, if you have a method that returns data retrieved from a database, the database has nothing to do with your test. You are testing just that method and the logic there within; what methods you may call within it, expectations as to what you expect those methods within it to do (like return a generic representation of a value you can do an assertion on) etc, and everything outside of that method is mocked (i.e. a generic representation).
In a simple example, if you created one method that is a setter of something, and a one method used as a getter of that something, then you will write a test that says when I use the setter the getter will return the same value.... boom, both methods are tested.
This is the reason why you hear about TDD (test driven development), which may feel counter intuitive at first, but it forces a developer to put together the pieces required to write testable code, which ultimately leads to code that's better. Yes, you can write code that functions perfectly, but it's not necessarily testable (or nearly impossible to), and that's an indicator that it's entirely too coupled, meaning it's not that reusable. For example, instead of creating a method that returns the number of apples, you could create a method that injects the object type so no matter what type of fruit you are using in that part of the project, it could return you a count (oranges, apples, pears, or not even fruit at all). That makes that method reusable, and also means you won't be writing methods for each type of fruit either (so you write less code).
Anyway, provide an example of your code, and your test, to see what the issue is.

Organizing unit test within a test class

Suppose I have several unit tests in a test class ([TestClass] in VSUnit in my case). I'm trying to test just one thing in each test (doesn't mean just one Assert though). Imagine there's one test (e.g. Test_MethodA() ) that tests a method used in other tests as well. I do not want to put an assert on this method in other tests that use it to avoid duplicity/maintainability issues so I have the assert in only this one test. Now when this test fails, all tests that depend on correct execution of that tested method fail as well. I want to be able to locate the problem faster so I want to be somehow pointed to Test_MethodA. It would e.g. help if I could make some of the tests in the test class execute in a particular order and when they fail I'd start looking for cause of the failure in the first failing test. Do you have any idea how to do this?
Edit: By suggesting that a solution would be to execute the tests in a particular order I have probably went too far and in the wrong direction. I don't care about the order of the tests. It's just that some of the tests will always fail if a prequisite isn't valid. E.g. I have a test class that tests a DAO class (ok, probably not a UNIT test, but there's logic in the database stored procedures that needs to be tested but that's not the point here I think). I need to insert some records into a table in order to test that a method responsible for retrieving the records (let's call it GetAll()) gets them all in the correct order e.g. I do the insert by using a method on the DAO class. Let's call it Insert(). I have tests in place that verify that the Insert() method works as expected. Now I want to test the GetAll() method. In order to get the database in a desired state I use the Insert() method. If Insert() doesn't work, most tests for GetAll() will fail. I'd prefer to mark the tests that can't pass because Insert() doesn't work as inconclusive rather than failed. It would ease finding the cause of the problem if I know which method/test to look into first.
You can't (and shouldn't) execute unit tests in a specific order. The underlying reason for this is to prevent Interacting Tests - I realize that your motivation for requesting such a feature is different, but that's the reason why unit test frameworks don't allow you to order tests. In fact, last time I checked, xUnit.net even randomizes the order.
One could argue that the fact that some of your tests depend on a different method call on the same class is a symptom of tight coupling, but that's not always the case (state machines come to mind).
However, if possible, consider using a Back Door instead of the other method in question.
If you can't do either that or decouple the interdependency (e.g. by making the first method virtual and using the Extract and Override technique), you will have to live with it.
Here's an example:
public class MyClass
{
public virtual void FirstMethod() { // do something... }
public void SecondMethod() {}
}
Since FirstMethod is virtual, you can derive from MyClass and override its behavior. You can also use a dynamic mock to do that for you. With Moq, it would look like this:
var sutStub = new Mock<MyClass>();
// by default, Moq overrides all virtual methods without calling base
// Now invoke both methods in sequence:
sutStub.Object.FirstMethod(); // overriden by Moq, so it does nothing
sutSutb.Object.SecondMethod();
I think I would indeed have the assertion on the method_A() result in every tests relying on its result, even if this introduces some duplication. Then I would use the assertion message to point to the method_A() failure.
assert("method_A() returned true", true, rc);
Perhaps will I end extracting the method_A() call and the assertion into an helper function to remove the duplication.
Now let's imagine method_A() queries an object and returns it, or NULL when no object is found. Then this assertion is a guard ; and it is necessary with languages suchas C, C++ that do not have NullPointerException.
I'm afraid you can't do this. The only solution is to redesign your code and break it up into smaller methods so that unit tests can call these one by one. Of course this isn't always desirable.
With Visual Studio you can order your tests: see here. But I'd like to advise you to stay away from this technique as much as possible: unit tests are meant to be run anywhere, anytime and in every order.
EDIT: why is this a problem for you? All failing tests point to the same method anyway...

unit testing data storage

Suppose I have an interface with methods 'storeData(key, data)' and 'getData(key)'. How should I test a concrete implementation? Should I check if the data was correctly set in the storage medium (eg an sql database) or should I just check whether or not it gives the correct data back by using getData?
If I look up the data in the database it feels like I'm also testing the internals of the method but only checking whether it gives the same data back feels incomplete.
You seem to be caught up in the hype of unit testing, what you will be doing is actually an integration test. Setting and getting back the same value from the same key is a unit test you'd do with a mock implementation of the storage engine, but actually testing the real storage, say your database, as you should, that is no longer a unit test, but it is a fundamental part of testing, and it sounds like integration testing to me. Don't use unit testing as your hammer, choose the right tools for the right job. Divide your testing into more layers.
What you want to do in a unit test is make sure that the method does the job that it is supposed to do. If the method uses dependencies to accomplish it's work, you would mock those dependencies out and make sure that your method calls the methods on the objects it depends on with the appropriate arguments. This way you test your code in isolation.
One of the benefits to this is that it will drive the design of your code in a better direction. In order to use mocking, for example, you naturally gravitate towards more decoupled code using dependency injection. This gives you the ability to easily substitute your mock objects for the actual objects that your class depends on. You also end up implementing interfaces, which are more naturally mocked. Both of these things are good design patterns and will improve your code.
In order to test your particular example, for instance, you might have your class depend on a factory to create connections to the database and a builder to construct parameterized SQL commands that are executed via the connection. You'd pass these mocked versions of these objects to your class and ensure that the correct methods to set up the connection and command, build the correct command, execute it, and tear down the connection were invoked. Or perhaps, you inject an already open connection and simply build the command and invoke it. The point is your class is built against an interface or set of interfaces and you use mocking to supply objects that implement those interfaces and can record invocations and supply correct return values to the methods that you expect to use from the interface(s).
In cases like this I will usually create SetUp and TearDown methods that fire before/after my unit tests. These methods will set up any test data I need in the db and delete any test data when I'm done. Pseudo code example:
Const KEY1 = "somekey"
Const VALUE1= "somevalue"
Const KEY2 = "somekey2"
Const VALUE2= "somevalue2"
Sub SetUpUnitTests()
{
Insert Into SQLTable(KEY1,VALUE1)
}
//this test is not dependent on the setData Method
Sub GetDataTest()
{
Assert.IsEqual(getData(KEY1),VALUE1)
}
//this test is not dependent on getData Method
Sub SetDataTest()
{
storeData(newKey,NewData)
Assert.IsNotNull(Direct Call to SQL [Select data from table where key=KEY2])
}
Sub TearDownUnitTests()
{
Delete From table Where key in (KEY1, KEY2)
}
Testing both in concert is a common technique (at least, in my experience), and I wouldn't shy away from it. I've used this same pattern for serializing/deserializing and parsing and printing.
If you don't want to hit the database, you could use a database mock. Some people have the same feelings as you when using mocks - it is partly implementation specific. As in all things, it's a trade-off: consider the benefits of mocking (faster, not db dependent) vs its downsides (won't detect actual db problems, slower).
I think it depends on what happens to the data later - if you're only ever going to access the data using storeData and getData, why not test the methods in concert? I suppose there's a chance that a bug will arise and it'll be slightly harder to figure out whether it's in storeData or getData, but I'd consider that an acceptable risk if it
makes your test easier to implement, and
conceals the internals, as you say
If the data will be read from, or inserted into, the database using some other mechanism, then I'd check the database using SQL as you suggest.
#brendan makes a good point, though - whichever method you decide on, you'll be inserting data in the database. It's a good idea to clear out the data before and after the tests to ensure that you can achieve consistent results.