Best practices for file system dependencies in unit/integration tests

Best practices for file system dependencies in unit/integration tests - unit-testing

I just started writing tests for a lot of code. There's a bunch of classes with dependencies to the file system, that is they read CSV files, read/write configuration files and so on.
Currently the test files are stored in the test directory of the project (it's a Maven2 project) but for several reasons this directory doesn't always exist, so the tests fail.
Do you know best practices for coping with file system dependencies in unit/integration tests?
Edit: I'm not searching an answer for that specific problem I described above. That was just an example. I'd prefer general recommendations how to handle dependencies to the file system/databases etc.

First one should try to keep the unit tests away from the filesystem - see this Set of Unit Testing Rules. If possible have your code working with Streams that will be buffers (i.e. in memory) for the unit tests, and FileStream in the production code.
If this is not feasible, you can have your unit tests generates the files they need. This makes the test easy to read as everything is in one file. This may also prevent permissions problem.
You can mock the filesystem/database/network access in your unit tests.
You can consider the unit tests that rely on DB or file systems as integration tests.

Dependencies on the filesystem come in two flavours here:
files that your tests depend upon; if you need files to run the test, then you can generate them in your tests and put them in a /tmp directory.
files that your code is dependent upon: config files, or input files.
In this second case, it is often possible to re-structure your code to remove dependency on a file (e.g. java.io.File can be replaced with java.io.InputStream and java.io.OutputStream, etc.) This may not be possible of course.
You may also need to handle 'non-determinism' in the filesystem (I had a devil of a job debugging something on an NFS once). In this case you should probably wrap the file system in a thin interface.
At its simplest, this is just helper methods that take a File and forward the call onto that file:
InputStream getInputStream(File file) throws IOException {
return new FileInputStream(file);
}
You can then replace this one with a mock which you can direct to throw the exception, or return a ByteArrayInputStream, or whatever.
The same can be said for URLs and URIs.

There are two options for testing code that needs to read from files:
Keep the files related to the unit tests in source control (e.g. in a test data folder), so anyone who gets the latest and runs the tests always has the relevant files in a known folder relative to the test binaries. This is probably the "best practice".
If the files in question are huge, you might not want to keep them in source control. In this case, a network share that is accessible from all developer and build machines is probably a reasonable compromise.
Obviously most well-written classes will not have hard dependencies on the file system in the first place.

Usually, file system tests aren't very critical: The file system is well understood, easy to set up and to keep stable. Also, accesses are usually pretty fast, so there is no reason per se to shun it or to mock the tests.
I suggest that you find out why the directory doesn't exist and make sure that it does. For example, check the existence of a file or directory in setUp() and copy the files if the check fails. This only happens once, so the performance impact is minimal.

Give the test files, both in and out, names that are structurally similar to the unit test name.
In JUnit, for instance, I'd use:
File reportFile = new File("tests/output/" + getClass().getSimpleName() + "/" + getName() + ".report.html");

Related

how to mock file copy in a functional test

I have a controller which duty is copying a file passed along with the request (through a body POST) to a specific path under web/images. The path is specified by a property living into the specific Controller.
What I would like to do is testing it with a functional test, but I wouldn't like it to overwrite files in my project, so I would like to use vfs or change the path before my test case sends the request.
Is there a good-straight way to accomplish this?

A common approach is to load configuration that may change between environments as an environmental variable. (I have not ever used symfony before, so there may be tools to help with env vars)
The upload path could then be
$upload_path = getenv('WEB_IMAGE_UPLOAD_PATH') ?
getenv('WEB_IMAGE_UPLOAD_PATH') : 'web/images'
This will allow you to specify a temp (/tmp ?) directory when starting up your server in integration mode.
Ah cool, (disclaimer: i'm not a php person) it looks like php has IO streams that may be able to help in functional testing, and allow easy cleanup.
http://php.net/manual/en/wrappers.php.php#refsect2-wrappers.php-unknown-unknown-unknown-unknown-unknown-descriptios
I believe you may be able to set your 'WEB_IMAGE_UPLOAD_PATH' to be one of those streams

I'll try to answer myself: I refactored my code in order to have a property that specifies the path I would like to copy/overwrite my file.
Then, inside a PHPUnit class I replace the object property's value with a vfsStream path. By doing like that I get the behavior I need, without touching my real files/paths. Everything will live inside the virtual file system and my object will use it.
Parameters are important for a clean and reusable code, but even more when you want to unit-test: I think Unit testing is helping me to force to parameterize everything in place of relapsing to hardcoding when you don't have so much time. In order to help me writing unit tests I created a class that accesses methods and properties, irrespective of their accessibility.
PHPUnitUtils
I'm quite sure there's already something more sophisticated, but this class fullfills my needs in this very moment. Hope it helps :-)

GoldenFiles testing and TFS server workspaces

Our product (C++ windows application, Google Test as testing framework, VS2015 as IDE) has a number of file-based interfaces to external products, i.e., we generate a file which is then imported into an external product. For testing these interfaces, we have chosen a golden file approach:
Invoke the code that produces an interface file, save the resulting file for later reference (this is our golden file - we here assume that the current state of interface code is correct).
Commit the golden file to the TFS repository.
Make changes to the interface code.
Invoke the code, compare the resulting file with the according golden file.
If the files are equal, the test passes (the change was a refactoring). Otherwise,
Enable the refresh modus which makes sure that the golden file is overriden by the file resulting from invoking the interface code.
Invoke the interface code (thus refreshing the golden file).
Investigate the outgoing changes in VS's team explorer. If the changes are as desired by our code changes from step 3, commit code changes and golden file. Otherwise, go back to step 3.
This approach works great for us, but it has one drawback: VS only recognizes that the golden files have changed (and thus allows us to investigate the changes) if we use a local workspace. If we use a server workspace, programmatically remove the read-only flag from the golden files and refresh them as described above, VS still does not recognize that the files have changed.
So my question is: Is there any way to make our golden file testing approach work with server workspaces, e.g. by telling VS that some files have changed?

I can think of two ways.
First approach is to run a tf checkout instead of removing the Read-Only attribute.
This has an intrinsic risk as one may inadvertently checking-in the generated file; this should be prevented by restricting check-in permissions on those files. Also you may need to run tf undo to clean up the local state.
Another approach would be to map the golden files in a different directory and use a local diff tool instead of relying on Visual Studio builtin tool. This is less risky than the other solution, but may be cumbersome. Do not forget that you can "clone" a workspace (e.g. Import Visual Studio TFS workspaces).

Maven: utility to generate unit tests

I need to write unit tests for an existing Java REST server.
The GET methods are very similar and I am thinking that I can write a small unit test generator that will use reflection to introspect the GET methods and the POJOs they consume to generate (boilerplate) unit tests.
Each test will be generated with a small syntax error so that they cannot be run as is, but must be examined by a developer and the syntax error corrected. I am hoping that this will as least assure that the tests are sane and look reasonable.
The generator will be run from the command line, passing in the class-under-test, the output directory for the unit tests, etc.
I don't want the class files for the generator to be added to the WAR file, but the generator needs to have access to the class files for the REST server.
My project directory is a "standard" Maven hierarchy: project/src/main/java, project/target, etc.
Where is the best place to put the generator source code? Under project/src/main/java? Under project/src/generator/java? Somewhere else?
I know how to exclude the generated class files from the WAR file if they all are included under a specific package (e.g. com.example.unit_test_generator).

This scenario sound like a maven-plugin to me. Furthermore the usual place for generated code is under target/generated... which means target folder ...take a look at maven-antlr3-plugin or maven-jaxb-plugin to see where they usually put generated code into. Never put generated code into src/ structure...But may be you have to change the location and to put into project/src/main/ ...But if these classes are some kind of tests the have to be located under project/src/test instead.

Unit testing processes that use records for state

I'd like to unit test a gen_fsm that uses a fairly large record for its state. The record is defined within the erl file that also defines the gen_fsm and thus is not (to my knowledge) visible to other modules.
Possible approaches:
Put the record into an hrl file and include that in both modules. This is ok, but spreads code that is logically owned by the gen_fsm across multiple files.
Fake a record with a raw tuple in the unit test module. This would get pretty ugly as the record is already over 20 fields.
Export a function from my gen_fsm that will convert a proplist to the correct record type with some record_info magic. While possible, I don't like the idea of polluting my module interface.
Actually spawn the gen_fsm and send it a series of messages to put it in the right state for the unit test. There is substantial complexity to this approach (although Meck helps) and I feel like I'm wasting these great, pure Module:StateName functions that I should be able to call without a whole bunch of setup.
Any other suggestions?

You might consider just putting your tests directly into your gen_fsm module, which of course would give them access to the record. If you'd rather not include the tests in production code, and assuming you're using eunit, you can conditionally compile them in or out as indicated in the eunit user's guide:
-ifdef(EUNIT).
% test code here
...
-endif.

How to organize C++ test apps and related files?

I'm working on a C++ library that (among other stuff) has functions to read config files; and I want to add tests for this. So far, this has lead me to create lots of valid and invalid config files, each with only a few lines that test one specific functionality. But it has now got very unwieldy, as there are so many files, and also lots of small C++ test apps. Somehow this seems wrong to me :-) so do you have hints how to organise all these tests, the test apps, and the test data?
Note: the library's public API itself is not easily testable (it requires a config file as parameter). The juicy, bug-prone methods for actually reading and interpreting config values are private, so I don't see a way to test them directly?
So: would you stick with testing against real files; and if so, how would you organise all these files and apps so that they are still maintainable?

Perhaps the library could accept some kind of stream input, so you could pass in a string-like object and avoid all the input files? Or depending on the type of configuration, you could provide "get/setAttribute()" functions to directly, publicy, fiddle the parameters. If that is not really a design goal, then never mind. Data-driven unit tests are frowned upon in some places, but it is definitely better than nothing! I would probably lay out the code like this:
project/
src/
tests/
test1/
input/
test2
input/
In each testN directory you would have a cpp file associated to the config files in the input directory.
Then, assuming you are using an xUnit-style test library (cppunit, googletest, unittest++, or whatever) you can add various testXXX() functions to a single class to test out associated groups of functionality. That way you could cut out part of the lots-of-little-programs problem by grouping at least some tests together.
The only problem with this is if the library expects the config file to be called something specific, or to be in a specific place. That shouldn't be the case, but if it is would have to be worked around by copying your test file to the expected location.
And don't worry about lots of tests cluttering your project up, if they are tucked away in a tests directory then they won't bother anyone.

Part 1.
As Richard suggested, I'd take a look at the CPPUnit test framework. That will drive the location of your test framework to a certain extent.
Your tests could be in a parallel directory located at a high-level, as per Richard's example, or in test subdirectories or test directories parallel with the area you want to test.
Either way, please be consistent in the directory structure across the project! Especially in the case of tests being contained in a single high-level directory.
There's nothing worse than having to maintain a mental mapping of source code in a location such as:
/project/src/component_a/piece_2/this_bit
and having the test(s) located somewhere such as:
/project/test/the_first_components/connection_tests/test_a
And I've worked on projects where someone did that!
What a waste of wetware cycles! 8-O Talk about violating the Alexander's concept of Quality Without a Name.
Much better is having your tests consistently located w.r.t. location of the source code under test:
/project/test/component_a/piece_2/this_bit/test_a
Part 2
As for the API config files, make local copies of a reference config in each local test area as a part of the test env. setup that is run before executing a test. Don't sprinkle copies of config's (or data) all through your test tree.
HTH.
cheers,
Rob
BTW Really glad to see you asking this now when setting things up!

In some tests I have done, I have actually used the test code to write the configuration files and then delete them after the test had made use of the file. It pads out the code somewhat and I have no idea if it is good practice, but it worked. If you happen to be using boost, then its filesystem module is useful for creating directories, navigating directories, and removing the files.

I agree with what #Richard Quirk said, but also you might want to make your test suite class a friend of the class you're testing and test its private functions.

For things like this I always have a small utility class that will load a config into a memory buffer and from there it gets fed into the actually config class. This means the real source doesn't matter - it could be a file or a db. For the unit-test it is hard coded one in a std::string that is then passed to the class for testing. You can simulate currup!pte3d data easily for testing failure paths.
I use UnitTest++. I have the tests as part of the src tree. So:
solution/project1/src <-- source code
solution/project1/src/tests <-- unit test code
solution/project2/src <-- source code
solution/project2/src/tests <-- unit test code

Assuming that you have control over the design of the library, I would expect that you'd be able to refactor such that you separate the concerns of actual file reading from interpreting it as a configuration file:
class FileReader reads the file and produces a input stream,
class ConfigFileInterpreter validates/interprets etc. the contents of the input stream
Now to test FileReader you'd need a very small number of actual files (empty, binary, plain text etc.), and for ConfigFileInterpreter you would use a stub of the FileReader class that returns an input stream to read from. Now you can prepare all your various config situations as strings and you would not have to read so many files.

You will not find a unit testing framework worse than CppUnit. Seriously, anybody who recommends CppUnit has not really taken a look at any of the competing frameworks.
So yes, go for a unit testing franework, but do not use CppUnit.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js