Unit Tests for comparing text files in NUnit - unit-testing

I have a class that processes a 2 xml files and produces a text file.
I would like to write a bunch of unit / integration tests that can individually pass or fail for this class that do the following:
For input A and B, generate the output.
Compare the contents of the generated file to the contents expected output
When the actual contents differ from the expected contents, fail and display some useful information about the differences.
Below is the prototype for the class along with my first stab at unit tests.
Is there a pattern I should be using for this sort of testing, or do people tend to write zillions of TestX() functions?
Is there a better way to coax text-file differences from NUnit? Should I embed a textfile diff algorithm?
class ReportGenerator
{
string Generate(string inputPathA, string inputPathB)
{
//do stuff
}
}
[TextFixture]
public class ReportGeneratorTests
{
static Diff(string pathToExpectedResult, string pathToActualResult)
{
using (StreamReader rs1 = File.OpenText(pathToExpectedResult))
{
using (StreamReader rs2 = File.OpenText(pathToActualResult))
{
string actualContents = rs2.ReadToEnd();
string expectedContents = rs1.ReadToEnd();
//this works, but the output could be a LOT more useful.
Assert.AreEqual(expectedContents, actualContents);
}
}
}
static TestGenerate(string pathToInputA, string pathToInputB, string pathToExpectedResult)
{
ReportGenerator obj = new ReportGenerator();
string pathToResult = obj.Generate(pathToInputA, pathToInputB);
Diff(pathToExpectedResult, pathToResult);
}
[Test]
public void TestX()
{
TestGenerate("x1.xml", "x2.xml", "x-expected.txt");
}
[Test]
public void TestY()
{
TestGenerate("y1.xml", "y2.xml", "y-expected.txt");
}
//etc...
}
Update
I'm not interested in testing the diff functionality. I just want to use it to produce more readable failures.

As for the multiple tests with different data, use the NUnit RowTest extension:
using NUnit.Framework.Extensions;
[RowTest]
[Row("x1.xml", "x2.xml", "x-expected.xml")]
[Row("y1.xml", "y2.xml", "y-expected.xml")]
public void TestGenerate(string pathToInputA, string pathToInputB, string pathToExpectedResult)
{
ReportGenerator obj = new ReportGenerator();
string pathToResult = obj.Generate(pathToInputA, pathToInputB);
Diff(pathToExpectedResult, pathToResult);
}

You are probably asking for the testing against "gold" data. I don't know if there is specific term for this kind of testing accepted world-wide, but this is how we do it.
Create base fixture class. It basically has "void DoTest(string fileName)", which will read specific file into memory, execute abstract transformation method "string Transform(string text)", then read fileName.gold from the same place and compare transformed text with what was expected. If content is different, it throws exception. Exception thrown contains line number of the first difference as well as text of expected and actual line. As text is stable, this is usually enough information to spot the problem right away. Be sure to mark lines with "Expected:" and "Actual:", or you will be guessing forever which is which when looking at test results.
Then, you will have specific test fixtures, where you implement Transform method which does right job, and then have tests which look like this:
[Test] public void TestX() { DoTest("X"); }
[Test] public void TestY() { DoTest("Y"); }
Name of the failed test will instantly tell you what is broken. Of course, you can use row testing to group similar tests. Having separate tests also helps in a number of situations like ignoring tests, communicating tests to colleagues and so on. It is not a big deal to create a snippet which will create test for you in a second, you will spend much more time preparing data.
Then you will also need some test data and a way your base fixture will find it, be sure to set up rules about it for the project. If test fails, dump actual output to the file near the gold, and erase it if test pass. This way you can use diff tool when needed. When there is no gold data found, test fails with appropriate message, but actual output is written anyway, so you can check that it is correct and copy it to become "gold".

I would probably write a single unit test that contains a loop. Inside the loop, I'd read 2 xml files and a diff file, and then diff the xml files (without writing it to disk) and compare it to the diff file read from disk. The files would be numbered, e.g. a1.xml, b1.xml, diff1.txt ; a2.xml, b2.xml, diff2.txt ; a3.xml, b3.xml, diff3.txt, etc., and the loop stops when it doesn't find the next number.
Then, you can write new tests just by adding new text files.

Rather than call .AreEqual you could parse the two input streams yourself, keep a count of line and column and compare the contents. As soon as you find a difference, you can generate a message like...
Line 32 Column 12 - Found 'x' when 'y' was expected
You could optionally enhance that by displaying multiple lines of output
Difference at Line 32 Column 12, first difference shown
A = this is a txst
B = this is a tests
Note, as a rule, I'd generally only generate through my code one of the two streams you have. The other I'd grab from a test/text file, having verified by eye or other method that the data contained is correct!

I would probably use XmlReader to iterate through the files and compare them. When I hit a difference I would display an XPath to the location where the files are different.
PS: But in reality it was always enough for me to just do a simple read of the whole file to a string and compare the two strings. For the reporting it is enough to see that the test failed. Then when I do the debugging I usually diff the files using Araxis Merge to see where exactly I have issues.

Related

how to do unit test elegant on graph operator

I have a application deal with graph computation. I want cover unit test on to it, but I found it is hard to do the test.
The main class is shown as follows:
Grid store the graph strcture
GridInput parse inputfile and save into Grid.
GridOperatorA do some operation on Grid.
GridOperatorB do some operation on Grid.
the production code is some thing like
string configure_file = "data.txt";
GridInput input(configure_file);
Grid grid = input.parseGrid();
GridOperatorA a;
a.operator(grid);
GridOpeartorB b;
b.operator(grid);
I found the code is hard to test.
My unit test code shown as follow
// unit test for grid input
string configure_file = "data.txt";
GridInput input(configure_file);
Grid grid = input.parseGrid();
// check grid status from input file
assert(grid.someAttribute(1) == {1,2,3,4,...,100}); // long int array hard to understand
...
assert(grid.someAttribute(5) == {100,101,102,...,200}); // long int array hard to understand
// unit test for operator A
string configure_file = "data.txt";
GridInput input(configure_file);
Grid grid = input.parseGrid();
GridOperatorA a;
a.operator(grid);
// check grid status after opeator A
assert(grid.someAttribute(1) == {1,3,,7,4,...,46}); // long int array hard to understand
...
assert(grid.someAttribute(5) == {59,78,...,32}); // long int array hard to understand
// unit test for operator B
string configure_file = "data.txt";
GridInput input(configure_file);
Grid grid = input.parseGrid();
GridOperatorA a;
a.operator(grid);
GridOperatorA b;
b.operator(grid);
// check grid status after opeator B
assert(grid.someAttribute(1) == {3,2,7,9,...,23}); // long int array hard to understand
...
assert(grid.someAttribute(5) == {38,76,...,13}); // long int array hard to understand
In my option, my unit test is not good, it have many backness
the unit test is slow, in order to test OperatorA,OperatorB it need to do file IO
the unit test is not clear, they need to check the grid status after operator, but check a lot of array is hard for programmer to understand what the array stand for. a few days later, programmer can not understand what have happened.
the unit test is only for one configure file, if I need to test grid from many configure file, there will be even more array hard to understand.
I have read some technique to break dependency, such as mock object. I can mock the grid read from configure file. But the mock data is just like the data store in configure file. I can mock the Grid after operatorA, but the mock data is just like the grid status after operatorA. They will also leads to a lot of array hard to understand.
I do not know how to do unit test elegant in my situation. Any voice is appreciate. Thanks for your time.
To get rid of the io
you can pass something like a data provider to GridInput. In you production code it will read the file. In test code you can replace it with a test double (stub) that does provide hardcoded data. You already mention that above.
you could also let "someone else" (i.e. other code) take care of loading the file and just pass the loaded data to the grid. Just looking at the Grid, testing gets simpler because there is no file handling required at all.
To make the test more readable you can do some of this:
use nice test method names that are not just testMethod. Name them after what you are testing. You could use your comments as method names. Test only one aspect in a single test.
replace the inline array with properly named constants. The name of the constants can help to understand what is checked at a given assertion.
same holds for the parameters to the someAttribute() method.
another option is to create you own assert methods to hide some of the details. Something like assertThatMySpecialConditionIsMet(grid).
You could also write a test data generator to avoid hardcoding the arrays. Not something i would suggest for the first test. After a couple of tests a pattern ight get visible that can be moved to a generator.
Just a couple of hints to get you started.... :-)

cppUnit: setUp function executed once for multiple testmethods

I've got an object Obj doing some (elaborate) computation and want to check weather the result (let's call it aComputed and bComputed) is correct or not. Therefore I want to split this task up into multiple test methods:
testA() { load aToBe; check if number aComputed = aToBe }
testB() { load bToBe; check if number bComputed = bToBe }
The problem is, that Obj is "executed" twice (which takes a lot of time) - one time per test. The question is: How can I manage that it's just "executed" once and the result is used used by both tests?
At the moment Obj is placed inside the setUp-function and saves the results to a private member of the test-class.
Thanks for helping!
There is no easy solution that allows you to split the code into two test methods. Each test method results in a new test object with an own set of local variables.
Obviously you could work around this problem through a static variable but in the long run this normally just causes issues and breaks the ideas behind the framework.
The better idea is to just write the two CPPUNIT_ASSERT in the same test method. If the results are part of the same calculation there is most likely not much value in splitting the checks into two independent test methods.

cppunit to use command line arguments

I have a CPP unit test which tests a class which is designed to read configuration: we can call this class
Config
The config class has the capacity of doing
Config c;
c.read("/tmp/random-tmp-directory/test.conf");
The random-temp-directory is created by a bash script and should be passed into the test binary.
#!/bin/bash
TEMPDIR=$(mktemp)
cp files/config/test.conf $TEMPDIR/.
./testConfig $(mktemp)/test.conf
The above creates a temp directory, copies our temporary file and passes the path to the test, so it can load the correct file.
Is there a way to tell CPPUNIT to send the commandline arguments, or any arguments to the test registry?
Here is my testConfig.cpp
#include <all the required.h>
CPPUNIT_TEST_SUITE_REGISTRATION(testConfig);
int main(int argc, char ** argv)
{
CPPUNIT_NS::TestResult testresult;
CPPUNIT_NS::TestRunner runner;
CPPUNIT_NS::TestFactoryRegistry &registry = CPPUNIT_NS::TestFactoryRegistry::getRegistry();
// register listener for collecting the test-results
CPPUNIT_NS::TestResultCollector collectedresults;
testresult.addListener(&collectedresults);
runner.addTest(registry.makeTest());
runner.run(testresult);
// Print test in a compiler compatible format.
CppUnit::CompilerOutputter outputter( &collectedresults, std::cerr );
outputter.write();
return collectedresults.wasSuccessful() ? 0 : 1;
}
Consider dividing your code into at least three distinct methods: the part that constructs the config file name, the part that reads the config file, and the part that parses what was read from the config file. You can easily and thoroughly unit test both the file name builder and the parser methods. And as long as you can test simply reading data from the file even one time, you should be golden.
[edit]
For example, you might have a method like string & assembleConfigFileName(string basepath, string randompath, string filename) that takes in the different components of your path and filename, and puts them together. One unit test should look like this:
void TestConfig::assembleConfigFileName_good()
{
string goodBase("/tmp");
string goodPath("1234");
string goodName("test.conf");
string actual(assembleConfigFileName(goodBase, goodPath, goodName));
string expected("/tmp/1234/test.conf");
CPPUNIT_ASSERT_EQUAL(expected, actual);
}
Now you can test that you're building the fully qualified config file name exactly correctly. The test is not trying to read a file. The test is not trying to generate a random number. The test is providing an example of exactly what kinds of input the routine needs to take, and stating exactly what the output should look like given that exact input. And it's proving the code does exactly that.
It's not important for this routine to actually read a config file out of a temp directory. It's only important that it generate the right file name.
Similarly, you build a unit test to test for each possible flow through your code, including error scenarios. Let's say you wrote an exception handler that throws if the random path is wrong. Your unit test will test the exception mechanism:
void TestConfig::assembleConfigFileName_null_path()
{
string goodBase("/tmp");
string nullPath;
string goodName("temp.config");
CPPUNIT_ASSERT_THROW(assembleConfigFileName(goodBase, nullPath, goodName), MissingPathException);
}
The tests are now a document that says exactly how it works, and exactly how it fails. And they prove it every single time you run the tests.
Something you appear to be trying to do is to create a system test, not a unit test. In a unit test, you do NOT want to be passing in randomly pathed config files. You aren't trying to test the external dependencies, that the file system works, that a shell script works, that $TMPDIR works, none of that. You're only trying to test that the logic you've written works.
Testing random files in the operating system is very appropriate for automated system tests, but not for automated unit tests.

How to test that a file is left unchanged?

I'm testing a function that may modify a file. How do i test that it is unchanged in the cases where I want it to?
I don't want to check the content, because the file may have been overwritten with the same content, changing the modification time.
I can't really check the modification time, either. Since I like tests to be self-contained, the original file would be written just before the (non-)modification test, rendering the modification time unreliable.
You can use DI to mock your filewriter. This way you do not need the file at all, only check if the write function is called and you know if the file was modified.
I would split the function into two separate functions; the first decides whether the modification should be made, the second makes the notification. The second is only called if necessary. In pretend language:
function bool IsModificationRequired()
{
// return true or false based on your actual code
}
function void WriteFile()
{
new File().Write("file");
}
function void WriteIfModified()
{
if (IsModificationRequired())
WriteFile();
}
And test
Assert.IsTrue(IsModificationRequired());
Well assuming you are using a text file and reasonable size. Just hash the file content, if before modify and after modidfy hashcode is same then - it means the file content is not changed.
Here is the link to Algorithim Design Manual - Steve Skiena (Google Book Result)
Section 3.8
How can i convicne you that a file isn't changed ?

How to test asynchronuous code

I've written my own access layer to a game engine. There is a GameLoop which gets called every frame which lets me process my own code. I'm able to do specific things and to check if these things happened. In a very basic way it could look like this:
void cycle()
{
//set a specific value
Engine::setText("Hello World");
//read the value
std::string text = Engine::getText();
}
I want to test if my Engine-layer is working by writing automated tests. I have some experience in using the Boost Unittest Framework for simple comparison tests like this.
The problem is, that some things I want the engine to do are just processed after the call to cycle(). So calling Engine::getText() directly after Engine::setText(...) would return an empty string. If I would wait until the next call of cycle() the right value would be returned.
I now am wondering how I should write my tests if it is not possible to process them in the same cycle. Are there any best practices? Is it possible to use the "traditional testing" approach given by Boost Unittest Framework in such an environment? Are there perhaps other frameworks aimed at such a specialised case?
I'm using C++ for everything here, but I could imagine that there are answers unrelated to the programming language.
UPDATE:
It is not possible to access the Engine outside of cycle()
In your example above, std::string text = Engine::getText(); is the code you want to remember from one cycle but execute in the next. You can save it for later execution. For example - using C++11 you could use a lambda to wrap the test into a simple function specified inline.
There are two options with you:
If the library that you have can be used synchronously or using c++11 futures like facility (which can indicate the readyness of the result) then in your test case you can do something as below
void testcycle()
{
//set a specific value
Engine::setText("Hello World");
while (!Engine::isResultReady());
//read the value
assert(Engine::getText() == "WHATEVERVALUEYOUEXPECT");
}
If you dont have the above the best you can do have a timeout (this is not a good option though because you may have spurious failures):
void testcycle()
{
//set a specific value
Engine::setText("Hello World");
while (Engine::getText() != "WHATEVERVALUEYOUEXPECT") {
wait(1 millisec);
if (total_wait_time > 1 sec) // you can put whatever max time
assert(0);
}
}