I've inherited an old program that works, but has pretty ugly source code. I need to make the code nicer without changing the functionality. This program takes an input file, performs all sorts of calculations and generates an output file.
The program is currently written in a C/C++ combination. At first I'm going to keep it a C++ program, but in the not-so-far future I'm going to convert it, or parts of it, to Python.
Naturally, the original developers haven't taken the time to create unit tests, or any other kind of test. Since I want to make sure my modifications haven't changed the program's behavior, I want to start by creating some tests. These will not be unit tests, but rather tests of the entire program.
I want each test to take one input file and a set of command line arguments, run the program and compare the output (which is the output file, stdout output and stderr output) to the expected output.
Since I need to support both C++ and Python, the test framework needs to be language agnostic - it should be able to run an executable, collect stdout and stderr and compare them, as well as another file, to the prerecorded outputs.
I couldn't find a test framework that can do that. Is there anything like that? I'd rather not develop one myself.
Well, off the top of my head, you could certainly run the executable with your desired inputs in python using subprocess or some similar module, parse the output and then use the unittest module to set expectations on what sort of output you're looking for.
Related
Situation:
I'm attempting to get coverage reports for a project that uses both C++ and Python. I'm using LCOV/GCOV for C++, and attempting to use Coverage.py for the python stuff. The only issue is, most of the python code that's being used is simply utility functions being called one function at a time. No initialization, no real life-cycle, or exit. So no real way to use the API to start/stop/save, or use the coverage command line to measure.
With this, I thought the easiest way to accomplish this would be using the sitecustomize.py method like outlined here. I have gotten that to work, and it measures all configured python code as expected. Now I'm looking at how to accomplish this with compiled python code (.pyc).
I can get it to work if I keep source(.py) and (.pyc) in the same directory when running, and then reporting. However, I'm looking for a way to RUN the files and generate the measurement data. Then at a later time point to the actual source files, and run the actual reports. Ideally I wouldn't need the source(.py) files at all, but I haven't found a way to accomplish this.
Objective:
In the end I want to be able to compile the python files(.pyc), install them on the target, and run coverage like stated above. It will generate coverage data files, then pull those files to my host machine which houses the source(.py) .. and do the actual coverage reporting.
Is this possible currently?
[Edit] Thanks to Ned's advice, I looked into the [paths] usage, and it worked exactly how I needed it to.
How do syntax highlighting tools such as pygments and textmate bundle do automated testing?
Tools like this often simply resort to a large collection of snippets of text representing a chosen input and the expected output. For instance if you look at the Pygments Github, you can see they have giant lists of text files divided into an input section and a tokens section like so:
---input---
f'{"quoted string"}'
---tokens---
'f' Literal.String.Affix
"'" Literal.String.Single
'{' Literal.String.Interpol
'"' Literal.String.Double
'quoted string' Literal.String.Double
'"' Literal.String.Double
'}' Literal.String.Interpol
"'" Literal.String.Single
'\n' Text
Since a highlighting tool reads a piece of code and then has to identify which bits of text are parts of which bits of code (is this the start of a function? is this a comment? is it a variable name?), they usually perform various processing steps that will result in a list of tokens as above, which they can then feed into the next step (insert highlights from the first Literal.String.Interpol to the next, bold any Literal.String.Single, etc. by generating the appropriate HTML or CSS or other markup relevant to the system). Checking that these tokens are generated properly from the input text is key.
Then, depending on the language the tool is built in you might use an existing testing suite or build your own (pygments seems to use a Python-based tool called pyTest), which essentially consists of running each of the inputs through your tool in a loop, reading the output, and comparing it to the expected values. If the output doesn't match, you can display a message showing what test failed, what the input/output/expected/error values were. If an output passes, you could simply signal with a happy green checkmark. Then when the test finishes, the developer can hopefully reason out what they broke by looking over the results.
It is often a good idea to randomize the order that these inputs so that you can be sure that each step in the test doesn't have side effects that are getting passed along to the next test and cause it to pass or fail incorrectly. It might also be a good idea to time the length of the complete test. If the whole thing was taking 12 seconds yesterday, but now it takes two minutes, we may have broken something even if all the test technically "pass".
In tools like a code highlighter, you often have a good idea of what many of the inputs and outputs will look like before you can code everything up, for instance if some spec document already exists. In that case, it may be a good idea to include tests that you know won't pass right away, but mark them with some tag (perhaps some text marker within the file that says "NOT PASSING", or naming the file in a certain way), and telling your testing suite to expect those tests to fail. Then, as you fix bugs and add features, say you fixed Bug X in your attempt to make test #144 pass. Now when you run the text, it also alerts you that 10 other tests that should be failing are now passing. Congrats! You just saved yourself a lot of work trying to fix several separate problems that were actually caused by the same root issue.
As the codebase is updated, a developer would run and rerun the test to ensure that any changes he makes doesn't break tests that were working before, and then would add new tests to the collection to verify that his new feature, fixed edge case, etc., now has a known expected output that you can be sure someone won't accidentally break in the future.
In CPP unit we run unit test as part of build as part of post build setup. We will be running multiple tests as part of this. In case if any test case fails post build should not stop, it should go ahead and run all the test cases and should report summary how many test cases passed and failed. how can we achieve this.
Thanks!
His question is specific enough. You need a test runner. Encapsulate each test in its own behavior and class. The test project is contained separately from the tested code. Afterwards just configure your XMLOutputter. You can find an excellent example of how to do this in the linux website. http://www.yolinux.com/TUTORIALS/CppUnit.html
We use this way to compile our test projects for our main projects and observe if everything is ok. Now it all becomes the work of maintaining your test code.
Your question is too vague for a precise answer. Usually, a unit test engine return a code to tell it has failed (like a non zero return code in the shell on linux) or generate some output file with results. The calling system handle this. If you have written it (some home made scripts) you have to give the option to go on tests execution even if an error occurred. If you are using some tools like continuous integration server, then you have to go through the doc and find the option that allows you to go on when tests fails.
A workaround is to write a script that return a "OK" result even if the unit test fails, but there you lose some automatic verification ...
Be more specific if you want more clues.
my2c
I would just write your tests this way. Instead of using the CPPUNIT_ASSERT macros or whatever you would write them in regular C++ with some way of logging errors.
You could use a macro for this too of course. Something like:
LOGASSERT( some_expression )
could be defined to execute some_expression and to log the expression together with FILE and LINE if it fails, and you can also log exceptions of course, as well as ones that are not thrown, simply by writing them in your tests (with macros if you want to log the expression that caused them with FILE and LINE).
If you are writing macros I would advise you to limit the content of your macro to calling an inline function with extra parameters.
Hello another question concerning debugging : Automatically generating test cases when i know the parameterset. And doing it all at once, instead during development (could kick myself)
i have a set of parameters for my software that i wish to test. (~ 12 parameters only). However of course these parameters are often integers, so for every parameter i can have 4 values that make sense(0, insanely huge, normally big, normally small).
is there a way i can generate my testcases automatically? would save me a lot of time. I already have to inspect every test case by hand, do i not? Alot of my program produces output to the console so normal assertions probably wont work, also i work on home made datastructures most of the time, so i could not use a simple assertion.
My dream option would be kind of a reverse regular expression, where i set the rules and get myself some file generated that i can use as an input (my software has a crude scripting language). that way i can assemble all input files and test them one by one.
Looking forward to listening to your kind suggestions.
cheers
There are lots of ways to generate test cases in your scenario -- though you're a bit vague on what form the inputs for your programs and units need to take. For one of my Fortran programs I use a template input parameter file, a bash script and a make file. The make file, when called on the test phony target:
a) compiles the program;
b) runs the bash script, which uses sed to replace placeholders in the template parameter file, to create 128 (or whatever) test input files;
c) submits all the test jobs to the job management system on our cluster.
Once they jobs have finished I have some other scripts to compare outputs with benchmarks, collect statistics, that sort of thing.
If you need more specific advice, post more specific questions.
EDIT: Using sed inside a bash script:
Suppose that the parameter input template file contains 3 codes to be replaced: $FREQ$, $NUM$ and $TOL$. Then I write a bash script with a 3-deep loop nest something like this:
for frq in 0.01 0.0 1 10
do
for np in 1 2 4 8 16
do
for tol in 0.001 0.0001 0.00001
sed ....
done
done
done
It's not pretty but it works, and it saves me wrestling with much more sophisticated solutions such as xUnit testing or Python programming.
I suggest you read something about data-driven unit testing.
There are lots of frameworks that can help you with that.
You may start here: http://www.slideshare.net/dnastacio/datadriven-unit-testing-for-java-1933154.
I see that you work with FORTRAN and you probably deal with one of FORTRAN's versions of xUnit. Being user of JUnit I'd suggest parameterized tests - see if the concept applies in your case.
I have a test program which prompts for input from user(stdin), and depending on inputs, it asks for other inputs, which also need to be entered.
is there a way I can have a script do all this work ?
There's a program called expect that does pretty much exactly what you want -- you can script inputs and expected outputs and responses based on those outputs, as simple or complex as you need. See also the wikipedia entry for expect
I may have misunderstood, but do you have a program which reads the input and does something with it, and you just want to know how to automate providing it some test input?
For a given test case, does the input you provide have to depend on the output from the program, or is it the same every time?
If the input for a given test is the same every time, then just put it in a text file and redirect stdin for your program to read from that file:
myprogram.exe < input.txt
If the input is different each time, for the same test, then this doesn't help. But for a typical simple test, you just want to answer "y" to the first question, "n" to the second, and "hello world" to the third, or whatever.
In the general case, yes, there is.
For more specific tasks, you can get other tools to do the job that will be more specialized and usable for that particular task.