Unit testing for stochastic processes? - unit-testing

Is there a sane way to unit test a stochastic process? For example say that you have coded a simulator for a specific system model. The simulator works randomly based on the seeds of the rngs so the state of the system cannot be predicted and if it can be every test should bring the system to a specific state before it attempts to test any method of a class. Is there a better way to do this?

The two obvious choices are to remove the randomness (that is, use a fixed, known seed for your unit tests and proceed from there), or to test statistically (that is, run the same test case a million times and verify that the mean and variance (etc.) match expectations). The latter is probably a better test of your system, but you'll have to live with some false alarms.

Here's a nice blog post that covers this topic. Basically you will need to inject a controlled randomness into the object under test.

Maybe you could use JUnit Theories to solve that.
http://blogs.oracle.com/jacobc/entry/junit_theories

you need to find the Q0 and p00 for the system. p00 is the predicted state while qo is
the calculated state.the predicted state can lead to find the recurrant system which is
the smallest value, say k in the system.

If your model is stochastic, then you could treat output as a randomly generated sample. Then, in your unit testing function, you could perform some sort of hypothesis testing with confidence interval. If the test output is within the confidence bound, then the test is successful. However, there will be a possibility of generating false positive/false negative.

Related

How to determine the contribution rate of a specific unit test case?

Background
I am aware of the principles of TDD (Test Driven Development) and unit testing, as well of different coverage metrics. Currently, i am working on an Linux C/C++ project, where 100% branch coverage should be reached.
Question
Does anybody know a technique/method to automatically identify those unit test cases, that contribute most to reach a specific coverage goal? Each unit-test could then be associated with contribution rate (in percent). Having this numbers, unit-test cases could be ordered by their contribution-rate.
The Greedy algorithm can help here. In simple words:
From all tests select the one with the highest coverage
Calculate the coverage delta between remaining candidates and tests selected already.
Pick the candidate that gives the biggest delta
Repeat as of step 2 until all tests are put into the ranking
As a result you'll get a sorting that looks like the one generated by Squish Coco for GNU coreutils:
Typically the benefit of each extra test will go down the more tests you add. Some of them may be even have zero contribution to the total coverage.
A good case for this sorting is an optimal execution order for smoke tests that only have limited time to run. For a complete testing you better always run the whole suite, of course.

How should I unit-test digital filters?

First the question(s):
How should I write unit tests for a digital filter (band-pass/band-stop) in software? What should I be testing? Is there any sort of canonical test suite for filtering?
How to select test inputs, generate expected outputs, and define "conformance" in a way that I can say the actual output conforms to expected output?
Now the context:
The application I am developing (electromyographic signal acquisition and analysis) needs to use digital filtering, mostly band-pass and band-stop filtering (C#/.Net in Visual Studio).
The previous version of our application has these filters implemented with some legacy code we could use, but we are not sure how mathematically correct it is, since we don't have unit-tests for them.
Besides that we are also evaluating Mathnet.Filtering, but their unit test suite doesn't include subclasses of OnlineFilter yet.
We are not sure how to evaluate one filtering library over the other, and the closest we got is to filter some sine waves to eyeball the differences between them. That is not a good approach regarding unit tests either, which is something we would like to automate (instead of running scripts and evaluating the results elsewhere, even visually).
I imagine a good test suite should test something like?
Linearity and Time-Invariance: how should I write an automated test (with a boolean, "pass or fail" assertion) for that?
Impulse response: feeding an impulse response to the filter, taking its output, and checking if it "conforms to expected", and in that case:
How would I define expected response?
How would I define conformance?
Amplitude response of sinusoidal input;
Amplitude response of step / constant-offset input;
Frequency Response (including Half-Power, Cut-off, Slope, etc.)
I could not be considered an expert in programming or DSP (far from it!) and that's exactly why I am cautious about filters that "seem" to work well. It has been common for us to have clients questioning our filtering algorithms (because they need to publish research where data was captured with our systems), and I would like to have formal proof that the filters are working as expected.
DISCLAIMER: this question was also posted on DSP.StackExchange.com

TDD with diagrams

I have an app which draws a diagram. The diagram follows a certain schema,
for e.g shape X goes within shape Y, shapes {X, Y} belong to a group P ...
The diagram can get large and complicated (think of a circuit diagram).
What is a good approach for writing unit tests for this app?
Find out where the complexity in your code is.
separate it out from the untestable visual presentation
test it
If you don't have any non-visual complexity, you are not writing a program, you are producing a work of art.
Unless you are using a horribly buggy compiler or something, I'd avoid any tests that boil down to 'test source code does what it says it does'. Any test that's functionally equivalent to:
assertEquals (hash(stripComments(loadSourceCode())), 0x87364fg3234);
can be deleted without loss.
It's hard to write defined unit tests for something visual like this unless you really understand the exact sequence of API calls that are going to be built.
To test something "visual" like this, you have three parts.
A "spike" to get the proper look, scaling, colors and all that. In some cases, this is almost the entire application.
A "manual" test of that creates some final images to be sure they look correct to someone's eye. There's no easy way to test this except by actually looking at the actual output. This is hard to automate.
Mock the graphics components to be sure your application calls the graphics components properly.
When you make changes, you have to run both tests: Are the API calls all correct? and Does that sequence of API calls produce the image that looks right?
You can -- if you want to really burst a brain cell -- try to create a PNG file from your graphics and test to see if the PNG file "looks" right. It's hardly worth the effort.
As you go forward, your requirements may change. In this case, you may have to rewrite the spike first and get things to look right. Then, you can pull out the sequence of API calls to create automated unit tests from the spike.
One can argue that creating the spike violates TDD. However, the spike is designed to create a testable graphics module. You can't easily write the test cases first because the test procedure is "show it to a person". It can't be automated.
You might consider first converting the initial input data into some intermediate format, that you can test. Then you forward that intermediate format to the actual drawing function, which you have to test manually.
For example when you have a program that inputs percentages and outputs a pie chart, then you might have an intermediate format that exactly describes the dimensions and position of each sector.
You've described a data model. The application presumably does something, rather than just sitting there with some data in memory. Write tests which exercise the behaviour of the application and verify the outcome is what is expected.

What do you do to test methods that produce complicated object graphs?

I'm a controls developer and a relative newbie to unit testing. Almost daily, I fight the attitude that you cannot test controls because of the UI interaction. I'm producing a demonstration control to show that it's possible to dramatically reduce manual testing if the control is designed to be testable. Currently I've got 50% logic coverage, but I think I could bump that up to 75% or higher if I could find a way to test some of the more complicated parts.
For example, I have a class with properties that describe the control's state and a method that generates a WPF PathGeometry object made of several segments. The implementation looks something like this:
internal PathGeometry CreateOuterGeometry()
{
double arcRadius = OuterCoordinates.Radius;
double sweepAngle = OuterCoordinates.SweepAngle;
ArcSegment outerArc = new ArcSegment(...);
LineSegment arcEndToCenter = new LineSegment(...);
PathFigure fig = new PathFigure();
// configure figure and add segments...
PathGeometry outerGeometry = new PathGeometry();
outerGeometry.Figures.Add(fig);
return outerGeometry;
}
I've got a few other methods like this that account for a few hundred blocks of uncovered code, an extra 25% coverage. I originally planned to test these methods, but rejected the notion. I'm still a unit testing newbie, and the only way I could think of to test the code would be several methods like this:
void CreateOuterGeometry_AngleIsSmall_ArcSegmentIsCorrect()
{
ClassUnderTest classUnderTest = new ClassUnderTest();
// configure the class under test...
ArcSegment expectedArc = // generate expected Arc...
PathGeometry geometry = classUnderTest.CreateOuterGeometry()
ArcSegment arc = geometry.Figures.Segments[0];
Assert.AreEqual(expectedArc, arc)
}
The test itself looks fine; I'd write one for each expected segment. But I had some problems:
Do I need tests to verify "Is the first segment an ArcSegment?" In theory the test tests this, but shouldn't each test only test one thing? This sounds like two things.
The control has at least six cases for calculation and four edge cases; this means for each method I need at least ten tests.
During development I changed how the various geometries were generated several times. This would cause me to have to rewrite all of the tests.
The first problem gave me pause because it seemed like it might inflate the number of tests. I thought I might have to test things like "Were there x segments?" and "Is segment n the right type?", but now that I've thought more I see that there's no branching logic in the method so I only need to do those tests once. The second problem made me more confident that there would be much effort associated with the test. It seems unavoidable. The third problem compounds the first two. Every time I changed the way the geometry was calculated, I'd have to edit an estimated 40 tests to make them respect the new logic. This would also include adding or removing tests if segments were added or removed.
Because of these three problems, I opted to write an application and manual test plan that puts the control in all of the interesting states and asks the user to verify it looks a particular way. Was this wrong? Am I overestimating the effort involved with writing the unit tests? Is there an alternative way to test this that might be easier? (I'm currently studying mocks and stubs; it seems like it'd require some refactoring of the design and end up being approximately as much effort.)
Use dependency injection and mocks.
Create interfaces for ArcSegmentFactory, LineSegmentFactory, etc., and pass a mock factory to your class. This way, you'll isolate the logic that is specific to this object (this should make testing easier), and won't be depending on the logic of your other objects.
About what to test: you should test what's important. You probably have a timeline in which you want to have things done, and you probably won't be able to test every single thing. Prioritize stuff you need to test, and test in order of priority (considering how much time it will take to test). Also, when you've already made some tests, it gets much easier to create new tests for other stuff, and I don't really see a problem in creating multiple tests for the same class...
About the changes, that's what tests are for: allowing you to change and don't really fear your change will bring chaos to the world.
You might try writing a control generation tool that generates random control graphs, and test those. This might yield some data points that you might not have thought of.
In our project, we use JUnit to perform tests which are not, strictly speaking, unit tests. We find, for example, that it's helpful to hook up a blank database and compare an automatic schema generated by Hibernate (an Object-Relational Mapping tool) to the actual schema for our test database; this helps us catch a lot of issues with wrong database mappings. But in general... you should only be testing one method, on one class, in a given test method. That doesn't mean you can't do multiple assertions against it to examine various properties of the object.
My approach is to convert the graph into a string (one segment per line) and compare this string to an expected result.
If you change something in your code, tests will start to fail but all you need to do is to check that the failures are in the right places. Your IDE should offer a side-by-side diff for this.
When you're confident that the new output is correct, just copy it over the old expected result. This will make sure that a mistake won't go unnoticed (at least not for long), the tests will still be simple and they are quick to fix.
Next, if you have common path parts, then you can put them into individual strings and build the expected result of a test from those parts. This allows you to avoid repeating yourself (and if the common part changes, you just have to update a single place for all tests).
If I understand your example correctly, you were trying to find a way to test whether a whole bunch of draw operations produce a given result.
Instead of human eyes, you could have produced a set of expected images (a snapshot of verified "good" images), and created unit tests which use the draw operations to create the same set of images and compare the result with an image comparison. This would allow you to automate the testing of the graphic operations, which is what I understand your problem to be.
The textbook way to do this would be to move all the business logic to libraries or controllers which are called by a 1 line method in the GUI. That way you can unit test the controller or library without dealing with the GUI.

Automated testing a game

Question
How would you go adding automated testing to a game?
I believe you can unit test a lot of the game engine's functionality (networking, object creation, memory management, etc), but is it possible to automate test the actual game itself?
I'm not talking about gameplay elements (like Protoss would beat Zerg in map X), but I'm talking about the interaction between the game and the engine.
Introduction
In game development, the engine is just a platform for the game. You could think of the game engine as an OS and the game as a software the OS would run. The game could be a collection of scripts or an actual subroutine inside the game engine.
Possible Answers
My idea is this:
You would need an engine that is deterministic. This means that given one set of input, the output would be exactly the same. This would inlude the random generator being seeded with the same input.
Then, create a bare-bone level which contains a couple of objects the avatar/user can interact with. Start small and then add objects into the level as more interactions are developed.
Create a script which follows a path (tests pathfinding) and interact with the different objects (store the result or expected behavior). This script would be your automated test. After a certain amount of time (say, one week), run the script along with your engine's unit tests.
This post at Games From Within might be relevant/interesting.
Riot Games has an article on using automated testing for League of Legends (LoL), a multiplayer online RTS game.
According to the developers, there are many changes to both the game code and game balance everyday. They built a Python test framework that is basically a simpler game client that sends commands to the Continuous Integration server that is running an instance of LoL's game server. The server then send the test framework the effect of the command, allowing the response to be tested.
The framework provides an event queue that records the events, data, and effect from a particular point in time. The article calls this a "snapshot".
The article described an example of a unittest for a spell:
Setup
1. Give a character the ability.
2. Spawn an enemy character in the midlane (a location on the map).
3. Spawn a creep in the midlane. (In the context of LoL, creeps are weak non-controllable characters that are part of each team's army. They are basically canon fodder and is a source of experience and gold for the enemy team. But if left unchecked, they can overwhelm the opposing team)
4. Teleport the character to the midlane.
Execute
1. Take a snapshot of all the variables (e.g. the current life from the player, enemy and normal characters).
2. Cast the spell.
3. Activate the spell's effects (for example, there are some spells that will proc on hit) on an enemy character.
4. Reset the spell's cooldown so it can be cast again immediately.
5. Cast the spell.
6. Activate the spell's effects on a creep (in the context of LoL, most spells have different calculations when used on creeps).
7. Take another snapshot.
Verify
Starting from the first snapshot, replay the events, and assert that the expected results (from a game designer's point of view) are correct. Examples of events that can be verified are: The damage is within the range of the spell's damage (LoL uses random numbers to give variance to attacks), Damage is properly resisted when compared with a player character and a creep, and spells are cast within its effective range.
The article shows that a video of the test can be extracted when the test server is viewed from a normal game client.
Values are so random within the gameplay aspects of development that it would be a far fetched idea to test for absolute values
But we can test deterministic values. For example, a unit test might have Guybrush Threepwood move toward a door (pathfinding), open the door (use command), fail because he doesn't have a key in his inventory (feedback), pick the door key (pathfinding + inventory management) and then finally opening the door.
All of these paths are deterministic. With this unit test, I can refactor the memory manager and if it somehow broke the inventory management routine, the unit test would fail.
This is just one idea for unit testing in games. I would love to know other ideas, hence, the motivation for this post.
I did something similar to your idea once and it was very successful, though I suspect it is really more of a system test than a unit test. As you suggest your random number generator must be seeded with the same value, and must produce an identical sequence each time.
The game ran on 50hz cycles, so timing was not an issue. I had a system that would record mouse clicks and locations, and used this to manually generate a 'script' which could be replayed to produce the same results. By removing the timing delays and turning off the graphic generation an hour of gameplay could be replicated in a few seconds.
The biggest problem was that changes to the game design would invalidate the script.
If your barebones room contained logic that was independent of the general game play then it could work very well. The engine could start up without any ui and start the script as soon as initialisation is complete. Testing for crashing along the way would be simple, but more complex tests such as leaving the characters in the correct positions would be more complex. If the recording of the scripts are simple enough, which they were in my system, then they can be updated very easily, and special scripts to test specialised behavior can be set up very quickly. My system had the added advantage that it could be used during game testing, and the exact sequence of events recorded to make bug fixing easier.
An article from Power of Two GamesGames From Within was mentioned in another answer already, but I suggest reading everything (or nearly everything) there, as they are all really well-written and apply directly to games development. The article on Assert is particularly good. You can also visit their previous website at Games From Within, which has a lot written about Test Driven Development, which is unit testing taken to the extreme.
The Power of Two guys are the ones who implemented UnitCpp, a pretty well-regarded unit testing framework. Personally, I prefer WinUnit.
If you are testing the rendering engine I guess you could render specific test scenes, do a screen captures and compare them to reference test renderings. That way you can detect if changes in the engine breaks anything, visually. You can write similar test for the sound engine, or even animation (by comparing a series of frames).
If you want to test game logic or scene progress you can do this by testing various conditions on the scripting variables (assuming you are using scripting to implement most of the scene and story aspects).
If you're using XNA (the idea could be extrapolated to other frameworks of course), you could use an in-game unit test framework that lets you access the game's state in the unit test. One such framework is Scurvy.Test :-)
http://flea.sourceforge.net/gameTestServer.pdf
This is an interesting discussion on implementing a full-blown functional tester in a game.
The term "unit testing" implies that a "unit" is being tested. This is one thing. If you're doing higher-level testing (e.g. several systems at once), usually this is called functional testing. It is possible to unit test much of a game, however you can't really test for fun.
Determinism isn't necessary, as long as your tests can be fuzzy. E.g. "did the character get hurt" as opposed to "did the character lose 14.7 hitpoints.
I have written a paper on that topic -
http://download.springer.com/static/pdf/722/art%253A10.7603%252Fs40601-013-0010-4.pdf?auth66=1407852969_87bc2e71ad5228b36738f0237084ebe5&ext=.pdf
This doesn't really answer your question but I was listening to a podcast on Pex from microsoft which does a similar thing to the solution you're proposing and when I was listening to it I remember thinking that it would be really interesting to see if it would be able to test games. I don't know if it would be able to help you specifically, but perhaps you could take a look at some of the ideas they use and apply it to your unit testing.