Unit Testing Object/Model Converters

Unit Testing Object/Model Converters - unit-testing

There are several places, where one has to convert one data object into another. For example incoming data from a webservice or a REST service into an object that is persistable.
Is there a way to unit test that all incoming data gets filled into the right places of the "outgoing" objects without copying the converter logic inside the test?
If the fields are all called the same, and one is feeling adventurous, reflections could do some work.. But I don't feel like going down that path..
Acceptance tests won't catch a bug if say a Person that has a name and a firstname gets converted into a Person where name == firstname due to some copy+paste mistake.
So right now I just skip testing object/model conversion and rather take a really good look at my converter.
Has anyone any idea on how to do this differently?

If you need to test that multiplication works, you should not replicate the multiplication logic. Define test data that you know are correct, and test that the multiplicaiton is ok.
assert( 4*5, 20 )
and not
assert( 4*5, 4*5 )
Here the test data are 4, 5, 20, and test that logic that ties them is the multiplication. The same principle holds in your case. Define test data and test that convertion produces the right results.
(As you point out, making test themsleves generic with reflection, etc., defeats the purpose of testing.)

Related

Extracting number of bits in a macroblock from VVC VTM reference software

Final:Result after calculating and displaying the differenceI am new to VVC and I am going through the reference software's code trying to understand it. I have encoded and decoded videos using the reference software. I want to extract the bitstream from it, I want to know the number of bits there are in each macroblock. I am not sure which class I should be working with, for now I am looking at, mv.cpp, QuantRDOQ.cpp, and TrQuant.cpp.
I am afraid to mess the code up completely, I don't know where to add what lines of code. Start: Result after calculating and displaying the difference
P.S. The linked pictures are after my problem has been solved, I attached these pictures because of my query in the comments.

As the error says, getNumBins() is not supported by the CABAC estimator. So you should make sure you call it "only" during the encoding, and not during the RDO.
This should do the job:
if (isEncoding())
before = m_BinEncoder.getNumBins()
coding_unit( cu, partitioner, cuCtx );
if (isEncoding())
{
after = m_BinEncoder.getNumBins();
diff = after - before;
}

The simpleset solution that I'm aware of is at the encoder side.
The trick is to compute the difference in the number of written bits "before" and "after" encoding a Coding Unit (CU) (aka macroblock). This stuff happens in the CABACWriter.cpp file.
You should go to to coding_tree() function, where coding_unit() function is called, which is responsible for context-coding all syntax elementes in the current CU.
There, you may call the function getNumBins() twice: once before and once after coding_unit(). The difference of the two value should do the job for you.

Unit Test for method that convert string to type date

I am pretty new to coding and unit tests.
I am writing a unit test for a method converts string to date.
What do I assert for a positive test
it goes like below
String s1 = "11/11/2018";
Date returnedValue = Class.ConvertToDate(s1);
assert(????);

I am not sure what methods the Date class in your code snippet provides, so I will assume that it has the following methods: getDayNrInMonth which returns the day as an int value starting from 1, getMonthNrInYear which returns the month as an int starting from 1, and getYearNr which returns the year as an int value (lets just ignore that there are different calendarian systems, and that is all for the Gregorian calendar).
The important question you would have to answer first is, what the goal of the test would be. There are several possible goals that you could test for - which means that you probably should end up with writing several tests. Let's assume you want to test that the year part was converted correctly. Then, the assertion could look like follows:
assert(returnedValue.getYearNr() == 2018);
Most likely, however, your test framework will provide something better for you here, possibly a function assertEquals, which you could then use as follows:
assertEquals(returnedValue.getYearNr(), 2018);
If in contrast you want to check that the month is converted correctly, you will immediately realize that the test example you have chosen is not ideal:
assertEquals(returnedValue.getMonthNrInYear(), 11);
As in your example day and month both are 11, this test would not be very reliable. So to test the correct conversion of the day and the month, the input strings should be chosen to have distinguishable values for the two.
While you move on with learning to program and to test, you will find that there are many more aspects that could be considered. The links in the comments can help you. However, hopefully the above points support you in taking the next steps.

Distinguishing between terms of different domains

What I am trying to do:
I am trying to take a list of terms and distinguish which domain they are coming from. For example "intestine" would be from the anatomical domain while the term "cancer" would be from the disease domain. I am getting these terms from different ontologies such as DOID and FMA (they can be found at bioportal.bioontology.org)
The problem:
I am having a hard time realizing the best way to implement this. Currently I am naively taking the terms from the ontologies DOID and FMA and taking difference of any term that is in the FMA list which we know is anatomical from the DOID list (which contains terms that may be anatomical such as colon carcinoma, colon being anatomical and carcinoma being disease).
Thoughts:
I was thinking that I can get root words, prefixes, and postfixes, for the different term domains and try and match it to the terms in the list. Another idea is to take more information from their ontology such as meta data or something and use this to distinguish between the terms.
Any ideas are welcome.

As a first run, you'll probably have the best luck with bigrams. As an initial hypothesis, diseases are usually noun phrases, and usually have a very English-specific structure where NP -> N N, like "liver cancer", which means roughly the same thing as "cancer of the liver." Doctors tend not to use the latter, while the former should be caught with bigrams quite well.
Use the two ontologies you have there as starting points to train some kind of bigram model. Like Rcynic suggested, you can count them up and derive probabilities. A Naive Bayes classifier would work nicely here. The features are the bigrams; classes are anatomy or disease. sklearn has Naive Bayes built in. The "naive" part means, in this case, that all your bigrams are independent of each other. This assumption is fundamentally false, but it works well in a lot of circumstances, so we pretend it's true.
This won't work perfectly. As it's your first pass, you should be prepared to probe the output to understand how it derived the answer it came upon and find cases that failed on. When you find trends of errors, tweak your model, and try again.
I wouldn't recommend WordNet here. It wasn't written by doctors, and since what you're doing relies on precise medical terminology, it's probably going to add bizarre meanings. Consider, from nltk.corpus.wordnet:
>>> livers = reader.synsets("liver")
>>> pprint([l.definition() for l in livers])
[u'large and complicated reddish-brown glandular organ located in the upper right portion of the abdominal cavity; secretes bile and functions in metabolism of protein and carbohydrate and fat; synthesizes substances involved in the clotting of the blood; synthesizes vitamin A; detoxifies poisonous substances and breaks down worn-out erythrocytes',
u'liver of an animal used as meat',
u'a person who has a special life style',
u'someone who lives in a place',
u'having a reddish-brown color']
Only one of these is really of interest to you. As a null hypothesis, there's an 80% chance WordNet will add noise, not knowledge.

The naive approach - what precision and recall is it getting you? If you setup a test case now, then you can track your progress as you apply more sophisticated methods.
I don't know what initial set you are dealing with - but one thing to try is to get your hands on annotated documents(maybe use mechanical turk). The documents need to be tagged as the domains you're looking for - anatomical or disease.
then count and divide will tell you how likely a word you encounter is to belong to a domain. With that the next step and be to tweak some weights.
Another approach (going in a whole other direction) is using WordNet. I don't know if it will be useful for exactly your purposes, but its a massive ontology - so it might help.
Python has bindings to use Wordnet via nltk.
from nltk.corpus import wordnet as wn
wn.synsets('cancer')
gives output = [Synset('cancer.n.01'), Synset('cancer.n.02'), Synset('cancer.n.03'), Synset('cancer.n.04'), Synset('cancer.n.05')]
http://wordnetweb.princeton.edu/perl/webwn
Let us know how it works out.

Unit testing data flow in a ssis package

Is there a way to unit test data flow in a ssis package .
Ex: testing the sort - verify that the sort is porperly done.

There is an unit testing framework for SSIS - see SSISUnit.
This is worth looking at but it may not solve your problem. It is possible to unit test individual components at the control flow level using this framework, but it is not possible to isolate and individual Data Flow Transformations - you can only test the whole Data Flow component.
One approach you could take is to redesign your package and break down your DataFlow component into multiple DataFlow components that can be individually tested. However that will impact the performance of your package, because you will have to persist the data somewhere in between each data flow task.
You can also adopt this approach by using NUnit or a similar framework, using the SSIS api to load a package and execute an individual task.

SSISTester can tap data flow between two components and save the data into a file. Output can be accessed in a unit test. For more information look at ssistester.bytesoftwo.com.An example how to use SSISTester to achieve this is given bellow:
[UnitTest("DEMO", "CopyCustomers.dtsx", DisableLogging=true)]
[DataTap(#"\[CopyCustomers]\[DFT Convert customer names]\[RCNT Count customers]", #"\[CopyCustomers]\[DFT Convert customer names]\[DER Convert names to upper string]")]
[DataTap(#"\[CopyCustomers]\[DFT Convert customer names]\[DER Convert names to upper string]", #"\[CopyCustomers]\[DFT Convert customer names]\[FFD Customers converted]")]
public class CopyCustomersFileAll : BaseUnitTest
{
...
protected override void Verify(VerificationContext context)
{
ReadOnlyCollection<DataTap> dataTaps = context.DataTaps;
DataTap dataTap = dataTaps[0];
foreach (DataTapSnapshot snapshot in dataTap.Snapshots)
{
string data = snapshot.LoadData();
}
DataTap dataTap1 = dataTaps[1];
foreach (DataTapSnapshot snapshot in dataTap1.Snapshots)
{
string data = snapshot.LoadData();
}
}
}

Short answer - not easily. Longer answer: yes, but you'll need lots of external tools to do it. One potential test would be to take a small sample of the data set, run it through your sort, and dump to an excel file. Take the same data set, copy it to an excel spreadsheet, and manually sort it. Run a binary diff tool on the result of the dump from SSIS and your hand-sorted example. If everything checks out, it's right.
OTOH, unit testing the Sort in SSIS shouldn't be necessary, unless what you're really testing is the sort criteria selection. The sort should have been tested by MS before it was shipped.

I would automate the testing by having a known good file for appropriate inputs which is compared binarily with an external program.

I like to use data viewers when I need to see the data moving from component to component.

How to unit test complex methods

I have a method that, given an angle for North and an angle for a bearing, returns a compass point value from 8 possible values (North, NorthEast, East, etc.). I want to create a unit test that gives decent coverage of this method, providing different values for North and Bearing to ensure I have adequate coverage to give me confidence that my method is working.
My original attempt generated all possible whole number values for North from -360 to 360 and tested each Bearing value from -360 to 360. However, my test code ended up being another implementation of the code I was testing. This left me wondering what the best test would be for this such that my test code isn't just going to contain the same errors that my production code might.
My current solution is to spend time writing an XML file with data points and expected results, which I can read in during the test and use to validate the method but this seems exceedingly time consuming. I don't want to write a file that contains the same range of values that my original test contained (that would be a lot of XML) but I do want to include enough to adequately test the method.
How do I test a method without just reimplementing the method?
How do I achieve adequate coverage to have confidence in the method I am testing without having to have test points for all possible inputs and results?
Obviously, don't dwell too much on my specific example as this applies to many situations where there are complex calculations and ranges of data to be tested.
NOTE: I am using Visual Studio and C#, but I believe this question is language-agnostic.

First off, you're right, you do not want your test code to reproduce the same calculation as the code under test. Secondly, your second approach is a step in the right direction. Your tests should contain a specific set of inputs with the pre-computed expected output values for those inputs.
Your XML file should contain just a subset of the input data that you've described. Your tests should ensure that you can handle the extreme ranges of your input domain (-360, 360), a few data points just inside the ends of the range, and a few data points in the middle. Your tests should also check that your code fails gracefully when given values outside the input range (e.g. -361 and +361).
Finally, in your specific case, you may want to have a few more edge cases to make sure that your function correctly handles "switchover" points within your valid input range. These would be the points in your input data where the output is expected to switch from "North" to "Northwest" and from "Northwest" to "West", etc. (don't run your code to find these points, compute them by hand).
Just concentrating on these edge cases and a few cases in between the edges should greatly reduce the amount of points you have to test.

You could possibly re-factor the method into parts that are easier to unit test and write the unit tests for the parts. Then the unit tests for the whole method only need to concentrate on integration issues.

I prefer to do the following.
Create a spreadsheet with right answers. However complex it needs to be is irrelevant. You just need some columns with the case and some columns with the expected results.
For your example, this can be big. But big is okay. You'll have an angle, a bearing and the resulting compass point value. You may have a bunch of intermediate results.
Create a small program that reads the spreadsheet and writes the simplified, bottom-line unittest cases. You want your cases stripped down to
def testCase215n( self ):
self.fixture.setCourse( 215 )
self.fixture.setBearing( 45 )
self.fixture.calculate()
self.assertEquals( "N", self.fixture.compass() )
[That's Python, the same idea would hold for C#.]
The spreadsheet contains the one-and-only authoritative list of right answers. You generate code from this once or twice. Unless, of course, you find an error in your spreadsheet version and have to fix that.
I use a small Python program with xlrd and the Mako template generator to do this. You could do something similar with C# products.

If you can think of a completely different implementation of your method, with completely different places for bugs to hide, you could test against that. I often do things like this when I've got an efficient, but complex implementation of something that could be implemented much more simply but inefficiently. For example, if writing a hash table implementation, I might implement a linear search-based associative array to test it against, and then test using lots of randomly generated input. The linear search AA is very hard to screw up and even harder to screw up such that it's wrong in the same way as the hash table. Therefore, if the hash table has the same observable behavior as the linear search AA, I'd be pretty confident it's correct.
Other examples would include writing a bubble sort to test a heap sort against, or using a known working sort function to find medians and comparing that to the results of an O(N) median finding algorithm implementation.

I believe that your solution is fine, despite using a XML file (I would have used a plain text file). But a more used tactic is to just test limit situations, like using, in your case, a entry value of -360, 360, -361, 361 and 0.

You could try orthogonal array testing to achieve all-pairs coverage instead of all possible combinations. This is a statistical technique based on the theory that most bugs occur due to interactions between pairs of parameters. It can drastically reduce the number of test cases you write.

Not sure how complicated your code is, if it is taking an integer in and dividing it up into 8 or 16 directions on the compass, it is probably a few lines of code yes?
You are going to have a hard time not re-writing your code to test it, depending how you test it. Ideally you want an independent party to write the test code based on the same requirements but without looking at or borrowing your code. This is unlikely to happen in most situations. In this case that may be overkill.
In this specific case I would feed it each number in order from -360 to +360, and print the number and the result (to a text file in a format that can be compiled into another program as a header file). Visually inspect that the direction changes at the desired input. This should be easy to visually inspect and validate. Now you have a table of inputs and valid outputs. Next have a program randomly select from the valid inputs feed it into your code under test and see that the right answer comes out. Do a few hundred of these random tests. At some point you need to validate that numbers less than -360 or greater than +360 are handled per your requirements, either clipping or modulating I assume.

So I took a software testing class link text and basically what you want is to identify the class of inputs.. all real numbers? all integers, only positive, only negative,etc... Then group the output actions. is 360 uniquely different from 359 or do they pretty much end up doing the same thing to the app. Once there do a combination of inputs to outputs.
This all seems abstract and vague but until you provide the method code it's difficult to come up with a perfect strategy.
Another way is to do branch level testing or predicate coverage testing. code coverage isn't fool proof but not covering all your code seems irresponsible.

One approach, probably one to apply in combination with other method of testing, is to see if you can make a function that reverses the method you are testing. In this case, it would take a compass direction(northeast, say), and output a bearing (given the bearing for north). Then, you could test the method by applying it to a series of inputs, then applying the function to reverse the method, and seeing if you get back the original input.
There are complications, particularly if one output corresponds to multiple inputs,but it may be possible in those cases to generate the set of inputs corresponding to a given output, and test that each member of the set (or a certain sample of the elements of the set).
The advantage of this approach is that it doesn't rely on you being able to simulate the method manually, or create an alternative implementation of the method. If the reversal involves a different approach to the problem to that used in the original method, it should reduce the risk of making equivalent mistakes in both.

Psuedocode:
array colors = { red, orange, yellow, green, blue, brown, black, white }
for north = -360 to 361
for bearing = -361 to 361
theColor = colors[dirFunction(north, bearing)] // dirFunction is the one being tested
setColor (theColor)
drawLine (centerX, centerY,
centerX + (cos(north + bearing) * radius),
centerY + (sin(north + bearing) * radius))
Verify Resulting Circle against rotated reference diagram.
When North = 0, you'll get an 8-colored pie chart. As north varies + or -, the pie chart will look the same, but rotated around that many degrees. Test verification is a simple matter of making sure that (a) the image is a properly rotated pie chart and (b) there aren't any green spots in the orange area, etc.
This technique, by the way, is a variation on the world's greatest debugging tool: Have the computer draw you a picture of what =IT= thinks it's doing. (Too often, developers waste hours chasing what they think the computer is doing, only to find that it's doing something completely different.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js