JUnit avoid duplicate assertions - unit-testing

I am writing simple test cases for converting Entity to DTO and vice versa. The question is more about design. Is it acceptable to leave duplicates like in code below or is it better to create external method for this assert? Since I'm Java newbie can someone give me hint about hmm? any Generic method? I don't want to use and inheritance or any other abstraction for such simple Entity and its DTO because there will be much more code than just few duplicate line of code.
Here is how it looks like now:
#Test
void addressToAddressDTO() {
Address address = getAddress();
AddressDTO addressDTO = addressMapper.addressToAddressDTO(address);
assertAll("Check if values were properly bound",
() -> {
assertEquals(address.getCity(), addressDTO.getCity());
assertEquals(address.getUserDetails().getFirstName(), addressDTO.getUserDetails().getFirstName());
assertEquals(address.getUserDetails().getUser().getUsername(), addressDTO.getUserDetails().getUser().getUsername());
assertEquals(address.getUserDetails().getContact().getEmail(), addressDTO.getUserDetails().getContact().getEmail());
assertEquals(address.getUserDetails().getProfileImage().getImageUrl(), addressDTO.getUserDetails().getProfileImage().getImageUrl());
});
}
#Test
void addressDTOtoAddress() {
AddressDTO addressDTO = getAddressDTO();
Address address = addressMapper.addressDTOtoAddress(addressDTO);
assertAll("Check if values were properly bound",
() -> {
assertEquals(addressDTO.getCity(), address.getCity());
assertEquals(addressDTO.getUserDetails().getFirstName(), address.getUserDetails().getFirstName());
assertEquals(addressDTO.getUserDetails().getUser().getUsername(), address.getUserDetails().getUser().getUsername());
assertEquals(addressDTO.getUserDetails().getContact().getEmail(), address.getUserDetails().getContact().getEmail());
assertEquals(addressDTO.getUserDetails().getProfileImage().getImageUrl(), address.getUserDetails().getProfileImage().getImageUrl());
});
}
My idea was to create something more generic like:
private<T, S> void assertObject(T expected, S actual) {
assertAll("Check if values were properly bound",
() -> {
assertEquals(expected.getCity(), actual.getCity());
assertEquals(expected.getUserDetails().getFirstName(), actual.getUserDetails().getFirstName());
assertEquals(expected.getUserDetails().getUser().getUsername(), actual.getUserDetails().getUser().getUsername());
assertEquals(expected.getUserDetails().getContact().getEmail(), actual.getUserDetails().getContact().getEmail());
assertEquals(expected.getUserDetails().getProfileImage().getImageUrl(), actual.getUserDetails().getProfileImage().getImageUrl());
});
}
but since even they are the same objects they have nothing in common. How to achieve something hmm interchangable that Address and AddressDTO can be both actual or expected?
EDIT
According to Aaron Digulla answer I've made some changes, hope it will help someone with the same doubts.If someone know any other option please post in in the comment section.
#Test
void addressToAddressDTO() {
Address expected = getAddress();
AddressDTO actual = addressMapper.addressToAddressDTO(expected);
assertEquals(
mergeAddressDataToString(expected),
actual.getCity() + "," +
actual.getUserDetails().getFirstName() + "," +
actual.getUserDetails().getUser().getUsername() + "," +
actual.getUserDetails().getContact().getEmail() + "," +
actual.getUserDetails().getProfileImage().getImageUrl()
);
}
#Test
void addressDTOtoAddress() {
AddressDTO expected = getAddressDTO();
Address actual = addressMapper.addressDTOtoAddress(expected);
assertEquals(
expected.getCity() + "," +
expected.getUserDetails().getFirstName() + "," +
expected.getUserDetails().getUser().getUsername() + "," +
expected.getUserDetails().getContact().getEmail() + "," +
expected.getUserDetails().getProfileImage().getImageUrl(),
mergeAddressDataToString(actual)
);
}
private String mergeAddressDataToString(Address address) {
StringJoiner stringJoiner = new StringJoiner(",");
stringJoiner.add(address.getCity());
stringJoiner.add(address.getUserDetails().getFirstName());
stringJoiner.add(address.getUserDetails().getUser().getUsername());
stringJoiner.add(address.getUserDetails().getContact().getEmail());
stringJoiner.add(address.getUserDetails().getProfileImage().getImageUrl());
return stringJoiner.toString();
}

I usually write a custom toString() method for the object in the test or a test utility class when several tests need to share it. That way, I can check all values with a single assertEquals().
I can use newlines to make the assert more readable in an IDE:
assertEquals(
"city=Foo\n" +
"firstName=John\n" +
"user=doe\n" +
....
, toString(actual));
which also works nicely when you have to check a list of values:
...
, list.stream().map(this::toString).collect(Collectors.joining("\n---\n"));
The big advantage here is that you get all mismatches at once. You can also tweak the toString() method to handle corner cases (like comparing only the number of elements for huge lists or rounding decimals).
It also makes the code easy to understand. In many cases, it also saves you the time to write the code to fill all those expected objects which you would need. Tests can even share expected strings when several tests should yield the same result.
When the test breaks because the output has changed, I can just select the whole string and replace it with the new output. The IDE will do the formatting for me.

Your initial approach, aside of the duplication, uses one antipattern: You call assertAll, but still put all assertions into one block. Thus, after the first failing assertion, the block's execution will be terminated. If instead you put each individual check in one Executable, in case of a failure, all checks will be performed and you will get more details about what failed and what not. Certainly, this is no longer the problem with the string comparison approach.
Regarding the duplication, there is another idea how you could avoid it in this particular case, that is, for the tests of the two conversion functions: You could make use of the fact that a conversion back and forth between the two types is the identity function:
Address address = getAddress();
AddressDTO addressDTO = addressMapper.addressToAddressDTO(address);
Address actual = addressMapper.addressDTOtoAddress(addressDTO);
assertEquals(address, actual);
This eliminates the comparison of individual elements. It may even be advantageous if the representation of an entity and the DTO change in a way that the attributes are no longer strictly equal, but instead only have to be convertible back and forth.
But, each test now also relies on other methods of the class under test. Which is not a problem in general: Many tests, for example, rely on the constructor to work - which is OK if there are also tests for the constructor. Here, however, if the test fails, there are two possible locations that are responsible, and so finding the culprit requires more analysis.
Regarding the second option you have tried, namely creating strings and comparing them: In some scenarios such an approach may be helpful, but generally I am hesitant about going from structured data to strings. Assume you create a lot of tests using that pattern. Later, however, you realize that in the DTO some of the attributes have to be encoded differently than in the entity (I mentioned that scenario above). Suddenly, the conversion to strings does not work any more or becomes awkward.
With or without the strings approach: If you have more assertions where complete entity objects and DTOs have to be compared, I'd recommend writing your own assertion helper methods, like, assertEntityMatchesDto and assertDtoMatchesEntity.
One final remark: There may be reasons not to compare all attributes of the objects in one test: This way, the tests may not be focused enough. Taking the example about encoding changes again: Imagine that you will have to change the representation of the email address in the DTO at some point in time. Then, if you always look at all attributes in your tests, all your tests will fail. If, instead, you have focused tests for individual attributes, such a change will - if done correctly - only affect tests focusing on the email attribute, but leave other tests intact.

Related

unit testing accuracy of function composition

I'm writing tests for an object that takes in an input, composes some functions together, runs the input through the composed function, and returns the result.
Here's a greatly-simplified set of objects and functions that mirrors my design:
type Result =
| Success of string
let internal add5 x = x + 5
let internal mapResult number =
Success (number.ToString())
type public InteropGuy internal (add, map) =
member this.Add5AndMap number =
number |> (add >> map)
type InteropGuyFactory() =
member this.CreateInteropGuy () =
new InteropGuy(add5, mapResult)
The class is designed to be used for C# interop which explains the structure, but this problem still can apply to any function under test that composes function parameters.
I'm having trouble finding an elegant way to keep the implementation details of the internal functions from creeping in to the test conditions when testing the composing function, or in other words, isolating one link in the chain instead of inspecting the output once the input is piped entirely through. If I simply inspect the output then tests for each function are going to be dependent on downstream functions working properly, and if the one at the end of the chain stops working, all of the tests will fail. The best I've been able to do is stub out a function to return a certain value, then stub out its downstream function, storing the input of the downstream function and then asserting the stored value is equal to the output of the stubbed function:
[<TestClass>]
type InteropGuyTests() =
[<TestMethod>]
member this.``Add5AndMap passes add5 result into map function``() =
let add5 _ = 13
let tempResult = ref 0
let mapResult result =
tempResult := result
Success "unused result"
let guy = new InteropGuy(add5, mapResult)
guy.Add5AndMap 8 |> ignore
Assert.AreEqual(13, !tempResult)
Is there a better way to do this or is this generally how to test composition in isolation? Design comments also appreciated.
The first question we should ask when encountering something like this is: why do we want to test this piece of code?
When the potential System Under Test (SUT) is literally a single statement, then which value does the test add?
AFAICT, there's only two ways to test a one-liner.
Triangulation
Duplication of implementation
Both are possible, but comes with drawbacks, so I think it's worth asking if such a method/function should be tested at all.
Still, assuming that you want to test the function, e.g. to prevent regressions, you can use either of these options.
Triangulation
With triangulation, you simply throw enough example values at the SUT to demonstrate that it works as the black box it's supposed to be:
open Xunit
open Swensen.Unquote
[<Theory>]
[<InlineData(0, "5")>]
[<InlineData(1, "6")>]
[<InlineData(42, "47")>]
[<InlineData(1337, "1342")>]
let ``Add5AndMap returns expected result`` (number : int, expected : string) =
let actual = InteropGuyFactory().CreateInteropGuy().Add5AndMap number
Success expected =! actual
The advantage of this example is that it treats the SUT as a black box, but the disadvantage is that it doesn't demonstrate that the SUT is a result of any particular composition.
Duplication of implementation
You can use Property-Based Testing to demonstrate (or, at least make very likely) that the SUT is composed of the desired functions, but it requires duplicating the implementation.
Since the functions are assumed to be referentially transparent, you can simply throw enough example values at both the composition and the SUT, and verify that they return the same value:
open FsCheck.Xunit
open Swensen.Unquote
[<Property>]
let ``Add5AndMap returns composed result`` (number : int) =
let actual = InteropGuyFactory().CreateInteropGuy().Add5AndMap number
let expected = number |> add5 |> mapResult
expected =! actual
Is it ever interesting to duplicate the implementation in the test?
Often, it's not, but if the purpose of the test is to prevent regressions, it may be worthwhile as a sort of double-entry bookkeeping.

Does property based testing make you duplicate code?

I'm trying to replace some old unit tests with property based testing (PBT), concreteley with scala and scalatest - scalacheck but I think the problem is more general. The simplified situation is , if I have a method I want to test:
def upcaseReverse(s:String) = s.toUpperCase.reverse
Normally, I would have written unit tests like:
assertEquals("GNIRTS", upcaseReverse("string"))
assertEquals("", upcaseReverse(""))
// ... corner cases I could think of
So, for each test, I write the output I expect, no problem. Now, with PBT, it'd be like :
property("strings are reversed and upper-cased") {
forAll { (s: String) =>
assert ( upcaseReverse(s) == ???) //this is the problem right here!
}
}
As I try to write a test that will be true for all String inputs, I find my self having to write the logic of the method again in the tests. In this case the test would look like :
assert ( upcaseReverse(s) == s.toUpperCase.reverse)
That is, I had to write the implementation in the test to make sure the output is correct.
Is there a way out of this? Am I misunderstanding PBT, and should I be testing other properties instead, like :
"strings should have the same length as the original"
"strings should contain all the characters of the original"
"strings should not contain lower case characters"
...
That is also plausible but sounds like much contrived and less clear. Can anybody with more experience in PBT shed some light here?
EDIT : following #Eric's sources I got to this post, and there's exactly an example of what I mean (at Applying the categories one more time): to test the method times in (F#):
type Dollar(amount:int) =
member val Amount = amount
member this.Add add =
Dollar (amount + add)
member this.Times multiplier =
Dollar (amount * multiplier)
static member Create amount =
Dollar amount
the author ends up writing a test that goes like:
let ``create then times should be same as times then create`` start multiplier =
let d0 = Dollar.Create start
let d1 = d0.Times(multiplier)
let d2 = Dollar.Create (start * multiplier) // This ones duplicates the code of Times!
d1 = d2
So, in order to test that a method, the code of the method is duplicated in the test. In this case something as trivial as multiplying, but I think it extrapolates to more complex cases.
This presentation gives some clues about the kind of properties you can write for your code without duplicating it.
In general it is useful to think about what happens when you compose the method you want to test with other methods on that class:
size
++
reverse
toUpperCase
contains
For example:
upcaseReverse(y) ++ upcaseReverse(x) == upcaseReverse(x ++ y)
Then think about what would break if the implementation was broken. Would the property fail if:
size was not preserved?
not all characters were uppercased?
the string was not properly reversed?
1. is actually implied by 3. and I think that the property above would break for 3. However it would not break for 2 (if there was no uppercasing at all for example). Can we enhance it? What about:
upcaseReverse(y) ++ x.reverse.toUpper == upcaseReverse(x ++ y)
I think this one is ok but don't believe me and run the tests!
Anyway I hope you get the idea:
compose with other methods
see if there are equalities which seem to hold (things like "round-tripping" or "idempotency" or "model-checking" in the presentation)
check if your property will break when the code is wrong
Note that 1. and 2. are implemented by a library named QuickSpec and 3. is "mutation testing".
Addendum
About your Edit: the Times operation is just a wrapper around * so there's not much to test. However in a more complex case you might want to check that the operation:
has a unit element
is associative
is commutative
is distributive with the addition
If any of these properties fails, this would be a big surprise. If you encode those properties as generic properties for any binary relation T x T -> T you should be able to reuse them very easily in all sorts of contexts (see the Scalaz Monoid "laws").
Coming back to your upperCaseReverse example I would actually write 2 separate properties:
"upperCaseReverse must uppercase the string" >> forAll { s: String =>
upperCaseReverse(s).forall(_.isUpper)
}
"upperCaseReverse reverses the string regardless of case" >> forAll { s: String =>
upperCaseReverse(s).toLowerCase === s.reverse.toLowerCase
}
This doesn't duplicate the code and states 2 different things which can break if your code is wrong.
In conclusion, I had the same question as you before and felt pretty frustrated about it but after a while I found more and more cases where I was not duplicating my code in properties, especially when I starting thinking about
combining the tested function with other functions (.isUpper in the first property)
comparing the tested function with a simpler "model" of computation ("reverse regardless of case" in the second property)
I have called this problem "convergent testing" but I can't figure out why or where there term comes from so take it with a grain of salt.
For any test you run the risk of the complexity of the test code approaching the complexity of the code under test.
In your case, the the code winds up being basically the same which is just writing the same code twice. Sometimes there is value in that. For example, if you are writing code to keep someone in intensive care alive, you could write it twice to be safe. I wouldn't fault you for the abundance of caution.
For other cases there comes a point where the likelihood of the test breaking invalidates the benefit of the test catching real issues. For that reason, even if it is against best practice in other ways (enumerating things that should be calculated, not writing DRY code) I try to write test code that is in some way simpler than the production code, so it is less likely to fail.
If I cannot find a way to write code simpler than the test code, that is also maintainable(read: "that I also like"), I move that test to a "higher" level(for example unit test -> functional test)
I just started playing with property based testing but from what I can tell it is hard to make it work with many unit tests. For complex units, it can work, but I find it more helpful at functional testing so far.
For functional testing you can often write the rule a function has to satisfy much more simply than you can write a function that satisfies the rule. This feels to me a lot like the P vs NP problem. Where you can write a program to VALIDATE a solution in linear time, but all known programs to FIND a solution take much longer. That seems like a wonderful case for property testing.

Should NUnit "theory" assumptions include algorithm details

Let's say that I would like to change my NUnit parametrized test method to a theory. As far as theories go they should define all assumptions/preconditions under which assertions will pass. As per NUnit documentation:
[when comparing theory to parametrized test] A theory, on the other hand, makes a general statement that all of its assertions will pass for all arguments satisfying certain assumptions.
But as I understand it this means that called PUT's code should be basically translated to assumptions. Completely.
What's the point having theories then? Because our algorithm would be written twice. First as testable code and second as theory assumptions. So if we'd intro a bug in the algorithm both our code and test would likely have the same bug. What's the point then?
Example for better understanding
Let's say we're having a checksum method that only supports digits and we'd like to test it using a theory. Let's write a theory:
static Regex rx = new Regex(#"^\d+$", RegexOptions.Compiled);
[Theory]
public void ChecksumTheory(string value)
{
Assume.That(!string.IsNotNullOrWhiteSpace(value));
Assume.That(value.Length > 1); // one single number + checksum = same number twice
Assume.That(rx.IsMatch(value));
var cc = new ChecksumValidator();
bool result = cc.ValidateValue(value);
Assert.IsTrue(result); // not really as algorithm assumptions are missing
}
This is a pretty nice theory, except that without actually implementing the tested code algorithm and expressing it as a set of assumptions its assertions still won't pass because without explicit algorithm assumptions we can't know what the outcome of the validation will be.
Additional info
Theories seem rather trivial and concise when we only need to provide assumptions on input state namely checking that particular values are being set correctly or that their combination is relevant:
[Theory]
public void Person_ValidateState(Person input)
{
Assume.That(input.Age < 110);
Assume.That(input.Birth < input.Death || input.Death == null);
...
}
Questions
Why write unit test theories if one needs to provide enough assumptions for all asserts to pass?
If we don't want to reinvent the wheel by providing all algorithm assumptions, how do we provide correct assumptions?
If that's not the case, how should I rewrite my theory to make it a good example of NUnit theories?
What is the intended use (by their creators) of test theories anyway?
Theories vs. parameterized tests
I am also aiming at introducing assumptions in my tests instead of using parameterized tests. But still I haven't started it due to similar thoughts.
The goal of assumptions is to describe the given input as a subset from an uncountable - or say vast but complete - set of values by applying a filter. By this your code above is absolutely correct, nevertheless in this case you would have to write several similar tests for negative result testing - eg. when the outcome of cc.ValidateValue(...) is false. Once again - for comprehensibility - I would still rely on a good choice of hand-picked parameters for a parameterized test of this trivial function.
On the other hand assumptions may be useful for tests of more complex business logic. Imagine you have a garage full of fancy cars and you feel like smashing the gas on some remote terrain - also let's imagine this is a business requirement so you need to write tests for it (how cool would this be!). Then you could write a test like this:
[Theory]
public void CarCanDriveOnMuddyGround(Car car)
{
Assume.That(car.IsFourWheelDrive);
Assume.That(car.HasMotor);
Assume.That(car.MaxSpeed > 50);
Assume.That(car.Color != "white");
bool result = car.DriveWithGivenSpeedOn<MuddyGround>(50);
Assert.IsTrue(result);
}
See how this is strongly related to the BDD approach? Like you I am also not that much convinced about using assumptions for plain unit tests. But I am certain that it's a good idea to use different approaches for test functions (parameterized, assertions) according to the different test levels (unit, integration, system, user acceptance).
About algorithm details in assumptions
Thought about your specific problem again. Now I've got your point. In my words: You would need to assume that a given value will give a positive result before you can assert that it gives a positive result. Right? I think you found a pretty good example why theories do not always work.
I tried to solve it anyway in a slightly simpler example (for readability). But I admit it's not very convincing:
public class TheoryTests
{
[Datapoints]
public string[] InvalidValues = new[] { null, string.Empty };
[Datapoints]
public string[] PositiveValues = new[] { "good" };
[Datapoints]
public string[] NegativeValues = new[] { "Bad" };
private bool FunctionUnderTest(string value)
{
return value.ToLower().Equals(value);
}
[Theory]
public void PositiveTest(string value)
{
Assume.That(!string.IsNullOrEmpty(value));
var result = FunctionUnderTest(value);
Assert.True(result);
}
[Theory]
public void PassingPositiveTest(string value)
{
Assume.That(!string.IsNullOrEmpty(value));
Assume.That(!NegativeValues.Contains(value));
var result = FunctionUnderTest(value);
Assert.True(result);
}
}
PositiveTest will fail obviously because the algorithm assumption is missing.
See the second line in the body of PassingPositiveTest which prevents the test from failing. The downside is of course that this actually is an example-based test and not a pure theory-based test. Better ideas welcome.

Programming without if-statements? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I remember some time (years, probably) ago I read on Stackoverflow about the charms of programming with as few if-tests as possible. This question is somewhat relevant but I think the stress was on using many small functions that returned values determined by tests depending on the parameter they receive. A very simple example would be using this:
int i = 5;
bool iIsSmall = isSmall(i);
with isSmall() looking like this:
private bool isSmall(int number)
{
return (i < 10);
}
instead of just doing this:
int i = 5;
bool isSmall;
if (i < 10) {
isSmall = true;
} else {
isSmall = false;
}
(Logically this code is just sample code. It is not part of a program I am making.)
The reason for doing this, I believe, was because it looks nicer and makes a programmer less prone to logical errors. If this coding convention is applied correctly, you would see virtually no if-tests anywhere, except in functions whose only purpose is to do that test.
Now, my question is: is there any documentation about this convention? Is there anyplace where you can see wild arguments between supporters and opposers of this style? I tried searching for the Stackoverflow post that introduced me to this, but I can't find it anymore.
Lastly, I hope this question doesn't get shot down because I am not asking for a solution to a problem. I am simply hoping to hear more about this coding style and maybe increase the quality of all coding I will do in the future.
This whole "if" vs "no if" thing makes me think of the Expression Problem1. Basically, it's an observation that programming with if statements or without if statements is a matter of encapsulation and extensibility and that sometimes it's better to use if statements2 and sometimes it's better to use dynamic dispatching with methods / function pointers.
When we want to model something, there are two axes to worry about:
The different cases (or types) of the inputs we need to deal with.
The different operations we want to perform over these inputs.
One way to implement this sort of thing is with if statements / pattern matching / the visitor pattern:
data List = Nil | Cons Int List
length xs = case xs of
Nil -> 0
Cons a as -> 1 + length x
concat xs ys = case ii of
Nil -> jj
Cons a as -> Cons a (concat as ys)
The other way is to use object orientation:
data List = {
length :: Int
concat :: (List -> List)
}
nil = List {
length = 0,
concat = (\ys -> ys)
}
cons x xs = List {
length = 1 + length xs,
concat = (\ys -> cons x (concat xs ys))
}
It's not hard to see that the first version using if statements makes it easy to add new operations on our data type: just create a new function and do a case analysis inside it. On the other hand, this makes it hard to add new cases to our data type since that would mean going back through the program and modifying all the branching statements.
The second version is kind of the opposite. It's very easy to add new cases to the datatype: just create a new "class" and tell what to do for each of the methods we need to implement. However, it's now hard to add new operations to the interface since this means adding a new method for all the old classes that implemented the interface.
There are many different approaches that languages use to try to solve the Expression Problem and make it easy to add both new cases and new operations to a model. However, there are pros and cons to these solutions3 so in general I think it's a good rule of thumb to choose between OO and if statements depending on what axis you want to make it easier to extend stuff.
Anyway, going back to your question there are couple of things I would like to point out:
The first one is that I think the OO "mantra" of getting rid of all if statements and replacing them with method dispatching has more to do with how most OO languages don't have typesafe Algebraic Data Types than it has to do with "if statemsnts" being bad for encapsulation. Since the only way to be type safe is to use method calls you are encouraged to convert programs using if statements into programs using the Visitor Pattern4 or worse: convert programs that should be using the visitor pattern into programs using simple method dispatch, therefore making extensibility easy in the wrong direction.
The second thing is that I'm not a big fan of breaking things into functions just because you can. In particular, I find that style where all the functions have just 5 lines and call tons of other functions is pretty hard to read.
Finally, I think your example doesn't really get rid of if statements. Essentially, what you are doing is having a function from Integers to a new datatype (with two cases, one for Big and one for Small) and then you still need to use if statements when working with the datatype:
data Size = Big | Small
toSize :: Int -> Size
toSize n = if n < 10 then Small else Big
someOp :: Size -> String
someOp Small = "Wow, its small"
someOp Big = "Wow, its big"
Going back to the expression problem point of view, the advantage of defining our toSize / isSmall function is that we put the logic of choosing what case our number fits in a single place and that our functions can only operate on the case after that. However, this does not mean that we have removed if statements from our code! If we have the toSize being a factory function and we have Big and Small be classes sharing an interface then yes, we will have removed if statements from our code. However, if our isSmall just returns a boolean or enum then there will be just as many if statements as there were before. (and you should choose what implementation to use depending if you want to make it easier to add new methods or new cases - say Medium - in the future)
1 - The name of the problem comes from the problem where you have an "expression" datatype (numbers, variables, addition/multiplication of subexpressions, etc) and want to implement things like evaluation functions and other things.
2 - Or pattern matching over Algebraic Data Types, if you want to be more type safe...
3 - For example, you might have to define all multimethods on the "top level" where the "dispatcher" can see them. This is a limitation compared to the general case since you can use if statements (and lambdas) nested deeply inside other code.
4 - Essentially a "church encoding" of an algebraic data type
I've never heard of such a convection. I don't see how it works, anyway. Surely the only point of having a iIsSmall is to later branch on it (possibly in combination with other values)?
What I have heard of is an argument to avoid having variables like iIsSmall at all. iIsSmall is just storing the result of a test you made, so that you can later use that result to make some decision. So why not just test the value of i at the point where you need to make the decision? i.e., instead of:
int i = 5;
bool iIsSmall = isSmall(i);
...
<code>
...
if (iIsSmall) {
<do something because i is small>
} else {
<do something different because i is not small>
}
just write:
int i = 5
...
<code>
...
if (isSmall(i)) {
<do something because i is small>
} else {
<do something different because i is not small>
}
That way you can tell at the branch point what you're actually branching on because it's right there. That's not hard in this example anyway, but if the test was complicated you're probably not going to be able to encode the whole thing in the variable name.
It's also safer. There's no danger that the name iIsSmall is misleading because you changed the code so that it was testing something else, or because i was actually altered after you called isSmall so that it is not necessarily small anymore, or because someone just picked a dumb variable name, etc, etc.
Obviously this doesn't always work. If the isSmall test is expensive and you need to branch on its result many times, you don't want to execute it many times. You also might not want to duplicate the code of that call many times, unless it's trivial. Or you might want to return the flag to be used by a caller who doesn't know about i (though then you could just return isSmall(i), rather than store it in a variable and then return the variable).
Btw, the separate function saves nothing in your example. You can include (i < 10) in an assignment to a bool variable just as easily as in a return statement in a bool function. i.e. you could just as easily write bool isSmall = i < 10; - it's this that avoids the if statement, not the separate function. Code of the form if (test) { x = true; } else { x = false; } or if (test) { return true; } else { return false; } is always silly; just use x = test or return test.
Is it really a convention? Should one just kill minimal if-constructs just because there could be frustration over it?
OK, if statements tend to grow out of control, especially if many special cases are added over time. Branch after branch is added and at the end no one is able to comprehend what everything does without spending hours of time and some cups of coffee into this grown instance of spaghetti-code.
But is it really a good idea to put everything in seperate functions? Code should be reusable. Code should be readable. But a function call just creates the need to look it up further up in the source file. If all ifs are put away in this way, you just skip around in the source file all the time. Does this support readability?
Or consider an if-statement which is not reused anywhere. Should it really go into a separate function, just for the sake of convention? there is some overhead involved here, too. Performance issues could be relevant in this context, too.
What I am trying to say: following coding conventions is good. Style is important. But there are exceptions. Just try to write good code that fits into your project and keep the future in mind. In the end, coding conventions are just guidelines which try to help us to produce good code without enforcing anything on us.

Explain unit testing please

I'm a little confused about unit testing. I see the value in things like automated testing. I think perhaps a good example would be the best way to help me understand. Lets say I have a binary search function I want unit tested.
Now in testing, I would want to know things like: Does the search find the first element, the last element, and other elements? Does the search correctly compare unicode characters. Does the search handle symbols and other "painful" characters. Would unit testing cover this, or am I missing it? How would you write unit tests for my binary search?
function search(collection, value){
var start = 0, end = collection.length - 1, mid;
while (start <= end) {
mid = start + ((end - start) / 2);
if (value == collection[mid])
return mid;
if (collection[mid] < value)
end = mid - 1;
else
start = mid + 1;
}
return mid;
}
Psuedo code for unit tests would be lovely.
So, we might have:
function testFirst(){
var collection = ['a','b','c','x','y','z'],first = 'a', findex = 0;
assert(seach(collection,first),findex);
}
function testLast(){
var collection = ['a','b','c','x','y','z'], last = 'z', lindex = 5;
assert(seach(collection,last),lindex);
}
No, you're not missing it, this is what unit testing is designed to tell you. You have the right idea by testing good and bad input, edge cases, etc. You need one test for each condition. A test will set up any preconditions and then assert that your calculation (or whatever it may be) matches your expectations
You're correct in your expectations of unit testing; it's very much about validating and verifying the expected behaviour.
One value I think many folks miss about unit testing is that its value increases with time. When I write a piece of code, and write a unit test, I've basically just tested that the code does what I think it should, that it's not failing in any ways in which I have chosen to check, etc. These are good things, but they're of limited value, because they express the knowledge that you have of the system at the time; they can't help you with things you don't know about (is there a sneaky bug in my algorithm that I don't know about and didn't think to test for?).
The real value of Unit Tests, in my opinion, is the value they gain over time. This value takes two forms; documentation value and validation value.
The documentation value is the value of the unit test saying "this is what the author of the code expected this bit of code to do". It's hard to overstate the value of this sort of thing; when you've been on a project that has a large chunk of underdocumented legacy code, let me tell you, this sort of documentation value is like a miracle.
The other value is that of validation; as code lives on in projects, things get refactored, and changed, and shifted. Unit tests provide validation that the component that you thought worked in one way continues to work in that way. This can be invaluable in helping find errors that creep into projects. For example, changing a database solution can sometimes be see-through, but sometimes, those changes can cause unexpected shifts in the way some things work; unit testing the components which depend on your ORM can catch critical subtle shifts in the underlying behaviour. This really gets useful when you've got a chunk of code that's been working perfectly for years at a time, and nobody thinks to consider its potential role in a failure; those types of bugs can take a VERY long time to find, because the last place you're going to look is in the component that's been rock-solid for a very long time. Unit Testing provides validation of that "Rock Solidity".
Yes, that's about it. Each of those questions you ask could be used as a test. Think of the unit test as three steps. Set up some preconditions, run some code that is "under test", and write an assert that documents your expectations.
In your case, setting up 'collection' with some particular values (or no values) is setting the preconditions.
Calling your search method with a particular parameter is running the code under test.
Checking that the value returned by your method matches what you expect is the assert step.
Give those three things a name that describes what you are trying to do (DoesTheSearchMethodFailIfCollectionIsEmpty) and voila you have a unit test.