Unit testing higher order functions in F# - unit-testing

Take the following F# example:
let parse mapDate mapLevel mapMessge (groups : string list) =
{
DateTime =
mapDate(
groups.[2] |> Int32.Parse,
groups.[0] |> Int32.Parse,
groups.[1] |> Int32.Parse)
Level = mapLevel groups.[3]
Message = mapMessge groups.[4]
}
I can unit test the map functions independently that's ok, but how do I unit test that this function calls the functions passed in as arguments correctly?
In C# I would use mocks and verify the calls to them. I recently watched a pluralsight video that talked about how functional languages tend to use stubs instead of mocks. Here I could pass in a function that throws if it doesn't get the expected arguments but I'm not really sold on this approach.
I was just wondering if there were any patterns in functional programming in general for unit testing higher-order functions like this?

Well, let me disagree with given answer. Actually, there is a nice way to test higher order functions without even bothering about concrete types they might take (I consider typical HOF to be totally generic, however there is no difference: approach I suggest will work with more strict HFO rightly).
Let's take something really simple, something everyone is familiar with. How about ['t] -> ['t] function? It takes a single argument - a list of whatever type and returns list of the same type. Traditional OOP approach wouldn't work here: one need's to put a restriction on 't and test somewhat specific parameters of that type; the only way to make author to feel more confident with his implementation, is to increase unit tests numbers.
There is really great stuff named "category theory" in math. It's comparatively new filed of mathematics and studies things from the outside rather from than inside. In order to be able to describe things "from the outside" you need take a thing you're interested in and force it to interact with something you already know deep enough. Thus, category theory teaches to describe things in terms of their interrelations with other things. Can't we do the same here?..
Indeed, we can. That's actually quite easy: we got a f : ['t] -> ['t] already, but is there anything else such that we could make both interact and define something common - something that holds for each and every interaction regardless of any other factors? Let's take any g: 't -> 'y. Now we able to state: g (List.head (f ...) = List.head (List.map g (f ...)). I assume a certain argument of type ['t] to substitute .... Please note: given property is universal: it would hold for any pure functions composition of specified signatures regardless of their implementation. Also note how generic yet obvious it is: there are only two distinct "objects" interacting with each other via "composition", which could also be rewritten in terms of standard F#'s (|>), (<|) operators.
Now the fact is that for any higher order (pure) function there exists such kind of universal property; mostly, there are dozens of them. Thus one able to specify their properties in terms of composition (which is regular for FP) staying at the generic level. Having such a properties in the explicit form gives one chance to autogenerate hundreds of tests, based on inputs different not only by their values (which normally done by unit tests, except the fact they are rarely autogenerated), but also by types.

Pure functions are easier because you just have to test the outputs of your parse function. You shouldn't ever need to test using side effects like you do in imperative programming.
When writing most of your unit tests, you generally use the most simple possible for your function arguments, like identity or similar. Then you'd write one test named something like "mapLevel is applied to fourth group" where instead you make mapLevel something that's easy to recognize as changed, like toUpper. This lets you make sure you didn't accidentally copy/paste mapLevel to more than one output. Then a similar test for mapMessge.

Related

Unit testing elixir functions properly

I'm fairly new to elixir and functional programming in general and I'm struggling to properly unit test functions that are composed of other functions. The general question is: when I have a function f that uses other functions g, h... internally, which approach should I take to test the whole?
Coming from the OOP world the first approach that comes to mind involves injecting the functions f depends of. I could unit test g, h... and inject all of those as arguments to f. Then, unit tests for f would just make sure it calls the injected functions as expected. This feels like overfitting though, and as an overall cumbersome approach that is against the functional mindset for which function composition should be a cheap thing to do and you should not be concerning yourself on passing all those arguments around the whole codebase.
I can also unit test g, h... as well as f by treating each of those as black boxes, which feels like the appropriate thing to do, but then the complexity of f's tests increases dramatically. Having simple tests that scale is one of the main purposes of unit testing.
To make the argument more concrete I'll put an example of a function that composes other functions inside and that I don't know how to unit test properly. This in particular is code for a plug that handles the creation of a resource in a RESTful fashion. Note that some of the "dependencies" are pure functions (such as validate_account_admin) but others are not (Providers.create):
def call(conn, _opts) do
account_uuid = conn.assigns.current_user.account["uuid"]
with {:ok, conn} <- Http.Authorization.validate_account_admin(conn),
{:ok, form_data} <- Http.coerce_form_data(conn, FormData),
{:ok, provider} <- Providers.create(FormData.to_provider(form_data), account_uuid: account_uuid) do
Http.respond_create(conn, Http.provider_path(provider))
else
{:error, reason, messages} -> Http.handle_error(conn, reason, messages)
end
end
Thanks!
Maybe this will be quite subjective answer, because there might be no perfect and ultimate one for such question.
Your assumption for me is wrong in terms of using public functions inside other public function. You shouldn't do that at all in business logic areas, because they should be separated and the only place where you can do that and - in fact - you has to is in controllers, but you test controllers with integration tests, not with unit tests, so all you care in such tests are proper and valid responses.
I like Erlang's explicit approach to declare which functions should be public by using export clause. In Elixir you should also follow this approach and whatever should be hidden in the module, should be declared with defp and defmacrop respectively for private functions and private macros.
Your unit tests should follow the rule of black box - you care about the output based on the input. That's all. Test is dumb and doesn't know at all how function under test looks like and what it contains.
In your example you're using some functions in the Plug callfunction and I'm pretty sure that this plug makes more than it should - remember about single responsible principle. This makes this one function almost impossible to test without mocking... I would rewrite this plug into 3 or 4 four separated plugs, because with clause is redundant - plugs check the outcom of previous plug to proceed - it's case inside case, just what with does.
Considering you have new plugs you can use some extra functions inside the plug except call and init that do the real work defined as private functions and this action would propably help you organize your code and avoid creating chained modules in terms of usage and responsibility.
Then, unit tests would be much easier, because you would test isolated plugs.
Assuming that you have this plug called like this:
plug MyPlug
you would rewrite into:
plug :validate_is_admin
plug :coerce_form_data
plug :create_from_form_data
Maybe it's simplified, but I hope you get what I meant here.
TL; DR: Split functions into smaller ones and test them in isolation. Hide internal computations in private functions and test only public API.

How to make API names into variables for easier coding

I am looking for a way to turn some long and confusing API function names into shorter types to reduce the amount of typing and over all errors due to misspelling.
For example : I would like to take gtk_functionName(); and make it a variable like so. doThis = gtk_functionName;
Sometimes the code will have lots of repeating suffix. I want to know if I can take this g_signal_connect_ and turn it into this connect so I could just type connectswapped instead of g_signal_connect_swapped.
I am looking to do this in C\C++ but would be happy to know how its done in any language. I thought I had seen a code that did this before but I can not figure out what this would be called so searching for it has been fruitless.
I am sure this is possible and I am just not able to remember how its done.
I believe what you are wanting to do is apply the Facade Pattern, which is to present a simplified interface to a larger, more complex body of code.
What this basically means is you define your own simplified interfaces for the functionality you want. The implementation of these interfaces use the longer more complex packages you want to simplify. After that, the rest of your code use the simplified interfaces instead of the complex package directly.
void doThis (That *withThat) {
gtk_functionName(withThat->arg1, withThat->arg2 /* etc. */);
}

Clojure: Perlis vs Protocols/Records [soft, philosophical]

Context:
(A) "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." —Alan Perlis
(B) Clojure has defProtocol, defRecord, defType
Question:
is there some style of programming Clojure that gets the benefits of both?
(B) has the advantage of avoiding type errors.
(A) has the advantage of avoiding duplicate code.
Thanks
PS: I would love to hear constructive criticism on why I'm being downvoted + how to restructure the question to make it productive.
I am not sure how you can co-relate the (A) and (B).
(A) is about having consistency i.e if you use same data structure to represent your data (for ex: a user info stored in a map) across various layers of your application then it would make things consistent. If you use many data structure to represent the same info then you will have to write code to transform the structure from one form to another form and also the various functions which work on different structure will not be composable as they expect different data structure.
(B) This is about the various constructs in Clojure.
defprotocol : This is not about data structure rather it is about contract/interface i.e a particular type implements a contract and the type can be used in any context where the consumer function require the passed type to implement a contract. Ex: any type that can have can be printed to console (or other writable string) will implement the print contract/protocol.
defrecord : To create maps but with some additional interfaces implemented in a default way.
deftype: A low level construct to create types and hence you will have to write a lot of code for this. 99% of time you wont need to use this.
The way to reconcile this is to think "abstractions" rather than "data types". Or to paraphrase Alan Perlis:
"It is better to have 100 functions operate on one abstraction than
10 functions on 10 abstractions."
So the Clojure way is to:
Define your abstractions in a simple, minimal way (using defprotocol)
Write functions against this abstraction
Define concrete types that implement the abstraction using defprotocol, deftype etc. (or use extend-protocol to extend the protocol to existing Java classes if you like)

Checking function equality in a F# unit test

I have a bunch of F# functions that implement different algorithms for the same input, kind of like the Strategy pattern. To pick the right strategy, I want to pattern match on the input argument and return the function as a value :
let equalStrategy points : seq<double> =
...
let multiplyStrategy factor (points: seq<double>) =
...
let getStrategy relationship =
match relationship with
| "=" -> equalStrategy
| "*5" -> multiplyStrategy 5.0
| _ -> raise (new System.NotImplementedException(" relationship not handled"))
Now I want to write some unit tests to make sure that I return the right strategy, so I tried something like this in nUnit :
[<TestCase("=")>]
[<Test>]
member self.getEqualstrategy( relationship:string ) =
let strategy = getStrategy relationship
Assert.AreEqual( strategy, equalStrategy )
Now I think the code is correct and will do what I want, but the assertion fails because functions don't seem to have an equality operation defined on them. so my questions are :
(a) is there a way to compare 2 functions to see if they are the same, i.e. let isFoo bar = foo == bar, that I can use in an nUnit assertion?
or
(b) is there another unit testing framework that will do this assertion for me in F#?
Testing whether an F# function returned by your getStrategy is the same function as one of the funcions you defined is also essentially impossible.
To give some details - the F# compiler generates a class that inherits from FSharpFunc when you return a function as a value. More importantly, it generates a new class each time you create a function value, so you cannot compare the types of the classes.
The structure of the generated classes is something like this:
class getStrategy#7 : FSharpFunc<IEnumerable<double>, IEnumerable<double>> {
public override IEnumerable<double> Invoke(IEnumerable<double> points) {
// Calls the function that you're returning from 'getStrategy'
return Test.equalStrategy(points);
}
}
// Later - in the body of 'getStrategy':
return new getStrategy#7(); // Returns a new instance of the single-purpose class
In principle, you could use Reflection to look inside the Invoke method and find which function is called from there, but that's not going to be a reliable solution.
In practice - I think you should probably use some other simpler test to check whether the getStrategy function returned the right algorithm. If you run the returned strategy on a couple of sample inputs, that should be enough to verify that the returned algorithm is the right one and you won't be relying on implementation details (such as whether the getStrategy function just returns a named function or whether it returns a new lambda function with the same behaviour.
Alternatively, you could wrap functions in Func<_, _> delegates and use the same approach that would work in C#. However, I think that checking whether getStrategy returns a particular reference is a too detailed test that just restricts your implementation.
Functions doesn't have equality comparer:
You will have error: The type '('a -> 'a)' does not support the 'equality' constraint because it is a function type
There is a good post here
It would be very difficult for the F# compiler to prove formally that two functions always have the same output (given the same input). If that was possible, you could use F# to prove mathematical theorems quite trivially.
As the next best thing, for pure functions, you can verify that two functions have the same output for a large enough sample of different inputs. Tools like fscheck can help you automate this type of test. I have not used it, but I've used scalacheck that is based on the same idea (both are ports from Haskell's QuickCheck)

Single-use class

In a project I am working on, we have several "disposable" classes. What I mean by disposable is that they are a class where you call some methods to set up the info, and you call what equates to a doit function. You doit once and throw them away. If you want to doit again, you have to create another instance of the class. The reason they're not reduced to single functions is that they must store state for after they doit for the user to get information about what happened and it seems to be not very clean to return a bunch of things through reference parameters. It's not a singleton but not a normal class either.
Is this a bad way to do things? Is there a better design pattern for this sort of thing? Or should I just give in and make the user pass in a boatload of reference parameters to return a bunch of things through?
What you describe is not a class (state + methods to alter it), but an algorithm (map input data to output data):
result_t do_it(parameters_t);
Why do you think you need a class for that?
Sounds like your class is basically a parameter block in a thin disguise.
There's nothing wrong with that IMO, and it's certainly better than a function with so many parameters it's hard to keep track of which is which.
It can also be a good idea when there's a lot of input parameters - several setup methods can set up a few of those at a time, so that the names of the setup functions give more clue as to which parameter is which. Also, you can cover different ways of setting up the same parameters using alternative setter functions - either overloads or with different names. You might even use a simple state-machine or flag system to ensure the correct setups are done.
However, it should really be possible to recycle your instances without having to delete and recreate. A "reset" method, perhaps.
As Konrad suggests, this is perhaps misleading. The reset method shouldn't be seen as a replacement for the constructor - it's the constructors job to put the object into a self-consistent initialised state, not the reset methods. Object should be self-consistent at all times.
Unless there's a reason for making cumulative-running-total-style do-it calls, the caller should never have to call reset explicitly - it should be built into the do-it call as the first step.
I still decided, on reflection, to strike that out - not so much because of Jalfs comment, but because of the hairs I had to split to argue the point ;-) - Basically, I figure I almost always have a reset method for this style of class, partly because my "tools" usually have multiple related kinds of "do it" (e.g. "insert", "search" and "delete" for a tree tool), and shared mode. The mode is just some input fields, in parameter block terms, but that doesn't mean I want to keep re-initializing. But just because this pattern happens a lot for me, doesn't mean it should be a point of principle.
I even have a name for these things (not limited to the single-operation case) - "tool" classes. A "tree_searching_tool" will be a class that searches (but doesn't contain) a tree, for example, though in practice I'd have a "tree_tool" that implements several tree-related operations.
Basically, even parameter blocks in C should ideally provide a kind of abstraction that gives it some order beyond being just a bunch of parameters. "Tool" is a (vague) abstraction. Classes are a major means of handling abstraction in C++.
I have used a similar design and wondered about this too. A fictive simplified example could look like this:
FileDownloader downloader(url);
downloader.download();
downloader.result(); // get the path to the downloaded file
To make it reusable I store it in a boost::scoped_ptr:
boost::scoped_ptr<FileDownloader> downloader;
// Download first file
downloader.reset(new FileDownloader(url1));
downloader->download();
// Download second file
downloader.reset(new FileDownloader(url2));
downloader->download();
To answer your question: I think it's ok. I have not found any problems with this design.
As far as I can tell you are describing a class that represents an algorithm. You configure the algorithm, then you run the algorithm and then you get the result of the algorithm. I see nothing wrong with putting those steps together in a class if the alternative is a function that takes 7 configuration parameters and 5 output references.
This structuring of code also has the advantage that you can split your algorithm into several steps and put them in separate private member functions. You can do that without a class too, but that can lead to the sub-functions having many parameters if the algorithm has a lot of state. In a class you can conveniently represent that state through member variables.
One thing you might want to look out for is that structuring your code like this could easily tempt you to use inheritance to share code among similar algorithms. If algorithm A defines a private helper function that algorithm B needs, it's easy to make that member function protected and then access that helper function by having class B derive from class A. It could also feel natural to define a third class C that contains the common code and then have A and B derive from C. As a rule of thumb, inheritance used only to share code in non-virtual methods is not the best way - it's inflexible, you end up having to take on the data members of the super class and you break the encapsulation of the super class. As a rule of thumb for that situation, prefer factoring the common code out of both classes without using inheritance. You can factor that code into a non-member function or you might factor it into a utility class that you then use without deriving from it.
YMMV - what is best depends on the specific situation. Factoring code into a common super class is the basis for the template method pattern, so when using virtual methods inheritance might be what you want.
Nothing especially wrong with the concept. You should try to set it up so that the objects in question can generally be auto-allocated vs having to be newed -- significant performance savings in most cases. And you probably shouldn't use the technique for highly performance-sensitive code unless you know your compiler generates it efficiently.
I disagree that the class you're describing "is not a normal class". It has state and it has behavior. You've pointed out that it has a relatively short lifespan, but that doesn't make it any less of a class.
Short-lived classes vs. functions with out-params:
I agree that your short-lived classes are probably a little more intuitive and easier to maintain than a function which takes many out-params (or 1 complex out-param). However, I suspect a function will perform slightly better, because you won't be taking the time to instantiate a new short-lived object. If it's a simple class, that performance difference is probably negligible. However, if you're talking about an extremely performance-intensive environment, it might be a consideration for you.
Short-lived classes: creating new vs. re-using instances:
There's plenty of examples where instances of classes are re-used: thread-pools, DB-connection pools (probably darn near any software construct ending in 'pool' :). In my experience, they seem to be used when instantiating the object is an expensive operation. Your small, short-lived classes don't sound like they're expensive to instantiate, so I wouldn't bother trying to re-use them. You may find that whatever pooling mechanism you implement, actually costs MORE (performance-wise) than simply instantiating new objects whenever needed.