Composing Task.async_stream vs. Continuation passing - concurrency

I have a pipeline of two functions that are both IO-heavy, running on a collection of items concurrently.
The first, func1, is very common, and often I just want the response of func1 alone. Other times, I'd like to process the result of func1 with some other function, func2.
What are the trade-offs (performance/overhead, idiomatic-ness) between composing Task.async_stream, i.e.
enum
|> Task.async_stream(Mod1, :func1, [])
|> Task.async_stream(Mod2, :func2, [])
...
vs. passing a continuation and using one Task.async_stream for both func1 and func2 i.e.
enum
|> Task.async_stream(Mod1, :func1_then, [&Mod2.func2/arity])
...
where func1_then calls the function parameter (Mod2.func2) at the end of the normal func1 computation?

If both functions are IO bound, then there shouldn't be any problem with your first example:
enum
|> Task.async_stream(Mod1, :func1, [])
|> Task.async_stream(Mod2, :func2, [])
If you did want to collapse the two calls, I wouldn't use a continuation style, just pipeline them in a lambda passed to Task.async_stream/3:
enum
|> Task.async_stream(fn x -> x |> Mod1.func1() |> M2.func2() end)
Alternatively, you might consider using Flow:
enum
|> Flow.from_enumerable()
|> Flow.map(&Mod1.func1/1)
|> Flow.map(&Mod2.func2/1)
|> Flow.run()

Related

F# return list of list lengths

I am to use combinators and no for/while loops, recursion or defined library functions from F#'s List module, except constructors :: and []
Ideally I want to implement map
I am trying to write a function called llength that returns the list of the lengths of the sublists. For example llength [[1;2;3];[1;2];[1;2;3]] should return [3;2,3]. I also have function length that returns the length of a list.
let Tuple f = fun a b -> f (a, b)
let length l : int =
List.fold (Tuple (fst >> (+) 1)) 0 l
currently have
let llength l : int list =
List.map (length inner list) list
Not sure how I should try accessing my sublists with my restraints and should I use my other method on each sublist? any help is greatly appreciated, thanks!
Since this is homework, I don't want to just give you a fully coded solution, but here are some hints:
First, since fold is allowed you could implement map via fold. The folding function would take the list accumulated "so far" and prepend the next element transformed with mapping function. The result will come out reversed though (fold traverses forward, but you prepend at every step), so perhaps that wouldn't work for you if you're not allowed List.rev.
Second - the most obvious, fundamental way: naked recursion. Here's the way to think about it: (1) when the argument is an empty list, result should be an empty list; (2) when the argument is a non-empty list, the result should be length of the argument's head prepended to the list of lengths of the argument's tail, which can be calculated recursively. Try to write that down in F#, and there will be your solution.
Since you can use some functions that basically have a loop (fold, filter ...), there might be some "cheated & dirty" ways to implement map. For example, via filter:
let mymap f xs =
let mutable result = []
xs
|> List.filter (fun x ->
result <- f x :: result
true)
|> ignore
result |> List.rev
Note that List.rev is required as explained in the other answer.

OCaml currying example

I am writing an OCaml function that accepts a function type, such as (fun _ -> true) and a list. This is what I currently have:
let drop_until_boolean (x: 'a -> bool) lst =
match lst with
| x -> true
Currently that written statement does not work properly, as it always evaluates to true.
When I call drop_until_boolean (fun _ -> true) [] I want it to return true, and when I call drop_until_boolean (fun _ -> true) ["a"] I want it to return false.
Question Summary: How do I make a function such that drop_until_boolean (fun _ -> true) [] evaluates to true.
Another example: drop_until_boolean (fun s -> s.[0]='z') ["z"] evaluates to true and drop_until_boolean (fun s -> s.[0]='z') ["y"] evaluates to false.
I managed to figure out what I wanted to do, probably did a terrible job explaining it. This is what I wanted.
let drop_until_boolean (x: 'a -> bool) lst = if (x lst) then true else false
Your current function says the following in English:
Take a function, call it x, and a second value of any type. Examine the second value. In all cases, no matter what the value, return true.
The variable x that appears in your match is a new variable that is matched against the second argument. Since it's just a simple variable, it always matches successfully. It has no relationship to the first parameter (which happens to be named x also).
It shouldn't be surprising that this function always returns true.
I'm not at all sure what you want the function to do. The name suggests it will return some trailing portion of the list that you give it. But you seem to be saying that it should return a boolean.
Let's assume that you want to do something reasonably simple with the second argument. You say the second argument is a list. The most common structure for a simple list-processing function is like this:
let rec my_function list =
match list with
| [] ->
(* Handle case of empty list *)
| head :: rest ->
(* Handle case of non-empty list,
probably with recursive call *)
Maybe you could think about this general structure as a possible solution to your problem. I hope it is helpful.

How do I write a function to create a circular version of a list in OCaml?

Its possible to create infinite, circular lists using let rec, without needing to resort to mutable references:
let rec xs = 1 :: 0 :: xs ;;
But can I use this same technique to write a function that receives a finite list and returns an infinite, circular version of it? I tried writing
let rec cycle xs =
let rec result = go xs and
go = function
| [] -> result
| (y::ys) -> y :: go ys in
result
;;
But got the following error
Error: This kind of expression is not allowed as right-hand side of `let rec'
Your code has two problems:
result = go xs is in illegal form for let rec
The function tries to create a loop by some computation, which falls into an infinite loop causing stack overflow.
The above code is rejected by the compiler because you cannot write an expression which may cause recursive computation in the right-hand side of let rec (see Limitations of let rec in OCaml).
Even if you fix the issue you still have a problem: cycle does not finish the job:
let rec cycle xs =
let rec go = function
| [] -> go xs
| y::ys -> y :: g ys
in
go xs;;
cycle [1;2];;
cycle [1;2] fails due to stack overflow.
In OCaml, let rec can define a looped structure only when its definition is "static" and does not perform any computation. let rec xs = 1 :: 0 :: xs is such an example: (::) is not a function but a constructor, which purely constructs the data structure. On the other hand, cycle performs some code execution to dynamically create a structure and it is infinite. I am afraid that you cannot write a function like cycle in OCaml.
If you want to introduce some loops in data like cycle in OCaml, what you can do is using lazy structure to prevent immediate infinite loops like Haskell's lazy list, or use mutation to make a loop by a substitution. OCaml's list is not lazy nor mutable, therefore you cannot write a function dynamically constructs looped lists.
If you do not mind using black magic, you could try this code:
let cycle l =
if l = [] then invalid_arg "cycle" else
let l' = List.map (fun x -> x) l in (* copy the list *)
let rec aux = function
| [] -> assert false
| [_] as lst -> (* find the last cons cell *)
(* and set the last pointer to the beginning of the list *)
Obj.set_field (Obj.repr lst) 1 (Obj.repr l')
| _::t -> aux t
in aux l'; l'
Please be aware that using the Obj module is highly discouraged. On the other hand, there are industrial-strength programs and libraries (Coq, Jane Street's Core, Batteries included) that are known to use this sort of forbidden art.
camlspotter's answer is good enough already. I just want to add several more points here.
First of all, for the problem of write a function that receives a finite list and returns an infinite, circular version of it, it can be done in code / implementation level, just if you really use the function, it will have stackoverflow problem and will never return.
A simple version of what you were trying to do is like this:
let rec circle1 xs = List.rev_append (List.rev xs) (circle1 xs)
val circle: 'a list -> 'a list = <fun>
It can be compiled and theoretically it is correct. On [1;2;3], it is supposed to generate [1;2;3;1;2;3;1;2;3;1;2;3;...].
However, of course, it will fail because its run will be endless and eventually stackoverflow.
So why let rec circle2 = 1::2::3::circle2 will work?
Let's see what will happen if you do it.
First, circle2 is a value and it is a list. After OCaml get this info, it can create a static address for circle2 with memory representation of list.
The memory's real value is 1::2::3::circle2, which actually is Node (1, Node (2, Node (3, circle2))), i.e., A Node with int 1 and address of a Node with int 2 and address of a Node with int 3 and address of circle2. But we already know circle2's address, right? So OCaml just put circle2's address there.
Everything will work.
Also, through this example, we can also know a fact that for a infinite circled list defined like this actually doesn't cost limited memory. It is not generating a real infinite list to consume all memory, instead, when a circle finishes, it just jumps "back" to the head of the list.
Let's then go back to example of circle1. Circle1 is a function, yes, it has an address, but we do not need or want it. What we want is the address of the function application circle1 xs. It is not like circle2, it is a function application which means we need to compute something to get the address. So,
OCaml will do List.rev xs, then try to get address circle1 xs, then repeat, repeat.
Ok, then why we sometimes get Error: This kind of expression is not allowed as right-hand side of 'let rec'?
From http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s%3aletrecvalues
the let rec binding construct, in addition to the definition of
recursive functions, also supports a certain class of recursive
definitions of non-functional values, such as
let rec name1 = 1 :: name2 and name2 = 2 :: name1 in expr which
binds name1 to the cyclic list 1::2::1::2::…, and name2 to the cyclic
list 2::1::2::1::…Informally, the class of accepted definitions
consists of those definitions where the defined names occur only
inside function bodies or as argument to a data constructor.
If you use let rec to define a binding, say let rec name. This name can be only in either a function body or a data constructor.
In previous two examples, circle1 is in a function body (let rec circle1 = fun xs -> ...) and circle2 is in a data constructor.
If you do let rec circle = circle, it will give error as circle is not in the two allowed cases. let rec x = let y = x in y won't do either, because again, x not in constructor or function.
Here is also a clear explanation:
https://realworldocaml.org/v1/en/html/imperative-programming-1.html
Section Limitations of let rec

A curry function that executes another function n times

I'm solving an old exam to practice SML. One task I found interesting was: Write a function repeat that executes another function with the signature 'a -> 'a.
I assumed the requested function is a curry function and used the o-Operator:
fun repeat (1, f: 'a->'a) = f
| repeat (n, f: 'a->'a) = f o repeat (n-1, f);
However, the o operator was not formally introduced in out course, and I wonder how I could write this without it?
Not the less verbose, but in some way, the most explicit, then after, the less verbose, with explanations.
A curried function is a function getting a single argument. If an expression has more arguments, then there are as many nested functions. The first outer level function gets an argument, and is made of an inner level function which may it‑self be made of an inner function, and so on. Any of this inner level function may be returned, not just the innermost, as explained later (this is a kind of “partial evaluation”). An inner function is “specialized” with the arguments (formally, the arguments are bound in a closure) of the outer functions.
We know there are at least a function argument f and integer argument counter. There needs to be an argument seed also, to invoke the function f the first time.
The order of nesting may be arbitrary or specified. If not specified, I personally prefer to put the least varying arguments on the outer‑scope and the most varying in the inner‑scope. Here, I would say it is, from least to most varying: f, counter seed.
This is already enough to suggest the beginning of a template:
val repeat: ('a -> 'a) -> int -> 'a -> 'a =
fn f: 'a -> 'a =>
fn count: int =>
fn seed: 'a =>
…
We already implemented the ('a -> 'a) -> int -> 'a part of the signature. Remains the last -> 'a which means an 'a is to be returned, and it will be evaluated by an inner loop.
A loop may be something of this form (in pseudo‑code):
val rec loop = fn i =>
if condition-to-stop
then return-something-or-`()`
else loop (i + 1) or (i - 1)
If the loop is to compute something, it will need an extra argument acting as an accumulator, and will return that accumulator as its final result.
Implementing the loop and putting it inside the curried function template above, we get:
val repeat: ('a -> 'a) -> int -> 'a -> 'a =
fn f: 'a -> 'a =>
fn count: int =>
fn seed: 'a =>
let
val rec loop = fn (counter, x) =>
if counter <= 0 then x
else loop (counter - 1, f x)
in
loop (count, seed)
end
Do you understand the let … in … end construct here?
Note the guard on counter may use a pattern as you did, but as SML's integer may be negative (there is no strict natural in SML), that's safer to catch this case too, thus the if … then … else instead of a pattern matching. Mileage may vary on that point, but that's not the question's focus.
The same as above, using fun instead of val rec:
fun repeat (f: 'a -> 'a) (count: int) (seed: 'a): 'a =
let
fun loop (counter, x) =
if counter <= 0 then x
else loop (counter - 1, f x)
in
loop (count, seed)
end
Note for repeat the arguments are not separated by a , (neither a *). This is the way to write a curried function using fun (on the contrary, loop is not curried). Compare it with the prior val version of the same function. If no type is specified and only names, the parenthesis can be omitted.
A test function to be used as the f argument:
val appendAnX = fn s: string => s ^ "x"
The test:
val () = print (repeat appendAnX 5 "Blah:")
Curried function are more abstract than function getting a tuple (which is formally a single argument, thus makes a curried function too, but that's another story and a bit cheating), as the outer function(s) may be partially applied:
This is a partial application, leaving the last argument, seed, unbound:
val repeatAppendAnXThreeTimes = repeat appendAnX 3
Then this function may be applied specifiying only this seed:
val () = print (repeatAppendAnXThreeTimes "Blah:")
Similarly, both counter and seed may be left free:
val repeatAppendAnX = repeat appendAnX
val () = print (repeatAppendAnX 4 "Blah:")
Another way of defining repeatAppendAnXThreeTimes. Compare it to its other definition above:
val repeatAppendAnXThreeTimes = repeatAppendAnX 3
val () = print (repeatAppendAnXThreeTimes "Blah:")

Difficulty thinking of properties for FsCheck

I've managed to get xUnit working on my little sample assembly. Now I want to see if I can grok FsCheck too. My problem is that I'm stumped when it comes to defining test properties for my functions.
Maybe I've just not got a good sample set of functions, but what would be good test properties for these functions, for example?
//transforms [1;2;3;4] into [(1,2);(3,4)]
pairs : 'a list -> ('a * 'a) list //'
//splits list into list of lists when predicate returns
// true for adjacent elements
splitOn : ('a -> 'a -> bool) -> 'a list -> 'a list list
//returns true if snd is bigger
sndBigger : ('a * 'a) -> bool (requires comparison)
There are already plenty of specific answers, so I'll try to give some general answers which might give you some ideas.
Inductive properties for recursive functions. For simple functions, this amounts probably to re-implementing the recursion. However, keep it simple: while the actual implementation more often than not evolves (e.g. it becomes tail-recursive, you add memoization,...) keep the property straightforward. The ==> property combinator usually comes in handy here. Your pairs function might make a good example.
Properties that hold over several functions in a module or type. This is usually the case when checking abstract data types. For example: adding an element to an array means that the array contains that element. This checks the consistency of Array.add and Array.contains.
Round trips: this is good for conversions (e.g. parsing, serialization) - generate an arbitrary representation, serialize it, deserialize it, check that it equals the original.
You may be able to do this with splitOn and concat.
General properties as sanity checks. Look for generally known properties that may hold - things like commutativity, associativity, idempotence (applying something twice does not change the result), reflexivity, etc. The idea here is more to exercise the function a bit - see if it does anything really weird.
As a general piece of advice, try not to make too big a deal out of it. For sndBigger, a good property would be:
let ``should return true if and only if snd is bigger`` (a:int) (b:int) =
sndBigger (a,b) = b > a
And that is probably exactly the implementation. Don't worry about it - sometimes a simple, old fashioned unit test is just what you need. No guilt necessary! :)
Maybe this link (by the Pex team) also gives some ideas.
I'll start with sndBigger - it is a very simple function, but you can write some properties that should hold about it. For example, what happens when you reverse the values in the tuple:
// Reversing values of the tuple negates the result
let swap (a, b) = (b, a)
let prop_sndBiggerSwap x =
sndBigger x = not (sndBigger (swap x))
// If two elements of the tuple are same, it should give 'false'
let prop_sndBiggerEq a =
sndBigger (a, a) = false
EDIT: This rule prop_sndBiggerSwap doesn't always hold (see comment by kvb). However the following should be correct:
// Reversing values of the tuple negates the result
let prop_sndBiggerSwap a b =
if a <> b then
let x = (a, b)
sndBigger x = not (sndBigger (swap x))
Regarding the pairs function, kvb already posted some good ideas. In addition, you could check that turning the transformed list back into a list of elements returns the original list (you'll need to handle the case when the input list is odd - depending on what the pairs function should do in this case):
let prop_pairsEq (x:_ list) =
if (x.Length%2 = 0) then
x |> pairs |> List.collect (fun (a, b) -> [a; b]) = x
else true
For splitOn, we can test similar thing - if you concatenate all the returned lists, it should give the original list (this doesn't verify the splitting behavior, but it is a good thing to start with - it at least guarantees that no elements will be lost).
let prop_splitOnEq f x =
x |> splitOn f |> List.concat = x
I'm not sure if FsCheck can handle this though (!) because the property takes a function as an argument (so it would need to generate "random functions"). If this doesn't work, you'll need to provide a couple of more specific properties with some handwritten function f. Next, implementing the check that f returns true for all adjacent pairs in the splitted lists (as kvb suggests) isn't actually that difficult:
let prop_splitOnAdjacentTrue f x =
x |> splitOn f
|> List.forall (fun l ->
l |> Seq.pairwise
|> Seq.forall (fun (a, b) -> f a b))
Probably the only last thing that you could check is that f returns false when you give it the last element from one list and the first element from the next list. The following isn't fully complete, but it shows the way to go:
let prop_splitOnOtherFalse f x =
x |> splitOn f
|> Seq.pairwise
|> Seq.forall (fun (a, b) -> lastElement a = firstElement b)
The last sample also shows that you should check whether the splitOn function can return an empty list as part of the returned list of results (because in that case, you couldn't find first/last element).
For some code (e.g. sndBigger), the implementation is so simple that any property will be at least as complex as the original code, so testing via FsCheck may not make sense. However, for the other two functions here are some things that you could check:
pairs
What's expected when the original length is not divisible by two? You could check for throwing an exception if that's the correct behavior.
List.map fst (pairs x) = evenEntries x and List.map snd (pairs x) = oddEntries x for simple functions evenEntries and oddEntries which you can write.
splitOn
If I understand your description of how the function is supposed to work, then you could check conditions like "For every list in the result of splitOn f l, no two consecutive entries satisfy f" and "Taking lists (l1,l2) from splitOn f l pairwise, f (last l1) (first l2) holds". Unfortunately, the logic here will probably be comparable in complexity to the implementation itself.