Controlling async actions that might conflict in F# - concurrency

I have many actions (Async<T> list) to perform in F#. I can execute most of these actions in parallel, but some might conflict due to file locks etc.
For each action, I can generate a "key" (int) that determines if the actions might conflict:
If action a has key i and action b has key j and i = j, then a and b might conflict. They must be executed serially.
If action a has key i and action b has key j and i <> j, then a and b will never conflict. They may be executed in parallel.
I would like to execute my actions (int * Async<T>) list in an efficient way and without conflicts.
I imagine that the process would be something like:
Group all actions by key
Chain each group serially into one Async
Run each chain in parallel
How can I implement this in F#?
How are these problems usually handled?
My attempt at a fully sequential implementation:
let wrapTasks<'T> (tasks : (int * Async<'T>) list) : Async<'T list> = async {
return
tasks
|> Seq.map (fun (k, t) -> t |> Async.RunSynchronously)
|> Seq.toList
}

With a helper function taking a 'promise' for a value x and one for a set of values acc:
module Async =
let sequence x acc = async {
let! x = x
let! y = acc
return x :: y
}
we can asynchronously group the tasks by their 'lock id', clean up the resulting list a bit and then sequence each group into a single async that 'contains' the list of the results of its group. This list is then processed in parallel. Once ts : 'b list [] is available, we flatten it:
let wrapTasks tasks = async {
let! ts =
tasks
|> List.groupBy fst
|> List.map (snd >> List.map snd)
|> List.map (fun asyncs -> List.foldBack Async.sequence asyncs (async { return [] }))
|> Async.Parallel
return ts |> List.ofArray |> List.collect id
}
This can be tested with e.g.
List.init 50 (fun i -> i % 5, async {
let now = System.DateTime.UtcNow.Ticks
do! Async.Sleep 10
return i, now })
|> wrapTasks
|> Async.RunSynchronously
|> List.groupBy snd
|> List.map (fun (t, rs) -> t, rs |> List.map fst)
|> List.sort
By varying the divisor we can adjust the level of parallelism and convince ourselves that the function works as expected :-)
[(636766393199727614L, [0; 1; 2; 3; 4]);
(636766393199962986L, [5; 6; 7; 8; 9]);
(636766393200068008L, [10; 11; 12; 13; 14]);
(636766393200278385L, [15; 16; 17; 18; 19]);
(636766393200382690L, [20; 21; 22; 23; 24]);
(636766393200597692L, [25; 26; 27; 28; 29]);
(636766393200703235L, [30; 31; 32; 33; 34]);
(636766393200918241L, [35; 36; 37; 38; 39]);
(636766393201027938L, [40; 41; 42; 43; 44]);
(636766393201133307L, [45; 46; 47; 48; 49])]
Full disclosure: I had to execute the test a few times for getting this nice result. Usually numbers will be a bit off.

This is a possible solution:
let wrapTasks (tasks : (int * Async<'T>) list) =
tasks
|> List.groupBy fst
|> Seq.map (fun (k, ts) -> async {
for (i, t) in ts do
let! r = t
()
})
|> Async.Parallel
|> Async.RunSynchronously

Related

Adding no value to return list

I'm having a problem with understanding how F# works. I come from C# and I think that I'm trying to make F# work like C#. My biggest problem is returning values in the correct format.
Example:
Let's say I have function that takes a list of integers and an integer.
Function should print a list of indexes where values from list match passed integer.
My code:
let indeks myList n = myList |> List.mapi (fun i x -> if x=n then i else 0);;
indeks [0..4] 3;;
However it returns:
val it : int list = [0; 0; 0; 3; 0]
instead of just [3] as I cannot ommit else in that statement.
Also I have targeted signature of -> int list -> int -> int list and I get something else.
Same goes for problem no. 2 where I want to provide an integer and print every number from 0 to this integer n times (where n is the iterated value):
example:
MultiplyValues 3;;
output: [1;2;2;3;3;3]
Best I could do was to create list of lists.
What am I missing when returning elements?
How do I add nothing to the return
example: if x=n then n else AddNothingToTheReturn
Use List.choose:
let indeks lst n =
lst
|> List.mapi (fun i s -> if s = n then Some i else None)
|> List.choose id
Sorry, I didn't notice that you had a second problem too. For that you can use List.collect:
let f (n : int) : list<int> =
[1 .. n]
|> List.collect (fun s -> List.init s (fun t -> s))
printfn "%A" (f 3) // [1; 2; 2; 3; 3; 3]
Please read the documentation for List.collect for more information.
EDIT
Following s952163's lead, here is another version of the first solution without the Option type:
let indeks (lst : list<int>) (n : int) : list<int> =
lst
|> List.fold (fun (s, t) u -> s + 1, (if u = n then (s :: t) else t)) (0, [])
|> (snd >> List.rev)
This one traverses the original list once, and the (potentially much shorter) newly formed list once.
The previous answer is quite idiomatic. Here's one solution that avoids the use of Option types and id:
let indeks2 lst n =
lst
|> List.mapi (fun i x -> (i,x))
|> List.filter (fun x -> (fst x) % n = 0 )
|> List.map snd
You can modify the filter function to match your needs.
If you plan to generate lots of sequences it might be a good idea to explore Sequence (list) comprehensions:
[for i in 1..10 do
yield! List.replicate i i]
If statements are an expression in F# and they return a value. In this case both the IF and ELSE branch must return the same type of value. Using Some/None (Option type) gets around this. There are some cases where you can get away with just using If.

How can I cast a back to a type a value was before?

Very often when writing generic code in F# I come by a situation similar to this (I know this is quite inefficient, just for demonstration purposes):
let isPrime n =
let sq = n |> float |> sqrt |> int
{2..sq} |> Seq.forall (fun d -> n % d <> 0)
For many problems I can use statically resolved types and get even a performance boost due to inlining.
let inline isPrime (n:^a) =
let two = LanguagePrimitives.GenericOne + LanguagePrimitives.GenericOne
let sq = n |> float |> sqrt |> int
{two..sq} |> Seq.forall (fun d -> n % d <> LanguagePrimitives.GenericZero)
The code above won't compile because of the upper sequence limit being a float. Nongenerically, I could just cast back to int for example.
But the compiler won't let me use any of these:
let sq = n |> float |> sqrt :> ^a
let sq = n |> float |> sqrt :?> ^a
and these two lead to a InvalidCastException:
let sq = n |> float |> sqrt |> box |> :?> ^a
let sq = n |> float |> sqrt |> box |> unbox
Also, upcast and downcast are forbidden.
let sq = System.Convert.ChangeType(n |> float |> sqrt, n.GetType()) :?> ^a works, but seems very cumbersome to me.
Is there a way that I overlooked or do I really have to use the last version? Because the last one will also break for bigint, which I need quite often.
With the trick from FsControl, we can define generic function fromFloat:
open FsControl.Core
type FromFloat = FromFloat with
static member instance (FromFloat, _:int32 ) = fun (x:float) -> int x
static member instance (FromFloat, _:int64 ) = fun (x:float) -> int64 x
static member instance (FromFloat, _:bigint ) = fun (x:float) -> bigint x
let inline fromFloat (x:float):^a = Inline.instance FromFloat x
let inline isPrime (n:^a) =
let two = LanguagePrimitives.GenericOne + LanguagePrimitives.GenericOne
let sq = n |> float |> sqrt |> fromFloat
{two..sq} |> Seq.forall (fun d -> n % d <> LanguagePrimitives.GenericZero)
printfn "%A" <| isPrime 71
printfn "%A" <| isPrime 6L
printfn "%A" <| isPrime 23I
Inline.instance was defined here.

Parallel processing in F#

I'm playing around with async in F#. Does this look right, or am I mangling things?
let time f =
let before = System.DateTime.Now
f () |> ignore
let after = System.DateTime.Now
after - before;;
let rec fib = function 0 | 1 -> 1
| n -> fib (n - 1) + fib (n - 2);;
let source = [45; 40; 45; 40]
let synchronous = time <| fun () -> List.map fib source
let para = time <| fun () -> source
|> List.map (fun n -> async {ignore <| fib n})
|> Async.Parallel
|> Async.RunSynchronously
In particular, how do I return results from an async block? Do I have to use mutable state?
Update: here's another approach:
#r "FSharp.PowerPack.Parallel.Seq.dll"
open Microsoft.FSharp.Collections
let pseq = time <| fun () -> source
|> PSeq.map fib
|> PSeq.toList
Firstly, it's a bit of an anti-pattern to use async for parallel CPU processing. See these questions and answers for more information:
Why shouldn't I use F# asynchronous workflows for parallelism?
Task Parallel Library vs Async Workflows
Secondly, your fib function should be re-written to be tail recursive, here's an example from here (including changing to BigInt):
let fib n =
let rec loop acc1 acc2 = function
| n when n = 0I -> acc1
| n -> loop acc2 (acc1 + acc2) (n - 1I)
loop 0I 1I n
Finally, the full code:
let source = [| 45I; 40I; 45I; 40I |]
let sync = time <| fun () -> Array.map fib source
let para = time <| fun () -> Array.Parallel.map fib source
Note that in both cases an Array of the results is returned, you're just throwing it away in your time function. How about a time function that returns both the time and the result?
let time f =
let watch = new System.Diagnostics.Stopwatch()
watch.Start()
let res = f ()
watch.Stop()
(res, watch.ElapsedMilliseconds)
Usage remains the same, but now showing results:
printfn "Sync: %A in %ims" (fst sync) (snd sync)
printfn "Para: %A in %ims" (fst para) (snd para)

F# find in list of records the records with same id and add up their values

Iam an F# newbie, I have following starting point:
type aB = { ID: int; Slide: list<string * int> }
// examples of aB's
let aB1 = { ID = 1; Slide = [("-10%",-20); ("0%",0); ("10%",20)] }
let aB2 = { ID = 2; Slide = [("-10%",6); ("0%",0); ("10%",3)] }
let correctoraB2 = {ID = 2; Slide = [("-10%", -2); ("0%", 0); ("10%", -1)] }
// Now we bunch the aB`s in a list together
let bunchedABRaw = [aB1; aB2; correctoraB2]
This list can now become quite long, in this list, I need now to first identify all the aB's with identical ID's, then I want to net out their slides, so that a new list results
let bunchedABReduced = [aB1; aB2New], where
aB2New = { ID = 2; Slide = [("-10%",4); ("0%",0); ("10%",2)] }
I am reading through the F# library on msdn but so far I don't know yet how to solve the problem, would be very happy for code proposition.
Thanks a lot
Martin
OK working my way through this when I have a minute.
Here's the first part where you can merge the slides of two aB's:
// this function can merge two slides
let mergeSlides l1 l2 =
List.zip l1 l2
|> List.map (fun ((a1, b1), (a2,b2)) -> (a1, b1+b2))
// see what it does
mergeSlides aB2.Slide correctoraB2.Slide
This bit groups all the aB's with the same Id:
let grp = bunchedABRaw
|> Seq.groupBy (fun a -> a.ID)
And now we can use mergeSlides as a folding function, that we use fold over each sequence of Ab's with the same Id to make the netted aB.
So here's the whole thing:
let mergeSlides l1 l2 = 
    List.zip l1 l2
    |> List.map (fun ((a1, b1), (a2,b2)) -> (a1, b1+b2))
let net =
bunchedABRaw
|> Seq.groupBy (fun a -> a.ID)
|> Seq.map (fun (i, s) -> (i, s |> Seq.map (fun a -> a.Slide))) // strip away the slides
|> Seq.map (fun (i, s) -> (i, List.ofSeq s)) // turn seq<slide> into list<slide>
|> Seq.map (fun (i, l) -> (i, List.fold mergeSlides l.Head l.Tail)) // so we can use HEad and Tail
|> Seq.map (fun (i, l) -> {ID=i;Slide=l}) // and Project into aB
|> List.ofSeq // and then List
Enjoy!
Try this:
Set up a dictionary where the keys will be the IDs you encounter and the values will be a "netted" aB type for that ID.
Then run a fold* on the list using the dictionary as your state and have the function you fold across the list accumulate the items in the dictionary by ID ("netting" them as you go).
After that you can put all the dictionary's values into a return list.
If you can't "net" them as you go then you could store a list of items as values instead of a single "netted" value and then do the netting after the fold finishes.
*fold http://msdn.microsoft.com/en-us/library/ee353894.aspx
EDIT: Made some things clearer

How to partition a list with a given group size?

I'm looking for the best way to partition a list (or seq) so that groups have a given size.
for ex. let's say I want to group with size 2 (this could be any other number though):
let xs = [(a,b,c); (a,b,d); (y,z,y); (w,y,z); (n,y,z)]
let grouped = partitionBySize 2 input
// => [[(a,b,c);(a,b,d)]; [(y,z,y);(w,y,z)]; [(n,y,z)]]
The obvious way to implement partitionBySize would be by adding the position to every tuple in the input list so that it becomes
[(0,a,b,c), (1,a,b,d), (2,y,z,y), (3,w,y,z), (4,n,y,z)]
and then use GroupBy with
xs |> Seq.ofList |> Seq.GroupBy (function | (i,_,_,_) -> i - (i % n))
However this solution doesn't look very elegant to me.
Is there a better way to implement this function (maybe with a built-in function)?
This seems to be a repeating pattern that's not captured by any function in the F# core library. When solving similar problems earlier, I defined a function Seq.groupWhen (see F# snippets) that turns a sequence into groups. A new group is started when the predicate holds.
You could solve the problem using Seq.groupWhen similarly to Seq.group (by starting a new group at even index). Unlike with Seq.group, this is efficient, because Seq.groupWhen iterates over the input sequence just once:
[3;3;2;4;1;2;8]
|> Seq.mapi (fun i v -> i, v) // Add indices to the values (as first tuple element)
|> Seq.groupWhen (fun (i, v) -> i%2 = 0) // Start new group after every 2nd element
|> Seq.map (Seq.map snd) // Remove indices from the values
Implementing the function directly using recursion is probably easier - the solution from John does exactly what you need - but if you wanted to see a more general approach then Seq.groupWhen may be interesting.
List.chunkBySize (hat tip: Scott Wlaschin) is now available and does exactly what you're talking about. It appears to be new with F# 4.0.
let grouped = [1..10] |> List.chunkBySize 3
// val grouped : int list list =
// [[1; 2; 3]; [4; 5; 6]; [7; 8; 9]; [10]]
Seq.chunkBySize and Array.chunkBySize are also now available.
Here's a tail-recursive function that traverses the list once.
let chunksOf n items =
let rec loop i acc items =
seq {
match i, items, acc with
//exit if chunk size is zero or input list is empty
| _, [], [] | 0, _, [] -> ()
//counter=0 so yield group and continue looping
| 0, _, _::_ -> yield List.rev acc; yield! loop n [] items
//decrement counter, add head to group, and loop through tail
| _, h::t, _ -> yield! loop (i-1) (h::acc) t
//reached the end of input list, yield accumulated elements
//handles items.Length % n <> 0
| _, [], _ -> yield List.rev acc
}
loop n [] items
Usage
[1; 2; 3; 4; 5]
|> chunksOf 2
|> Seq.toList //[[1; 2]; [3; 4]; [5]]
I like the elegance of Tomas' approach, but I benchmarked both our functions using an input list of 10 million elements. This one clocked in at 9 secs vs 22 for his. Of course, as he admitted, the most efficient method would probably involve arrays/loops.
What about a recursive approach? - only requires a single pass
let rec partitionBySize length inp dummy =
match inp with
|h::t ->
if dummy |> List.length < length then
partitionBySize length t (h::dummy)
else dummy::(partitionBySize length t (h::[]))
|[] -> dummy::[]
Then invoke it with partitionBySize 2 xs []
let partitionBySize size xs =
let sq = ref (seq xs)
seq {
while (Seq.length !sq >= size) do
yield Seq.take size !sq
sq := Seq.skip size !sq
if not (Seq.isEmpty !sq) then yield !sq
}
// result to list, if you want
|> Seq.map (Seq.toList)
|> Seq.toList
UPDATE
let partitionBySize size (sq:seq<_>) =
seq {
let e = sq.GetEnumerator()
let empty = ref true;
while !empty do
yield seq { for i = 1 to size do
empty := e.MoveNext()
if !empty then yield e.Current
}
}
array slice version:
let partitionBySize size xs =
let xa = Array.ofList xs
let len = xa.Length
[
for i in 0..size..(len-1) do
yield ( if i + size >= len then xa.[i..] else xa.[i..(i+size-1)] ) |> Array.toList
]
Well, I was late for the party. The code below is a tail-recursive version using high-order functions on List:
let partitionBySize size xs =
let i = size - (List.length xs - 1) % size
let xss, _, _ =
List.foldBack( fun x (acc, ls, j) ->
if j = size then ((x::ls)::acc, [], 1)
else (acc, x::ls, j+1)
) xs ([], [], i)
xss
I did the same benchmark as Daniel did. This function is efficient while it is 2x faster than his approach on my machine. I also compared it with an array/loop version, they are comparable in terms of performance.
Moreover, unlike John's answer, this version preserves order of elements in inner lists.