Getting result of a spawned function in Erlang - list

My objective at the moment is to write Erlang code calculating a list of N elements, where each element is a factorial of it's "index" (so, for N = 10 I would like to get [1!, 2!, 3!, ..., 10!]). What's more, I would like every element to be calculated in a seperate process (I know it is simply inefficient, but I am expected to implement it and compare its efficiency with other methods later).
In my code, I wanted to use one function as a "loop" over given N, that for N, N-1, N-2... spawns a process which calculates factorial(N) and sends the result to some "collecting" function, which packs received results into a list. I know my concept is probably overcomplicated, so hopefully the code will explain a bit more:
messageFactorial(N, listPID) ->
listPID ! factorial(N). %% send calculated factorial to "collector".
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
nProcessesFactorialList(-1) ->
ok;
nProcessesFactorialList(N) ->
spawn(pFactorial, messageFactorial, [N, listPID]), %%for each N spawn...
nProcessesFactorialList(N-1).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
listPrepare(List) -> %% "collector", for the last factorial returns
receive %% a list of factorials (1! = 1).
1 -> List;
X ->
listPrepare([X | List])
end.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
startProcessesFactorialList(N) ->
register(listPID, spawn(pFactorial, listPrepare, [[]])),
nProcessesFactorialList(N).
I guess it shall work, by which I mean that listPrepare finally returns a list of factorials. But the problem is, I do not know how to get that list, how to get what it returned? As for now my code returns ok, as this is what nProcessesFactorialList returns at its finish. I thought about sending the List of results from listPrepare to nProcessesFactorialList in the end, but then it would also need to be a registered process, from which I wouldn't know how to recover that list.
So basically, how to get the result from a registered process running listPrepare (which is my list of factorials)? If my code is not right at all, I would ask for a suggestion of how to get it better. Thanks in advance.

My way how to do this sort of tasks is
-module(par_fact).
-export([calc/1]).
fact(X) -> fact(X, 1).
fact(0, R) -> R;
fact(X, R) when X > 0 -> fact(X-1, R*X).
calc(N) ->
Self = self(),
Pids = [ spawn_link(fun() -> Self ! {self(), {X, fact(X)}} end)
|| X <- lists:seq(1, N) ],
[ receive {Pid, R} -> R end || Pid <- Pids ].
and result:
> par_fact:calc(25).
[{1,1},
{2,2},
{3,6},
{4,24},
{5,120},
{6,720},
{7,5040},
{8,40320},
{9,362880},
{10,3628800},
{11,39916800},
{12,479001600},
{13,6227020800},
{14,87178291200},
{15,1307674368000},
{16,20922789888000},
{17,355687428096000},
{18,6402373705728000},
{19,121645100408832000},
{20,2432902008176640000},
{21,51090942171709440000},
{22,1124000727777607680000},
{23,25852016738884976640000},
{24,620448401733239439360000},
{25,15511210043330985984000000}]

The first problem is that your listPrepare process doesn't do anything with the result. Try to print it in the end.
The second problem is that you don't wait for all the processes to finish, but for process that sends 1 and this is the quickest factorial to calculate. So this message will surely be received before the more complex will be calculated, and you'll end up with only a few responses.
I had answered a bit similar question on the parallel work with many processes here: Create list across many processes in Erlang Maybe that one will help you.

I propose you this solution:
-export([launch/1,fact/2]).
launch(N) ->
launch(N,N).
% launch(Current,Total)
% when all processes are launched go to the result collect phase
launch(-1,N) -> collect(N+1);
launch(I,N) ->
% fact will be executed in a new process, so the normal way to get the answer is by message passing
% need to give the current process pid to get the answer back from the spawned process
spawn(?MODULE,fact,[I,self()]),
% loop until all processes are launched
launch(I-1,N).
% simply send the result to Pid.
fact(N,Pid) -> Pid ! {N,fact_1(N,1)}.
fact_1(I,R) when I < 2 -> R;
fact_1(I,R) -> fact_1(I-1,R*I).
% init the collect phase with an empty result list
collect(N) -> collect(N,[]).
% collect(Remaining_result_to_collect,Result_list)
collect(0,L) -> L;
% accumulate the results in L and loop until all messages are received
collect(N,L) ->
receive
R -> collect(N-1,[R|L])
end.
but a much more straight (single process) solution could be:
1> F = fun(N) -> lists:foldl(fun(I,[{X,R}|Q]) -> [{I,R*I},{X,R}|Q] end, [{0,1}], lists:seq(1,N)) end.
#Fun<erl_eval.6.80484245>
2> F(6).
[{6,720},{5,120},{4,24},{3,6},{2,2},{1,1},{0,1}]
[edit]
On a system with multicore, cache and an multitask underlying system, there is absolutly no guarantee on the order of execution, same thing on message sending. The only guarantee is in the message queue where you know that you will analyse the messages according to the order of message reception. So I agree with Dmitry, your stop condition is not 100% effective.
In addition, using startProcessesFactorialList, you spawn listPrepare which collect effectively all the factorial values (except 1!) and then simply forget the result at the end of the process, I guess this code snippet is not exactly the one you use for testing.

Related

Erlang register error

I'm writing a program which will take two strings and concatenate them as a shared dropbox stimulation. I'm using code from a different application, which did a similar thing with a joint bank account, so the errors may be because I haven't changed some line of code properly but I just can't work out what’s wrong.
The code is written in two separate files and they link together, the basic dropbox is first and then the code which links that and displays the answer is below.
-module(dropbox).
-export([account/1, start/0, stop/0, deposit/1, get_bal/0, set_bal/1]).
account(Balance) ->
receive
{set, NewBalance} ->
account(NewBalance);
{get, From} ->
From ! {balance, Balance},
account(Balance);
stop -> ok
end.
start() ->
Account_PID = spawn(dropbox, account, [0]),
register(account_process, Account_PID).
stop() ->
account_process ! stop,
unregister(account_process).
set_bal(B) ->
account_process ! {set, B}.
get_bal() ->
account_process ! {get, self()},
receive
{balance, B} -> B
end.
deposit(Amount) ->
OldBalance = get_bal(),
NewBalance = OldBalance ++ Amount,
set_bal(NewBalance).
-module(dropboxtest).
-export([start/0, client/1]).
start() ->
dropbox:start(),
mutex:start(),
register(tester_process, self()),
loop("hello ", "world", 100),
unregister(tester_process),
mutex:stop(),
dropbox:stop().
loop(_, _, 0) ->
true;
loop(Amount1, Amount2, N) ->
dropbox:set_bal(" "),
spawn(dropboxtest, client, [Amount1]),
spawn(dropboxtest, client, [Amount2]),
receive
done -> true
end,
receive
done -> true
end,
io:format("Expected balance = ~p, actual balance = ~p~n~n",
[Amount1 ++ Amount2, dropbox:get_bal()]),
loop(Amount1, Amount2, N-1).
client(Amount) ->
dropbox:deposit(Amount),
tester_process ! done.
This is the error which I'm getting, all of the other ones I've managed to work out but I don't quite get this one so I'm not sure how to solve it.
** exception error: bad argument
in function register/2
called as register(account_process,<0.56.0>)
in call from dropbox:start/0 (dropbox.erl, line 16)
in call from dropboxtest:start/0 (dropboxtest.erl, line 5)
Also I know that this is going to come up with errors due to concurrency issues, I need to show these errors to prove what’s wrong before I can fix it. Some of the functions haven't been changed from the bank program hence balance etc.
As per the documentation, register can fail with badarg for a number of reasons:
If PidOrPort is not an existing local process or port.
If RegName is already in use.
If the process or port is already registered (already has a name).
If RegName is the atom undefined.
In this case I suspect it's the second reason, that there's already a process with the name account_process, from a previous run. You could try restarting the Erlang shell, or you could change the spawn call in dropbox:start to spawn_link, which would cause the old process to crash in case of any error in the shell.

Erlang concurrency understanding

Lately I've been trying to understand concurrent servers in Erlang. Consider the following code that makes requests to the server. Depending on the
particular order of execution, different values may get printed by the 3 processes. What are the orders and what is the highest and lowest value of each process?
test() ->
Server = start(),
spawn(fun() ->
incr(Server),
io:format("Child 1 read ~p~n", [read(Server)]) end),
incr(Server),
spawn(fun() ->
incr(Server),
io:format("Child 2 read ~p~n", [read(Server)]) end),
io:format("Parent read ~p~n", [read(Server)]).
The code runs against the server below:
-module(p4).
-export([start/0, init/0, read/1, incr/1, reset/1]).
start() ->
spawn(fun() -> init() end).
init() -> loop(0).
loop(N) ->
receive
{read, Pid} ->
Pid ! {value, self(), N},
loop(N);
{incr, Pid} ->
Pid ! {incr_reply, self()},
loop(N+1);
{reset, Pid} ->
Pid ! {reset_reply, self()},
loop(0)
end.
read(Serv) ->
Serv ! {read, self()},
receive {value, Serv, N} -> N end.
incr(Serv) ->
Serv ! {incr, self()},
receive {incr_reply, Serv} -> ok end.
reset(Serv) ->
Serv ! {reset, self()},
receive {reset_reply, Serv} -> ok end.
Parent: Lowest = 1 Highest = 3
Child1: Lowest = 1 Highest = 3
Child2: Lowest = 1 Highest = 2
I'm not completely sure about the orders, but I guess it could be that:
Child1 can read 1, 2 and 3
Parent can read 1, 2 and 3
Child2 can read 1 and 2
Is this correct for both the lowest, highest values and the orders?
The initial value in the loop is 0. The server's increment operation replies to the caller before performing the increment, but that doesn't matter because no messages are processed between the sending of that reply and the actual increment. Each read message results in a reply containing the effects of all increment messages that arrived before it. Because of guaranteed message ordering from one process to another, any process that increments then reads is guaranteed to read at least its own increment. The server's read operation simply replies with the current loop value. The reset operation is unused.
Child1 increments, then reads. It runs concurrently with Parent initially and then later with Child2 as well, both of which also increment. It can therefore read 1 from just its own increment, 2 from its own increment and that of its parent, or 3 if its read also picks up the increment from Child2.
Child2 also increments, then reads, but it doesn't start until after the Parent has already incremented. The minimum it can read is therefore 2, and since it runs concurrently with Child1, it could alternatively read a 3.
Parent increments, then reads, so the minimum it can read is 1. Its read runs concurrently with Child1 and Child2, so if its read happens before either of their increments, it sees a 1. It could alternatively read a 2 if its read picks up either of the child increments, or a 3 if its read picks up both child increments.

How to fully utilise `lwt` in this case

Here is what I am going to do:
I have a list of task and I need to run them all every 1 hour (scheduling).
All those tasks are similar. for example, for one task, I need to download some data from a server (using http protocol and would take 5 - 8 seconds) and then do a computation on the data (would take 1 - 5 seconds).
I think I can use lwt to achieve these, but can't figure out the best way for efficiency.
For the task scheduling part, I can do like this (How to schedule a task in OCaml?):
let rec start () =
(Lwt_unix.sleep 1.) >>= (fun () -> print_endline "Hello, world !"; start ())
let _ = Lwt_main.run (start())
The questions come from the actual do_task part.
So a task involves http download and computation.
The http download part would have to wait for 5 to 8 seconds. If I really execute each task one by one, then it wastes the bandwidth and of course, I wish the download process of all tasks to be in parallel. So should I put this download part to lwt? and will lwt handle all the downloads in parallel?
By code, should I do like this?:
let content = function
| Some (_, body) -> Cohttp_lwt_unix.Body.string_of_body body
| _ -> return ""
let download task =
Cohttp_lwt_unix.Client.get ("http://dataserver/task?name="^task.name)
let get_data task =
(download task) >>= (fun response -> Lwt.return (Content response))
let do_task task =
(get_data task) >>= (fun data -> Lwt.return_unit (calculate data))
So, through the code above, will all tasks be executed in parallel, at least for the http download part?
For calculation part, will all calculations be executed in sequence?
Furthermore, can any one briefly describe the mechanism of lwt? Internally, what is the logic of light weight thread? Why can it handle IO in parallel?
To do parallel computation using lwt, you can check the lwt_list module, and especially iter_p.
val iter_p : ('a -> unit Lwt.t) -> 'a list -> unit Lwt.t
iter_p f l call the function f on each element of l, then waits for all the threads to terminate. For your purpose, it would look like :
let do_tasks tasks = List.iter_p do_task tasks
Assuming that "tasks" is a list of task.

catching a deadlock in a simple odd-even sending

I'm trying to solve a simple problem with MPI, my implementation is MPICH2 and my code is in fortran. I have used the blocking send and receive, the idea is so simple but when I run it it crashes!!! I have absolutely no idea what is wrong? can anyone make quote on this issue please? there is a piece of the code:
integer, parameter :: IM=100, JM=100
REAL, ALLOCATABLE :: T(:,:), TF(:,:)
CALL MPI_COMM_RANK(MPI_COMM_WORLD,RNK,IERR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,SIZ,IERR)
prv = rnk-1
nxt = rnk+1
LIM = INT(IM/SIZ)
IF (rnk==0) THEN
ALLOCATE(TF(IM,JM))
prv = MPI_PROC_NULL
ELSEIF(rnk==siz-1) THEN
NXT = MPI_PROC_NULL
LIM = LIM+MOD(IM,SIZ)
END IF
IF (MOD(RNK,2)==0) THEN
CALL MPI_SEND(T(2,:),JM+2,MPI_REAL,PRV,10,MPI_COMM_WORLD,IERR)
CALL MPI_RECV(T(1,:),JM+2,MPI_REAL,PRV,20,MPI_COMM_WORLD,STAT,IERR)
ELSE
CALL MPI_RECV(T(LIM+2,:),JM+2,MPI_REAL,NXT,10,MPI_COMM_WORLD,STAT,IERR)
CALL MPI_SEND(T(LIM+1,:),JM+2,MPI_REAL,NXT,20,MPI_COMM_WORLD,IERR)
END IF
as I understood even processes are not receiving anything while the odd ones finish sending successfully, in some cases when I added some print to observe what is going on I saw that the variable NXT is changing during the sending procedure!!! for example all the odd process was sending message to process 0 not their next one!
The array T is not allocated so reading or writing from it is an error.
I can't see the whole program, but some observation on what I can see:
1) make sure that rnk, size, and prv are integers. Likely, prv is real (by default
typing rules) and you are sending a real to an integer, so tags don't match, hence deadlock.
2) I'd use sendrcv rather than send / recv ; recv / send code section. Two sendrecv's are cleaner ( 2 lines of code vs. 7), guaranteed not to deadlock, and is faster when you have bi-directional links (almost always true.)

Is F# really faster than Erlang at spawning and killing processes?

Updated: This question contains an error which makes the benchmark meaningless. I will attempt a better benchmark comparing F# and Erlang's basic concurrency functionality and inquire about the results in another question.
I am trying do understand the performance characteristics of Erlang and F#. I find Erlang's concurrency model very appealing but am inclined to use F# for interoperability reasons. While out of the box F# doesn't offer anything like Erlang's concurrency primitives -- from what I can tell async and MailboxProcessor only cover a small portion of what Erlang does well -- I've been trying to understand what is possible in F# performance wise.
In Joe Armstrong's Programming Erlang book, he makes the point that processes are very cheap in Erlang. He uses the (roughly) the following code to demonstrate this fact:
-module(processes).
-export([max/1]).
%% max(N)
%% Create N processes then destroy them
%% See how much time this takes
max(N) ->
statistics(runtime),
statistics(wall_clock),
L = for(1, N, fun() -> spawn(fun() -> wait() end) end),
{_, Time1} = statistics(runtime),
{_, Time2} = statistics(wall_clock),
lists:foreach(fun(Pid) -> Pid ! die end, L),
U1 = Time1 * 1000 / N,
U2 = Time2 * 1000 / N,
io:format("Process spawn time=~p (~p) microseconds~n",
[U1, U2]).
wait() ->
receive
die -> void
end.
for(N, N, F) -> [F()];
for(I, N, F) -> [F()|for(I+1, N, F)].
On my Macbook Pro, spawning and killing 100 thousand processes (processes:max(100000)) takes about 8 microseconds per processes. I can raise the number of processes a bit further, but a million seems to break things pretty consistently.
Knowing very little F#, I tried to implement this example using async and MailBoxProcessor. My attempt, which may be wrong, is as follows:
#r "System.dll"
open System.Diagnostics
type waitMsg =
| Die
let wait =
MailboxProcessor.Start(fun inbox ->
let rec loop =
async { let! msg = inbox.Receive()
match msg with
| Die -> return() }
loop)
let max N =
printfn "Started!"
let stopwatch = new Stopwatch()
stopwatch.Start()
let actors = [for i in 1 .. N do yield wait]
for actor in actors do
actor.Post(Die)
stopwatch.Stop()
printfn "Process spawn time=%f microseconds." (stopwatch.Elapsed.TotalMilliseconds * 1000.0 / float(N))
printfn "Done."
Using F# on Mono, starting and killing 100,000 actors/processors takes under 2 microseconds per process, roughly 4 times faster than Erlang. More importantly, perhaps, is that I can scale up to millions of processes without any apparent problems. Starting 1 or 2 million processes still takes about 2 microseconds per process. Starting 20 million processors is still feasible, but slows to about 6 microseconds per process.
I have not yet taken the time to fully understand how F# implements async and MailBoxProcessor, but these results are encouraging. Is there something I'm doing horribly wrong?
If not, is there some place Erlang will likely outperform F#? Is there any reason Erlang's concurrency primitives can't be brought to F# through a library?
EDIT: The above numbers are wrong, due to the error Brian pointed out. I will update the entire question when I fix it.
In your original code, you only started one MailboxProcessor. Make wait() a function, and call it with each yield. Also you are not waiting for them to spin up or receive the messages, which I think invalidates the timing info; see my code below.
That said, I have some success; on my box I can do 100,000 at about 25us each. After too much more, I think possibly you start fighting the allocator/GC as much as anything, but I was able to do a million too (at about 27us each, but at this point was using like 1.5G of memory).
Basically each 'suspended async' (which is the state when a mailbox is waiting on a line like
let! msg = inbox.Receive()
) only takes some number of bytes while it's blocked. That's why you can have way, way, way more asyncs than threads; a thread typically takes like a megabyte of memory or more.
Ok, here's the code I'm using. You can use a small number like 10, and --define DEBUG to ensure the program semantics are what is desired (printf outputs may be interleaved, but you'll get the idea).
open System.Diagnostics
let MAX = 100000
type waitMsg =
| Die
let mutable countDown = MAX
let mre = new System.Threading.ManualResetEvent(false)
let wait(i) =
MailboxProcessor.Start(fun inbox ->
let rec loop =
async {
#if DEBUG
printfn "I am mbox #%d" i
#endif
if System.Threading.Interlocked.Decrement(&countDown) = 0 then
mre.Set() |> ignore
let! msg = inbox.Receive()
match msg with
| Die ->
#if DEBUG
printfn "mbox #%d died" i
#endif
if System.Threading.Interlocked.Decrement(&countDown) = 0 then
mre.Set() |> ignore
return() }
loop)
let max N =
printfn "Started!"
let stopwatch = new Stopwatch()
stopwatch.Start()
let actors = [for i in 1 .. N do yield wait(i)]
mre.WaitOne() |> ignore // ensure they have all spun up
mre.Reset() |> ignore
countDown <- MAX
for actor in actors do
actor.Post(Die)
mre.WaitOne() |> ignore // ensure they have all got the message
stopwatch.Stop()
printfn "Process spawn time=%f microseconds." (stopwatch.Elapsed.TotalMilliseconds * 1000.0 / float(N))
printfn "Done."
max MAX
All this said, I don't know Erlang, and I have not thought deeply about whether there's a way to trim down the F# any more (though it's pretty idiomatic as-is).
Erlang's VM doesn't uses OS threads or process to switch to new Erlang process. It's VM simply counts function calls into your code/process and jumps to other VM's process after some (into same OS process and same OS thread).
CLR uses mechanics based on OS process and threads, so F# has much higher overhead cost for each context switch.
So answer to your question is "No, Erlang is much faster than spawning and killing processes".
P.S. You can find results of that practical contest interesting.