Erlang concurrency understanding - concurrency

Lately I've been trying to understand concurrent servers in Erlang. Consider the following code that makes requests to the server. Depending on the
particular order of execution, different values may get printed by the 3 processes. What are the orders and what is the highest and lowest value of each process?
test() ->
Server = start(),
spawn(fun() ->
incr(Server),
io:format("Child 1 read ~p~n", [read(Server)]) end),
incr(Server),
spawn(fun() ->
incr(Server),
io:format("Child 2 read ~p~n", [read(Server)]) end),
io:format("Parent read ~p~n", [read(Server)]).
The code runs against the server below:
-module(p4).
-export([start/0, init/0, read/1, incr/1, reset/1]).
start() ->
spawn(fun() -> init() end).
init() -> loop(0).
loop(N) ->
receive
{read, Pid} ->
Pid ! {value, self(), N},
loop(N);
{incr, Pid} ->
Pid ! {incr_reply, self()},
loop(N+1);
{reset, Pid} ->
Pid ! {reset_reply, self()},
loop(0)
end.
read(Serv) ->
Serv ! {read, self()},
receive {value, Serv, N} -> N end.
incr(Serv) ->
Serv ! {incr, self()},
receive {incr_reply, Serv} -> ok end.
reset(Serv) ->
Serv ! {reset, self()},
receive {reset_reply, Serv} -> ok end.
Parent: Lowest = 1 Highest = 3
Child1: Lowest = 1 Highest = 3
Child2: Lowest = 1 Highest = 2
I'm not completely sure about the orders, but I guess it could be that:
Child1 can read 1, 2 and 3
Parent can read 1, 2 and 3
Child2 can read 1 and 2
Is this correct for both the lowest, highest values and the orders?

The initial value in the loop is 0. The server's increment operation replies to the caller before performing the increment, but that doesn't matter because no messages are processed between the sending of that reply and the actual increment. Each read message results in a reply containing the effects of all increment messages that arrived before it. Because of guaranteed message ordering from one process to another, any process that increments then reads is guaranteed to read at least its own increment. The server's read operation simply replies with the current loop value. The reset operation is unused.
Child1 increments, then reads. It runs concurrently with Parent initially and then later with Child2 as well, both of which also increment. It can therefore read 1 from just its own increment, 2 from its own increment and that of its parent, or 3 if its read also picks up the increment from Child2.
Child2 also increments, then reads, but it doesn't start until after the Parent has already incremented. The minimum it can read is therefore 2, and since it runs concurrently with Child1, it could alternatively read a 3.
Parent increments, then reads, so the minimum it can read is 1. Its read runs concurrently with Child1 and Child2, so if its read happens before either of their increments, it sees a 1. It could alternatively read a 2 if its read picks up either of the child increments, or a 3 if its read picks up both child increments.

Related

Concurrent process execution order

Trying to figure out how Erlang concurrency works. For testing, I have the following modules:
server.erl:
-module(server).
-export([loop/0]).
loop() ->
receive
{foo, Msg_foo} ->
io:format("~w~n", [Msg_foo]),
loop();
{bar, Msg_bar} ->
io:format("~w~n", [Msg_bar]),
loop();
stop ->
io:format("~s~n", ["End server process"]),
true
end.
process_a.erl
-module(process_a).
-export([go_a/0]).
go_a() ->
receive
{foo, Pid1} ->
Pid1 ! {foo, 'Message foo from process A'},
go_a();
{bar, Pid2} ->
Pid2 ! {bar, 'Message bar from process A'},
go_a()
end.
process_b.erl
-module(process_b).
-export([go_b/0]).
go_b() ->
receive
{foo, Pid1} ->
Pid1 ! {foo, 'Message foo from process B'},
go_b();
{bar, Pid2} ->
Pid2 ! {bar, 'Message bar from process B'},
go_b()
end.
client.erl
-module(client).
-export([start/0]).
-import(server, [loop/0]).
-import(process_a, [go_a/0]).
-import(process_b, [go_b/0]).
go() ->
Server_Pid = spawn(server, loop, []),
Pid_A = spawn(process_a, go_a, []),
Pid_B = spawn(process_b, go_b, []),
Pid_A ! {foo, Server_Pid},
Pid_B ! {bar, Server_Pid},
Pid_A ! {bar, Server_Pid},
Pid_B ! {foo, Server_Pid},
Pid_A ! {foo, Server_Pid},
Pid_B ! {foo, Server_Pid},
Pid_A ! {bar, Server_Pid},
Pid_B ! {bar, Server_Pid}.
start() ->
go().
The client sends messages to process A and process B which in turn send messages to the server. The order of the messages is:
A foo
B bar
A bar
B foo
A foo
B foo
A bar
B bar
but the program output is:
'Message foo from process A'
'Message bar from process A'
'Message foo from process A'
'Message bar from process A'
'Message bar from process B'
'Message foo from process B'
'Message foo from process B'
'Message bar from process B'
The server first processes all messages from process A, then all the messages from process B. My question is, what does determine the message processing order? I thought that it was the order in which the messages were received.
It all depends on process scheduling. After your client code starts the server and procs A and B, those processes are newly created but might not even have been given any time to execute yet (and if they have, they will immediately be suspended in their receives). The client code keeps executing and quickly sends off a bunch of messages to A and B. These are asynchronous operations and the client process will not have to suspend at all before returning from the call to go().
As soon as a suspended process gets a message, it becomes ready to be scheduled for execution, but it can take a fraction of time before this happens. Meanwhile, more messages may keep arriving in their mailboxes, so when A or B actually start running, they are likely to have all four messages from the client already in their mailboxes. In general you can also not be sure which of A and B will start to execute first, even though the scheduling probably is very predictable in a simple case like this.
So in your case, A gets scheduled before B, it starts executing, and in very short time it consumes all its messages. This does not take much work, so A won't even spend a whole time slice. Then it suspends due to its mailbox being empty. Then B gets scheduled and does the same thing.
If there had been many processes, and/or a lot of work, the Erlang VM could have split the processes up across schedulers on different OS threads (running in truly parallel fashion if you have a multicore CPU). But since the example is so simple, these processes are probably handled within a single scheduler, and thus the ordering becomes even more predictable. If both A and B had thousands of messages in their queue, or each message took a lot of computational effort to process, you would see the messages getting interleaved.
(By the way, your import declarations in the client do nothing, since you are using spawn(Module, Fname, Args). If you had written e.g. spawn(fun() -> loop() end) they would be needed.)

Erlang register error

I'm writing a program which will take two strings and concatenate them as a shared dropbox stimulation. I'm using code from a different application, which did a similar thing with a joint bank account, so the errors may be because I haven't changed some line of code properly but I just can't work out what’s wrong.
The code is written in two separate files and they link together, the basic dropbox is first and then the code which links that and displays the answer is below.
-module(dropbox).
-export([account/1, start/0, stop/0, deposit/1, get_bal/0, set_bal/1]).
account(Balance) ->
receive
{set, NewBalance} ->
account(NewBalance);
{get, From} ->
From ! {balance, Balance},
account(Balance);
stop -> ok
end.
start() ->
Account_PID = spawn(dropbox, account, [0]),
register(account_process, Account_PID).
stop() ->
account_process ! stop,
unregister(account_process).
set_bal(B) ->
account_process ! {set, B}.
get_bal() ->
account_process ! {get, self()},
receive
{balance, B} -> B
end.
deposit(Amount) ->
OldBalance = get_bal(),
NewBalance = OldBalance ++ Amount,
set_bal(NewBalance).
-module(dropboxtest).
-export([start/0, client/1]).
start() ->
dropbox:start(),
mutex:start(),
register(tester_process, self()),
loop("hello ", "world", 100),
unregister(tester_process),
mutex:stop(),
dropbox:stop().
loop(_, _, 0) ->
true;
loop(Amount1, Amount2, N) ->
dropbox:set_bal(" "),
spawn(dropboxtest, client, [Amount1]),
spawn(dropboxtest, client, [Amount2]),
receive
done -> true
end,
receive
done -> true
end,
io:format("Expected balance = ~p, actual balance = ~p~n~n",
[Amount1 ++ Amount2, dropbox:get_bal()]),
loop(Amount1, Amount2, N-1).
client(Amount) ->
dropbox:deposit(Amount),
tester_process ! done.
This is the error which I'm getting, all of the other ones I've managed to work out but I don't quite get this one so I'm not sure how to solve it.
** exception error: bad argument
in function register/2
called as register(account_process,<0.56.0>)
in call from dropbox:start/0 (dropbox.erl, line 16)
in call from dropboxtest:start/0 (dropboxtest.erl, line 5)
Also I know that this is going to come up with errors due to concurrency issues, I need to show these errors to prove what’s wrong before I can fix it. Some of the functions haven't been changed from the bank program hence balance etc.
As per the documentation, register can fail with badarg for a number of reasons:
If PidOrPort is not an existing local process or port.
If RegName is already in use.
If the process or port is already registered (already has a name).
If RegName is the atom undefined.
In this case I suspect it's the second reason, that there's already a process with the name account_process, from a previous run. You could try restarting the Erlang shell, or you could change the spawn call in dropbox:start to spawn_link, which would cause the old process to crash in case of any error in the shell.

Getting result of a spawned function in Erlang

My objective at the moment is to write Erlang code calculating a list of N elements, where each element is a factorial of it's "index" (so, for N = 10 I would like to get [1!, 2!, 3!, ..., 10!]). What's more, I would like every element to be calculated in a seperate process (I know it is simply inefficient, but I am expected to implement it and compare its efficiency with other methods later).
In my code, I wanted to use one function as a "loop" over given N, that for N, N-1, N-2... spawns a process which calculates factorial(N) and sends the result to some "collecting" function, which packs received results into a list. I know my concept is probably overcomplicated, so hopefully the code will explain a bit more:
messageFactorial(N, listPID) ->
listPID ! factorial(N). %% send calculated factorial to "collector".
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
nProcessesFactorialList(-1) ->
ok;
nProcessesFactorialList(N) ->
spawn(pFactorial, messageFactorial, [N, listPID]), %%for each N spawn...
nProcessesFactorialList(N-1).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
listPrepare(List) -> %% "collector", for the last factorial returns
receive %% a list of factorials (1! = 1).
1 -> List;
X ->
listPrepare([X | List])
end.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
startProcessesFactorialList(N) ->
register(listPID, spawn(pFactorial, listPrepare, [[]])),
nProcessesFactorialList(N).
I guess it shall work, by which I mean that listPrepare finally returns a list of factorials. But the problem is, I do not know how to get that list, how to get what it returned? As for now my code returns ok, as this is what nProcessesFactorialList returns at its finish. I thought about sending the List of results from listPrepare to nProcessesFactorialList in the end, but then it would also need to be a registered process, from which I wouldn't know how to recover that list.
So basically, how to get the result from a registered process running listPrepare (which is my list of factorials)? If my code is not right at all, I would ask for a suggestion of how to get it better. Thanks in advance.
My way how to do this sort of tasks is
-module(par_fact).
-export([calc/1]).
fact(X) -> fact(X, 1).
fact(0, R) -> R;
fact(X, R) when X > 0 -> fact(X-1, R*X).
calc(N) ->
Self = self(),
Pids = [ spawn_link(fun() -> Self ! {self(), {X, fact(X)}} end)
|| X <- lists:seq(1, N) ],
[ receive {Pid, R} -> R end || Pid <- Pids ].
and result:
> par_fact:calc(25).
[{1,1},
{2,2},
{3,6},
{4,24},
{5,120},
{6,720},
{7,5040},
{8,40320},
{9,362880},
{10,3628800},
{11,39916800},
{12,479001600},
{13,6227020800},
{14,87178291200},
{15,1307674368000},
{16,20922789888000},
{17,355687428096000},
{18,6402373705728000},
{19,121645100408832000},
{20,2432902008176640000},
{21,51090942171709440000},
{22,1124000727777607680000},
{23,25852016738884976640000},
{24,620448401733239439360000},
{25,15511210043330985984000000}]
The first problem is that your listPrepare process doesn't do anything with the result. Try to print it in the end.
The second problem is that you don't wait for all the processes to finish, but for process that sends 1 and this is the quickest factorial to calculate. So this message will surely be received before the more complex will be calculated, and you'll end up with only a few responses.
I had answered a bit similar question on the parallel work with many processes here: Create list across many processes in Erlang Maybe that one will help you.
I propose you this solution:
-export([launch/1,fact/2]).
launch(N) ->
launch(N,N).
% launch(Current,Total)
% when all processes are launched go to the result collect phase
launch(-1,N) -> collect(N+1);
launch(I,N) ->
% fact will be executed in a new process, so the normal way to get the answer is by message passing
% need to give the current process pid to get the answer back from the spawned process
spawn(?MODULE,fact,[I,self()]),
% loop until all processes are launched
launch(I-1,N).
% simply send the result to Pid.
fact(N,Pid) -> Pid ! {N,fact_1(N,1)}.
fact_1(I,R) when I < 2 -> R;
fact_1(I,R) -> fact_1(I-1,R*I).
% init the collect phase with an empty result list
collect(N) -> collect(N,[]).
% collect(Remaining_result_to_collect,Result_list)
collect(0,L) -> L;
% accumulate the results in L and loop until all messages are received
collect(N,L) ->
receive
R -> collect(N-1,[R|L])
end.
but a much more straight (single process) solution could be:
1> F = fun(N) -> lists:foldl(fun(I,[{X,R}|Q]) -> [{I,R*I},{X,R}|Q] end, [{0,1}], lists:seq(1,N)) end.
#Fun<erl_eval.6.80484245>
2> F(6).
[{6,720},{5,120},{4,24},{3,6},{2,2},{1,1},{0,1}]
[edit]
On a system with multicore, cache and an multitask underlying system, there is absolutly no guarantee on the order of execution, same thing on message sending. The only guarantee is in the message queue where you know that you will analyse the messages according to the order of message reception. So I agree with Dmitry, your stop condition is not 100% effective.
In addition, using startProcessesFactorialList, you spawn listPrepare which collect effectively all the factorial values (except 1!) and then simply forget the result at the end of the process, I guess this code snippet is not exactly the one you use for testing.

If you send a message to a process before giving out its pid, is it guaranteed to recieve that message first?

Say there is a process B, which receives a pid and sends m2 to it. If you spawn A and send it m1, and then send A to B , is A guaranteed to get m1 before m2?
In other words, can this crash?
-module(test).
-compile(export_all).
test() ->
B = spawn_link(fun() -> receive P -> P ! m2 end end),
A = spawn_link(fun() -> receive X -> X=m1 end end),
A ! m1,
B ! A.
Your code cannot crash because all processes are local.
B = spawn_link(fun() -> receive P -> P ! m2 end end), % 1
A = spawn_link(fun() -> receive X -> X=m1 end end), % 2
A ! m1, % 3
B ! A. % 4
When evaluating line 3, both BEAM emulator and HiPE invoke the erl_send built-in function (BIF). Since A is a local process, erl_send (actually do_send) eventually calls erts_send_message which enqueues the message in the mailbox. In SMP mode, the thread actually acquires a lock on the mailbox.
So when evaluating line 4 and sending A to process B, A already has m1 in its mailbox. So m2 can only be enqueued after m1.
Whether this result is particular of the current implementation of Erlang is debatable, even if this is not guaranteed by documentation. Indeed, each process need a mailbox and this mailbox needs to be filled somehow. This is done synchronously on line 3. To do it asynchronously would either require another thread in-between or several mailboxes per process (e.g. one per scheduler to avoid the lock on the mailbox). Yet I do not think this would make sense performance-wise.
If processes A and B were remote but within the same node, the behavior is slightly different but the result would be the same with the current implementation of Erlang. On line 3, message m1 will be enqueued for the remote node and on line 4, message A will be enqueued afterwards. When remote node will dequeue messages, it will first write m1 to A's mailbox before writing A to B's mailbox.
If process A was remote and B was local, the result would still be the same. On line 3, message m1 will be enqueued for the remote node and on line 4, message will be written to B, but then on line 1, message m2 will be enqueued to remote node after m1. So A will get messages in m1, m2 order.
Likewise, if process A was local and B was remote, A will get the message copied to its mailbox on line 3 before anything is sent over the network to B's node.
With the current version of Erlang, the only way for this to crash is to have A and B on distinct remote nodes. In this case, m1 is enqueued to A's node before A is enqueued to B's node. However, delivery of these messages is not synchronous. Delivery to B's node could happen first, for example if many messages are already enqueued for A's node.
The following code (sometimes) triggers the crash by filling queue to A's node with junk messages that slow delivery of m1.
$ erl -sname node_c#localhost
C = spawn_link(fun() ->
A = receive {process_a, APid} -> APid end,
B = receive {process_b, BPid} -> BPid end,
ANode = node(A),
lists:foreach(fun(_) ->
rpc:cast(ANode, erlang, whereis, [user])
end, lists:seq(1, 10000)),
A ! m1,
B ! A
end),
register(process_c, C).
$ erl -sname node_b#localhost
B = spawn_link(fun() -> receive P -> P ! m2 end end),
C = rpc:call(node_c#localhost, erlang, whereis, [process_c]),
C ! {process_b, B}.
$ erl -sname node_a#localhost
A = spawn_link(fun() -> receive X -> X = m1 end, io:format("end of A\n") end),
C = rpc:call(node_c#localhost, erlang, whereis, [process_c]),
C ! {process_a, A}.
If both the two processes are on the same node, it is true that A is guaranteed to get m1 before m2.
But when the two processes are on different nodes, it is not guaranteed.
There is a paper Programming Distributed Erlang Applications: Pitfalls and Recipes about this problem.
Here is the link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.9929&rep=rep1&type=pdf
Your problem is in 2.2 of this paper, and I think it is really an intereting paper!

multiprocess program running in Erlang currency programming

I am trying to learn Erlang currency programming.
This is an example program got from Erlang.org but no instructions about how to run it.
I run it in this way,
1> counter:start()
<0.33.0>
But, I do not know how to run other functions so that the process (counter:start()) can do the work according to the received message.
How to confirm that two or more processes have really been generated ?
Another question, how to print out received message in a function ?
-module(counter).
-export([start/0,loop/1,increment/1,value/1,stop/1]).
%% First the interface functions.
start() ->
spawn(counter, loop, [0]).
increment(Counter) ->
Counter ! increment.
value(Counter) ->
Counter ! {self(),value},
receive
{Counter,Value} ->
Value
end.
stop(Counter) ->
Counter ! stop.
%% The counter loop.
loop(Val) ->
receive
increment ->
loop(Val + 1);
{From,value} ->
From ! {self(),Val},
loop(Val);
stop -> % No recursive call here
true;
Other -> % All other messages
loop(Val)
end.
Any help will be appreciated.
thanks
Other functions will just use the module you just created, like this:
C = counter:start(),
counter:increment(C),
counter:increment(C),
io:format("Value: ~p~n", [counter:value(C)]).
You can run pman:start() to bring up the (GUI) process manager to see which processes you have.
In addition to what Emil said, you can use the i() command to verify which processes are running. Let's start three counters:
1> counter:start().
<0.33.0>
2> counter:start().
<0.35.0>
3> counter:start().
<0.37.0>
And run i():
...
<0.33.0> counter:loop/1 233 1 0
counter:loop/1 2
<0.35.0> counter:loop/1 233 1 0
counter:loop/1 2
<0.37.0> counter:loop/1 233 1 0
counter:loop/1 2
...
As you can see, the above processes (33, 35 and 37) are happily running and they're executing the counter:loop/1 function. Let's stop process 37:
4> P37 = pid(0,37,0).
<0.37.0>
5> counter:stop(P37).
stop
Checking the new list of processes:
6> i().
You should verify it's gone.