multiprocess program running in Erlang currency programming - concurrency

I am trying to learn Erlang currency programming.
This is an example program got from Erlang.org but no instructions about how to run it.
I run it in this way,
1> counter:start()
<0.33.0>
But, I do not know how to run other functions so that the process (counter:start()) can do the work according to the received message.
How to confirm that two or more processes have really been generated ?
Another question, how to print out received message in a function ?
-module(counter).
-export([start/0,loop/1,increment/1,value/1,stop/1]).
%% First the interface functions.
start() ->
spawn(counter, loop, [0]).
increment(Counter) ->
Counter ! increment.
value(Counter) ->
Counter ! {self(),value},
receive
{Counter,Value} ->
Value
end.
stop(Counter) ->
Counter ! stop.
%% The counter loop.
loop(Val) ->
receive
increment ->
loop(Val + 1);
{From,value} ->
From ! {self(),Val},
loop(Val);
stop -> % No recursive call here
true;
Other -> % All other messages
loop(Val)
end.
Any help will be appreciated.
thanks

Other functions will just use the module you just created, like this:
C = counter:start(),
counter:increment(C),
counter:increment(C),
io:format("Value: ~p~n", [counter:value(C)]).
You can run pman:start() to bring up the (GUI) process manager to see which processes you have.

In addition to what Emil said, you can use the i() command to verify which processes are running. Let's start three counters:
1> counter:start().
<0.33.0>
2> counter:start().
<0.35.0>
3> counter:start().
<0.37.0>
And run i():
...
<0.33.0> counter:loop/1 233 1 0
counter:loop/1 2
<0.35.0> counter:loop/1 233 1 0
counter:loop/1 2
<0.37.0> counter:loop/1 233 1 0
counter:loop/1 2
...
As you can see, the above processes (33, 35 and 37) are happily running and they're executing the counter:loop/1 function. Let's stop process 37:
4> P37 = pid(0,37,0).
<0.37.0>
5> counter:stop(P37).
stop
Checking the new list of processes:
6> i().
You should verify it's gone.

Related

Niether 'MPI_Barrier' nor 'BLACS_Barrier' doesn't stop a processors executing its commands

I'm working on ScaLAPACK and trying to get used to BLACS routines which is essential using ScaLAPACK.
I've had some elementary course on MPI, so have some rough idea of MPI_COMM_WORLD stuff, but has no deep understanding on how it works internally and so on.
Anyway, I'm trying following code to say hello using BLACS routine.
program hello_from_BLACS
use MPI
implicit none
integer :: info, nproc, nprow, npcol, &
myid, myrow, mycol, &
ctxt, ctxt_sys, ctxt_all
call BLACS_PINFO(myid, nproc)
! get the internal default context
call BLACS_GET(0, 0, ctxt_sys)
! set up a process grid for the process set
ctxt_all = ctxt_sys
call BLACS_GRIDINIT(ctxt_all, 'c', nproc, 1)
call BLACS_BARRIER(ctxt_all, 'A')
! set up a process grid of size 3*2
ctxt = ctxt_sys
call BLACS_GRIDINIT(ctxt, 'c', 3, 2)
if (myid .eq. 0) then
write(6,*) ' myid myrow mycol nprow npcol'
endif
(**) call BLACS_BARRIER(ctxt_sys, 'A')
! all processes not belonging to 'ctxt' jump to the end of the program
if (ctxt .lt. 0) goto 1000
! get the process coordinates in the grid
call BLACS_GRIDINFO(ctxt, nprow, npcol, myrow, mycol)
write(6,*) 'hello from process', myid, myrow, mycol, nprow, npcol
1000 continue
! return all BLACS contexts
call BLACS_EXIT(0)
stop
end program
and the output with 'mpirun -np 10 ./exe' is like,
hello from process 0 0 0 3 2
hello from process 4 1 1 3 2
hello from process 1 1 0 3 2
myid myrow mycol nprow npcol
hello from process 5 2 1 3 2
hello from process 2 2 0 3 2
hello from process 3 0 1 3 2
Everything seems to work fine except that 'BLACS_BARRIER' line, which I marked (**) in the code's leftside.
I've put that line to make the output like below whose title line always printed at the top of the it.
myid myrow mycol nprow npcol
hello from process 0 0 0 3 2
hello from process 4 1 1 3 2
hello from process 1 1 0 3 2
hello from process 5 2 1 3 2
hello from process 2 2 0 3 2
hello from process 3 0 1 3 2
So the question goes,
I've tried BLACS_BARRIER to 'ctxt_sys', 'ctxt_all', and 'ctxt' but all of them does not make output in which the title line is firstly printed. I've also tried MPI_Barrier(MPI_COMM_WORLD,info), but it didn't work either. Am I using the barriers in the wrong way?
In addition, I got SIGSEGV when I used BLACS_BARRIER to 'ctxt' and used more than 6 processes when executing mpirun. Why SIGSEGV takes place in this case?
Thank you for reading this question.
To answer your 2 questions (in future it is best to give then separate posts)
1) MPI_Barrier, BLACS_Barrier and any barrier in any parallel programming methodology I have come across only synchronises the actual set of processes that calls it. However I/O is not dealt with just by the calling process, but at least one and quite possibly more within the OS which actually the process the I/O request. These are NOT synchronised by your barrier. Thus ordering of I/O is not ensured by a simple barrier. The only standard conforming ways that I can think of to ensure ordering of I/O are
Have 1 process do all the I/O or
Better is to use MPI I/O either directly, or indirectly, via e.g. NetCDF or HDF5
2) Your second call to BLACS_GRIDINIT
call BLACS_GRIDINIT(ctxt, 'c', 3, 2)
creates a context for 3 by 2 process grid, so holding 6 process. If you call it with more than 6 processes, only 6 will be returned with a valid context, for the others ctxt should be treated as an uninitialised value. So for instance if you call it with 8 processes, 6 will return with a valid ctxt, 2 will return with ctxt having no valid value. If these 2 now try to use ctxt anything is possible, and in your case you are getting a seg fault. You do seem to see that this is an issue as later you have
! all processes not belonging to 'ctxt' jump to the end of the program
if (ctxt .lt. 0) goto 1000
but I see nothing in the description of BLACS_GRIDINIT that ensures ctxt will be less than zero for non-participating processes - at https://www.netlib.org/blacs/BLACS/QRef.html#BLACS_GRIDINIT it says
This routine creates a simple NPROW x NPCOL process grid. This process
grid will use the first NPROW x NPCOL processes, and assign them to
the grid in a row- or column-major natural ordering. If these
process-to-grid mappings are unacceptable, BLACS_GRIDINIT's more
complex sister routine BLACS_GRIDMAP must be called instead.
There is no mention of what ctxt will be if the process is not part of the resulting grid - this is the kind of problem I find regularly with the BLACS documentation. Also please don't use goto, for your own sake. You WILL regret it later. Use If ... End If. I can't remember when I last used goto in Fortran, it may well be over 10 years ago.
Finally good luck in using BLACS! In my experience the documentation is often incomplete, and I would suggest only using those calls that are absolutely necessary to use ScaLAPACK and using MPI, which is much, much better defined, for the rest. It would be so much nicer if ScaLAPACK just worked with MPI nowadays.

Erlang concurrency understanding

Lately I've been trying to understand concurrent servers in Erlang. Consider the following code that makes requests to the server. Depending on the
particular order of execution, different values may get printed by the 3 processes. What are the orders and what is the highest and lowest value of each process?
test() ->
Server = start(),
spawn(fun() ->
incr(Server),
io:format("Child 1 read ~p~n", [read(Server)]) end),
incr(Server),
spawn(fun() ->
incr(Server),
io:format("Child 2 read ~p~n", [read(Server)]) end),
io:format("Parent read ~p~n", [read(Server)]).
The code runs against the server below:
-module(p4).
-export([start/0, init/0, read/1, incr/1, reset/1]).
start() ->
spawn(fun() -> init() end).
init() -> loop(0).
loop(N) ->
receive
{read, Pid} ->
Pid ! {value, self(), N},
loop(N);
{incr, Pid} ->
Pid ! {incr_reply, self()},
loop(N+1);
{reset, Pid} ->
Pid ! {reset_reply, self()},
loop(0)
end.
read(Serv) ->
Serv ! {read, self()},
receive {value, Serv, N} -> N end.
incr(Serv) ->
Serv ! {incr, self()},
receive {incr_reply, Serv} -> ok end.
reset(Serv) ->
Serv ! {reset, self()},
receive {reset_reply, Serv} -> ok end.
Parent: Lowest = 1 Highest = 3
Child1: Lowest = 1 Highest = 3
Child2: Lowest = 1 Highest = 2
I'm not completely sure about the orders, but I guess it could be that:
Child1 can read 1, 2 and 3
Parent can read 1, 2 and 3
Child2 can read 1 and 2
Is this correct for both the lowest, highest values and the orders?
The initial value in the loop is 0. The server's increment operation replies to the caller before performing the increment, but that doesn't matter because no messages are processed between the sending of that reply and the actual increment. Each read message results in a reply containing the effects of all increment messages that arrived before it. Because of guaranteed message ordering from one process to another, any process that increments then reads is guaranteed to read at least its own increment. The server's read operation simply replies with the current loop value. The reset operation is unused.
Child1 increments, then reads. It runs concurrently with Parent initially and then later with Child2 as well, both of which also increment. It can therefore read 1 from just its own increment, 2 from its own increment and that of its parent, or 3 if its read also picks up the increment from Child2.
Child2 also increments, then reads, but it doesn't start until after the Parent has already incremented. The minimum it can read is therefore 2, and since it runs concurrently with Child1, it could alternatively read a 3.
Parent increments, then reads, so the minimum it can read is 1. Its read runs concurrently with Child1 and Child2, so if its read happens before either of their increments, it sees a 1. It could alternatively read a 2 if its read picks up either of the child increments, or a 3 if its read picks up both child increments.

Getting result of a spawned function in Erlang

My objective at the moment is to write Erlang code calculating a list of N elements, where each element is a factorial of it's "index" (so, for N = 10 I would like to get [1!, 2!, 3!, ..., 10!]). What's more, I would like every element to be calculated in a seperate process (I know it is simply inefficient, but I am expected to implement it and compare its efficiency with other methods later).
In my code, I wanted to use one function as a "loop" over given N, that for N, N-1, N-2... spawns a process which calculates factorial(N) and sends the result to some "collecting" function, which packs received results into a list. I know my concept is probably overcomplicated, so hopefully the code will explain a bit more:
messageFactorial(N, listPID) ->
listPID ! factorial(N). %% send calculated factorial to "collector".
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
nProcessesFactorialList(-1) ->
ok;
nProcessesFactorialList(N) ->
spawn(pFactorial, messageFactorial, [N, listPID]), %%for each N spawn...
nProcessesFactorialList(N-1).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
listPrepare(List) -> %% "collector", for the last factorial returns
receive %% a list of factorials (1! = 1).
1 -> List;
X ->
listPrepare([X | List])
end.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
startProcessesFactorialList(N) ->
register(listPID, spawn(pFactorial, listPrepare, [[]])),
nProcessesFactorialList(N).
I guess it shall work, by which I mean that listPrepare finally returns a list of factorials. But the problem is, I do not know how to get that list, how to get what it returned? As for now my code returns ok, as this is what nProcessesFactorialList returns at its finish. I thought about sending the List of results from listPrepare to nProcessesFactorialList in the end, but then it would also need to be a registered process, from which I wouldn't know how to recover that list.
So basically, how to get the result from a registered process running listPrepare (which is my list of factorials)? If my code is not right at all, I would ask for a suggestion of how to get it better. Thanks in advance.
My way how to do this sort of tasks is
-module(par_fact).
-export([calc/1]).
fact(X) -> fact(X, 1).
fact(0, R) -> R;
fact(X, R) when X > 0 -> fact(X-1, R*X).
calc(N) ->
Self = self(),
Pids = [ spawn_link(fun() -> Self ! {self(), {X, fact(X)}} end)
|| X <- lists:seq(1, N) ],
[ receive {Pid, R} -> R end || Pid <- Pids ].
and result:
> par_fact:calc(25).
[{1,1},
{2,2},
{3,6},
{4,24},
{5,120},
{6,720},
{7,5040},
{8,40320},
{9,362880},
{10,3628800},
{11,39916800},
{12,479001600},
{13,6227020800},
{14,87178291200},
{15,1307674368000},
{16,20922789888000},
{17,355687428096000},
{18,6402373705728000},
{19,121645100408832000},
{20,2432902008176640000},
{21,51090942171709440000},
{22,1124000727777607680000},
{23,25852016738884976640000},
{24,620448401733239439360000},
{25,15511210043330985984000000}]
The first problem is that your listPrepare process doesn't do anything with the result. Try to print it in the end.
The second problem is that you don't wait for all the processes to finish, but for process that sends 1 and this is the quickest factorial to calculate. So this message will surely be received before the more complex will be calculated, and you'll end up with only a few responses.
I had answered a bit similar question on the parallel work with many processes here: Create list across many processes in Erlang Maybe that one will help you.
I propose you this solution:
-export([launch/1,fact/2]).
launch(N) ->
launch(N,N).
% launch(Current,Total)
% when all processes are launched go to the result collect phase
launch(-1,N) -> collect(N+1);
launch(I,N) ->
% fact will be executed in a new process, so the normal way to get the answer is by message passing
% need to give the current process pid to get the answer back from the spawned process
spawn(?MODULE,fact,[I,self()]),
% loop until all processes are launched
launch(I-1,N).
% simply send the result to Pid.
fact(N,Pid) -> Pid ! {N,fact_1(N,1)}.
fact_1(I,R) when I < 2 -> R;
fact_1(I,R) -> fact_1(I-1,R*I).
% init the collect phase with an empty result list
collect(N) -> collect(N,[]).
% collect(Remaining_result_to_collect,Result_list)
collect(0,L) -> L;
% accumulate the results in L and loop until all messages are received
collect(N,L) ->
receive
R -> collect(N-1,[R|L])
end.
but a much more straight (single process) solution could be:
1> F = fun(N) -> lists:foldl(fun(I,[{X,R}|Q]) -> [{I,R*I},{X,R}|Q] end, [{0,1}], lists:seq(1,N)) end.
#Fun<erl_eval.6.80484245>
2> F(6).
[{6,720},{5,120},{4,24},{3,6},{2,2},{1,1},{0,1}]
[edit]
On a system with multicore, cache and an multitask underlying system, there is absolutly no guarantee on the order of execution, same thing on message sending. The only guarantee is in the message queue where you know that you will analyse the messages according to the order of message reception. So I agree with Dmitry, your stop condition is not 100% effective.
In addition, using startProcessesFactorialList, you spawn listPrepare which collect effectively all the factorial values (except 1!) and then simply forget the result at the end of the process, I guess this code snippet is not exactly the one you use for testing.

mpirun is not working with two nodes

I am working in a cluster where each node has 16 processors. My version of Open MPI is
1.5.3. I have written the following simple code in fortran:
program MAIN
implicit none
include 'mpif.h'
integer status(MPI_STATUS_SIZE)
integer ierr,my_rank,size
integer irep, nrep, iex
character*1 task
!Initialize MPI
call mpi_init(ierr)
call mpi_comm_rank(MPI_COMM_WORLD,my_rank,ierr)
call mpi_comm_size(MPI_COMM_WORLD,size,ierr)
do iex=1,2
if(my_rank.eq.0) then
!Task for the master
nrep = size
do irep=1,nrep-1
task='q'
print *, 'master',iex,task
call mpi_send(task,1,MPI_BYTE,irep,irep+1,
& MPI_COMM_WORLD,ierr)
enddo
else
!Here are the tasks for the slaves
!Receive the task sent by the master node
call mpi_recv(task,1,MPI_BYTE,0,my_rank+1,
& MPI_COMM_WORLD,status,ierr)
print *, 'slaves', my_rank,task
endif
enddo
call mpi_finalize(ierr)
end
then I compile the code with:
/usr/lib64/openmpi/bin/mpif77 -o test2 test2.f
and run it with
/usr/lib64/openmpi/bin/mpirun -np 32 -hostfile nodefile test2
my nodefile looks like this:
node1
node1
...
node2
node2
...
with node1 and node2 repeated 16 times each.
I can compile successfully. When I run it for -np 16 (so just one node) it works
fine: each slave finishes its task and I get the prompt back in the terminal. But when I try -np 32, not all the slaves finish
their work, only 16 of them.
Actually with 32 nodes the program doesn't give me the
prompt back, so that I think the program is stacked somewhere and is waiting for
some task to be perform.
I would like to receive any comment from you as far as I have spent some time in this
trivial problem.
Thanks.
I'm not sure that your nodefile is correct. I'd expect to see lines like this:
node1 slots=16
OpenMPI is pretty well-documented, have you checked out their FAQ ?
did you try mpiexec instead of mpirun?

Eunit timeout doesn't work

I am trying to run all unit tests using eunit inside a folder but it seems like timeout always reset to 5 seconds.
e.g.
Module:
-module(example).
-include_lib("eunit/include/eunit.hrl").
main_test() ->
% sleep for 10 seconds
?assertEqual(true, begin timer:sleep(10000), true end).
Command line:
Eshell V5.7.3 (abort with ^G)
1> c(example).
{ok,example}
2> eunit:test({timeout, 15, example}).
Test passed.
ok
3> eunit:test({timeout, 15, {dir, "."}}).
example: main_test (module 'example')...*timed out*
undefined
=======================================================
Failed: 0. Skipped: 0. Passed: 0.
One or more tests were cancelled.
error
As you can see, running {timeout, 15, example} works but not {timeout, 15, {dir, "."}}. Does anyone have a clue?
To me that makes sense: the timeout for an entire directory is probably not related to the timeouts for the individual tests.
I would write the test like this:
main_test_() ->
% sleep for 10 seconds
{timeout, 15, ?_assertEqual(true, begin timer:sleep(10000), true end)}.
(Underscores added to create a test expression instead of the test itself; it's all in the eunit manual. I don't think it's possible to specify a timeout in the test itself in any other way.)