Erlang course concurrency exercise: Can my answer be improved? - concurrency

I am doing this exercise from the erlang.org course:
2) Write a function which starts N
processes in a ring, and sends a
message M times around all the
processes in the ring. After the
messages have been sent the processes
should terminate gracefully.
Here's what I've come up with:
-module(ring).
-export([start/2, node/2]).
node(NodeNumber, NumberOfNodes) ->
NextNodeNumber = (NodeNumber + 1) rem NumberOfNodes,
NextNodeName = node_name(NextNodeNumber),
receive
CircuitNumber ->
io:format("Node ~p Circuit ~p~n", [NodeNumber, CircuitNumber]),
LastNode = NodeNumber =:= NumberOfNodes - 1,
NextCircuitNumber = case LastNode of
true ->
CircuitNumber - 1;
false ->
CircuitNumber
end,
if
NextCircuitNumber > 0 ->
NextNodeName ! NextCircuitNumber;
true ->
ok
end,
if
CircuitNumber > 1 ->
node(NodeNumber, NumberOfNodes);
true ->
ok
end
end.
start(NumberOfNodes, NumberOfCircuits) ->
lists:foreach(fun(NodeNumber) ->
register(node_name(NodeNumber),
spawn(ring, node, [NodeNumber, NumberOfNodes]))
end,
lists:seq(0, NumberOfNodes - 1)),
node_name(0) ! NumberOfCircuits,
ok.
node_name(NodeNumber) ->
list_to_atom(lists:flatten(io_lib:format("node~w", [NodeNumber]))).
Here is its output:
17> ring:start(3, 2).
Node 0 Circuit 2
ok
Node 1 Circuit 2
Node 2 Circuit 2
Node 0 Circuit 1
Node 1 Circuit 1
Node 2 Circuit 1
If I actually knew Erlang, would could I do differently to improve this code? And specifically:
Is there any alternative to specifying a do-nothing "true" clause in the last two if statements?
Am I indeed terminating gracefully? Is any special action required when ending a process which was registered?

Welcome to Erlang! I hope you enjoy it as much as I do.
Is there any alternative to specifying a do-nothing "true" clause in the last two if statements?
You can just leave these off. I ran your code with this:
if NextCircuitNumber > 0 ->
NextNodeName ! NextCircuitNumber
end,
if CircuitNumber > 1 ->
node(NodeNumber, NumberOfNodes)
end
and it worked for me.
Am I indeed terminating gracefully? Is any special action required when ending a process which was registered?
Yes you are. You can verify this by running the i(). command. This will show you the list of processes, and if your registered processes weren't terminating, you would see alot of your registered processes left over like node0, node1, etc. You also would not be able to run your program a second time, because it would error trying to register an already registered name.
As far as other things you could do to improve the code, there is not much because your code is basically fine. One thing I might do is leave off the NextNodeName variable. You can just send a message directly to node_name(NextNodeNumber) and that works.
Also, you could probably do a bit more pattern matching to improve things. For example, one change I made while playing with your code was to spawn the processes by passing in the number of the last node (NumberOfNodes - 1), rather than passing the NumberOfNodes. Then, I could pattern match in my node/2 function header like this
node(LastNode, LastNode) ->
% Do things specific to the last node, like passing message back to node0
% and decrementing the CircuitNumber
node(NodeNumber, LastNode) ->
% Do things for every other node.
That allowed me to clean up some of the case and if logic in your node function and make it all a little tidier.
Hope that helps, and good luck.

Lets walk through the code:
-module(ring).
-export([start/2, node/2]).
The name node is one I avoid because a node() in Erlang has the connotation of an Erlang VM running on some machine - usually several nodes run on several machines. I'd rather call it ring_proc or something such.
node(NodeNumber, NumberOfNodes) ->
NextNodeNumber = (NodeNumber + 1) rem NumberOfNodes,
NextNodeName = node_name(NextNodeNumber),
This is what we are trying to spawn, and we get a number to the next node and the name of the next node. Lets look at node_name/1 as an interlude:
node_name(NodeNumber) ->
list_to_atom(lists:flatten(io_lib:format("node~w", [NodeNumber]))).
This function is a bad idea. You will be needing a local name which needs to be an atom, so you created a function that can create arbitrary such names. The warning here is that the atom table is not garbage collected and limited, so we should avoid it if possible. The trick to solve this problem is to pass the pids instead and build the ring in reverse. The final process will then tie the knot of the ring:
mk_ring(N) ->
Pid = spawn(fun() -> ring(none) end),
mk_ring(N, Pid, Pid).
mk_ring(0, NextPid, Initiator) ->
Initiator ! {set_next, NextPid},
Initiator;
mk_ring(N, NextPid, Initiator) ->
Pid = spawn(fun() -> ring(NextPid) end),
mk_ring(N-1, Pid, Initiator).
And then we can rewrite your start function:
start(NumberOfNodes, NumberOfCircuits) ->
RingStart = mk_ring(NumberOfNodes)
RingStart ! {operate, NumberOfCircuits, self()},
receive
done ->
RingStart ! stop
end,
ok.
The Ring code is then something along the lines of:
ring(NextPid) ->
receive
{set_next, Pid} ->
ring(Pid);
{operate, N, Who} ->
ring_ping(N, NextPid),
Who ! done,
ring(NextPid);
ping ->
NextPid ! ping,
ring(NextPid);
stop ->
NextPid ! stop,
ok
end.
And to fire something around the ring N times:
ring_ping(0, _Next) -> ok;
ring_ping(N, Next) ->
Next ! ping
receive
ping ->
ring_ping(N-1, Next)
end.
(None of this code has been tested by the way, so it may very well be quite wrong).
As for the rest of your code:
receive
CircuitNumber ->
io:format("Node ~p Circuit ~p~n", [NodeNumber, CircuitNumber]),
I'd tag the CircuitNumber with some atom: {run, CN}.
LastNode = NodeNumber =:= NumberOfNodes - 1,
NextCircuitNumber = case LastNode of
true ->
CircuitNumber - 1;
false ->
CircuitNumber
end,
This can be done with an if:
NextCN = if NodeNumber =:= NumberOfNodes - 1 -> CN -1;
NodeNumber =/= NumberOfNodes - 1 -> CN
end,
The next part here:
if
NextCircuitNumber > 0 ->
NextNodeName ! NextCircuitNumber;
true ->
ok
end,
if
CircuitNumber > 1 ->
node(NodeNumber, NumberOfNodes);
true ->
ok
end
does need the true case, unless you never hit it. The process will crash if nothing matches in the if. It is often possible to rewire the code as to not rely that much on counting constructions, like the above code of mine hints.
Several trouble can be avoided with this code. One problem with the current code is that if something crashes in the ring, it gets broken. We can use spawn_link rather than spawn to link the ring together, so such errors will destroy the whole ring. Furthermore our ring_ping function will crash if it is sent a message while the ring is operating. This can be alleviated, the simplest way is probably to alter the state of the ring process such that it knows it is currently operating and fold ring_ping into ring. Finally, we should probably also link the initial spawn so we don't end up with a large ring that are live but no-one has a reference to. Perhaps we could register the initial process so it is easy to grab hold of the ring later.
The start function is also bad in two ways. First, we should use make_ref() to tag a unique message along and receive the tag, so another process can't be sinister and just send done to the start-process while the ring works. We should probably also add a monitor on the ring, while it is working. Otherwise we will never be informed, should be ring crash while we are waiting for the done message (with tag). OTP does both in its synchronous calls by the way.
Finally, finally: No, you don't have to clean up a registration.

My colleagues have made some excellent points. I'd also like to mention that the initial intent of the problem is being avoided by registering the processes instead of actually creating a ring. Here is one possible solution:
-module(ring).
-export([start/3]).
-record(message, {data, rounds, total_nodes, first_node}).
start(TotalNodes, Rounds, Data) ->
FirstNode = spawn_link(fun() -> loop(1, 0) end),
Message = #message{data=Data, rounds=Rounds, total_nodes=TotalNodes,
first_node=FirstNode},
FirstNode ! Message, ok.
loop(Id, NextNode) when not is_pid(NextNode) ->
receive
M=#message{total_nodes=Total, first_node=First} when Id =:= Total ->
First ! M,
loop(Id, First);
M=#message{} ->
Next = spawn_link(fun() -> loop(Id+1, 0) end),
Next ! M,
loop(Id, Next)
end;
loop(Id, NextNode) ->
receive
M=#message{rounds=0} ->
io:format("node: ~w, stopping~n", [Id]),
NextNode ! M;
M=#message{data=D, rounds=R, total_nodes=Total} ->
io:format("node: ~w, message: ~p~n", [Id, D]),
if Id =:= Total -> NextNode ! M#message{rounds=R-1};
Id =/= Total -> NextNode ! M
end,
loop(Id, NextNode)
end.
This solution uses records. If you are unfamiliar with them, read all about them here.
Each node is defined by a loop/2 function. The first clause of loop/2 deals with creating the ring (the build phase), and the second clause deals with printing the messages (the data phase). Notice that all clauses end in a call to loop/2 except the rounds=0 clause, which indicates that the node is done with its task and should die. This is what is meant by graceful termination. Also note the hack used to tell the node that it's in the build phase - NextNode isn't a pid but rather an integer.

Related

Wait for all processes created though spawn/3 to finish and collect their results in Elixir

I want to spawn multiple processes which would do some computation and collect the results of each in a list. Consider this, although incorrect, toy example:
defmodule Counter do
def loop(items \\ [])
def loop(items) do
receive do
{:append, item} ->
IO.inspect([item | items])
loop([item | items])
:exit ->
items
end
end
def push(from_pid, item) do
send(from_pid, {:append, :math.pow(item, 2)})
end
def run() do
for item <- 1..10 do
spawn(Counter, :push, [self(), item])
end
loop()
end
end
Counter.run()
Method run/1 spawns 10 processes with 2 arguments - process id and number.
Each spawned process computes the result (in this case, squares the given number) and send the result back.
Method loop/1 listens for messages and accumulates the results into a list.
The problem is I do not understand how to properly stop listening to messages after all created processes are done. I cannot just send another message type (in this case, :exit) to stop calling loop/1 recursively as some processes might not be done yet. Of course, I could keep track of the number of received messages and do not call loop/1 again if the target count is reached. However, I doubt that it is a correct approach.
How do I implement this properly?
Task.Supervisor.async_stream_nolink is a good tool for doing tasks like this. Although this may not address the specifics of how the low-level send and receive functions work, it is a good recipe for dealing with problems like this, especially when you need to control how many things happen concurrently.
Consider the following block: it will take approx. 5 seconds to run because each task sleeps for 1 second, but we run 2 of them concurrently (via max_concurrency).
iex> Task.Supervisor.start_link(name: TmpTaskSupervisor)
iex> Task.Supervisor.async_stream_nolink(
TmpTaskSupervisor,
1..10,
fn item ->
IO.puts("processing item #{item}")
Process.sleep(1_000)
end,
timeout: 120_000,
max_concurrency: 2
)
|> Stream.run()
If you want to return the values, convert the resulting stream to a list, but keep in mind that each return value will be wrapped in an :ok tuple:
iex> Task.Supervisor.start_link(name: TmpTaskSupervisor)
iex> Task.Supervisor.async_stream_nolink(
TmpTaskSupervisor,
1..10,
fn n ->
n * n
end,
timeout: 120_000,
max_concurrency: 2
)
|> Enum.to_list()
[ok: 1, ok: 4, ok: 9, ok: 16, ok: 25, ok: 36, ok: 49, ok: 64, ok: 81, ok: 100]
The problem is I do not understand how to properly stop listening to
messages after all created processes are done
spawn/3 returns a pid. Keep a list of all the pids, then pass the list as an argument to your loop() function.
The receive inside your loop() function will wait for a message that begins with a pid. The first pid in the list will be the first message you wait for, e.g. {FirstPid, Result}.
A spawned process should send a message back in the form of {self(), Result}.
Once you receive a message from the first pid in the list, then you recursively call loop() again with the tail of the list.
Once the list is empty, you end your loop (think multiple function clauses).
Now, suppose the first pid takes the longest to calculate a result, so you sit there waiting in the receive for that result, thereafter all the other receives will execute with no waiting, taking a few microseconds to execute, so the total amount of time to get all the messages will be approximately equal to the time it takes for a pid to do the longest calculation.
Next, suppose the first pid takes takes the shortest time to calculate the result, say T1. The second pid will finish its calculation in T2-T1 seconds because while you were waiting for the first pid to finish, the second pid was also calculating its result for T1 seconds, so it only needs T2-T1 seconds to complete its calculation, and so on for T3, T4, etc. Basically, all the shorter calculations will finish before the longest calculation, and you will receive their messages before the longest calculation finishes, so the total time you wait will be the time of the longest calculation.
In other words, it doesn't matter what order the pids are in the list.
Use Flow:
1..10
|> Flow.from_enumerable(max_demand: 1)
|> Flow.map(&(&1 * &1))
|> Enum.to_list()
Result (unsorted):
[1, 9, 16, 25, 36, 49, 4, 64, 81, 100]

How to solve "unresolved flex record" in else if statement in SML?

I want to find a list of nodes that currently given nodes directly or indirectly connect to.
For example, I have a list of nodes:
[1,2]
and a list of tuples, and each of the tuples represents a direct edge:
[(1,5),(2,4),(4,6)]
So, the nodes I am looking for are
[1,2,5,4,6]
Because, 1 connects to 5, 2 connects to 4. Then, 4 is connected to 6.
To achieve this, I need two a queues, and a list. Each time a new node is discovered, we append the new node to the queue and the list. Then, we remove the first node of the queue, and go to next node. If a new node is connected to the current node of the queue. Then, we add new node to both the queue and the list.
We keep doing this until the queue is empty and we return the list.
So now, I have an append function which appends a list to another list:
fun append(xs, ys) =
case ys of
[] => xs
| (y::ys') => append(xs # [y], ys')
Then, I have a function called getIndirectNodes, which intends to return the lists of nodes that the given nodes indirectly connected to, but throws "unresolved flex record". List1 and List2 have the same items supposedly. But, List1 serves the queue, and list2 servers as the list to be returned.
fun getIndirectNode(listRoleTuples, list1, list2) =
if list1 = []
then list2
else if hd(list1) = #1(hd(listRoleTuples))
then (
append(list1,#2(hd(listRoleTuples)) :: []);
append(list2,#2(hd(listRoleTuples)) :: []);
getIndirectNode(listRoleTuples,tl(list1),list2)
)
else
getIndirectNode(listRoleTuples,tl(list1),list2)
If I remove the else if statement, it works perfectly fine. But, it's not what I intended to do. The problem is in the else if statement. What can I do to fix it?
SML needs to know exactly what shape a tuple has in order to deconstruct it.
You could specify the type of the parameter - listRoleTuples : (''a * ''a) list - but using pattern matching is a better idea.
(There are many other problems with that code, but that's the answer to your question.)
It seems that one of your classmates had this exact tuple problem in a very related task.
Make sure you browse the StackOverflow Q&A's before you ask the same question again.
As for getting the indirect nodes, this can be solved by fixed-point iteration.
First you get all the direct nodes, and then you get the direct nodes of the direct nodes.
And you do this recursively until no more new nodes occur this way.
fun getDirectNodes (startNode, edges) =
List.map #2 (List.filter (fn (node, _) => node = startNode) edges)
fun toSet xs =
... sort and remove duplicates ...
fun getReachableNodes (startNodes, edges) =
let
fun f startNode = getDirectNodes (startNode, edges)
val startNodes = toSet startNodes
val endNodes = toSet (List.concat (List.map f startNodes))
in
if startNodes = endNodes
then endNodes
else getReachableNodes (startNodes # endNodes, edges)
end
This doesn't exactly find indirect end-nodes; it finds all nodes directly or indirectly reachable by startNodes, and it includes startNodes themselves even if they're not directly or indirectly reachable by themselves.
I've tried to make this exercise easier by using sets as a datatype; it would be even neater with an actual, efficient implementation of a set type, e.g. using a balanced binary search tree. It is easier to see if there are no new nodes by adding elements to a set, since if a set already contains an element, it will be equivalent to itself before and after the addition of the element.
And I've tried to use higher-order functions when this makes sense. For example, given a list of things where I want to do the same thing on each element, List.map produces a list of results. But since that thing I want to do, getDirectNodes (startNode, edges) produces a list, then List.map f produces a list of lists. So List.concat collapses that into a single list.
List.concat (List.map f xs)
is a pretty common thing to do.

Finding the longest run of positive integers in OCaml

Trying an OCaml question of iterating through the list and finding the longest run of positive or negative integers. My thinking so far is you have to use List.fold_left and somehow +1 to the accumulator each time the next sign is the same as the current sign. However, I'm a bit stuck on how to save that value. Any help would be appreciated.
I suspect you're being downvoted because this is the kind of basic question that's probably best answered by looking at an introduction to OCaml or functional programming in general.
The essential idea of folds in general and List.fold_left in particular is to maintain some state while traversing a collection (or a list in particular). When you say you want to "save" a value, the natural answer is that the value would be part of the state that you maintain while traversing the list.
The template for calling List.fold_left looks like this:
let final_state =
List.fold_left update_function initial_state list
The update function takes the current state and the next element of the list, and returns the value of the next state. So it looks like this:
let update_function old_state list_element =
let new_state =
(* compute based on old state and element *)
in
new_state
So the explicit answer to your question is that your update function (the function that you "fold" over the list) would save a value by returning it as part of the new state.
Here's some code that returns the largest non-negative integer that it sees in a list:
let largest_int list =
let update largest_so_far element =
max largest_so_far element
in
List.fold_left update 0 list
This code "saves" the largest int seen so far by returning it as the value of the update function. (Note that it returns a value of 0 for an empty list.)
Ah dang im also up all night doing OCaml LOL!!!
here's my shot at it ... one way is to sort the list and then find the first positive or first negative depending on how you sort ... checkout sort function from https://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html ... then it's easy to get the size.
Let's say you don't want to use this built in library ... This code should be close to working (If my editor/debugger worked with OCaml I'd further test but I think this is close to good)
let mostPositives(l: int list) : int list=
let counter = ref 0 in
let maxSize = ref 0 in
for i = 0 to List.length(l) -1 do
if (List.nth l i) >= 0 then
counter := !counter + 1
else
counter := 1;
maxSize := (max !counter !maxSize );
done;

generate all binary trees with n nodes OCaml

I am trying to write a code that generates all binary trees with n nodes (so the program has to return a list in which we can find all the different binary trees with n nodes).
Here is the way I represent binary trees :
type 'a tree = Empty | Node of 'a * 'a tree * 'a tree
So I am trying to implement a function all_tree : int -> tree list such that :
all_tree 0 = [Empty]
all_tree 1 = [Node('x',Empty,Empty)]
all_tree 2 = [Node('x',Node('x',Empty,Empty),Empty); Node('x',Empty,Node('x',Empty,Empty))]
...
I tried several ideas but it didn't work out. For example we could try the following :
let rec all_tree result = function
|0 -> r
|s -> all_tree ((List.map (fun i -> Node('x',i,Empty)) result)#(List.map (fun i -> Node('x',Empty,i)) result) ) (s-1)
in all_tree [Empty] (*some number*)
This code doesn't work because it doesn't generate every possibility.
Here is one possible answer.
let rec all_trees = function
| 0 -> [Empty]
| n ->
let result = ref [] in
for i = 0 to n-1 do
let left_side = all_trees i
and right_side = all_trees (n-1-i) in
List.iter
(fun left_tree ->
List.iter
(fun right_tree ->
result := (Node('x', left_tree, right_tree)) :: (!result)
)
right_side
)
left_side
done;
!result
;;
It's pretty simple: a tree with n>0 nodes is a tree with 1 node at the top, and then n-1 nodes below split between a certain number on the left and a certain number on the right. So we loop for i from 0 to n-1 through all possible numbers of values on the left side, and n-i-1 is going to be the number of nodes on the right side. We recursively call all_trees to get the trees with i and n-i-1 nodes, and simply aggregate them.
Notice that it's a very poor implementation. It has everything a recursive function should avoid. See something like this page on recursive implementations of the Fibonacci sequence to see how to improve it (one of the first things to do would be to cache the results rather than recompute the same things many many times).
I do agree with the question's comments though that writing a printer would be step 1 in that kind of project, because it's really annoying having to read through messy things like [Node ('x', Node ('x', Empty, Node ('x', Node ('x', Empty, Empty), Empty)), Empty);. Naming variables better would also make it easier for people to read your code and will increase the chance someone will help you. And generally, listening to the comments when people give you advice on how to properly ask your questions will make it easier for you to get answers both right now and in your future questions. For instance, in my own code, I used i as the loop index. It makes sense to me while I'm coding it, but when you read the code, maybe you would have preferred to read something like left_side_nodes or something like that, which would have made it obvious what this variable was supposed to do. It's the same in your own scenario: you could call i something like subtree or maybe something even more explicit. Actually, properly naming it could make you realize what's wrong with your code. Often, if you can't properly name a variable, it's that you don't really understand what it's doing (even local variables).

Erlang Iterating through list removing one element

I have the following erlang code:
lists:all(fun(Element) -> somefunction(TestCase -- [Element]) end, TestCase).
Where TestCase is an array. I'm trying to iterate over the list/array with one element missing.
The problem is this code takes O(N^2) time worst case because of the copies of the TestCase array everytime -- is called. There is a clear O(N) Solution in a non functional language.
saved = TestCase[0]
temp = 0
NewTestCase = TestCase[1:]
for a in range(length(NewTestCase)):
somefunction(NewTestCase)
temp = NewTestCase[a]
NewTestCase[a] = saved
saved = temp
... or something like that.
Is there an O(N) solution in erlang?
Of course there is, but it's a little bit more complicated. I am assuming that some_function/1 is indeed a boolean function and you want to test whether it returns true for every sub-list.
test_on_all_but_one([], _Acc) -> true;
test_on_all_but_one([E|Rest], Acc) ->
case somefunction(lists:reverse(Acc,Rest)) of
true -> test_on_all_but_one(Rest, [E|Acc]);
false -> false
end.
This implementation is still O(length(List)^2) as the lists:reverse/2 call will still need O(length(Acc)). If you can modify somefunction/1 to do it's calculation on a list split into two parts, then you can modify the previous call to somefunction(lists:reverse(Acc,Rest)) with somefunction(Acc, Rest) or something similar and avoid the reconstruction.
The modification depends on the inner workings of somefunction/1. If you want more help with that, give some code!
You can split the list into 2 sublists, if it's acceptable of course.
witerate(Fun, [Tail], Acc) ->
Fun([], Acc);
witerate(Fun, [Head | Tail], Acc) ->
Fun(Tail, Acc),
witerate(Fun, Tail, [Head | Acc]).