Wait for all processes created though spawn/3 to finish and collect their results in Elixir - concurrency

I want to spawn multiple processes which would do some computation and collect the results of each in a list. Consider this, although incorrect, toy example:
defmodule Counter do
def loop(items \\ [])
def loop(items) do
receive do
{:append, item} ->
IO.inspect([item | items])
loop([item | items])
:exit ->
items
end
end
def push(from_pid, item) do
send(from_pid, {:append, :math.pow(item, 2)})
end
def run() do
for item <- 1..10 do
spawn(Counter, :push, [self(), item])
end
loop()
end
end
Counter.run()
Method run/1 spawns 10 processes with 2 arguments - process id and number.
Each spawned process computes the result (in this case, squares the given number) and send the result back.
Method loop/1 listens for messages and accumulates the results into a list.
The problem is I do not understand how to properly stop listening to messages after all created processes are done. I cannot just send another message type (in this case, :exit) to stop calling loop/1 recursively as some processes might not be done yet. Of course, I could keep track of the number of received messages and do not call loop/1 again if the target count is reached. However, I doubt that it is a correct approach.
How do I implement this properly?

Task.Supervisor.async_stream_nolink is a good tool for doing tasks like this. Although this may not address the specifics of how the low-level send and receive functions work, it is a good recipe for dealing with problems like this, especially when you need to control how many things happen concurrently.
Consider the following block: it will take approx. 5 seconds to run because each task sleeps for 1 second, but we run 2 of them concurrently (via max_concurrency).
iex> Task.Supervisor.start_link(name: TmpTaskSupervisor)
iex> Task.Supervisor.async_stream_nolink(
TmpTaskSupervisor,
1..10,
fn item ->
IO.puts("processing item #{item}")
Process.sleep(1_000)
end,
timeout: 120_000,
max_concurrency: 2
)
|> Stream.run()
If you want to return the values, convert the resulting stream to a list, but keep in mind that each return value will be wrapped in an :ok tuple:
iex> Task.Supervisor.start_link(name: TmpTaskSupervisor)
iex> Task.Supervisor.async_stream_nolink(
TmpTaskSupervisor,
1..10,
fn n ->
n * n
end,
timeout: 120_000,
max_concurrency: 2
)
|> Enum.to_list()
[ok: 1, ok: 4, ok: 9, ok: 16, ok: 25, ok: 36, ok: 49, ok: 64, ok: 81, ok: 100]

The problem is I do not understand how to properly stop listening to
messages after all created processes are done
spawn/3 returns a pid. Keep a list of all the pids, then pass the list as an argument to your loop() function.
The receive inside your loop() function will wait for a message that begins with a pid. The first pid in the list will be the first message you wait for, e.g. {FirstPid, Result}.
A spawned process should send a message back in the form of {self(), Result}.
Once you receive a message from the first pid in the list, then you recursively call loop() again with the tail of the list.
Once the list is empty, you end your loop (think multiple function clauses).
Now, suppose the first pid takes the longest to calculate a result, so you sit there waiting in the receive for that result, thereafter all the other receives will execute with no waiting, taking a few microseconds to execute, so the total amount of time to get all the messages will be approximately equal to the time it takes for a pid to do the longest calculation.
Next, suppose the first pid takes takes the shortest time to calculate the result, say T1. The second pid will finish its calculation in T2-T1 seconds because while you were waiting for the first pid to finish, the second pid was also calculating its result for T1 seconds, so it only needs T2-T1 seconds to complete its calculation, and so on for T3, T4, etc. Basically, all the shorter calculations will finish before the longest calculation, and you will receive their messages before the longest calculation finishes, so the total time you wait will be the time of the longest calculation.
In other words, it doesn't matter what order the pids are in the list.

Use Flow:
1..10
|> Flow.from_enumerable(max_demand: 1)
|> Flow.map(&(&1 * &1))
|> Enum.to_list()
Result (unsorted):
[1, 9, 16, 25, 36, 49, 4, 64, 81, 100]

Related

Run every N minutes or if item differs from average

I have an actor which receives WeatherConditions and pushes it (by using OfferAsync) it to source. Currently it is setup to run for each item it receives (it stores it to db).
public class StoreConditionsActor : ReceiveActor
{
public StoreConditionsActor(ITemperatureDataProvider temperatureDataProvider)
{
var materializer = Context.Materializer();
var source = Source.Queue<WeatherConditions>(10, OverflowStrategy.DropTail);
var graph = source
.To(Sink.ForEach<WeatherConditions>(conditions => temperatureDataProvider.Store(conditions)))
.Run(materializer);
Receive<WeatherConditions>(i =>
{
graph.OfferAsync(i);
});
}
}
What I would like to achieve is:
Run it only once every N minutes and store average value of WeatherConditions from all items received in this N minutes time window
If item received matches certain condition (i.e. item temperature is 30% higher than previous item's temperature) run it despite of being "hidden" in time window.
I've been trying ConflateWithSeed, Buffer, Throttle but neither seems to be working (I'm newbie in Akka / Akka Streams so I may be missing something basic)
This answer uses Akka Streams and Scala, but perhaps it will inspire your Akka.NET solution.
The groupedWithin method could meet your first requirement:
val queue =
Source.queue[Int](10, OverflowStrategy.dropTail)
.groupedWithin(10, 1 second)
.map(group => group.sum / group.size)
.toMat(Sink.foreach(println))(Keep.left)
.run()
Source(1 to 10000)
.throttle(10, 1 second)
.mapAsync(1)(queue.offer(_))
.runWith(Sink.ignore)
In the above example, up to 10 integers per second are offered to the SourceQueue, which groups the incoming elements in one-second bundles and calculates the respective averages of each bundle.
As for your second requirement, you could use sliding to compare an element with the previous element. The below example passes an element downstream only if it is at least 30% greater than the previous element:
val source: Source[Int, _] = ???
source
.sliding(2, 1)
.collect {
case Seq(a, b) if b >= 1.3 * a => b
}
.runForeach(println)

Weird behaviour with numpy and multiprocessing.process

Sorry for the long code, I have tried to make it as simple as possible and yet reproducible.
In short, this python script starts four processes that randomly distribute numbers into lists. Then, the result is added to a multiprocessing.Queue().
import random
import multiprocessing
import numpy
import sys
def work(subarray, queue):
result = [numpy.array([], dtype=numpy.uint64) for i in range (0, 4)]
for element in numpy.nditer(subarray):
index = random.randint(0, 3)
result[index] = numpy.append(result[index], element)
queue.put(result)
print "after the queue.put"
jobs = []
queue = multiprocessing.Queue()
subarray = numpy.array_split(numpy.arange(1, 10001, dtype=numpy.uint64), 4)
for i in range(0, 4):
process = multiprocessing.Process(target=work, args=(subarray[i], queue))
jobs.append(process)
process.start()
for j in jobs:
j.join()
print "the end"
All processes ran the print "after the queue.put" line. However, it doesn't get to the print "the end" line. Weird enough, if I change the arange from 10001 to 1001, it gets to the end. What is happening?
most of the child processes are blocking on put call.
multiprocessing queue put
block if necessary until a free slot is available.
this can be avoided by adding a call to queue.get() before join.
Also, in multiprocessing code please isolate the parent process by having:
if __name__ == '__main__':
# main code here
Compulsory usage of if name==“main” in windows while using multiprocessing
I will expand my comment into a short answer. As I also do not understand the weird behavior it is merely a workaround.
A first observation is that the code runs to the end if the queue.put line is commented out, so it must be a problem related to the queue. The results are actually added to the queue so the problem must be in the interplay between the queue and join.
The following code works as expected
import random
import multiprocessing
import numpy
import sys
import time
def work(subarray, queue):
result = [numpy.array([], dtype=numpy.uint64) for i in range (4)]
for element in numpy.nditer(subarray):
index = random.randint(0, 3)
result[index] = numpy.append(result[index], element)
queue.put(result)
print("after the queue.put")
jobs = []
queue = multiprocessing.Queue()
subarray = numpy.array_split(numpy.arange(1, 15001, dtype=numpy.uint64), 4)
for i in range(4):
process = multiprocessing.Process(target=work, args=(subarray[i], queue))
jobs.append(process)
process.start()
res = []
while len(res)<4:
res.append(queue.get())
print("the end")
This is the reason:
Joining processes that use queues
Bear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the “feeder” thread to the underlying pipe. (The child process can call the cancel_join_thread() method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be joined automatically.

How to call Executor.map in custom dask graph?

I've got a computation, consisting of 3 "map" steps, and the last step depends on results of the first two. I am performing this task using dask.distributed running on several PCs.
Dependency graph looks like following.
map(func1, list1) -> res_list1-\
| -> create_list_3(res_list1, res_list2)-> list3 -> map(func3, list3)
map(func2, list2) -> res_list2-/
If we imagine that these computations are independent, then it is straightforward to call map function 3 times.
from distributed import Executor, progress
def process(jobid):
e = Executor('{address}:{port}'.format(address=config('SERVER_ADDR'),
port=config('SERVER_PORT')))
futures = []
futures.append(e.map(func1, list1))
futures.append(e.map(func2, list2))
futures.append(e.map(func3, list3))
return futures
if __name__ == '__main__':
jobid = 'blah-blah-blah'
r = process(jobid)
progress(r)
However, list3 is constructed from results of func1 and func2, and its creation is not easily mappable (list1, list2, res_list1 and res_list2 are stored in the Postgresql database and creation of list3 is a JOIN query, taking some time).
I've tried to add call to submit to the list of futures, however, that did not work as expected:
def process(jobid):
e = Executor('{address}:{port}'.format(address=config('SERVER_ADDR'),
port=config('SERVER_PORT')))
futures = []
futures.append(e.map(func1, list1))
futures.append(e.map(func2, list2))
futures.append(e.submit(create_list_3))
futures.append(e.map(func3, list3))
return futures
In this case one dask-worker has received the task to execute create_list_3, but others have simultaneously received tasks to call func3, that have erred, because list3 did not exist.
Obvious thing - I'm missing synchronization. Workers must stop and wait till creation of list3 is finished.
Documentation to dask describes custom task graphs, that can provide a synchronization.
However, examples in the documentation do not include map functions, only simple calculations, like calls add and inc.
Is it possible to use map and custom dask graph in my case, or should I implement sync with some other means, that are not included in dask?
If you want to link dependencies between tasks then you should pass the outputs from previous tasks into the inputs of another.
futures1 = e.map(func1, list1)
futures2 = e.map(func2, list2)
futures3 = e.map(func3, futures1, futures2)
For any invocation of func3 Dask will handle waiting until the inputs are ready and will send the appropriate results to that function from wherever they get computed.
However it looks like you want to handle data transfer and synchronization through some other custom means. If this is so then perhaps it would be useful to pass along some token to the call to func3.
futures1 = e.map(func1, list1)
futures2 = e.map(func2, list2)
def do_nothing(*args):
return None
token1 = e.submit(do_nothing, futures1)
token2 = e.submit(do_nothing, futures2)
list3 = e.submit(create_list_3)
def func3(arg, tokens=None):
...
futures3 = e.map(func3, list3, tokens=[token1, token2])
This is a bit of a hack, but would force all func3 functions to wait until they were able to get the token results from the previous map calls.
However I recommend trying to do something like the first option. This will allow dask to be much smarter about when it runs and can release resources. Barriers like token1/2 result in sub-optimal scheduling.

Python multiprocess get result from queue

I'm running a multiprocessing script that is supposed to launch 2.000.000 jobs of about 0.01 seconds. Each job put the result in a queue imported from Queue because the queue from Multiprocessing module couldn't handle more than 517 results in it.
My programm freeze before getting the results from the queue. Here is the core of my multiprocess function:
while argslist != []:
p = mp.Process(target=function, args=(result_queue, argslist.pop(),))
jobs.append(p)
p.start()
for p in jobs:
p.join()
print 'over'
res = [result_queue.get() for p in jobs]
print 'got it'
output: "over" but never "got it"
when I replace
result_queue.get()
by
result_queue.get_nowait()
I got the raise Empty error saying my queue is Empty...
but if I do the queue.get() just after the queue.put() in my inner function, then it works, showing me that the queue is well filed by my function..
queue.Queue is not shared between processes, so it won't work with that, you must use multiprocessing.Queue.
To avoid a deadlock you should not join your processes before getting the results from the queue. A multiprocessing.Queue is effectively limited by its underlying pipes' buffer, so if that fills up no more items can be flushed to the pipe and queue.put() will block until a consumer calls queue.get(), but if the consumer is joining a blocked process, then you have a deadlock.
You can avoid all of this by using a multiprocessing.Pool and its map() instead.
Thank you mata, I switched back to the multiprocessing.Queue() but I don't want to use a pool because I want to keep track of how many jobs did run. I finally added an if statement to regularly empty my queue.
def multiprocess(function, argslist, ncpu):
total = len(argslist)
done = 0
result_queue = mp.Queue(0)
jobs = []
res = []
while argslist != []:
if len(mp.active_children()) < ncpu:
p = mp.Process(target=function, args=(result_queue, argslist.pop(),))
jobs.append(p)
p.start()
done += 1
print "\r",float(done)/total*100,"%", #here is to keep track
# here comes my emptying step
if len(jobs) == 500:
tmp = [result_queue.get() for p in jobs]
for r in tmp:
res.append(r)
result_queue = mp.Queue(0)
jobs = []
tmp = [result_queue.get() for p in jobs]
for r in tmp:
res.append(r)
return res
Then comes to my mind this question:
Is 500 jobs the limit because of python or because of my machine or my system?
Will this threshold be buggy if my multiprocessing function is used in other conditions?

Erlang course concurrency exercise: Can my answer be improved?

I am doing this exercise from the erlang.org course:
2) Write a function which starts N
processes in a ring, and sends a
message M times around all the
processes in the ring. After the
messages have been sent the processes
should terminate gracefully.
Here's what I've come up with:
-module(ring).
-export([start/2, node/2]).
node(NodeNumber, NumberOfNodes) ->
NextNodeNumber = (NodeNumber + 1) rem NumberOfNodes,
NextNodeName = node_name(NextNodeNumber),
receive
CircuitNumber ->
io:format("Node ~p Circuit ~p~n", [NodeNumber, CircuitNumber]),
LastNode = NodeNumber =:= NumberOfNodes - 1,
NextCircuitNumber = case LastNode of
true ->
CircuitNumber - 1;
false ->
CircuitNumber
end,
if
NextCircuitNumber > 0 ->
NextNodeName ! NextCircuitNumber;
true ->
ok
end,
if
CircuitNumber > 1 ->
node(NodeNumber, NumberOfNodes);
true ->
ok
end
end.
start(NumberOfNodes, NumberOfCircuits) ->
lists:foreach(fun(NodeNumber) ->
register(node_name(NodeNumber),
spawn(ring, node, [NodeNumber, NumberOfNodes]))
end,
lists:seq(0, NumberOfNodes - 1)),
node_name(0) ! NumberOfCircuits,
ok.
node_name(NodeNumber) ->
list_to_atom(lists:flatten(io_lib:format("node~w", [NodeNumber]))).
Here is its output:
17> ring:start(3, 2).
Node 0 Circuit 2
ok
Node 1 Circuit 2
Node 2 Circuit 2
Node 0 Circuit 1
Node 1 Circuit 1
Node 2 Circuit 1
If I actually knew Erlang, would could I do differently to improve this code? And specifically:
Is there any alternative to specifying a do-nothing "true" clause in the last two if statements?
Am I indeed terminating gracefully? Is any special action required when ending a process which was registered?
Welcome to Erlang! I hope you enjoy it as much as I do.
Is there any alternative to specifying a do-nothing "true" clause in the last two if statements?
You can just leave these off. I ran your code with this:
if NextCircuitNumber > 0 ->
NextNodeName ! NextCircuitNumber
end,
if CircuitNumber > 1 ->
node(NodeNumber, NumberOfNodes)
end
and it worked for me.
Am I indeed terminating gracefully? Is any special action required when ending a process which was registered?
Yes you are. You can verify this by running the i(). command. This will show you the list of processes, and if your registered processes weren't terminating, you would see alot of your registered processes left over like node0, node1, etc. You also would not be able to run your program a second time, because it would error trying to register an already registered name.
As far as other things you could do to improve the code, there is not much because your code is basically fine. One thing I might do is leave off the NextNodeName variable. You can just send a message directly to node_name(NextNodeNumber) and that works.
Also, you could probably do a bit more pattern matching to improve things. For example, one change I made while playing with your code was to spawn the processes by passing in the number of the last node (NumberOfNodes - 1), rather than passing the NumberOfNodes. Then, I could pattern match in my node/2 function header like this
node(LastNode, LastNode) ->
% Do things specific to the last node, like passing message back to node0
% and decrementing the CircuitNumber
node(NodeNumber, LastNode) ->
% Do things for every other node.
That allowed me to clean up some of the case and if logic in your node function and make it all a little tidier.
Hope that helps, and good luck.
Lets walk through the code:
-module(ring).
-export([start/2, node/2]).
The name node is one I avoid because a node() in Erlang has the connotation of an Erlang VM running on some machine - usually several nodes run on several machines. I'd rather call it ring_proc or something such.
node(NodeNumber, NumberOfNodes) ->
NextNodeNumber = (NodeNumber + 1) rem NumberOfNodes,
NextNodeName = node_name(NextNodeNumber),
This is what we are trying to spawn, and we get a number to the next node and the name of the next node. Lets look at node_name/1 as an interlude:
node_name(NodeNumber) ->
list_to_atom(lists:flatten(io_lib:format("node~w", [NodeNumber]))).
This function is a bad idea. You will be needing a local name which needs to be an atom, so you created a function that can create arbitrary such names. The warning here is that the atom table is not garbage collected and limited, so we should avoid it if possible. The trick to solve this problem is to pass the pids instead and build the ring in reverse. The final process will then tie the knot of the ring:
mk_ring(N) ->
Pid = spawn(fun() -> ring(none) end),
mk_ring(N, Pid, Pid).
mk_ring(0, NextPid, Initiator) ->
Initiator ! {set_next, NextPid},
Initiator;
mk_ring(N, NextPid, Initiator) ->
Pid = spawn(fun() -> ring(NextPid) end),
mk_ring(N-1, Pid, Initiator).
And then we can rewrite your start function:
start(NumberOfNodes, NumberOfCircuits) ->
RingStart = mk_ring(NumberOfNodes)
RingStart ! {operate, NumberOfCircuits, self()},
receive
done ->
RingStart ! stop
end,
ok.
The Ring code is then something along the lines of:
ring(NextPid) ->
receive
{set_next, Pid} ->
ring(Pid);
{operate, N, Who} ->
ring_ping(N, NextPid),
Who ! done,
ring(NextPid);
ping ->
NextPid ! ping,
ring(NextPid);
stop ->
NextPid ! stop,
ok
end.
And to fire something around the ring N times:
ring_ping(0, _Next) -> ok;
ring_ping(N, Next) ->
Next ! ping
receive
ping ->
ring_ping(N-1, Next)
end.
(None of this code has been tested by the way, so it may very well be quite wrong).
As for the rest of your code:
receive
CircuitNumber ->
io:format("Node ~p Circuit ~p~n", [NodeNumber, CircuitNumber]),
I'd tag the CircuitNumber with some atom: {run, CN}.
LastNode = NodeNumber =:= NumberOfNodes - 1,
NextCircuitNumber = case LastNode of
true ->
CircuitNumber - 1;
false ->
CircuitNumber
end,
This can be done with an if:
NextCN = if NodeNumber =:= NumberOfNodes - 1 -> CN -1;
NodeNumber =/= NumberOfNodes - 1 -> CN
end,
The next part here:
if
NextCircuitNumber > 0 ->
NextNodeName ! NextCircuitNumber;
true ->
ok
end,
if
CircuitNumber > 1 ->
node(NodeNumber, NumberOfNodes);
true ->
ok
end
does need the true case, unless you never hit it. The process will crash if nothing matches in the if. It is often possible to rewire the code as to not rely that much on counting constructions, like the above code of mine hints.
Several trouble can be avoided with this code. One problem with the current code is that if something crashes in the ring, it gets broken. We can use spawn_link rather than spawn to link the ring together, so such errors will destroy the whole ring. Furthermore our ring_ping function will crash if it is sent a message while the ring is operating. This can be alleviated, the simplest way is probably to alter the state of the ring process such that it knows it is currently operating and fold ring_ping into ring. Finally, we should probably also link the initial spawn so we don't end up with a large ring that are live but no-one has a reference to. Perhaps we could register the initial process so it is easy to grab hold of the ring later.
The start function is also bad in two ways. First, we should use make_ref() to tag a unique message along and receive the tag, so another process can't be sinister and just send done to the start-process while the ring works. We should probably also add a monitor on the ring, while it is working. Otherwise we will never be informed, should be ring crash while we are waiting for the done message (with tag). OTP does both in its synchronous calls by the way.
Finally, finally: No, you don't have to clean up a registration.
My colleagues have made some excellent points. I'd also like to mention that the initial intent of the problem is being avoided by registering the processes instead of actually creating a ring. Here is one possible solution:
-module(ring).
-export([start/3]).
-record(message, {data, rounds, total_nodes, first_node}).
start(TotalNodes, Rounds, Data) ->
FirstNode = spawn_link(fun() -> loop(1, 0) end),
Message = #message{data=Data, rounds=Rounds, total_nodes=TotalNodes,
first_node=FirstNode},
FirstNode ! Message, ok.
loop(Id, NextNode) when not is_pid(NextNode) ->
receive
M=#message{total_nodes=Total, first_node=First} when Id =:= Total ->
First ! M,
loop(Id, First);
M=#message{} ->
Next = spawn_link(fun() -> loop(Id+1, 0) end),
Next ! M,
loop(Id, Next)
end;
loop(Id, NextNode) ->
receive
M=#message{rounds=0} ->
io:format("node: ~w, stopping~n", [Id]),
NextNode ! M;
M=#message{data=D, rounds=R, total_nodes=Total} ->
io:format("node: ~w, message: ~p~n", [Id, D]),
if Id =:= Total -> NextNode ! M#message{rounds=R-1};
Id =/= Total -> NextNode ! M
end,
loop(Id, NextNode)
end.
This solution uses records. If you are unfamiliar with them, read all about them here.
Each node is defined by a loop/2 function. The first clause of loop/2 deals with creating the ring (the build phase), and the second clause deals with printing the messages (the data phase). Notice that all clauses end in a call to loop/2 except the rounds=0 clause, which indicates that the node is done with its task and should die. This is what is meant by graceful termination. Also note the hack used to tell the node that it's in the build phase - NextNode isn't a pid but rather an integer.