I wrote a simple program ("controller") to run some computation on a separate node ("worker"). The reason being that if the worker node runs out of memory, the controller still works:
-module(controller).
-compile(export_all).
p(Msg,Args) -> io:format("~p " ++ Msg, [time() | Args]).
progress_monitor(P,N) ->
timer:sleep(5*60*1000),
p("killing the worker which was using strategy #~p~n", [N]),
exit(P, took_to_long).
start() ->
start(1).
start(Strat) ->
P = spawn('worker#localhost', worker, start, [Strat,self(),60000000000]),
p("starting worker using strategy #~p~n", [Strat]),
spawn(controller,progress_monitor,[P,Strat]),
monitor(process, P),
receive
{'DOWN', _, _, P, Info} ->
p("worker using strategy #~p died. reason: ~p~n", [Strat, Info]);
X ->
p("got result: ~p~n", [X])
end,
case Strat of
4 -> p("out of strategies. giving up~n", []);
_ -> timer:sleep(5000), % wait for node to come back
start(Strat + 1)
end.
To test it, I deliberately wrote 3 factorial implementations that will use up lots of memory and crash, and a fourth implementation which uses tail recursion to avoid taking too much space:
-module(worker).
-compile(export_all).
start(1,P,N) -> P ! factorial1(N);
start(2,P,N) -> P ! factorial2(N);
start(3,P,N) -> P ! factorial3(N);
start(4,P,N) -> P ! factorial4(N,1).
factorial1(0) -> 1;
factorial1(N) -> N*factorial1(N-1).
factorial2(N) ->
case N of
0 -> 1;
_ -> N*factorial2(N-1)
end.
factorial3(N) -> lists:foldl(fun(X,Y) -> X*Y end, 1, lists:seq(1,N)).
factorial4(0, A) -> A;
factorial4(N, A) -> factorial4(N-1, A*N).
Note even with the tail recursive version, I'm calling it with 60000000000, which will probably take days on my machine even with factorial4. Here is the output of running the controller:
$ erl -sname 'controller#localhost'
Erlang R16B (erts-5.10.1) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.1 (abort with ^G)
(controller#localhost)1> c(worker).
{ok,worker}
(controller#localhost)2> c(controller).
{ok,controller}
(controller#localhost)3> controller:start().
{23,24,28} starting worker using strategy #1
{23,25,13} worker using strategy #1 died. reason: noconnection
{23,25,18} starting worker using strategy #2
{23,26,2} worker using strategy #2 died. reason: noconnection
{23,26,7} starting worker using strategy #3
{23,26,40} worker using strategy #3 died. reason: noconnection
{23,26,45} starting worker using strategy #4
{23,29,28} killing the worker which was using strategy #1
{23,29,29} worker using strategy #4 died. reason: took_to_long
{23,29,29} out of strategies. giving up
ok
It almost works, but worker #4 was killed too early (should have been close to 23:31:45, not 23:29:29). Looking deeper, only worker #1 was attempted to be killed, and no others. So worker #4 should not have died, yet it did. Why? We can even see that the reason was took_to_long, and that progress_monitor #1 started at 23:24:28, five minutes before 23:29:29. So it looks like progress_monitor #1 killed worker #4 instead of worker #1. Why did it kill the wrong process?
Here is the output of the worker when I ran the controller:
$ while true; do erl -sname 'worker#localhost'; done
Erlang R16B (erts-5.10.1) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.1 (abort with ^G)
(worker#localhost)1>
Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot allocate 2733560184 bytes of memory (of type "heap").
Aborted
Erlang R16B (erts-5.10.1) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.1 (abort with ^G)
(worker#localhost)1>
Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot allocate 2733560184 bytes of memory (of type "heap").
Aborted
Erlang R16B (erts-5.10.1) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.1 (abort with ^G)
(worker#localhost)1>
Crash dump was written to: erl_crash.dump
eheap_alloc: Cannot allocate 2733560184 bytes of memory (of type "old_heap").
Aborted
Erlang R16B (erts-5.10.1) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V5.10.1 (abort with ^G)
(worker#localhost)1>
There are several issues, and eventually you experienced creation number wrap around.
Since you do not cancel the progress_monitor process, it will send always an exit signal after 5 minutes.
The computation is long and/or the VM is slow, hence process 4 is still running 5 minutes after the progress monitor for process 1 was started.
The 4 worker nodes were started sequentially with the same name workers#localhost, and the creation numbers of the first and the fourth node are the same.
Creation numbers (creation field in references and pids) are a mechanism to prevent pids and references created by a crashed node to be interpreted by a new node with the same name. Exactly what you expect in your code when you try to kill worker 1 after the node is long gone, you don't intend to kill a process in a restarted node.
When a node sends a pid or a reference, it encodes its creation number. When it receives a pid or a reference from another node, it checks that the creation number in the pid matches its own creation number. The creation number are attributed by epmd following the 1,2,3 sequence.
Here, unfortunately, when the 4th node gets the exit message, the creation number matches because this sequence wrapped. Since the nodes spawn the process and did the exact same thing before (initialized erlang), the pid of the worker of node 4 matches the pid of the worker of node 1.
As a result, the controller eventually kills worker 4 believing it is worker 1.
To avoid this, you need something more robust than the creation number if there can be 4 workers within the lifespan of a pid or a reference in the controller.
Related
hello every one who helps me?
python:3.8
Django==4.0.4
celery==5.2.1
I am using python/Django/celery to do something,when I get data from hive by sql,my celely worker get this error "process 'forkPoolworker-5' pid:111 exited with 'signal 9 (SIGKILL)'",and then,my task is not be used to finish and the tcp connect is closing! what can I do for it to solve?
I try to do:
CELERYD_MAX_TASKS_PER_CHILD = 1 # 单work最多任务使用数
CELERYD_CONCURRENCY = 3 # 单worker最大并发数
CELERYD_MAX_MEMORY_PER_CHILD = 1024*1024*2 # 单任务可占用2G内存
CELERY_TASK_RESULT_EXPIRES = 60 * 60 * 24 * 3
-Ofair
but these is not using for solving.
SIGKILL is raised by system, most likely due to memory or storage, monitor how much memory a celery task takes by running -P solo option or -c 1 and allocate sufficient memory accordingly.
To check memory usage either use pmap <pid> or ps -a -o rss,vsz. Please search rss and vsz for more details (in short rss is RAM and vsz is virtual memory).
CELERYD_MAX_TASKS_PER_CHILD = 1 kills process after every task, so CELERYD_MAX_MEMORY_PER_CHILD has no affect ie worker waits for completion of task before enforcing limit on running child process.
Idle (0 peers), best: #0 (0xed0a…2e72), finalized #0 (0xed0a…2e72), ⬇ 0 ⬆ 0
Idle (0 peers), best: #0 (0xed0a…2e72), finalized #0 (0xed0a…2e72), ⬇ 0 ⬆ 0
I am getting above output if i run (./target/release/substrate --chain=staging) this command in substrate full node.
I also tried to run a private network for staging , the result was same.
In either of the case the network is not producing the blocks.
Can I get any guide how to use staging?
Need to run in production network and I have seen that for production purposes we should use --staging but not --dev and --local. Is this right?
You need to also add a block producer key to your command.
Not exactly sure what the staging chain specification looks like, but something like:
./target/release/substrate --chain=staging --alice
Where we assume Alice is a configured block producer for the chain.
I'm trying to use Torque's (5.1.1) qsub command to launch multiple OpenMPI
processes, one process per node, and having each process launch a single
process on its own local node using MPI_Comm_spawn(). MPI_Comm_spawn() is reporting:
All nodes which are allocated for this job are already filled.
My OpenMPI version is 4.0.1.
I am following the instructions here to control the mapping of nodes.
Controlling node mapping of MPI_COMM_SPAWN
using the --map-by ppr:1:node option to mpiexec, and a hostfile (programatically derived
from the ${PBS_NODEFILE} file that Torque produces). My derived file MyHostFile looks
like this:
n001.cluster.com slots=2 max_slots=2
n002.cluster.com slots=2 max_slots=2
while the original ${PBS_NODEFILE} only has the node names, and no slot specifications.
My qsub command is
qsub -V -j oe -e ./tempdir -o ./tempdir -N MyJob MyJob.bash
The mpiexec command from MyJob.bash is
mpiexec --display-map --np 2 --hostfile MyNodefile --map-by ppr:1:node <executable>.
MPI_Comm_spawn() causes this error to be printed:
Data for JOB [22220,1] offset 0 Total slots allocated 1 <=====
======================== JOB MAP ========================
Data for node: n001 Num slots: 1 Max slots: 0 Num procs: 1
Process OMPI jobid: [22220,1] App: 0 Process rank: 0 Bound: socket 0[core 0[hwt 0]]:[B/././././././././.][./././././././././.]
=============================================================
All nodes which are allocated for this job are already filled.
There are two things that occur to me:
(1) "Total slots allocated" is 1 above, but I need at least two slots available.
(2) It may not be right to try to specify a hostfile to mpiexec when
using Torque (though it is derived from the Torque hostfile ${PBS_NODEFILE}). Maybe my derived hostfile is being ignored.
Is there a way to make this work? I've tried recompiling OpenMPI
without Torque support, hopefully preventing OpenMPI from interacting
with it, but it didn't change the error message.
Answering my own question: adding the argument -l nodes=1:ppn=2 to the qsub command reserves 2 processors on the node, even though mpiexec is launching only one process. MPI_Comm_spawn() can then spawn the new process on the second reserved slot.
I also had to compile OpenMPI without Torque support, since including it causes my hostfile argument to be ignored and the Torque-generated hostfile to be used.
Writing a cli tool that on startup turns on the OS X web proxy and on shutdown I'd like to turn it off again. What's the correct way to catch SIGINT and perform app cleanup? Tried the following and it traces the message but does not run the system command or exit:
Signal::INT.trap do
puts "trap"
fork do
system "networksetup -setwebproxystate Wi-Fi off"
end
exit
end
This code does exit but gives an 'Invalid memory access' error
at_exit do
fork do
system "networksetup -setwebproxystate Wi-Fi off"
end
end
LibC.signal Signal::INT.value, ->(s : Int32) { exit }
Invalid memory access (signal 10) at address 0x10d3a8e00
[0x10d029b4b] *CallStack::print_backtrace:Int32 +107
[0x10d0100d5] __crystal_sigfault_handler +181
[0x7fff6c5b3b3d] _sigtramp +29
UPDATE
Here's the complete 'app' using Signal::INT.trap, for me running that will correctly turn on and off the OS X proxy settings but the loop will continue to run after the interrupt signal.
fork do
system "networksetup -setwebproxy Wi-Fi 127.0.0.1 4242"
end
Signal::INT.trap do
puts "trap"
fork do
system "networksetup -setwebproxystate Wi-Fi off"
end
exit
end
loop do
sleep 1
puts "foo"
end
You can use a Fibers?
spawn do
system "networksetup -setwebproxy Wi-Fi 127.0.0.1 4242"
end
sleep 0.1
Signal::INT.trap do
puts "trap"
spawn do
system "networksetup -setwebproxystate Wi-Fi off"
end
sleep 0.1
exit
end
loop do
sleep 1
puts "foo"
end
IMHO, the trouble is from crystal-lang's fork, which has some strange semantic meaning.
When you tried to start a working process to run system call, crystal duplicated the loop too...
And when exit is executed, the first loop exited, not the forked one.
To verify this, you can write some sleep into the fork and INT.trap block like this:
fork do
system "echo \"start\""
end
Signal::INT.trap do
puts "trap"
fork do
system "echo \"off\""
sleep 15
end
sleep 20
exit
end
loop do
sleep 1
puts "foo"
end
Then try to watch the result of ps command continuously.
Alternative approach has been answered by #Sergey Fedorov, using fiber.
Further reading: Process.fork has dangerous semantics
I have written a Qt5/C++ program which forks and runs in the background, and stops in response to a signal and shuts down normally. All sounds great, but when I "ps ax | grep myprog" I see a bunch of my programs still running; eg:
29244 ? Ss 149:47 /usr/local/myprog/myprog -q
30913 ? Ss 8:37 /usr/local/myprog/myprog -q
32484 ? Ss 0:11 /usr/local/myprog/myprog -q
If I run the program in the foreground then the process does NOT hang around on the process list - it dies off as expected. This only happens when in the background. Why?
Update: I found that my program is in futex_wait_queue_me state (queue_me and wait for wakeup, timeout, or signal). I do have 3 seperate threads - and that may be related. So I attached a debugger to one of the waiting processes and found this:
(gdb) bt
#0 0x000000372460b575 in pthread_cond_wait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8990fb454b in QWaitCondition::wait(QMutex*, unsigned long) ()
from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#2 0x00007f8990fb3b3e in QThread::wait(unsigned long) () from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#3 0x00007f8990fb0402 in QThreadPoolPrivate::reset() () from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#4 0x00007f8990fb0561 in QThreadPool::waitForDone(int) () from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#5 0x00007f89911a4261 in QMetaObject::activate(QObject*, int, int, void**) ()
from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#6 0x00007f89911a4d5f in QObject::destroyed(QObject*) () from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#7 0x00007f89911aa3ee in QObject::~QObject() () from /opt/Qt/5.1.1/gcc_64/lib/libQt5Core.so.5
#8 0x0000000000409d8b in main (argc=1, argv=0x7fffba44c8f8) at ../../src/main.cpp:27
(gdb)
Update:
I commented out my 2 threads, so only the main thread runs now, and the problem is the same.
Is there a special way to cause a background process to exit? Why won't the main thread shutdown?
Update:
Solved - Qt does not like Fork. (See another StackExchane questoin). I had to move my fork to the highest level (before Qt does anything), and then Qt doesn't hang on exit.
http://man7.org/linux/man-pages/man1/ps.1.html#PROCESS_STATE%20CODES
PROCESS STATE CODES
Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process:
D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped, either by a job control signal or because it is being traced.
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by its parent.
For BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group.
So your processes are all in "S: interruptible sleep." That is, they are all waiting for blocking syscalls.
You might have better hints on what your programs are waiting for from this command:
$ ps -o pid,stat,wchan `pidof zsh`
PID STAT WCHAN
4490 Ss rt_sigsuspend
4814 Ss rt_sigsuspend
4861 Ss rt_sigsuspend
4894 Ss+ n_tty_read
5744 Ss+ n_tty_read
...
"wchan (waiting channel)" shows a kernel function (=~ syscall) which is blocking.
See also
https://askubuntu.com/questions/19442/what-is-the-waiting-channel-of-a-process