/etc/init.d/celeryd start fail on AWS - django

Hi I've been reading a lot about this on this forums but I just don't have an idea of what's going wrong right now, looks like everything is ok, but just don't work
I set up my local configuration like this (/etc/default/celeryd):
# or we could have three nodes:
#CELERYD_NODES="w1 w2 w3"
# Absolute or relative path to the 'celery' command:
#CELERY_BIN="/usr/local/bin/celery"
CELERY_BIN="/home/ubuntu/.virtualenvs/wlenv/bin/celery"
# Where to chdir at start.
CELERYD_CHDIR="/var/www/DIR_TO_MANAGE.PY_FOLDER"
# Python interpreter from environment.
ENV_PYTHON="/home/ubuntu/.virtualenvs/wlenv/bin/python"
#ENV_PYTHON="/usr/bin/python2.7"
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="sec.settings"
# How to call "manage.py celeryd_multi"
CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
# Extra arguments to celeryd
CELERYD_OPTS="--time-limit 300 --concurrency=8"
# Name of the celery config module.
CELERY_CONFIG_MODULE="celeryconfig"
# %n will be replaced with the nodename.
CELERYD_LOG_FILE="/logs/celery/log/%n.log"
CELERYD_PID_FILE="/logs/celery/run/%n.pid"
# Workers should run as an unprivileged user.
CELERYD_USER="ubuntu"
CELERYD_GROUP="ubuntu"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
When I run /etc/init.d/celeryd start I get this:
celeryd-multi v3.0.9 (Chiastic Slide)
> Starting nodes...
> celery.ip-10-51-179-42: OK
> 300.ip-10-51-179-42: OK
But the workers are not running (/etc/init.d/celeryd status):
Error: No nodes replied within time constraint.
I read something about run like this (sh -x /etc/init.d/celeryd start) and find the error, most of the time is a file permissions error but I don't see nothing wrong
+ DEFAULT_PID_FILE=/logs/celery/run/celeryd#%n.pid
+ DEFAULT_LOG_FILE=/logs/celery/log/celeryd#%n.log
+ DEFAULT_LOG_LEVEL=INFO
+ DEFAULT_NODES=celery
+ DEFAULT_CELERYD=-m celery.bin.celeryd_detach
+ CELERY_DEFAULTS=/etc/default/celeryd
+ test -f /etc/default/celeryd
+ . /etc/default/celeryd
+ CELERY_BIN=/home/ubuntu/.virtualenvs/wlenv/bin/celery
+ CELERYD_CHDIR=/var/www/DIR_TO_MANAGE.PY_FOLDER
+ ENV_PYTHON=/home/ubuntu/.virtualenvs/wlenv/bin/python
+ export DJANGO_SETTINGS_MODULE=sec.settings
+ CELERYD_MULTI=/var/www/DIR_TO_MANAGE.PY_FOLDER/manage.py celeryd_multi
+ CELERYD_OPTS=--time-limit 300 --concurrency=8
+ CELERY_CONFIG_MODULE=celeryconfig
+ CELERYD_LOG_FILE=/logs/celery/log/%n.log
+ CELERYD_PID_FILE=/logs/celery/run/%n.pid
+ CELERYD_USER=ubuntu
+ CELERYD_GROUP=ubuntu
+ CELERY_CREATE_DIRS=1
+ [ -f /etc/default/celeryd ]
+ . /etc/default/celeryd
+ CELERY_BIN=/home/ubuntu/.virtualenvs/wlenv/bin/celery
+ CELERYD_CHDIR=/var/www/DIR_TO_MANAGE.PY_FOLDER
+ ENV_PYTHON=/home/ubuntu/.virtualenvs/wlenv/bin/python
+ export DJANGO_SETTINGS_MODULE=sec.settings
+ CELERYD_MULTI=/var/www/DIR_TO_MANAGE.PY_FOLDER/manage.py celeryd_multi
+ CELERYD_OPTS=--time-limit 300 --concurrency=8
+ CELERY_CONFIG_MODULE=celeryconfig
+ CELERYD_LOG_FILE=/logs/celery/log/%n.log
+ CELERYD_PID_FILE=/logs/celery/run/%n.pid
+ CELERYD_USER=ubuntu
+ CELERYD_GROUP=ubuntu
+ CELERY_CREATE_DIRS=1
+ CELERYD_PID_FILE=/logs/celery/run/%n.pid
+ CELERYD_LOG_FILE=/logs/celery/log/%n.log
+ CELERYD_LOG_LEVEL=INFO
+ CELERYD_MULTI=/var/www/DIR_TO_MANAGE.PY_FOLDER/manage.py celeryd_multi
+ CELERYD=-m celery.bin.celeryd_detach
+ CELERYCTL=celeryctl
+ CELERYD_NODES=celery
+ export CELERY_LOADER
+ [ -n ]
+ dirname /logs/celery/log/%n.log
+ CELERYD_LOG_DIR=/logs/celery/log
+ dirname /logs/celery/run/%n.pid
+ CELERYD_PID_DIR=/logs/celery/run
+ [ ! -d /logs/celery/log ]
+ [ ! -d /logs/celery/run ]
+ [ -n ubuntu ]
+ DAEMON_OPTS= --uid=ubuntu
+ chown ubuntu /logs/celery/log /logs/celery/run
+ [ -n ubuntu ]
+ DAEMON_OPTS= --uid=ubuntu --gid=ubuntu
+ chgrp ubuntu /logs/celery/log /logs/celery/run
+ [ -n /var/www/DIR_TO_MANAGE.PY_FOLDER/contracts ]
+ DAEMON_OPTS= --uid=ubuntu --gid=ubuntu --workdir="/var/www/DIR_TO_MANAGE.PY_FOLDER/contracts"
+ export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/sbin:/sbin
+ check_dev_null
+ [ ! -c /dev/null ]
+ check_paths
+ dirname /logs/celery/run/%n.pid
+ ensure_dir /logs/celery/run
+ [ -d /logs/celery/run ]
+ mkdir -p /logs/celery/run
+ chown ubuntu:ubuntu /logs/celery/run
+ chmod 02755 /logs/celery/run
+ dirname /logs/celery/log/%n.log
+ ensure_dir /logs/celery/log
+ [ -d /logs/celery/log ]
+ mkdir -p /logs/celery/log
+ chown ubuntu:ubuntu /logs/celery/log
+ chmod 02755 /logs/celery/log
+ start_workers
+ /var/www/DIR_TO_MANAGE.PY_FOLDER/manage.py celeryd_multi start celery --uid=ubuntu --gid=ubuntu --workdir="/var/www/DIR_TO_MANAGE.PY_FOLDER" --pidfile=/logs/celery/run/%n.pid --logfile=/logs/celery/log/%n.log --loglevel=INFO --cmd=-m celery.bin.celeryd_detach --time-limit 300 --concurrency=8
celeryd-multi v3.0.9 (Chiastic Slide)
> Starting nodes...
> celery.ip-10-51-179-42: OK
> 300.ip-10-51-179-42: OK
+ exit 0
Any ideas?

Which version of celery are you using?
When you debugged you used "C_FAKEFORK=1 sh -x /etc/init.d/celeryd start" (with C_FAKEFORK=1) right?
If you are using the version 3.x+ you dont need to use "manage.py celery" (djangp-celery) instead you have to use the "celery" command which come with celery itself.
Take a look to this part of the doc documentation.
Thanks!

Related

Install Crystal Linux Mint. Error: Could not locate compatible llvm-config

Following steps to install Crystal from source:
https://crystal-lang.org/install/from_sources/
Need to run make command that raise error:
Makefile:65: *** Could not locate compatible llvm-config, make sure it is installed and in your PATH, or set LLVM_CONFIG. Compatible versions: 12.0 11.1 11.0 10.0 9.0 8.0 7.1 6.0 5.0 4.0 3.9 3.8. Stop.
It's a known issue which can be solved this way:
heavy_check_mark from apt.llvm.org
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 11
Source: https://github.com/crystal-lang/crystal/issues/10557#issuecomment-810170295
But for Linux Mint 20:
lsb_release -a
No LSB modules are available.
Distributor ID: Linuxmint
Description: Linux Mint 20
Release: 20
Codename: ulyana
It's raises the error:
sudo ./llvm.sh 11
+ needed_binaries=(lsb_release wget add-apt-repository)
+ missing_binaries=()
+ for binary in "${needed_binaries[#]}"
+ which lsb_release
+ for binary in "${needed_binaries[#]}"
+ which wget
+ for binary in "${needed_binaries[#]}"
+ which add-apt-repository
+ [[ 0 -gt 0 ]]
+ LLVM_VERSION=13
+ '[' 1 -eq 1 ']'
+ LLVM_VERSION=11
++ lsb_release -is
+ DISTRO=Linuxmint
++ lsb_release -sr
+ VERSION=20
+ DIST_VERSION=Linuxmint_20
+ [[ 0 -ne 0 ]]
+ declare -A LLVM_VERSION_PATTERNS
+ LLVM_VERSION_PATTERNS[9]=-9
+ LLVM_VERSION_PATTERNS[10]=-10
+ LLVM_VERSION_PATTERNS[11]=-11
+ LLVM_VERSION_PATTERNS[12]=-12
+ LLVM_VERSION_PATTERNS[13]=-13
+ LLVM_VERSION_PATTERNS[14]=
+ '[' '!' _ ']'
+ LLVM_VERSION_STRING=-11
+ case "$DIST_VERSION" in
+ echo 'Distribution '\''Linuxmint'\'' in version '\''20'\'' is not supported by this script (Linuxmint_20).'
Distribution 'Linuxmint' in version '20' is not supported by this script (Linuxmint_20).
+ exit 2
Appreciate any advice to solve this issue?
According to the documentation source, it's better to use version 8.0 as the latest supported:
Even though use the proper version (8.0) is good, the script from https://apt.llvm.org/llvm.sh still raise an error:
sudo ./llvm.sh 8
+ needed_binaries=(lsb_release wget add-apt-repository)
+ missing_binaries=()
+ for binary in "${needed_binaries[#]}"
+ which lsb_release
+ for binary in "${needed_binaries[#]}"
+ which wget
+ for binary in "${needed_binaries[#]}"
+ which add-apt-repository
+ [[ 0 -gt 0 ]]
+ LLVM_VERSION=13
+ '[' 1 -eq 1 ']'
+ LLVM_VERSION=8
++ lsb_release -is
+ DISTRO=Linuxmint
++ lsb_release -sr
+ VERSION=20
+ DIST_VERSION=Linuxmint_20
+ [[ 0 -ne 0 ]]
+ declare -A LLVM_VERSION_PATTERNS
+ LLVM_VERSION_PATTERNS[9]=-9
+ LLVM_VERSION_PATTERNS[10]=-10
+ LLVM_VERSION_PATTERNS[11]=-11
+ LLVM_VERSION_PATTERNS[12]=-12
+ LLVM_VERSION_PATTERNS[13]=-13
+ LLVM_VERSION_PATTERNS[14]=
+ '[' '!' ']'
+ echo 'This script does not support LLVM version 8'
This script does not support LLVM version 8
+ exit 3
The solution is use native sudo apt-get install lldb-8 command to install the required lib.
Now make command passed:
make
Using /usr/bin/llvm-config-8 [version=8.0.1]
g++ -c -o src/llvm/ext/llvm_ext.o src/llvm/ext/llvm_ext.cc -I/usr/lib/llvm-8/include -std=c++11 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
In file included from /usr/lib/llvm-8/include/llvm/IR/DIBuilder.h:18,
from src/llvm/ext/llvm_ext.cc:1:
/usr/lib/llvm-8/include/llvm/ADT/ArrayRef.h: In instantiation of ‘llvm::ArrayRef<T>::ArrayRef(const std::initializer_list<_Tp>&) [with T = long unsigned int]’:
/usr/lib/llvm-8/include/llvm/IR/DIBuilder.h:645:74: required from here
/usr/lib/llvm-8/include/llvm/ADT/ArrayRef.h:102:37: warning: initializing ‘llvm::ArrayRef<long unsigned int>::Data’ from ‘std::initializer_list<long unsigned int>::begin’ does not extend the lifetime of the underlying array [-Winit-list-lifetime]
102 | : Data(Vec.begin() == Vec.end() ? (T*)nullptr : Vec.begin()),
CRYSTAL_CONFIG_BUILD_COMMIT="cc0b8d1f0" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1632497766" CRYSTAL_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib

Running supervisord in AWS Environment

I'm working on adding Django Channels on my elastic beanstalk enviorment, but running into trouble configuring supervisord. Specifically, in /.ebextensions I have a file channels.config with this code:
container_commands:
01_copy_supervisord_conf:
command: "cp .ebextensions/supervisord/supervisord.conf /opt/python/etc/supervisord.conf"
02_reload_supervisord:
command: "supervisorctl -c /opt/python/etc/supervisord.conf reload"
This errors on the 2nd command with the following error message, through the elastic beanstalk CLI:
Command failed on instance. Return code: 1 Output: error: <class
'FileNotFoundError'>, [Errno 2] No such file or directory:
file: /opt/python/run/venv/local/lib/python3.4/site-
packages/supervisor/xmlrpc.py line: 562.
container_command 02_reload_supervisord in
.ebextensions/channels.config failed.
My guess would be supervisor didn't install correctly, but because command 1 copies the files without an error, that leads me to think supervisor is indeed installed and I have an issue with the container command. Has anyone implemented supervisor in an AWS environment and can see where I'm going wrong?
You should be careful about python versions and exact installation paths ,
Here is how did it maybe it can help
packages:
yum:
python27-setuptools: []
container_commands:
01-supervise:
command: ".ebextensions/supervise.sh"
Here is the supervise.sh
#!/bin/bash
if [ "${SUPERVISE}" == "enable" ]; then
export HOME="/root"
export PATH="/sbin:/bin:/usr/sbin:/usr/bin:/opt/aws/bin"
easy_install supervisor
cat <<'EOB' > /etc/init.d/supervisord
# Source function library
. /etc/rc.d/init.d/functions
# Source system settings
if [ -f /etc/sysconfig/supervisord ]; then
. /etc/sysconfig/supervisord
fi
# Path to the supervisorctl script, server binary,
# and short-form for messages.
supervisorctl=${SUPERVISORCTL-/usr/bin/supervisorctl}
supervisord=${SUPERVISORD-/usr/bin/supervisord}
prog=supervisord
pidfile=${PIDFILE-/var/run/supervisord.pid}
lockfile=${LOCKFILE-/var/lock/subsys/supervisord}
STOP_TIMEOUT=${STOP_TIMEOUT-60}
OPTIONS="${OPTIONS--c /etc/supervisord.conf}"
RETVAL=0
start() {
echo -n $"Starting $prog: "
daemon --pidfile=${pidfile} $supervisord $OPTIONS
RETVAL=$?
echo
if [ $RETVAL -eq 0 ]; then
touch ${lockfile}
$supervisorctl $OPTIONS status
fi
return $RETVAL
}
stop() {
echo -n $"Stopping $prog: "
killproc -p ${pidfile} -d ${STOP_TIMEOUT} $supervisord
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && rm -rf ${lockfile} ${pidfile}
}
reload() {
echo -n $"Reloading $prog: "
LSB=1 killproc -p $pidfile $supervisord -HUP
RETVAL=$?
echo
if [ $RETVAL -eq 7 ]; then
failure $"$prog reload"
else
$supervisorctl $OPTIONS status
fi
}
restart() {
stop
start
}
case "$1" in
start)
start
;;
stop)
stop
;;
status)
status -p ${pidfile} $supervisord
RETVAL=$?
[ $RETVAL -eq 0 ] && $supervisorctl $OPTIONS status
;;
restart)
restart
;;
condrestart|try-restart)
if status -p ${pidfile} $supervisord >&/dev/null; then
stop
start
fi
;;
force-reload|reload)
reload
;;
*)
echo $"Usage: $prog {start|stop|restart|condrestart|try-restart|force-reload|reload}"
RETVAL=2
esac
exit $RETVAL
EOB
chmod +x /etc/init.d/supervisord
cat <<'EOB' > /etc/sysconfig/supervisord
# Configuration file for the supervisord service
#
# Author: Jason Koppe <jkoppe#indeed.com>
# orginal work
# Erwan Queffelec <erwan.queffelec#gmail.com>
# adjusted to new LSB-compliant init script
# make sure elasticbeanstalk PARAMS are being passed through to supervisord
. /opt/elasticbeanstalk/support/envvars
# WARNING: change these wisely! for instance, adding -d, --nodaemon
# here will lead to a very undesirable (blocking) behavior
#OPTIONS="-c /etc/supervisord.conf"
PIDFILE=/var/run/supervisord/supervisord.pid
#LOCKFILE=/var/lock/subsys/supervisord.pid
# Path to the supervisord binary
SUPERVISORD=/usr/local/bin/supervisord
# Path to the supervisorctl binary
SUPERVISORCTL=/usr/local/bin/supervisorctl
# How long should we wait before forcefully killing the supervisord process ?
#STOP_TIMEOUT=60
# Remove this if you manage number of open files in some other fashion
#ulimit -n 96000
EOB
mkdir -p /var/run/supervisord/
chown webapp: /var/run/supervisord/
cat <<'EOB' > /etc/supervisord.conf
[unix_http_server]
file=/tmp/supervisor.sock
chmod=0777
[supervisord]
logfile=/var/app/support/logs/supervisord.log
logfile_maxbytes=0
logfile_backups=0
loglevel=warn
pidfile=/var/run/supervisord/supervisord.pid
nodaemon=false
nocleanup=true
user=webapp
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock
[program:process-ipn-api-gpsfsoft]
command = -- command that u want to run ---
directory = /var/app/current/
user = webapp
autorestart = true
startsecs = 0
numprocs = 10
process_name = -- process name that u want ---
EOB
# this is now a little tricky, not officially documented, so might break but it is the cleanest solution
# first before the "flip" is done (e.g. switch between ondeck vs current) lets stop supervisord
echo -e '#!/usr/bin/env bash\nservice supervisord stop' > /opt/elasticbeanstalk/hooks/appdeploy/enact/00_stop_supervisord.sh
chmod +x /opt/elasticbeanstalk/hooks/appdeploy/enact/00_stop_supervisord.sh
# then right after the webserver is reloaded, we can start supervisord again
echo -e '#!/usr/bin/env bash\nservice supervisord start' > /opt/elasticbeanstalk/hooks/appdeploy/enact/99_z_start_supervisord.sh
chmod +x /opt/elasticbeanstalk/hooks/appdeploy/enact/99_z_start_supervisord.sh
fi
PS: You have define SUPERVISE as Enable in Elasticbeanstalk environment value to get this run.

STDERR output from Hadoop, this does mean some issue?

I'm using Mrjob-Hadoop with Python2.7, Ubuntu 14.04 and I had the following screen output:
no configs found; falling back on auto-configuration
no configs found; falling back on auto-configuration
creating tmp directory /tmp/word-document.hduser.20160122.065849.953886
writing wrapper script to /tmp/word-document.hduser.20160122.065849.953886/setup-wrapper.sh
PLEASE NOTE: Starting in mrjob v0.5.0, protocols will be strict by default. It's recommended you run your job with --strict-protocols or set up mrjob.conf as described at https://pythonhosted.org/mrjob/whats-new.html#ready-for-strict-protocols
writing to /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper_part-00000
> sh -ex setup-wrapper.sh /usr/bin/python word-document.py --step-num=0 --mapper /tmp/word-document.hduser.20160122.065849.953886/input_part-00000 > /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper_part-00000
writing to /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper_part-00001
> sh -ex setup-wrapper.sh /usr/bin/python word-document.py --step-num=0 --mapper /tmp/word-document.hduser.20160122.065849.953886/input_part-00001 > /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper_part-00001
STDERR: + __mrjob_PWD=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/mapper/0
STDERR: + exec
STDERR: + /usr/bin/python -c import fcntl; fcntl.flock(9, fcntl.LOCK_EX)
STDERR: + export PYTHONPATH=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/mapper/0/mrjob.tar.gz:/home/ignacio/shogun-install/lib/python2.7/dist-packages:/home/ignacio/shogun/examples/undocumented/python_modular:
STDERR: + exec
STDERR: + cd /tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/mapper/0
STDERR: + /usr/bin/python word-document.py --step-num=0 --mapper /tmp/word-document.hduser.20160122.065849.953886/input_part-00000
STDERR: + __mrjob_PWD=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/mapper/1
STDERR: + exec
STDERR: + /usr/bin/python -c import fcntl; fcntl.flock(9, fcntl.LOCK_EX)
STDERR: + export PYTHONPATH=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/mapper/1/mrjob.tar.gz:/home/ignacio/shogun-install/lib/python2.7/dist-packages:/home/ignacio/shogun/examples/undocumented/python_modular:
STDERR: + exec
STDERR: + cd /tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/mapper/1
STDERR: + /usr/bin/python word-document.py --step-num=0 --mapper /tmp/word-document.hduser.20160122.065849.953886/input_part-00001
Counters from step 1:
(no counters found)
writing to /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper-sorted
> sort /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper_part-00000 /tmp/word-document.hduser.20160122.065849.953886/step-0-mapper_part-00001
writing to /tmp/word-document.hduser.20160122.065849.953886/step-0-reducer_part-00000
> sh -ex setup-wrapper.sh /usr/bin/python word-document.py --step-num=0 --reducer /tmp/word-document.hduser.20160122.065849.953886/input_part-00000 > /tmp/word-document.hduser.20160122.065849.953886/step-0-reducer_part-00000
writing to /tmp/word-document.hduser.20160122.065849.953886/step-0-reducer_part-00001
> sh -ex setup-wrapper.sh /usr/bin/python word-document.py --step-num=0 --reducer /tmp/word-document.hduser.20160122.065849.953886/input_part-00001 > /tmp/word-document.hduser.20160122.065849.953886/step-0-reducer_part-00001
STDERR: + __mrjob_PWD=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/reducer/0
STDERR: + exec
STDERR: + /usr/bin/python -c import fcntl; fcntl.flock(9, fcntl.LOCK_EX)
STDERR: + export PYTHONPATH=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/reducer/0/mrjob.tar.gz:/home/ignacio/shogun-install/lib/python2.7/dist-packages:/home/ignacio/shogun/examples/undocumented/python_modular:
STDERR: + exec
STDERR: + cd /tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/reducer/0
STDERR: + /usr/bin/python word-document.py --step-num=0 --reducer /tmp/word-document.hduser.20160122.065849.953886/input_part-00000
STDERR: + __mrjob_PWD=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/reducer/1
STDERR: + exec
STDERR: + /usr/bin/python -c import fcntl; fcntl.flock(9, fcntl.LOCK_EX)
STDERR: + export PYTHONPATH=/tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/reducer/1/mrjob.tar.gz:/home/ignacio/shogun-install/lib/python2.7/dist-packages:/home/ignacio/shogun/examples/undocumented/python_modular:
STDERR: + exec
STDERR: + cd /tmp/word-document.hduser.20160122.065849.953886/job_local_dir/0/reducer/1
STDERR: + /usr/bin/python word-document.py --step-num=0 --reducer /tmp/word-document.hduser.20160122.065849.953886/input_part-00001
Counters from step 1:
(no counters found)
Moving /tmp/word-document.hduser.20160122.065849.953886/step-0-reducer_part-00000 -> /tmp/word-document.hduser.20160122.065849.953886/output/part-00000
Moving /tmp/word-document.hduser.20160122.065849.953886/step-0-reducer_part-00001 -> /tmp/word-document.hduser.20160122.065849.953886/output/part-00001
Streaming final output from /tmp/word-document.hduser.20160122.065849.953886/output
removing tmp directory /tmp/word-document.hduser.20160122.065849.953886
Could you say if there is some problem? I mean, the jobs finished but the STDERR: keywords are noisy to me.
Thank you in advance.
Looks like your job isn't generating any output. Could you please post word-document.py and your input data?

celeryd and celerybeat pid files are not being created, workers are not starting, but output says OK

I setted up celeryd and celerybeat as daemons, and they worked until not too long. But since some time, it wont start workers and not create pid files.
Here is my /etc/default/celeryd:
# Name of nodes to start
CELERYD_NODES="w1 w2 w3 w4 w5 w6 w7 w8"
# Extra arguments to celeryd
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# Where to chdir at start.
CELERYD_CHDIR="/srv/www/web-system/myproject"
# %n will be replaced with the nodename.
#CELERYD_LOG_FILE="/var/log/celery/%n.log"
#CELERYD_PID_FILE="/var/run/celery/%n.pid"
CELERYD_LOG_FILE="/srv/www/web-system/logs/celery/%n.log"
CELERYD_PID_FILE="/srv/www/web-system/pids/celery/%n.pid"
# Log level to use for celeryd. Default is INFO.
CELERYD_LOG_LEVEL="INFO"
# How to call "manage.py celeryd_multi"
CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
# How to call "manage.py celeryctl"
CELERYCTL="$CELERYD_CHDIR/manage.py celeryctl"
# Workers should run as an unprivileged user.
#CELERYD_USER="celery"
#CELERYD_GROUP="celery"
CELERYD_USER="myuser"
CELERYD_GROUP="myuser"
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="myproject.settings"
and here's the init.d script:
#!/bin/sh -e
# ============================================
# celeryd - Starts the Celery worker daemon.
# ============================================
#
# :Usage: /etc/init.d/celeryd {start|stop|force-reload|restart|try-restart|status}
# :Configuration file: /etc/default/celeryd
#
# See http://docs.celeryproject.org/en/latest/tutorials/daemonizing.html#generic-init-scripts
### BEGIN INIT INFO
# Provides: celeryd
# Required-Start: $network $local_fs $remote_fs
# Required-Stop: $network $local_fs $remote_fs
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: celery task worker daemon
### END INIT INFO
# some commands work asyncronously, so we'll wait this many seconds
SLEEP_SECONDS=5
DEFAULT_PID_FILE="/var/run/celery/%n.pid"
DEFAULT_LOG_FILE="/var/log/celery/%n.log"
DEFAULT_LOG_LEVEL="INFO"
DEFAULT_NODES="celery"
DEFAULT_CELERYD="-m celery.bin.celeryd_detach"
CELERY_DEFAULTS=${CELERY_DEFAULTS:-"/etc/default/celeryd"}
test -f "$CELERY_DEFAULTS" && . "$CELERY_DEFAULTS"
# Set CELERY_CREATE_DIRS to always create log/pid dirs.
CELERY_CREATE_DIRS=${CELERY_CREATE_DIRS:-0}
CELERY_CREATE_RUNDIR=$CELERY_CREATE_DIRS
CELERY_CREATE_LOGDIR=$CELERY_CREATE_DIRS
if [ -z "$CELERYD_PID_FILE" ]; then
CELERYD_PID_FILE="$DEFAULT_PID_FILE"
CELERY_CREATE_RUNDIR=1
fi
if [ -z "$CELERYD_LOG_FILE" ]; then
CELERYD_LOG_FILE="$DEFAULT_LOG_FILE"
CELERY_CREATE_LOGDIR=1
fi
CELERYD_LOG_LEVEL=${CELERYD_LOG_LEVEL:-${CELERYD_LOGLEVEL:-$DEFAULT_LOG_LEVEL}}
CELERYD_MULTI=${CELERYD_MULTI:-"celeryd-multi"}
CELERYD=${CELERYD:-$DEFAULT_CELERYD}
CELERYD_NODES=${CELERYD_NODES:-$DEFAULT_NODES}
export CELERY_LOADER
if [ -n "$2" ]; then
CELERYD_OPTS="$CELERYD_OPTS $2"
fi
CELERYD_LOG_DIR=`dirname $CELERYD_LOG_FILE`
CELERYD_PID_DIR=`dirname $CELERYD_PID_FILE`
# Extra start-stop-daemon options, like user/group.
if [ -n "$CELERYD_USER" ]; then
DAEMON_OPTS="$DAEMON_OPTS --uid=$CELERYD_USER"
fi
if [ -n "$CELERYD_GROUP" ]; then
DAEMON_OPTS="$DAEMON_OPTS --gid=$CELERYD_GROUP"
fi
if [ -n "$CELERYD_CHDIR" ]; then
DAEMON_OPTS="$DAEMON_OPTS --workdir=$CELERYD_CHDIR"
fi
check_dev_null() {
if [ ! -c /dev/null ]; then
echo "/dev/null is not a character device!"
exit 75 # EX_TEMPFAIL
fi
}
maybe_die() {
if [ $? -ne 0 ]; then
echo "Exiting: $* (errno $?)"
exit 77 # EX_NOPERM
fi
}
create_default_dir() {
if [ ! -d "$1" ]; then
echo "- Creating default directory: '$1'"
mkdir -p "$1"
maybe_die "Couldn't create directory $1"
echo "- Changing permissions of '$1' to 02755"
chmod 02755 "$1"
maybe_die "Couldn't change permissions for $1"
if [ -n "$CELERYD_USER" ]; then
echo "- Changing owner of '$1' to '$CELERYD_USER'"
chown "$CELERYD_USER" "$1"
maybe_die "Couldn't change owner of $1"
fi
if [ -n "$CELERYD_GROUP" ]; then
echo "- Changing group of '$1' to '$CELERYD_GROUP'"
chgrp "$CELERYD_GROUP" "$1"
maybe_die "Couldn't change group of $1"
fi
fi
}
check_paths() {
if [ $CELERY_CREATE_LOGDIR -eq 1 ]; then
create_default_dir "$CELERYD_LOG_DIR"
fi
if [ $CELERY_CREATE_RUNDIR -eq 1 ]; then
create_default_dir "$CELERYD_PID_DIR"
fi
}
create_paths() {
create_default_dir "$CELERYD_LOG_DIR"
create_default_dir "$CELERYD_PID_DIR"
}
export PATH="${PATH:+$PATH:}/usr/sbin:/sbin"
_get_pid_files() {
[ ! -d "$CELERYD_PID_DIR" ] && return
echo `ls -1 "$CELERYD_PID_DIR"/*.pid 2> /dev/null`
}
stop_workers () {
$CELERYD_MULTI stopwait $CELERYD_NODES --pidfile="$CELERYD_PID_FILE"
sleep $SLEEP_SECONDS
}
start_workers () {
$CELERYD_MULTI start $CELERYD_NODES $DAEMON_OPTS \
--pidfile="$CELERYD_PID_FILE" \
--logfile="$CELERYD_LOG_FILE" \
--loglevel="$CELERYD_LOG_LEVEL" \
--cmd="$CELERYD" \
$CELERYD_OPTS
sleep $SLEEP_SECONDS
}
restart_workers () {
$CELERYD_MULTI restart $CELERYD_NODES $DAEMON_OPTS \
--pidfile="$CELERYD_PID_FILE" \
--logfile="$CELERYD_LOG_FILE" \
--loglevel="$CELERYD_LOG_LEVEL" \
--cmd="$CELERYD" \
$CELERYD_OPTS
sleep $SLEEP_SECONDS
}
check_status () {
local pid_files=
pid_files=`_get_pid_files`
[ -z "$pid_files" ] && echo "celeryd not running (no pidfile)" && exit 1
local one_failed=
for pid_file in $pid_files; do
local node=`basename "$pid_file" .pid`
local pid=`cat "$pid_file"`
local cleaned_pid=`echo "$pid" | sed -e 's/[^0-9]//g'`
if [ -z "$pid" ] || [ "$cleaned_pid" != "$pid" ]; then
echo "bad pid file ($pid_file)"
else
local failed=
kill -0 $pid 2> /dev/null || failed=true
if [ "$failed" ]; then
echo "celeryd (node $node) (pid $pid) is stopped, but pid file exists!"
one_failed=true
else
echo "celeryd (node $node) (pid $pid) is running..."
fi
fi
done
[ "$one_failed" ] && exit 1 || exit 0
}
case "$1" in
start)
check_dev_null
check_paths
start_workers
;;
stop)
check_dev_null
check_paths
stop_workers
;;
reload|force-reload)
echo "Use restart"
;;
status)
check_status
;;
restart)
check_dev_null
check_paths
restart_workers
;;
try-restart)
check_dev_null
check_paths
restart_workers
;;
create-paths)
check_dev_null
create_paths
;;
check-paths)
check_dev_null
check_paths
;;
*)
echo "Usage: /etc/init.d/celeryd {start|stop|restart|kill|create-paths}"
exit 64 # EX_USAGE
;;
esac
exit 0
Also, I executed the init script with the following command: sh -x /etc/init.d/celeryd start, as suggested in the documentation, and this is the output:
# sh -x /etc/init.d/celeryd start
+ SLEEP_SECONDS=5
+ DEFAULT_PID_FILE=/var/run/celery/%n.pid
+ DEFAULT_LOG_FILE=/var/log/celery/%n.log
+ DEFAULT_LOG_LEVEL=INFO
+ DEFAULT_NODES=celery
+ DEFAULT_CELERYD=-m celery.bin.celeryd_detach
+ CELERY_DEFAULTS=/etc/default/celeryd
+ test -f /etc/default/celeryd
+ . /etc/default/celeryd
+ CELERYD_NODES=w1 w2 w3 w4 w5 w6 w7 w8
+ CELERYD_OPTS=--time-limit=300 --concurrency=8
+ CELERYD_CHDIR=/srv/www/web-system/myproject
+ CELERYD_LOG_FILE=/srv/www/web-system/logs/celery/%n.log
+ CELERYD_PID_FILE=/srv/www/web-system/pids/celery/%n.pid
+ CELERYD_LOG_LEVEL=INFO
+ CELERYD_MULTI=/srv/www/web-system/myproject/manage.py celeryd_multi
+ CELERYCTL=/srv/www/web-system/myproject/manage.py celeryctl
+ CELERYD_USER=myproject
+ CELERYD_GROUP=myproject
+ export DJANGO_SETTINGS_MODULE=myproject.settings
+ CELERY_CREATE_DIRS=0
+ CELERY_CREATE_RUNDIR=0
+ CELERY_CREATE_LOGDIR=0
+ [ -z /srv/www/sistema-web/pids/celery/%n.pid ]
+ [ -z /srv/www/sistema-web/logs/celery/%n.log ]
+ CELERYD_LOG_LEVEL=INFO
+ CELERYD_MULTI=/srv/www/web-system/myproject/manage.py celeryd_multi
+ CELERYD=-m celery.bin.celeryd_detach
+ CELERYD_NODES=w1 w2 w3 w4 w5 w6 w7 w8
+ export CELERY_LOADER
+ [ -n ]
+ dirname /srv/www/web-system/logs/celery/%n.log
+ CELERYD_LOG_DIR=/srv/www/web-system/logs/celery
+ dirname /srv/www/web-system/pids/celery/%n.pid
+ CELERYD_PID_DIR=/srv/www/web-system/pids/celery
+ [ -n yougrups ]
+ DAEMON_OPTS= --uid=myprojects
+ [ -n yougrups ]
+ DAEMON_OPTS= --uid=myprojects --gid=myprojects
+ [ -n /srv/www/web-system/myprojects ]
+ DAEMON_OPTS= --uid=myproject --gid=myproject --workdir=/srv/www/web-system/myproject
+ export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/sbin:/sbin
+ check_dev_null
+ [ ! -c /dev/null ]
+ check_paths
+ [ 0 -eq 1 ]
+ [ 0 -eq 1 ]
+ start_workers
+ /srv/www/web-system/myproject/manage.py celeryd_multi start w1 w2 w3 w4 w5 w6 w7 w8 --uid=myproject --gid=myproject --workdir=/srv/www/web-system/myproject --pidfile=/srv/www/web-system/pids/celery/%n.pid --logfile=/srv/www/web-system/logs/celery/%n.log --loglevel=INFO --cmd=-m celery.bin.celeryd_detach --time-limit=300 --concurrency=8
celeryd-multi v3.0.21 (Chiastic Slide)
> Starting nodes...
> w1.myproject: OK
> w2.myproject: OK
> w3.myproject: OK
> w4.myproject: OK
> w5.myproject: OK
> w6.myproject: OK
> w7.myproject: OK
> w8.myproject: OK
+ sleep 5
+ exit 0
Then, when I check the pids dir, it is empty, and ps aux says there are no active process about it. There is nothing in the logs either. I'm not using virtualenv. It just stopped working. The version of django-celery is 3.0.21. Here's my wsgi script:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import sys
path = '/srv/www/web-system/'
if path not in sys.path:
sys.path.append(path)
sys.path.append(path + 'myproject/')
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
import djcelery
djcelery.setup_loader()
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
And this is my djcelery associated settings:
# For Celery & RabbitMQ:
# Create user and vhost for RabbitMQ with the following commands:
#
# $ sudo rabbitmqctl add_user myproject <mypassword>
# $ sudo rabbitmqctl add_vhost myproject
# $ sudo rabbitmqctl set_permissions -p myproject myproject ".*" ".*" ".*"
# Format: amqp://user:password#host:port/vhost
BROKER_URL = 'amqp://myproject:mypassword#localhost:5672/myproject'
import djcelery
djcelery.setup_loader()
Please, any suggestion would be really appreciated!!!! thanks in advance...
There's probably an error in your code. Try running it manually using
celery worker -A appname
If it throws an error, then you know that's whats wrong with it.
It most likely has to do with memory on your system
Info Logs
[2017-08-02 10:00:32,004: CRITICAL/MainProcess] Unrecoverable error: OSError(12, 'Cannot allocate memory')
Traceback (most recent call last)
I was just debugging mine thanks to #Adriaan

Issue in running system commands from python script continuously?

I have a python script that checks continuously for snmpd and a socket script to be running .If any of these get killed it should kill both and start new session. The problem
is once socket is running it wait for connection for long ,in between if anyone kill snmpd its not getting started (think its not going to loop back).What may be the reason and a possible solution.? any optimisation possible for the code?
def terminator():
i=0
j=0
os.system("ps -eaf|grep snmpd|cut -d \" \" -f7 >snmpd_pid.txt")
os.system("ps -eaf|grep iperf|cut -d \" \" -f7 >iperf_pid.txt")
os.system("ps -eaf|grep sock_bg.py|cut -d \" \" -f7 >script_pid.txt")
snmpd_pids = tuple(line.strip() for line in open('snmpd_pid.txt'))
iperf_pids = tuple(line.strip() for line in open('iperf_pid.txt'))
script_pids = tuple(line.strip() for line in open('script_pid.txt'))
k1 = len(snmpd_pids) - 2
k2 = len(iperf_pids) - 2
k3 = len(script_pids) - 2
if (k1 == 0 or k3 == 0):
for i in range(k1):
cmd = 'kill -9 %s' %(snmpd_pids[i])
os.system(cmd)
for i in range(k2):
cmd = 'kill -9 %s' %(iperf_pids[i])
os.system(cmd)
for i in range(k3):
cmd = 'kill -9 %s' %(script_pids[i])
os.system(cmd)
os.system("/usr/local/sbin/snmpd -f -L -d -p 9999")
os.system("python /home/maxuser/utils/python-bg/sock_bg.py")
try:
terminator()
except:
print 'an exception occured'
I found the answer ,its the problem of getting the prompt back .
I used screen -d -m option and now able to get intended result.
os.system("screen -d -m /usr/local/sbin/snmpd -f -L -d -p 9999 &")
os.system("screen -d -m python /home/maxuser/utils/python-bg/sock_bg.py &")
Also those system commands need to be inside if condition.