Why are my environment variables not detected when starting up celery?

Why are my environment variables not detected when starting up celery? - django

I am running django on centos served by apache and mod_wsgi. I followed the instructions to set up celery to be run as a daemon.
I put this init script https://github.com/celery/celery/blob/3.1/extra/generic-init.d/celeryd in /etc/init.d/celeryd
and set up the configuration in
/etc/default/celeryd
I am using environment variables in my django settings.py file so I can use different configurations in my development and production environments. I know these environment variables are set correctly because the app has been working this whole time. I think that celery is just not getting the variable passed to it or something.
I checked by typing the env command. variables are showing fine.
To start up I just do:
service celeryd start
It tries to start up but throws an error saying that I do not have my environment variables set.
I wrote a function to grab environment variables. that is what throws the error.
def get_env_variable(var_name):
try:
return os.environ[var_name]
except KeyError:
error_msg = "Set the %s environment variable" % var_name
raise ImproperlyConfigured(error_msg)
The only way that error is thrown is if the environment variable is not set correctly.
Does anyone know why celery is not detecting the enironment variables that I have set?

I just discovered that I not only had to set my environment variables in the system, but I also had to pass those variables in to the /etc/default/celleryd script.
I just put my variables at the bottom of /etc/default/celleryd:
export MY_SPECIAL_VARIABLE = "my production variable"
export MY_OTHERSPECIAL_VARIABLE = "my other production variable"

if environment variables write on ~/.bashrc, you can add source ~/.bashrc to /etc/init.d/celeryd at first.

Does your /etc/default/celeryd define what user celery should run as?
In mine I have:
CELERYD_USER="celery"
CELERYD_GROUP="celery"
Can you post your /etc/defaults/celeryd config file?

I had the same problem using celery and supervisor, I had supervisord to use a shell script to start celery worker and also source the env variables.
#!/bin/bash
source ~/.profile
CELERY_LOGFILE=/usr/local/src/imbue/application/imbue/log/celeryd.log
CELERYD_OPTS=" --loglevel=INFO --autoscale=10,5 --concurrency=8"
cd /usr/local/src/imbue/application/imbue/conf
exec celery worker -n celeryd#%h -f $CELERY_LOGFILE $CELERYD_OPTS
in ~/.profile:
export C_FORCE_ROOT="true"
export KEY="DEADBEEF"

Related

Celery and Celerybeat are running, but don't run tasks

I'v already checked my code on local server and I'm sure everything is ok on my code. So it seems something is wrong on server configuration. I have a linux server (Ubuntu 16.04) and installed nginx, redis , ...Also I created configuration files for celery and celerybeat as below:
/etc/init.d/celeryd
/etc/default/celeryd
/etc/init.d/celerybeat
/etc/default/celerybeat
I checked their status ,both of them are running but when I check beat.log it doesn't do anything and only shows 'starting ...'
celeryd file:
# Names of nodes to start
CELERYD_NODES="worker"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/home/amirali/AwesomeApp/awesome_env/bin/celery"
# App instance to use
CELERY_APP="AwesomeApp"
# Where to chdir at start. Where your manage.py is...
CELERYD_CHDIR="/home/amirali/AwesomeApp"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 -Ofair --concurrency=8"
# Set logging level to DEBUG
CELERYD_LOG_LEVEL="INFO"
# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists (e.g., nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
celerybeat file:
File: /etc/default/celerybeat
CELERYBEAT_LOG_LEVEL="info"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/home/amirali/AwesomeApp/awesome_env/bin/celery"
CELERYBEAT_USER="celery"
CELERYBEAT_GROUP="celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="AwesomeApp"
# or fully qualified:
#CELERY_APP="proj.tasks:app"
# Where to chdir at start.
CELERYBEAT_CHDIR="/home/amirali/AwesomeApp"
# Extra arguments to celerybeat
CELERYBEAT_OPTS="--schedule=/var/run/celery/celerybeat-schedule"
export DJANGO_SETTINGS_MODULE="AwesomeApp.settings"

When we had to implement celery periodic tasks, it turned out celery-beat did not work properly, it had just stopping launch tasks in some moment.
After some tests we decided do not waste our time on it anymore and rely on linux crontab utility

Running celery as daemon does not create PID file

I have been scratching my brains on this one since past few days, I have seen other issues on stackoverflow (as it is a duplicate question) and I have tried everything to make this work, the workers are running fine but the celery is not starting up as a process.
I run the command:
sudo service celeryd start
and I get:
celery init v10.1.
Using config script: /etc/default/celeryd
celery multi v3.1.23 (Cipater)
> Starting nodes...
> worker1#ip-172-31-21-215: OK
I run:
sudo service celeryd status
and I get:
celery init v10.1.
Using config script: /etc/default/celeryd
celeryd down: no pidfiles found
The celeryd down: no pidfiles found error is what I need to resolve.
I know this question is a duplicate one but still go along with me on this one because I have tried all of them and still unable to get it resolved.
I am deploying this script on Amazon Web Services. I am using a virtual environment.
The init.d script is taken directly from the here and then I gave it the required permissions.
Here is my configuration file:
# Names of nodes to start
# most people will only start one node:
CELERYD_NODES="worker1"
# but you can also start multiple and configure settings
# for each in CELERYD_OPTS (see `celery multi --help` for examples):
#CELERYD_NODES="worker1 worker2 worker3"
# alternatively, you can specify the number of nodes to start:
#CELERYD_NODES=10
# Absolute or relative path to the 'celery' command:
# CELERY_BIN="/usr/local/bin/celery"
CELERY_BIN="/home/<user>/.virtualenvs/<virtualenv_name>/bin/celery"
# App instance to use
# comment out this line if you don't use an app
# CELERY_APP="proj"
# or fully qualified:
CELERY_APP="<project_name>.settings:app"
# Where to chdir at start.
CELERYD_CHDIR="/home/<user>/projects/<project_name>/"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists, e.g. nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
I have used the process to create the celery user using this article.
My project is a Django project and I have specified the DJANGO_SETTINGS_MODULE environment variable in the celery setting file as specified in the documentation and also in the stackoverflow answer.
Do I need to change anything in the init.d script or anything else that needs to be added in the celery configuration file... Is it about the celery user that I have created because I also tried specifying
CELERYD_USER = ""
CELERYD_GROUP = ""
while also changing the DEFAULT_USER value to "" in the init.d script.
Still the issue persisted.
In one of the answers it was also told that there might be some errors in the project... but I did not find any such errors all thanks to my test cases.
PS : I have specified , and for privacy issues
they have their original names.

I was having this a similar issue on my ubuntu server [ERROR 2]FILE NOT FOUND. Turns out, /var/run/celery/ Directories don't get automatically created even if you set that in the celery.service configuration done in the celery example docs. You can make that directory, and grant the right permissions manually, but as soon you reboot the server the directory will vanish because it's in a temporary directory.
After some reading about how the linux system operates, I found out you just need to create a configuration file in /etc/tmpfiles.d/celery.conf with these lines
d /var/run/celery 0755 admin admin -
d /var/log/celery 0755 admin admin -
Note: you will need to use a different user:group other than 'admin' or you can create a user:group called admin specifically to handle your celery process.
You can read more about this configuration and the way it operates by typing
man tmpfiles.d

I had the issue and solved it just now, thank god! For me it was a permission issue. I had expected it to be in /var/run/celery or /var/log/celery but it turns out it was the log file I have setup Django logging for. For some reason celery wanted to write to that file (I have to look into that) but had no permission. I found the error with the verbose command and skip daemonization step:
# C_FAKEFORK=1 sh -x /etc/init.d/celeryd start
This is an old thread but if anyone of you run into this error, I hope this may help!
Good luck!

I saw the same issue and it turned out to be a permissions issue.
Make sure to set the user/group that celery is running under to own the /var/log/celery/ and /var/run/celery/ folders.
See here for a step by step example:
Daemonizing celery

Django: environment variable for SECRET_KEY not working

I have SECRET_KEY = os.environ['SECRET_KEY'] in my prod.py, and SECRET_KEY=secret_string in my .bashrc
This will cause 502 error but if I set SECRET_KEY="secret_string", it is working. How can I use environment variable to do this?
I'm starting gunicorn via sudo service gunicorn restart and I have a upstart script.
Here is the output of cat /proc/<PID>/environ:
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin^#TERM=linux^#UPSTART_JOB=gunicorn^#UPSTART_INSTANCE=^#

You need to do:
export SECRET_KEY=secret_string
in your .bashrc. If you just do:
SECRET_KEY=secret_string
It's only available in current process, but when you run django server/shell, the subprocess has no idea of this variable. export make the variable available in subprocesses as well.

.bashrc only affects bash login shells. Init scripts are not affected in any way by it.
You should copy the export SECRET_KEY=... line to the top of your init script.

How to use environment variables with supervisor, gunicorn and django (1.6)

I want to configure supervisor to control gunicorn in my django 1.6 project using an environment variable for SECRET_KEY.
I set my secret key in .bashrc as
export SECRET_KEY=[my_secret_key]
And I have a shell script to start gunicorn:
NAME="myproject"
LOGFILE=/home/django/myproject/log/gunicorn.log
LOGDIR=$(dirname $LOGFILE)
NUM_WORKERS=3
DJANGO_WSGI_MODULE=myproject.wsgi
USER=django
GROUP=django
IP=0.0.0.0
PORT=8001
echo "Starting $NAME"
cd /home/django/myproject/myproject
source /home/django/.virtualenvs/myproject/bin/activate
test -d $LOGDIR || mkdir -p $LOGDIR
exec gunicorn ${DJANGO_WSGI_MODULE} \
--name $NAME \
--workers $NUM_WORKERS \
--user=$USER --group=$GROUP \
--log-level=debug \
--bind=$IP:$PORT
--log-file=$LOGFILE 2>>$LOGFILE
Then to configure my project's gunicorn server in supervisor:
[program:my_django_project]
directory=/home/django/my_django_project/my_django_project
command=/home/django/my_django_project/my_django_project/gunicorn.sh
user=django
autostart=true
autorestart=true
stdout_logfile=/home/django/my_django_project/log/supervisord.log
stderr_logfile=/home/django/my_django_project/log/supervisor_error.log
If I start gunicorn using my shell script it doesn't throw any error but when I start it with supervisor it fails and I see in the logs that it doesn't "find" my SECRET_KEY.
What's the correct way to configure supervisor to read my shell variables (I wan't to keep them in my .bashrc unless there's a more appropriate way)?

OK, I guess I got it.
I had tried including
environment=SECRET_KEY="secret_key_with_non_alphanumeric_chars"
in the conf file for supervisor but it didn't like the non alphanumeric chars and I didn't want to have my key in the conf file as I have it in git.
After loking at supervisor's docs I had also tried with:
HOME="/home/django", USER="django"
but didn't work.
Finally I tried with this and is working now!:
environment=HOME="/home/django", USER="django", SECRET_KEY=$SECRET_KEY
Maybe although it's working it's not the best solution. I'd be happy to learn more.
EDIT:
Finally, Ewan made me see that using the bash for setting the env vars wouldn't be the best option. So one solution, as pointed by #Ewan, would be to use:
[program:my_project]
...
environment=SECRET_KEY="secret_key_avoiding_%_chars"
Another solution I found, for those using virtualenv would be to export the env vars in the "activate" script of the virtualenv, that is, edit your virtualenv/bin/activate file and add at the end your SECRET_KEY.
This way you can use % chars as generated by key generators for django and is valid if you don't use supervisor.
I restarted my server without logging to check that it worked. With this option I don't have to edit my keys, I can keep my conf files versioned and it works whether I use supervisor, upstart or whatever (or nothing, just gunicorn).
Anyway, I know I haven't discovered anything new (well #Ewan raised an issue with supervisor) but I'm learning things and hope this can be useful to someone else.

Also if you use gunicorn config file:
gunicorn -c gunicorn.py myproject.wsgi
It's possible to pass environment variables in the gunicorn.py file like this:
bind = "0.0.0.0:8001"
workers = 3
proc_name = "myproject"
user = "django"
group = "django"
loglevel = "debug"
errorlog = "/home/django/myproject/log/gunicorn.log"
raw_env = [
'DATABASE_URL=postgres://user:password#host/dbname',
'SECRET_KEY=mysecretkey',
]

Your .bashrc will only work for interactive shells so will work when running the shell script as your user however supervisor, running in the background, wont get passed these values.
Instead, in your supervsior .ini file set the environment variable there (more information in the documentation).
e.g.
[program:my_django_project]
environment=SECRET_KEY="my_secret_key"
After a little bit of trial and error, I noticed that the supervisor .ini file doesn't like to have % in the environment variables section (even if you do quote it...). Based on your example in the comments I have tried this with supervisor==3.0 installed via pip and it works:
environment=SECRET_KEY="*wi4h$kqxp84f3w6uh8w#l$0(+#x$3cr&)z^lmg+pqw^6wkyi"
Only difference is I have removed the% sign. (I tried escaping it with \% but this still didn't work)
Edit 2
Raised issue #291 with supervisor for this bug.
Edit 3
As noted in above issue, if a % is present in your secret key it must be escaped python-style: %%

You can escape % character by adding another % character.
Otherwise, quoting the values is optional but recommended. To escape percent characters, simply use two. (e.g. URI="/first%%20name")
Taken from here: http://supervisord.org/configuration.html

sudo /etc/init.d/celeryd start generates a "Unknown command: 'celeryd_multi'"

I'm setting up celery to run daemonized, using the variables from my virtual environment. But when I run $ sudo /etc/init.d/celeryd start, I get Unknown command: 'celeryd_multi' Type 'manage.py help' for usage.
I have set the following:
CELERYD_CHDIR="/home/myuser/projects/myproject"
ENV_PYTHON="/home/myuser/.virtualenvs/myproject/bin/python"
CELERYD_MULTI="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryd_multi"
When I run $ /home/myuser/.virtualenvs/myproject/bin/python /home/myuser/projects/myproject/manage.py celeryd_multi from the command line, it works fine.
Any ideas? I will gladly post any other code you need :)
Thank you!

Maybe you just set a wrong DJANGO_SETTINGS_MODULE:
try: DJANGO_SETTINGS_MODULE="settings" <-> DJANGO_SETTINGS_MODULE="project.settings"

The problem here is that when you run it as your user, virtualenv already has proper environment activated for your user "myuser" and it pulls packages from /home/myuser/.virtualenvs/myproject/...
When you do sudo /etc/init.d/celeryd start you are starting celery as root which probably doesn't have virtualenv activated in /root/.virtualenvs/ if such a thing even exists and thus it looks for python packages in /usr/lib/... where your default python is and consequently where your celery is not installed.
Your options are to either:
Replicate the same virtualenv under root user and start it like you tried with sudo
Keep virtualenv where it is and start celery as your user "myuser" (no sudo) without using init scripts.
Write a script that will su - myuser -c /bin/sh /home/myuser/.virtualenvs/myproject/bin/celeryd to invoke it from init.d as a myuser.
Install supervisor outside of virtualenv and let it do the dirtywork for you
Thoughts:
Avoid using root for anything you don't have to.
If you don't need celery to start on boot then this is fine, wrapped in a script possibly.
Plain hackish to me, but works if you don't want to invest additional 30min to use something else.
Probably best way to handle ALL of your python startup needs, highly recommended.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js