I'm running multiple Django/apache/wsgi websites on the same server using apache2 virtual servers. And I would like to use celery, but if I start celeryd for multiple websites, all the websites will use the configuration (logs, DB, etc) of the last celeryd instance I started.
Is there a way to use multiple Celeryd (one for each website) or one Celeryd for all of them? Seems like it should be doable, but I can't find out how.
This problem was a big headache, i didn't noticed #Crazyshezy 's comment when i first came here. I just accomplished this by changing Broker URL in settings.py for each web app.
app1.settings.py
BROKER_URL = 'redis://localhost:6379/0'
app2.settings.py
BROKER_URL = 'redis://localhost:6379/1'
Yes there is a way.
We use supervisor to start celery daemons for every project we need it.
The supervisor config file looks something like this:
[program:PROJECTNAME]
command=python manage.py celeryd --loglevel=INFO --beat
environment=PATH=/home/www-data/projects/PROJECTNAME/env/bin:/usr/bin:/bin
directory=/home/www-data/projects/PROJECTNAME/
user=www-data
numprocs=1
umask=022
stdout_logfile=/home/www-data/logs/%(program_name)s.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
stderr_logfile=/home/www-data/logs/%(program_name)s.error.log
stderr_logfile_maxbytes=50MB
stderr_logfile_backups=10
autorestart=true
autostart=True
startsecs=10
stopwaitsecs = 60
priority=998
There is also an other advantage if you use this setup: The celery daemons run entirely in userspace.
Remember to use different broker backends for your projects. It won't work if you use the same rabbitmq virtualhost or if you use the same redis database for every project.
Related
I have a test Django site using a mod_wsgi daemon process, and have set up a simple Celery task to email a contact form, and have installed supervisor.
I know that the Django code is correct.
The problem I'm having is that when I submit the form, I am only getting one message - the first one. Subsequent completions of the contact form do not send any message at all.
On my server, I have another test site with a configured supervisor task running which uses the Django server (ie it's not using mod_wsgi). Both of my tasks are running fine if I do
sudo supervisorctl status
Here is my conf file for the task I've described above which is saved at
/etc/supervisor/conf.d
the user in this instance is called myuser
[program:test_project]
command=/home/myuser/virtualenvs/test_project_env/bin/celery -A test_project worker --loglevel=INFO --concurrency=10 -n worker2#%%h
directory=/home/myuser/djangoprojects/test_project
user=myuser
numprocs=1
stdout_logfile=/var/log/celery/test_project.out.log
stderr_logfile=/var/log/celery/test_project.err.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
stopasgroup=true
; Set Celery priority higher than default (999)
; so, if rabbitmq is supervised, it will start first.
priority=1000
My other test site has this set as the command - note worker1#%%h
command=/home/myuser/virtualenvs/another_test_project_env/bin/celery -A another_test_project worker --loglevel=INFO --concurrency=10 -n worker1#%%h
I'm obviously doing something wrong in that my form is only submitted. If I look at the out.log file referred to above, I only see the first task, nothing is visible for the other form submissions.
Many thanks in advance.
UPDATE
I submitted the first form at 8.32 am (GMT) which was received, and then as described above, another one shortly thereafter for which a task was not created. Just after finishing the question, I submitted the form again at 9.15, and for this a task was created and the message received! I then submitted the form again, but no task was created again. Hope this helps!
use ps auxf|grep celery to see how many worker you started,if there is any other worker you start before and you don't kill it ,the worker you create before will consume the task,result in you every two or three(or more) times there is only one task is received.
and you need to stop celery by:
sudo supervisorctl -c /etc/supervisord/supervisord.conf stop all
everytime, and set this in supervisord.conf:
stopasgroup=true ; send stop signal to the UNIX process group (default false)
Otherwise it will causes memory leaks and regular task loss.
If you has multi django site,here is a demo support by RabbitMQ:
you need add rabbitmq vhost and set user to vhost:
sudo rabbitmqctl add_vhost {vhost_name}
sudo rabbitmqctl set_permissions -p {vhost_name} {username} ".*" ".*" ".*"
and different site use different vhost(but can use same user).
add this to your django settings.py:
BROKER_URL = 'amqp://username:password#localhost:5672/vhost_name'
some info here:
Using celeryd as a daemon with multiple django apps?
Running multiple Django Celery websites on same server
Run Multiple Django Apps With Celery On One Server With Rabbitmq VHosts
Run Multiple Django Apps With Celery On One Server With Rabbitmq VHosts
I have a wsgi.ini file in my project, and I use uwsgi wsgi.ini to run my project.But when I change the django code,I want to restart the project instead kill uwsgi then reload it. The uwsgi official document provide the following methods:
# using kill to send the signal
kill -HUP `cat /tmp/project-master.pid`
# or the convenience option --reload
uwsgi --reload /tmp/project-master.pid
# or if uwsgi was started with touch-reload=/tmp/somefile
touch /tmp/somefile
But I don't have a project-master.pid file in /tmp catalog in my system(centOS).
my question:
how to use uwsgi restart django instead of kill it then start it?
if use uwsgi official document provided method,how to create a .pid file and what content should in this file?
I find the anwser. project-master.pid is set in wsgi.ini file, you should set pidfile=/tmp/project-master.pid first. Then use uwsgi to start server: uwsgi wsgi.ini.After you start it, you can see a project-master.pid file in /tmp catalog. When you want to reload uwsgi server, you can use such command to restart server: uwsgi --reload /tmp/project-master.pid.
I found simplier answer in my opinion, you can just kill your uwsgi process and then spawn it again:
killall uwsgi
And then just run your uwsgi command again.
You don't need to use uWSGI server for your local development needs. Apache/uWSGI are meant for production, and having them restarted implicitly at every code change is not often desirable. In fact, production server not restarting even after the code is changed often acts as a safety net, so that you don't end up restarting the server without finalising the deployment.
Just use inbuild server django provides with itself.
python manage.py runserver 8000
We're using Django + Gunicorn + Nginx in our server. The problem is that after a while we see lot's of gunicorn worker processes that have became orphan, and a lot other ones that have became zombie. Also we can see that some of Gunicorn worker processes spawn some other Gunicorn workers. Our best guess is that these workers become orphans after their parent workers have died.
Why Gunicorn workers spawn child workers? Why do they die?! And how can we prevent this?
I should also mention that we've set Gunicorn log level to debug and still we don't see any thing significant, other than periodical log of workers number, which reports count of workers we wanted from it.
UPDATE
This is the line we used to run gunicorn:
gunicorn --env DJANGO_SETTINGS_MODULE=proj.settings proj.wsgi --name proj --workers 10 --user proj --group proj --bind 127.0.0.1:7003 --log-level=debug --pid gunicorn.pid --timeout 600 --access-logfile /home/proj/access.log --error-logfile /home/proj/error.log
In my case I deploy in Ubuntu servers (LTS releases, now almost are 14.04 LTS servers) and I never did have problems with gunicorn daemons, I create a gunicorn.conf.py and launch gunicorn with this config from upstart with an script like this in /etc/init/djangoapp.conf
description "djangoapp website"
start on startup
stop on shutdown
respawn
respawn limit 10 5
script
cd /home/web/djangoapp
exec /home/web/djangoapp/bin/gunicorn -c gunicorn.conf.py -u web -g web djangoapp.wsgi
end script
I configure gunicorn with a .py file config and i setup some options (details below) and deploy my app (with virtualenv) in /home/web/djangoapp and no problems with zombie and orphans gunicorn processes.
i verified your options, timeout can be a problem but another one is that you don't setup max-requests in your config, by default is 0, so, no automatic worker restart in your daemon and can generate memory leaks (http://gunicorn-docs.readthedocs.org/en/latest/settings.html#max-requests)
We will use a .sh file to start the gunicorn process. Later you will use a supervisord configuration file. what is supervisord? some external know how information link about how to install supervisord with Django,Nginx,Gunicorn Here
gunicorn_start.sh remember to give chmod +x to the file.
#!/bin/sh
NAME="myDjango"
DJANGODIR="/var/www/html/myDjango"
NUM_WORKERS=3
echo "Starting myDjango -- Django Application"
cd $DJANGODIR
exec gunicorn -w $NUM_WORKERS $NAME.wsgi:application --bind 127.0.0.1:8001
mydjango_django.conf : Remember to install supervisord on your OS. and
Copy this on the configuration folder.
[program:myDjango]
command=/var/www/html/myDjango/gunicorn_start.sh
user=root
autorestart=true
redirect_sderr=true
Later on use the command:
Reload the daemon’s configuration files, without add/remove (no restarts)
supervisordctl reread
Restart all processes Note: restart does not reread config files. For that, see reread and update.
supervisordctl start all
Get all process status info.
supervisordctl status
This sounds like a timeout issue.
You have multiple timeouts going on and they all need to be in a descending order. It seems they may not be.
For example:
Nginx has a default timeout of 60 seconds
Gunicorn has a default timeout of 30 seconds
Django has a default timeout of 300 seconds
Postgres default timeout is complicated but let's pose 60 seconds for this example.
In this example, when 30 seconds has passed and Django is still waiting for Postgres to respond. Gunicorn tells Django to stop, which in turn should tell Postgres to stop. Gunicorn will wait a certain amount of time for this to happen before it kills django, leaving the postgres process as an orphan query. The user will re-initiate their query and this time the query will take longer because the old one is still running.
I see that you have set your Gunicorn tiemeout to 300 seconds.
This would probably mean that Nginx tells Gunicorn to stop after 60 seconds, Gunicorn may wait for Django who waits for Postgres or any other underlying processes, and when Nginx gets tired of waiting, it kills Gunicorn, leaving Django hanging.
This is still just a theory, but it is a very common problem and hopefully leads you and any others experiencing similar problems, to the right place.
I have a django app in which it has a celery functionality, so i can able to run the celery sucessfully like below
celery -A tasks worker --loglevel=info
but as a known fact that we need to run it as a daemon and so i have written the below celery.conf file inside /etc/supervisor/conf.d/ folder
; ==================================
; celery worker supervisor example
; ==================================
[program:celery]
; Set full path to celery program if using virtualenv
command=/root/Envs/proj/bin/celery -A app.tasks worker --loglevel=info
user=root
environment=C_FORCE_ROOT="yes"
environment=HOME="/root",USER="root"
directory=/root/apps/proj/structure
numprocs=1
stdout_logfile=/var/log/celery/worker.log
stderr_logfile=/var/log/celery/worker.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
but when i tried to update the supervisor like supervisorctl reread and supervisorctl update i was getting the message from supervisorctl status
celery FATAL Exited too quickly (process log may have details)
So i went to worker.log file and seen the error message as below
Running a worker with superuser privileges when the
worker accepts messages serialized with pickle is a very bad idea!
If you really want to continue then you have to set the C_FORCE_ROOT
environment variable (but please think about this before you do).
User information: uid=0 euid=0 gid=0 egid=0
So why it was complaining about C_FORCE_ROOT even though we had set it as environment variable inside supervisor conf file ? what am i doing wrong in the above conf file ?
I had the same problem,so I added
environment=C_FORCE_ROOT="yes"
in my program config,but It didn't work
so I used
environment=C_FORCE_ROOT="true"
it's working
You'll need to run celery with a non superuser account, Please remove following lines from your config:
user=root
environment=C_FORCE_ROOT="yes"
environment=HOME="/root",USER="root"
And the add these lines to your config, I assume that you use django as a non superuser and developers as the user group:
user=django
group=developers
Note that subprocesses will inherit the environment variables of the
shell used to start supervisord except for the ones overridden here
and within the program’s environment option. See supervisord documents.
So Please note that when you change environment variables via supervisor config files, Changes won't apply by running supervisorctl reread and supervisorctl reload . You should run supervisor from the very start by following command:
supervisord -c /path/to/config/file.conf
From this other thread on stackoverflow. I managed to add the following settings and it worked for me.
app.conf.update(
CELERY_ACCEPT_CONTENT = ['json'],
CELERY_TASK_SERIALIZER = 'json',
CELERY_RESULT_SERIALIZER = 'json',
)
Im trying to find some information on the correct way of setting up multiple django sites on a linode (Ubuntu 12.04.3 LTS (GNU/Linux 3.9.3-x86_64-linode33 x86_64)
Here is what I have now:
Webserver: nginx
Every site is contained in a .virtualenv
Django and other packages is installed using pip in each .virtualenv
RabbitMQ is installed using sudo apt-get rabbitmq, and a new user and vhost is created for each site.
Each site is started using supervisor script:
[group:<SITENAME>]
programs=<SITENAME>-gunicorn, <SITENAME>-celeryd, <SITENAME>-celerycam
[program:<SITENAME>-gunicorn]
directory = /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/
command=/home/<USER>/.virtualenvs/<SITENAME>/bin/gunicorn <PROJECT>.wsgi:application -c /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/server_conf/<SITENAME>-gunicorn.py
user=<USER>
autostart = true
autorestart = true
stderr_events_enabled = true
redirect_stderr = true
logfile_maxbytes=5MB
[program:<SITENAME>-celeryd]
directory=/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/
command=/home/<USER>/.virtualenvs/<SITENAME>/bin/python /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/manage.py celery worker -E -n <SITENAME> --broker=amqp://<SITENAME>:<SITENAME>#localhost:5672//<SITENAME> --loglevel=ERROR
environment=HOME='/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/',DJANGO_SETTINGS_MODULE='<PROJECT>.settings.staging'
user=<USER>
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
[program:<SITENAME>-celerycam]
directory=/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/
command=/home/<USER>/.virtualenvs/<SITENAME>/bin/python /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/manage.py celerycam
environment=HOME='/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/',DJANGO_SETTINGS_MODULE='<PROJECT>.settings.staging'
user=<USER>
autostart=true
autorestart=true
startsecs=10
Question 1: Is this the correct way? Or is it a better way to do this?
Question 2: I have tried to install celery flower, but how does that work with multiple sites? Do I need to install one flower-package for each .virtualenv, or could I use one install for every site? How do I setup nginx to display the flower-page(s) on my server?
Answer 1
There are - as so often :) - several ways to go. We do set it up in a similar way.
For the supervisor configuration I would suggest to use a little bit less verbose way, below an example for running web/tasks for 'example.com':
/etc/supervisor/conf.d/example.com.conf
(we usually have the config files in the repository as well, and just symlink them. So this file could be a symlink to:
/var/www/example.com/conf/supervisord.conf )
[group:example.com]
programs=web, worker, cam
[program:web]
command=/srv/example.com/bin/gunicorn project.wsgi -c /var/www/example.com/app/gunicorn.conf.py
directory=/var/www/example.com/app/
user=<USER>
autostart=true
autorestart=true
redirect_stderr=True
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=5
stdout_logfile=/var/log/apps/web.example.com.log
[program:worker]
command=/srv/example.com/bin/celery -A project worker -l info
directory=/var/www/example.com/app/
user=<USER>
autostart=true
autorestart=true
redirect_stderr=True
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=5
stdout_logfile=/var/log/apps/web.example.com.log
[program:flower]
command=/srv/example.com/bin/celery flower -A project --broker=amqp://guest:guest#localhost:5672//example.com/ --url_prefix=flower --port 5001
directory=/var/www/example.com/app/
...
So you have less to type and it is easier to read..
# restart all 'programs'
supervisorctl restart example.com:*
# restart web/django
supervisorctl restart example.com:web
etc.
Answer 2
Not totally sure if it is the best way, but what I would do here (and usually do):
run flower separately for every app (see config above)
with respective vhost (and url_prefix)
ad nginx reverse-proxy (directory with same name as url_prefix)
/etc/nginx/sites-enabled/example.conf
server {
...
location /flower {
proxy_pass http://127.0.0.1:5001;
...
Access the flower interface at example.com/flower