Supervisor shutsdown when I have multiple services in the .conf file - django

I have a Linux server running ubuntu 22.10 with digital ocean where I plan to host multiple virtualenv to run isolated Django applications.
Using gunicorn to serve the application and nginx as the reverse proxy.
The supervisor to control the processes.
I started with one django application and everything was working fine.
on adding the second .conf file for the second application, the supervisor shuts down and subsequent supervisorctl commands return this error unix:///var/run/supervisor.sock refused connection
this is my supervisor configuration
[program:brotherstech]
command=/home/webapps/brotherstech/bin/gunicorn_start
user=ops
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/home/webapps/brotherstech/logs/gunicorn-error.log
[program:officebook]
command=/home/webapps/officebook/bin/gunicorn_start
user=ops
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/home/webapps/officebook/logs/gunicorn-error.log

Related

Gunicorn sync workers spawning processes

We're using Django + Gunicorn + Nginx in our server. The problem is that after a while we see lot's of gunicorn worker processes that have became orphan, and a lot other ones that have became zombie. Also we can see that some of Gunicorn worker processes spawn some other Gunicorn workers. Our best guess is that these workers become orphans after their parent workers have died.
Why Gunicorn workers spawn child workers? Why do they die?! And how can we prevent this?
I should also mention that we've set Gunicorn log level to debug and still we don't see any thing significant, other than periodical log of workers number, which reports count of workers we wanted from it.
UPDATE
This is the line we used to run gunicorn:
gunicorn --env DJANGO_SETTINGS_MODULE=proj.settings proj.wsgi --name proj --workers 10 --user proj --group proj --bind 127.0.0.1:7003 --log-level=debug --pid gunicorn.pid --timeout 600 --access-logfile /home/proj/access.log --error-logfile /home/proj/error.log
In my case I deploy in Ubuntu servers (LTS releases, now almost are 14.04 LTS servers) and I never did have problems with gunicorn daemons, I create a gunicorn.conf.py and launch gunicorn with this config from upstart with an script like this in /etc/init/djangoapp.conf
description "djangoapp website"
start on startup
stop on shutdown
respawn
respawn limit 10 5
script
cd /home/web/djangoapp
exec /home/web/djangoapp/bin/gunicorn -c gunicorn.conf.py -u web -g web djangoapp.wsgi
end script
I configure gunicorn with a .py file config and i setup some options (details below) and deploy my app (with virtualenv) in /home/web/djangoapp and no problems with zombie and orphans gunicorn processes.
i verified your options, timeout can be a problem but another one is that you don't setup max-requests in your config, by default is 0, so, no automatic worker restart in your daemon and can generate memory leaks (http://gunicorn-docs.readthedocs.org/en/latest/settings.html#max-requests)
We will use a .sh file to start the gunicorn process. Later you will use a supervisord configuration file. what is supervisord? some external know how information link about how to install supervisord with Django,Nginx,Gunicorn Here
gunicorn_start.sh remember to give chmod +x to the file.
#!/bin/sh
NAME="myDjango"
DJANGODIR="/var/www/html/myDjango"
NUM_WORKERS=3
echo "Starting myDjango -- Django Application"
cd $DJANGODIR
exec gunicorn -w $NUM_WORKERS $NAME.wsgi:application --bind 127.0.0.1:8001
mydjango_django.conf : Remember to install supervisord on your OS. and
Copy this on the configuration folder.
[program:myDjango]
command=/var/www/html/myDjango/gunicorn_start.sh
user=root
autorestart=true
redirect_sderr=true
Later on use the command:
Reload the daemon’s configuration files, without add/remove (no restarts)
supervisordctl reread
Restart all processes Note: restart does not reread config files. For that, see reread and update.
supervisordctl start all
Get all process status info.
supervisordctl status
This sounds like a timeout issue.
You have multiple timeouts going on and they all need to be in a descending order. It seems they may not be.
For example:
Nginx has a default timeout of 60 seconds
Gunicorn has a default timeout of 30 seconds
Django has a default timeout of 300 seconds
Postgres default timeout is complicated but let's pose 60 seconds for this example.
In this example, when 30 seconds has passed and Django is still waiting for Postgres to respond. Gunicorn tells Django to stop, which in turn should tell Postgres to stop. Gunicorn will wait a certain amount of time for this to happen before it kills django, leaving the postgres process as an orphan query. The user will re-initiate their query and this time the query will take longer because the old one is still running.
I see that you have set your Gunicorn tiemeout to 300 seconds.
This would probably mean that Nginx tells Gunicorn to stop after 60 seconds, Gunicorn may wait for Django who waits for Postgres or any other underlying processes, and when Nginx gets tired of waiting, it kills Gunicorn, leaving Django hanging.
This is still just a theory, but it is a very common problem and hopefully leads you and any others experiencing similar problems, to the right place.

Correct setup for multiple web sites with nginx, django and celery

Im trying to find some information on the correct way of setting up multiple django sites on a linode (Ubuntu 12.04.3 LTS (GNU/Linux 3.9.3-x86_64-linode33 x86_64)
Here is what I have now:
Webserver: nginx
Every site is contained in a .virtualenv
Django and other packages is installed using pip in each .virtualenv
RabbitMQ is installed using sudo apt-get rabbitmq, and a new user and vhost is created for each site.
Each site is started using supervisor script:
[group:<SITENAME>]
programs=<SITENAME>-gunicorn, <SITENAME>-celeryd, <SITENAME>-celerycam
[program:<SITENAME>-gunicorn]
directory = /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/
command=/home/<USER>/.virtualenvs/<SITENAME>/bin/gunicorn <PROJECT>.wsgi:application -c /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/server_conf/<SITENAME>-gunicorn.py
user=<USER>
autostart = true
autorestart = true
stderr_events_enabled = true
redirect_stderr = true
logfile_maxbytes=5MB
[program:<SITENAME>-celeryd]
directory=/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/
command=/home/<USER>/.virtualenvs/<SITENAME>/bin/python /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/manage.py celery worker -E -n <SITENAME> --broker=amqp://<SITENAME>:<SITENAME>#localhost:5672//<SITENAME> --loglevel=ERROR
environment=HOME='/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/',DJANGO_SETTINGS_MODULE='<PROJECT>.settings.staging'
user=<USER>
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
[program:<SITENAME>-celerycam]
directory=/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/
command=/home/<USER>/.virtualenvs/<SITENAME>/bin/python /home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/manage.py celerycam
environment=HOME='/home/<USER>/.virtualenvs/<SITENAME>/<PROJECT>/',DJANGO_SETTINGS_MODULE='<PROJECT>.settings.staging'
user=<USER>
autostart=true
autorestart=true
startsecs=10
Question 1: Is this the correct way? Or is it a better way to do this?
Question 2: I have tried to install celery flower, but how does that work with multiple sites? Do I need to install one flower-package for each .virtualenv, or could I use one install for every site? How do I setup nginx to display the flower-page(s) on my server?
Answer 1
There are - as so often :) - several ways to go. We do set it up in a similar way.
For the supervisor configuration I would suggest to use a little bit less verbose way, below an example for running web/tasks for 'example.com':
/etc/supervisor/conf.d/example.com.conf
(we usually have the config files in the repository as well, and just symlink them. So this file could be a symlink to:
/var/www/example.com/conf/supervisord.conf )
[group:example.com]
programs=web, worker, cam
[program:web]
command=/srv/example.com/bin/gunicorn project.wsgi -c /var/www/example.com/app/gunicorn.conf.py
directory=/var/www/example.com/app/
user=<USER>
autostart=true
autorestart=true
redirect_stderr=True
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=5
stdout_logfile=/var/log/apps/web.example.com.log
[program:worker]
command=/srv/example.com/bin/celery -A project worker -l info
directory=/var/www/example.com/app/
user=<USER>
autostart=true
autorestart=true
redirect_stderr=True
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=5
stdout_logfile=/var/log/apps/web.example.com.log
[program:flower]
command=/srv/example.com/bin/celery flower -A project --broker=amqp://guest:guest#localhost:5672//example.com/ --url_prefix=flower --port 5001
directory=/var/www/example.com/app/
...
So you have less to type and it is easier to read..
# restart all 'programs'
supervisorctl restart example.com:*
# restart web/django
supervisorctl restart example.com:web
etc.
Answer 2
Not totally sure if it is the best way, but what I would do here (and usually do):
run flower separately for every app (see config above)
with respective vhost (and url_prefix)
ad nginx reverse-proxy (directory with same name as url_prefix)
/etc/nginx/sites-enabled/example.conf
server {
...
location /flower {
proxy_pass http://127.0.0.1:5001;
...
Access the flower interface at example.com/flower

Running Gunicorn behind chrooted nginx inside virtualenv

I can this setup to work if I start gunicorn manually or if I add gunicorn to my django installed apps. But when I try to start gunicorn with systemd the gunicorn socket and service start fine but they don't serve anything to Nginx; I get a 502 bad gateway.
Nginx is running under the "http" user/group, chroot jail. I used pythonbrew to setup the virtualenvs so gunicorn is installed in my home directory under .pythonbrew. The vitualenv directory is owned by my user and the adm group.
I'm pretty sure there is a permission issue somewhere, because everything works if I start gunicorn but not if systemd starts it. I've tried changing the user and group directives inside the gunicorn.service file, but nothing worked; if root start the server then I get no errors and a 502, if my user starts it I get no errors and 504.
I have checked the Nginx logs and there are no errors, so I'm sure it's a gunicorn issue. Should I have the virtualenv in the app directory? Who should be the owner of the app directory? How can I narrow down the issue?
/usr/lib/systemd/system/gunicorn-app.service
#!/bin/sh
[Unit]
Description=gunicorn-app
[Service]
ExecStart=/home/noel/.pythonbrew/venvs/Python-3.3.0/nlp/bin/gunicorn_django
User=http
Group=http
Restart=always
WorkingDirectory = /home/noel/.pythonbrew/venvs/Python-3.3.0/nlp/bin
[Install]
WantedBy=multi-user.target
/usr/lib/systemd/system/gunicorn-app.socket
[Unit]
Description=gunicorn-app socket
[Socket]
ListenStream=/run/unicorn.sock
ListenStream=0.0.0.0:9000
ListenStream=[::]:8000
[Install]
WantedBy=sockets.target
I realize this is kind of a sprawling question, but I'm sure I can pinpoint the issue with a few pointers. Thanks.
Update
I'm starting to narrow this down. When I run gunicorn manually and then run ps aux|grep gunicorn then I see two processes that are started: master and worker. But when I start gunicorn with systemd there is only one process started. I tried adding Type=forking to my gunicorn.services file, but then I get an error when loading service. I thought that maybe gunicorn wasn't running under the virtualenv or the venv isn't getting activated?
Does anyone know what I'm doing wrong here? Maybe gunicorn isn't running in the venv?
I had a similar problem on OSX with launchd.
The issue was I needed to allow for the process to spawn sub processes.
Try adding Type=forking:
[Unit]
Description=gunicorn-app
[Service]
Type=forking
I know this isn't the best way, but I was able to get it working by adding gunicorn to the list of django INSTALLED_APPS. Then I just created a new systemd service:
[Unit]
Description=hack way to start gunicorn and django
[Service]
User=http
Group=http
ExecStart=/srv/http/www/nlp.com/nlp/bin/python /srv/http/www/nlp.com/nlp/nlp/manage.py run_gunicorn
Restart=always
[Install]
WantedBy=multi-user.target
There must be a better way, but judging by the lack of responses not many people know what that better way is.

Restarting SuperVisor with code changes

I am running a django project with Gunicorn and Nginx with Supervisor. Everything worked fine but when i made some changes to the code it is not recognized by the supervisor and still it reads the old codes. Can you please help me. I tried to restart supervisorctl, it didnt work
If you're talking about python code changes, just use supervisorctl.
supervisorctl restart gunicorn (or whatever you called this)
If you're talking about supervisor configuration changes, use supervisorctl reread before starting your supervisor startup script via supervisorctl start foo
"You can gracefully reload your application in Gunicorn by sending HUP signal: $ kill -HUP masterpid", http://docs.gunicorn.org/en/stable/faq.html
For example, pkill -HUP gunicorn
"Sending HUP signal to the Master Gunicorn process -- Reload the configuration, start the new worker processes with a new configuration and gracefully shutdown older workers.", http://docs.gunicorn.org/en/stable/signals.html

Running multiple Django Celery websites on same server

I'm running multiple Django/apache/wsgi websites on the same server using apache2 virtual servers. And I would like to use celery, but if I start celeryd for multiple websites, all the websites will use the configuration (logs, DB, etc) of the last celeryd instance I started.
Is there a way to use multiple Celeryd (one for each website) or one Celeryd for all of them? Seems like it should be doable, but I can't find out how.
This problem was a big headache, i didn't noticed #Crazyshezy 's comment when i first came here. I just accomplished this by changing Broker URL in settings.py for each web app.
app1.settings.py
BROKER_URL = 'redis://localhost:6379/0'
app2.settings.py
BROKER_URL = 'redis://localhost:6379/1'
Yes there is a way.
We use supervisor to start celery daemons for every project we need it.
The supervisor config file looks something like this:
[program:PROJECTNAME]
command=python manage.py celeryd --loglevel=INFO --beat
environment=PATH=/home/www-data/projects/PROJECTNAME/env/bin:/usr/bin:/bin
directory=/home/www-data/projects/PROJECTNAME/
user=www-data
numprocs=1
umask=022
stdout_logfile=/home/www-data/logs/%(program_name)s.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
stderr_logfile=/home/www-data/logs/%(program_name)s.error.log
stderr_logfile_maxbytes=50MB
stderr_logfile_backups=10
autorestart=true
autostart=True
startsecs=10
stopwaitsecs = 60
priority=998
There is also an other advantage if you use this setup: The celery daemons run entirely in userspace.
Remember to use different broker backends for your projects. It won't work if you use the same rabbitmq virtualhost or if you use the same redis database for every project.