I am using Django to handle fairly long http post requests and I am wondering if my setup has some limitations when I received many requests at the same time.
lighttpd.conf fcgi:
fastcgi.server = (
"a.fcgi" => (
"main" => (
# Use host / port instead of socket for TCP fastcgi
"host" => "127.0.0.1",
"port" => 3033,
"check-local" => "disable",
"allow-x-send-file" => "enable"
))
)
Django init.d script start section:
start-stop-daemon --start --quiet \
--pidfile /var/www/tmp/a.pid \
--chuid www-data --exec /usr/bin/env -- python \
/var/www/a/manage.py runfcgi \
host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
Starting Django using the script above results in a multi-threaded Django server:
www-data 342 7873 0 04:58 ? 00:01:04 python /var/www/a/manage.py runfcgi host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
www-data 343 7873 0 04:58 ? 00:01:15 python /var/www/a/manage.py runfcgi host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
www-data 378 7873 0 Feb14 ? 00:04:45 python /var/www/a/manage.py runfcgi host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
www-data 382 7873 0 Feb12 ? 00:14:53 python /var/www/a/manage.py runfcgi host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
www-data 386 7873 0 Feb12 ? 00:12:49 python /var/www/a/manage.py runfcgi host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
www-data 7873 1 0 Feb12 ? 00:00:24 python /var/www/a/manage.py runfcgi host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid
In lighttpd error.log, I do see load = 10 which shows I am getting many requests at the same time, this happens few times a day:
2010-02-16 05:17:17: (mod_fastcgi.c.2979) got proc: pid: 0 socket: tcp:127.0.0.1:3033 load: 10
Is my setup correct to handle many long http post requests (can last few minutes each) at the same time ?
I think you may want to configure your fastcgi worker to run multi-processed, or multi-threaded.
From manage.py runfcgi help:
method=IMPL prefork or threaded (default prefork)
[...]
maxspare=NUMBER max number of spare processes / threads
minspare=NUMBER min number of spare processes / threads.
maxchildren=NUMBER hard limit number of processes / threads
So your start command would be:
start-stop-daemon --start --quiet \
--pidfile /var/www/tmp/a.pid \
--chuid www-data --exec /usr/bin/env -- python \
/var/www/a/manage.py runfcgi \
host=127.0.0.1 port=3033 pidfile=/var/www/tmp/a.pid \
method=prefork maxspare=4 minspare=4 maxchildren=8
You will want to adjust the number of processes as needed. Note that the more FCGI processes you have, your memory usage will increase linearly. Also, if your processes are CPU-bound, having more processes than number of available CPU cores won't help much for concurrency.
Related
I am running the setup that follows(just showing the beat setup for simplicity) for daemonizing my celery and beat workers on Elastic beanstalk. I am able daemonize the processes successfully however too many processes are being spawned.
Current Output
root 20409 0.7 9.1 473560 92452 ? S 02:59 0:01 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp worker --loglevel=INFO
root 20412 0.6 7.8 388152 79228 ? S 02:59 0:01 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp beat --loglevel=INFO
root 20509 0.0 7.1 388748 72412 ? S 02:59 0:00 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp worker --loglevel=INFO
root 20585 0.6 7.7 387624 78340 ? S 03:00 0:01 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp beat --loglevel=INFO
root 20679 1.1 9.1 473560 92584 ? S 03:01 0:01 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp worker --loglevel=INFO
root 20685 0.0 7.1 388768 72460 ? S 03:01 0:00 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp worker --loglevel=INFO
Desired output as achieved by after the environment deploys running kill -9 $(pgrep celery)
root 20794 20.6 7.7 387624 78276 ? S 03:03 0:01 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp beat --loglevel=INFO
root 20797 24.3 9.1 473560 92564 ? S 03:03 0:01 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp worker --loglevel=INFO
root 20806 0.0 7.1 388656 72272 ? S 03:03 0:00 /opt/python/run/venv/bin/python3.6 /opt/python/run/venv/bin/celery -A djangoApp worker --loglevel=INFO
celeryBeat.sh
#!/usr/bin/env bash
/opt/python/run/venv/bin/celery -A djangoApp beat --loglevel=INFO
supervisor.conf
[unix_http_server]
file=/opt/python/run/supervisor.sock ; (the path to the socket file)
;chmod=0700 ; socket file mode (default 0700)
;chown=nobody:nogroup ; socket file uid:gid owner
[supervisord]
logfile=/opt/python/log/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=10MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/opt/python/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
directory=/opt/python/current/app ; (default is not to cd during start)
;nocleanup=true ; (don't clean up tempfiles at start;default false)
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///opt/python/run/supervisor.sock
[program:httpd]
command=/opt/python/bin/httpdlaunch
numprocs=1
directory=/opt/python/current/app
autostart=true
autorestart=unexpected
startsecs=1 ; number of secs prog must stay running (def. 1)
startretries=3 ; max # of serial start failures (default 3)
exitcodes=0,2 ; 'expected' exit codes for process (default 0,2)
killasgroup=false ; SIGKILL the UNIX process group (def false)
redirect_stderr=false
[include]
files: celery.conf
celery.conf
[program:beat:]
; Set full path to celery program if using virtualenv
command=sh /opt/python/etc/celeryBeat.sh
directory=/opt/python/current/app
; user=nobody
numprocs=1
stdout_logfile=/var/log/celery-beat.log
stderr_logfile=/var/log/celery-beat.log
autostart=true
autorestart=true
startsecs=60
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 60
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
container commands
container_commands:
01_celery_tasks:
command: "cat .ebextensions/files/celery_configuration.txt > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
leader_only: true
02_celery_tasks_run:
command: "cat .ebextensions/files/beat_configuration.txt > /opt/python/etc/beat.sh && chmod 744 /opt/python/etc/celeryBeat.sh"
leader_only: true
03_celery_tasks_run:
command: "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
leader_only: true
I am trying to daemonize my celery/redis workers on Ubuntu 18.04 and I am making progress! Celery is now running, but it does appear to be communicating with my django app. I found that removing the directive Type=forking from the celery.service file, celery started working.
# systemctl status celery.service
● celery.service - Celery Service
Loaded: loaded (/etc/systemd/system/celery.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2020-12-17 18:35:19 MST; 1min 52s ago
Main PID: 21509 (code=exited, status=1/FAILURE)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/celery.service
Dec 17 18:35:17 t-rex systemd[1]: Starting Celery Service...
Dec 17 18:35:19 t-rex sh[24331]: celery multi v4.3.0 (rhubarb)
Dec 17 18:35:19 t-rex sh[24331]: > Starting nodes...
Dec 17 18:35:19 t-rex sh[24331]: > w1#t-rex: OK
Dec 17 18:35:19 t-rex sh[24331]: > w2#t-rex: OK
Dec 17 18:35:19 t-rex sh[24331]: > w3#t-rex: OK
Dec 17 18:35:19 t-rex systemd[1]: Started Celery Service.
When I test celery from the python prompt in my apps virtualenv the test fails. This is the test I use in my app before I call a celery task.
>>> celery_app.control.broadcast('ping', reply=True, limit=1)
[]
My celery.service file (straight from the celery docs) with a few local changes.
[Unit]
Description=Celery Service
After=network.target redis.service
Requires=redis.service
[Service]
#Type=forking
User=www-data
Group=www-data
EnvironmentFile=/etc/conf.d/celery
WorkingDirectory=/home/mark/python-projects/archive
ExecStart=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --loglevel="${CELERYD_LOG_LEVEL}"'
ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
Restart=always
[Install]
WantedBy=multi-user.target
and my environment file (also from the same celery docs):
# Name of nodes to start
# here we have a single node
CELERYD_NODES="w1 w2 w3"
# or we could have three nodes:
#CELERYD_NODES="w1 w2 w3"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/home/mark/.virtualenvs/archive/bin/celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="MemorabiliaJSON"
# or fully qualified:
#CELERY_APP="proj.tasks:app"
# How to call manage.py
CELERYD_MULTI="multi"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# - %n will be replaced with the first part of the nodename.
# - %I will be replaced with the current child process index
# and is important when using the prefork pool to avoid race conditions.
CELERYD_PID_FILE="/var/run/celery/%n.pid"
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_LOG_LEVEL="DEBUG"
The redis server is running, so that not be the issue. I am not sure if redis is talking to my daemonized celery or not.
I start celery with "celery -A MemorabiliaJSON worker -l debug" when using django runserver, and I am not sure if my daemonized celery needs something else to make it talk to my django apps.
Is there any magic needed to get django/apache/wsgi to work with daemonized celery? There is nothing in the celery log files when I try my test above.
Thanks for any assistance you can give me in debugging this problem!
Mark
I'm running a Django app with uWSGI in Docker with docker-compose. I get the same error every time I:
Send a POST request with AJAX
In handling said request in my view, I use python's requests module, i.e. r = requests.get(some_url)
uWSGI says the following:
!!! uWSGI process 13 got Segmentation Fault !!!
DAMN ! worker 1 (pid: 13) died :( trying respawn ...
Respawned uWSGI worker 1 (new pid: 24)
spawned 4 offload threads for uWSGI worker 1
The console in the browser says net::ERR_EMPTY_RESPONSE
I've tried using the requests module in different places, and wherever I put it I get the same Segmentation Fault error. I'm also able to run everything fine outside of docker with no errors, so I've narrowed it down to: docker + requests module = errror.
Is there something that could be blocking the requests sent with the requests module from within the docker container? Thanks in advance for your help.
Here's my uwsgi.ini file:
[uwsgi]
chdir = %d
module = my_project.wsgi:application
master = true
processes = 2
http = 0.0.0.0:8000
vacuum = true
pidfile = /tmp/my_project.pid
daemonize = %d/my_project.log
check-static = %d
static-expires = /* 7776000
offload-threads = %k
uid = 1000
gid = 1000
# there is no /etc/mime.types on the docker Arch Linux image
mime-file = %d/mime.types
Dockerfile:
FROM alpine:3.8
ENV PYTHONUNBUFFERED 1
RUN mkdir /my_project
WORKDIR /my_project
RUN apk add build-base python3-dev py3-pip python3
# deps for python cryptography
RUN apk add libffi-dev musl-dev openssl-dev
# dep for uwsgi
RUN apk add linux-headers
ADD requirements.txt /my_project/
RUN pip3 install -r requirements.txt
ADD . /my_project/
ENTRYPOINT ./start.sh
docker-compose.yml:
version: '3'
services:
web:
build: .
entrypoint: ./start.sh
volumes:
- .:/my_project
ports:
- "8000:8000"
environment:
- DEBUG_LEVEL=INFO
network_mode: "host"
start.sh:
#!/bin/sh
echo '' > logfile.log
uwsgi --ini uwsgi.ini
tail -f logfile.log
Solution: Change base image to Ubuntu 16.04 and everything works fine now.
I'm using this guide to setting up an intranet server. Everything goes ok, the server works and I can checked it is working in my network.
But when I logout, I get 404 error.
The sock file is in the path indicated in gunicorn_start.
(cmi2014)javier#sgc:~/workspace/cmi/cmi$ ls -l run/
total 0
srwxrwxrwx 1 javier javier 0 mar 10 17:31 cmi.sock
Actually I can se the workers when listed the process list.
(cmi2014)javier#sgc:~/workspace/cmi/cmi$ ps aux | grep cmi
javier 17354 0.0 0.2 14652 8124 ? S 17:27 0:00 gunicorn: master [cmi]
javier 17365 0.0 0.3 18112 10236 ? S 17:27 0:00 gunicorn: worker [cmi]
javier 17366 0.0 0.3 18120 10240 ? S 17:27 0:00 gunicorn: worker [cmi]
javier 17367 0.0 0.5 36592 17496 ? S 17:27 0:00 gunicorn: worker [cmi]
javier 17787 0.0 0.0 4408 828 pts/0 S+ 17:55 0:00 grep --color=auto cmi
And supervisorctl responds that the process is running:
(cmi2014)javier#sgc:~/workspace/cmi/cmi$ sudo supervisorctl status cmi
[sudo] password for javier:
cmi RUNNING pid 17354, uptime 0:29:21
There is an error in nginx logs,
(cmi2014)javier#sgc:~/workspace/cmi/cmi$ tail logs/nginx-error.log
2014/03/10 17:38:57 [error] 17299#0: *19 connect() to
unix:/home/javier/workspace/cmi/cmi/run/cmi.sock failed (111: Connection refused) while
connecting to upstream, client: 10.69.0.174, server: , request: "GET / HTTP/1.1",
upstream: "http://unix:/home/javier/workspace/cmi/cmi/run/cmi.sock:/", host:
"10.69.0.68:2014"
Again, the error appears only when I logout or close session, but everything works fine when run or reload supervisor and stay connected.
By the way, ngnix, supervisor and gunicorn run on my uid.
Thanks in advance.
Edit Supervisor conf
[program:cmi]
command = /home/javier/entornos/cmi2014/bin/cmi_start
user = javier
stdout_logfile = /home/javier/workspace/cmi/cmi/logs/cmi_supervisor.log
redirect_stderr = true
autostart=true
autorestart=true
Gnunicor start script
#!/bin/bash
NAME="cmi" # Name of the application
DJANGODIR=/home/javier/workspace/cmi/cmi # Django project directory
SOCKFILE=/home/javier/workspace/cmi/cmi/run/cmi.sock # we will communicte using this unix socket
USER=javier # the user to run as
GROUP=javier # the group to run as
NUM_WORKERS=3 # how many worker processes should Gunicorn spawn
DJANGO_SETTINGS_MODULE=cmi.settings # which settings file should Django use
DJANGO_WSGI_MODULE=cmi.wsgi # WSGI module name
echo "Starting $NAME as `whoami`"
# Activate the virtual environment
cd $DJANGODIR
source /home/javier/entornos/cmi2014/bin/activate
export DJANGO_SETTINGS_MODULE=$DJANGO_SETTINGS_MODULE
export PYTHONPATH=$DJANGODIR:$PYTHONPATH
export CMI_SECRET_KEY='***'
export CMI_DATABASE_HOST='***'
export CMI_DATABASE_NAME='***'
export CMI_DATABASE_USER='***'
export CMI_DATABASE_PASS='***'
export CMI_DATABASE_PORT='3306'
# Create the run directory if it doesn't exist
RUNDIR=$(dirname $SOCKFILE)
test -d $RUNDIR || mkdir -p $RUNDIR
# Start your Django Unicorn
# Programs meant to be run under supervisor should not daemonize themselves (do not use --daemon)
exec /home/javier/entornos/cmi2014/bin/gunicorn ${DJANGO_WSGI_MODULE}:application --name $NAME --workers $NUM_WORKERS --user=$USER --group=$GROUP --log-level=debug --bind=unix:$SOCKFILE
I use Django 1.5.3 with gunicorn 18.0 and lighttpd. I serve my static and dynamic content like that using lighttpd:
$HTTP["host"] == "www.mydomain.com" {
$HTTP["url"] !~ "^/media/|^/static/|^/apple-touch-icon(.*)$|^/favicon(.*)$|^/robots\.txt$" {
proxy.balance = "hash"
proxy.server = ( "" => ("myserver" =>
( "host" => "127.0.0.1", "port" => 8013 )
))
}
$HTTP["url"] =~ "^/media|^/static|^/apple-touch-icon(.*)$|^/favicon(.*)$|^/robots\.txt$" {
alias.url = (
"/media/admin/" => "/var/www/virtualenvs/mydomain/lib/python2.7/site-packages/django/contrib/admin/static/admin/",
"/media" => "/var/www/mydomain/mydomain/media",
"/static" => "/var/www/mydomain/mydomain/static"
)
}
url.rewrite-once = (
"^/apple-touch-icon(.*)$" => "/media/img/apple-touch-icon$1",
"^/favicon(.*)$" => "/media/img/favicon$1",
"^/robots\.txt$" => "/media/robots.txt"
)
}
I already tried to run gunicorn (via supervisord) in many different ways, but I cant get it better optimized than it can handle about 1100 concurrent connections. In my project I need about 10000-15000 connections
command = /var/www/virtualenvs/myproject/bin/python /var/www/myproject/manage.py run_gunicorn -b 127.0.0.1:8013 -w 9 -k gevent --preload --settings=myproject.settings
command = /var/www/virtualenvs/myproject/bin/python /var/www/myproject/manage.py run_gunicorn -b 127.0.0.1:8013 -w 10 -k eventlet --worker_connections=1000 --settings=myproject.settings --max-requests=10000
command = /var/www/virtualenvs/myproject/bin/python /var/www/myproject/manage.py run_gunicorn -b 127.0.0.1:8013 -w 20 -k gevent --settings=myproject.settings --max-requests=1000
command = /var/www/virtualenvs/myproject/bin/python /var/www/myproject/manage.py run_gunicorn -b 127.0.0.1:8013 -w 40 --settings=myproject.settings
On the same server there live about 10 other projects, but CPU and RAM is fine, so this shouldnt be a problem, right?
I ran a load test and these are the results:
At about 1100 connections my lighttpd errorlog says something like that, thats where the load test shows the drop of connections:
2013-10-31 14:06:51: (mod_proxy.c.853) write failed: Connection timed out 110
2013-10-31 14:06:51: (mod_proxy.c.939) proxy-server disabled: 127.0.0.1 8013 83
2013-10-31 14:06:51: (mod_proxy.c.1316) no proxy-handler found for: /
... after about one minute
2013-10-31 14:07:02: (mod_proxy.c.1361) proxy - re-enabled: 127.0.0.1 8013
These things also appear ever now and then:
2013-10-31 14:06:55: (network_linux_sendfile.c.94) writev failed: Connection timed out 600
2013-10-31 14:06:55: (mod_proxy.c.853) write failed: Connection timed out 110
...
2013-10-31 14:06:57: (mod_proxy.c.828) establishing connection failed: Connection timed out
2013-10-31 14:06:57: (mod_proxy.c.939) proxy-server disabled: 127.0.0.1 8013 45
So how can I tune gunicorn/lighttpd to serve more connections faster? What can I optimize? Do you know any other/better setup?
Thanks alot in advance for your help!
Update: Some more server info
root#django ~ # top
top - 15:28:38 up 100 days, 9:56, 1 user, load average: 0.11, 0.37, 0.76
Tasks: 352 total, 1 running, 351 sleeping, 0 stopped, 0 zombie
Cpu(s): 33.0%us, 1.6%sy, 0.0%ni, 64.2%id, 0.4%wa, 0.0%hi, 0.7%si, 0.0%st
Mem: 32926156k total, 17815984k used, 15110172k free, 342096k buffers
Swap: 23067560k total, 0k used, 23067560k free, 4868036k cached
root#django ~ # iostat
Linux 2.6.32-5-amd64 (django.myserver.com) 10/31/2013 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
33.00 0.00 2.36 0.40 0.00 64.24
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 137.76 980.27 2109.21 119567783 257268738
sdb 24.23 983.53 2112.25 119965731 257639874
sdc 24.25 985.79 2110.14 120241256 257382998
md0 0.00 0.00 0.00 400 0
md1 0.00 0.00 0.00 284 6
md2 1051.93 38.93 4203.96 4748629 512773952
root#django ~ # netstat -an |grep :80 |wc -l
7129
Kernel Settings:
echo "10152 65535" > /proc/sys/net/ipv4/ip_local_port_range
sysctl -w fs.file-max=128000
sysctl -w net.ipv4.tcp_keepalive_time=300
sysctl -w net.core.somaxconn=250000
sysctl -w net.ipv4.tcp_max_syn_backlog=2500
sysctl -w net.core.netdev_max_backlog=2500
ulimit -n 10240