Django IO bound performance drop with gunicorn + gevent - django

I have been working on an IO bound (only simple queries to database for getting info or updating it) application written in Django. As application is mostly db and redis queries, I decided to use gunicorn with async gevent worker class, but the weird part is that eventhough I'm running gunicorn with gevent (also monkey patched db for that purpose), I'm not seeing any performance gain by it, in fact, both requests/s and response time has dropped.
Steps taken
To do so, I have done the following:
Installed greenlet, gevent & psycogreen as instructed in official docs.
Patched postgres with psycogreen.gevent.patch_psycopg (tried wsgi.py, settings.py and post_fork in gunicorn) but to no avail.
Tried running gevent's monkey.patch_all manually, also no use.
This is my gunicorn.config.py:
import gunicorn
gunicorn.SERVER_SOFTWARE = ""
gunicorn.SERVER = ""
bind = "0.0.0.0:8000"
worker_class = 'gevent'
worker_connections = 1000 # Tested with different values
keepalive = 2 # Tested with different values
workers = 10 # Tested with different values
loglevel = 'info'
access_log_format = 'sport %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
def post_fork(server, worker): # also tried pre_fork and on_starting
from psycogreen.gevent import patch_psycopg
patch_psycopg()
Result
But the app only got slower, also note that I am using django_prometheus.db.backends.postgresql as db backend, django-redis (which uses gevent friendly redis-py) for cache and no notable other third party library.
Test method
To check and compare performance, I ran app with several different configs and ran a load test each time for one api endpoint (the api simply queries database for a list of objects and serializes it). As for specs, I'm running app in my Intel© Core™ i7-9750HF CPU # 2.60GHz × 6 laptop with 16GB memory. For load test, I am using "hey" like this:
hey -n 1000 -c 100 -t 0 <api>
And this is the result for sync and async tests:
sync class result
Requests/sec: 101.4230
99% in 1.6029 secs
gevent class result
Requests/sec: 87.9190
99% in 5.2386 secs
I would appreciate if someone has any idea what could be wrong with my config or application.

Related

How to stop Django 3.0 leaking db connections?

In my requirements.txt I only change from django==2.2.17 to django==3.0 (or 3.1.4)
and the gunicorn webserver starts leaking postgres db connections.
(Every request increases the number of connections when I check the list of clients in pgbouncer.)
I use python 3.7 and redis (via django-redis).
How can I stop the leakage?
Is there any way to limit the number of connections to a server?
Update
The leaks also happen with django==2.2.17 if I set 'CONN_MAX_AGE': None, even if I go directly to postgres, avoiding pgbouncer.

Flask / UWSGI architecture with worker thread

Following code approximates real code. Flask app on starts creates a worker thread. Routing function use data processing done by worker function .
app = Flask(__name__)
timeStr=""
def loop ():
global timeStr
while True:
time.sleep (2)
timeStr =datetime.now().replace(microsecond=0).isoformat()
print (timeStr)
ThreadID = Thread (target=loop)
ThreadID.daemon = True
ThreadID.start()
#app.route('/')
def test():
return os.name + " " + platform.platform() + " " + timeStr
application=app
if __name__ == "__main__":
app.run(host='0.0.0.0', port=8080, debug=True)
The above app works beautifully for days when started likes this:
python3 app.py
However in uwsgi, even though I have enabled threads, app is not working. It's not updating global timeStr
sudo /usr/local/bin/uwsgi --wsgi-file /home/pi/pyTest/app.py --http :80 --touch-reload /home/pi/pyTest/app.py --enable-threads --stats 127.0.0.1:9191
What do I need to do for app to function correctly under UWSGI, so I have create systemd service proper way?
Bad news, good news.
I have an app that starts a worker thread in much the same way. It uses a queue.Queue to let routes pass work on the worker thread. The app has been running happily on my home intranet (on a Rapsberry Pi) using the Flask development server. I tried putting my app behind uwsgi, and observed the same failure -- the worker thread didn't appear to get scheduled. The thread reported as _is_alive = True, but I couldn't find a uwsgi switch combination that let it actually run.
Using gunicorn resolved the issue.
virtualenv venv --python=python3
. venv/bin/activate
pip install flask gunicorn
gunicorn -b 0.0.0.0:5000 demo:app
was enough to get my app to work (meaning the worker thread actually ran, and side-effects were noticeable).

Wsgi number of process and threads setting in AWS Beanstalk

I have an AWS beanstalk env and have old setting of wsgi (given below), I do not have idea how does this work internally, can anybody guide me?
NumProcesses:7 -- number of process
NumThreads:5 -- number of thread in each process
How memory and cpu are being used with this configuration because there is no memory and cpu settings in AWS beanstalk level.
These parameters are part of configuration option for Python environment:
aws:elasticbeanstalk:application:environment.
They mean (from docs):
NumProcesses: The number of daemon processes that should be started for the process group when running WSGI applications (default value 1).
NumThreads: The number of threads to be created to handle requests in each daemon process within the process group when running WSGI applications (default value 15).
Internally, these values map to uwsgi or gunicorn configuration options in your EB environment. For example:
uwsgi --http :8000 --wsgi-file application.py --master --processes 4 --threads 2
Their impact on memory and cpu usage of your instance(s) is based on your application and how resource intensive it is. If you are not sure how to set them up, maybe keeping them at default values would be a good start.
The settings are also available in the EB console, under Software category:
To add on to #Marcin
Amazon linux 2 uses gunicorn
workers are processes in gunicorn
Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.
Gunicorn relies on the operating system to provide all of the load balancing when handling requests. Generally, we (gunicorn creators) recommend (2 x $num_cores) + 1 as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.
To see how the settings in the option settings map to gunicorn you can ssh into your eb instance, go
$ eb ssh
$ cd cd /var/app/current/
$ cat Procfile
web: gunicorn --bind 127.0.0.1:8000 --workers=3 --threads=20 api.wsgi:application
--threads
A positive integer generally in the 2-4 x $(NUM_CORES) range. You’ll want to vary this a bit to find the best for your particular application’s work load.
The threads option only applies to gthread worker type. gunicons default worker class is sync, If you try to use the sync worker type and set the threads setting to more than 1, the gthread worker type will be used instead automatically
based on all the above I would personally choose
workers = (2 x $NUM_CORES ) + 1
threads = 4 x $NUM_CORES
for a t3.medum instance that has 2 cores that translates to
workers = 5
threads = 8
obviously, you need to tweak this for your use case, and treat these as defaults that could very well not be right for your particular application use case, read the refs below to see how to choose the right setup for you use case
References:
REF: Gunicorn Workers and Threads
REF: https://medium.com/building-the-system/gunicorn-3-means-of-concurrency-efbb547674b7
REF: https://docs.gunicorn.org/en/stable/settings.html#worker-class

Eventlet is_monkey_patched issues False

Hello I have a django app. My whole system configuration is the following: python 3, django 1.11, eventlet 0.21.0.
1) Nginx as an upstream server:
upstream proj_server {
server unix:///tmp/proj1.sock fail_timeout=0;
server unix:///tmp/proj2.sock fail_timeout=0;
}
2) Supervisor that controls workers. There is a gunicorn worker:
[program:proj]
command=/home/vagrant/.virtualenvs/proj/bin/gunicorn -c /vagrant/proj/proj/proj/deploy/gunicorn.small.conf.py proj.wsgi:application
directory=/vagrant/proj/proj/proj/deploy
user=www-data
autostart=true
autorestart=true
stdout_logfile=/var/log/supervisor/proj.log
3) This is a gunicorn.small.conf content:
bind = ["unix:///tmp/proj1.sock", "unix:///tmp/proj2.sock"]
pythonpath = "/vagrant/proj/proj/proj/deploy"
workers = 2
worker_class = "eventlet"
worker_connections = 10
timeout = 60
graceful_timeout = 60
4) And this is proj.wsgi content:
"""
WSGI config for proj project.
This module contains the WSGI application used by Django's development server
and any production WSGI deployments. It should expose a module-level variable
named ``application``. Django's ``runserver`` and ``runfcgi`` commands discover
this application via the ``WSGI_APPLICATION`` setting.
Usually you will have the standard Django WSGI application here, but it also
might make sense to replace the whole Django WSGI application with a custom one
that later delegates to the Django one. For example, you could introduce WSGI
middleware here, or combine a Django application with an application of another
framework.
"""
import eventlet
eventlet.monkey_patch()
from eventlet import wsgi
import django.core.handlers.wsgi
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "proj.settings")
# This application object is used by any WSGI server configured to use this
# file. This includes Django's development server, if the WSGI_APPLICATION
# setting points here.
from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()
# Apply WSGI middleware here.
# from helloworld.wsgi import HelloWorldApplication
# application = HelloWorldApplication(application)
So, as you can see there is a chain: nginx as an upstream server calls one of the gunicorn eventlet workers using two sockets proj1.sock or proj2.sock.
Note that according with the eventlet documentation I try to use eventlet.monkey_patch() as early as possible. The most appropriate place for that is proj.wsgi that is called by gunicorn in the first place.
However it seems that library isn't monkey patched.
To check this I added to the proj/proj/proj/__init__.py (the first module that called by the django application) the following code:
import eventlet
import os
print("monkey patched os is: " + str(eventlet.patcher.is_monkey_patched('os')))
print("monkey patched select is: " + str(eventlet.patcher.is_monkey_patched('select')))
print("monkey patched socket is: " + str(eventlet.patcher.is_monkey_patched('socket')))
print("monkey patched time is: " + str(eventlet.patcher.is_monkey_patched('time')))
print("monkey patched subprocess is: " + str(eventlet.patcher.is_monkey_patched('subprocess')))
then i issued **./manage.py check** and got that answer:
monkey patched os is: false
monkey patched select is: false
monkey patched socket is: false
monkey patched time is: false
monkey patched subprocess is: false
What am I doing wrong?
What if you change proj.wsgi file content to one line raise Exception? That should eliminate eventlet from suspects.
I'm not good with Django, here's a pure speculation:
based on name, proj.wsgi is executed when WSGI server is about to start
manage.py check doesn't seem to be related to remote network service (WSGI), seems to be a general management command, so it shouldn't execute WSGI related code
One possible solution, taken from your question text:
proj/proj/proj/init.py (the first module that called by the django application
Try to put monkey_patch call in there.
P.S.: you don't need supervisor for gunicorn, its master process (arbiter) is designed to run forever in spite of problems with workers.

uWSGI downtime when restart

I have a problem with uwsgi everytime I restart the server when I have a code updates.
When I restart the uwsgi using "sudo restart accounting", there's a small gap between stop and start instance that results to downtime and stops all the current request.
When I try "sudo reload accounting", it works but my memory goes up (double). When I run the command "ps aux | grep accounting", it shows that I have 10 running processes (accounting.ini) instead of 5 and it freezes up my server when the memory hits the limit.
accounting.ini
I am running
Ubuntu 14.04
Django 1.9
nginx 1.4.6
uwsgi 2.0.12
This is how uwsgi does graceful reload. Keeps old processes until requests are served and creates new ones that will take over incoming requests.
Read Things that could go wrong
Do not forget, your workers/threads that are still running requests
could block the reload (for various reasons) for more seconds than
your proxy server could tolerate.
And this
Another important step of graceful reload is to avoid destroying
workers/threads that are still managing requests. Obviously requests
could be stuck, so you should have a timeout for running workers (in
uWSGI it is called the “worker’s mercy” and it has a default value of
60 seconds).
So i would recommend trying worker-reload-mercy
Default value is to wait 60 seconds, just lower it to something that your server can handle.
Tell me if it worked.
Uwsgi chain reload
This is another try to fix your issue. As you mentioned your uwsgi workers are restarting in a manner described below:
send SIGHUP signal to the master
Wait for running workers.
Close all of the file descriptors except the ones mapped to sockets.
Call exec() on itself.
One of the cons of this kind of reload might be stuck workers.
Additionaly you report that your server crashes when uwsgi maintains 10 proceses (5 old and 5 new ones).
I propose trying chain reload. DIrect quote from documentation explains this kind of reload best:
When triggered, it will restart one worker at time, and the following worker is not reloaded until the previous one is ready to accept new requests.
It means that you will not have 10 processes on your server but only 5.
Config that should work:
# your .ini file
lazy-apps = true
touch-chain-reload = /path/to/reloadFile
Some resources on chain reload and other kinds are in links below:
Chain reloading uwsgi docs
uWSGI graceful Python code deploy