How to stop Django 3.0 leaking db connections? - django

In my requirements.txt I only change from django==2.2.17 to django==3.0 (or 3.1.4)
and the gunicorn webserver starts leaking postgres db connections.
(Every request increases the number of connections when I check the list of clients in pgbouncer.)
I use python 3.7 and redis (via django-redis).
How can I stop the leakage?
Is there any way to limit the number of connections to a server?
Update
The leaks also happen with django==2.2.17 if I set 'CONN_MAX_AGE': None, even if I go directly to postgres, avoiding pgbouncer.

Related

Django IO bound performance drop with gunicorn + gevent

I have been working on an IO bound (only simple queries to database for getting info or updating it) application written in Django. As application is mostly db and redis queries, I decided to use gunicorn with async gevent worker class, but the weird part is that eventhough I'm running gunicorn with gevent (also monkey patched db for that purpose), I'm not seeing any performance gain by it, in fact, both requests/s and response time has dropped.
Steps taken
To do so, I have done the following:
Installed greenlet, gevent & psycogreen as instructed in official docs.
Patched postgres with psycogreen.gevent.patch_psycopg (tried wsgi.py, settings.py and post_fork in gunicorn) but to no avail.
Tried running gevent's monkey.patch_all manually, also no use.
This is my gunicorn.config.py:
import gunicorn
gunicorn.SERVER_SOFTWARE = ""
gunicorn.SERVER = ""
bind = "0.0.0.0:8000"
worker_class = 'gevent'
worker_connections = 1000 # Tested with different values
keepalive = 2 # Tested with different values
workers = 10 # Tested with different values
loglevel = 'info'
access_log_format = 'sport %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
def post_fork(server, worker): # also tried pre_fork and on_starting
from psycogreen.gevent import patch_psycopg
patch_psycopg()
Result
But the app only got slower, also note that I am using django_prometheus.db.backends.postgresql as db backend, django-redis (which uses gevent friendly redis-py) for cache and no notable other third party library.
Test method
To check and compare performance, I ran app with several different configs and ran a load test each time for one api endpoint (the api simply queries database for a list of objects and serializes it). As for specs, I'm running app in my Intel© Core™ i7-9750HF CPU # 2.60GHz × 6 laptop with 16GB memory. For load test, I am using "hey" like this:
hey -n 1000 -c 100 -t 0 <api>
And this is the result for sync and async tests:
sync class result
Requests/sec: 101.4230
99% in 1.6029 secs
gevent class result
Requests/sec: 87.9190
99% in 5.2386 secs
I would appreciate if someone has any idea what could be wrong with my config or application.

Wsgi number of process and threads setting in AWS Beanstalk

I have an AWS beanstalk env and have old setting of wsgi (given below), I do not have idea how does this work internally, can anybody guide me?
NumProcesses:7 -- number of process
NumThreads:5 -- number of thread in each process
How memory and cpu are being used with this configuration because there is no memory and cpu settings in AWS beanstalk level.
These parameters are part of configuration option for Python environment:
aws:elasticbeanstalk:application:environment.
They mean (from docs):
NumProcesses: The number of daemon processes that should be started for the process group when running WSGI applications (default value 1).
NumThreads: The number of threads to be created to handle requests in each daemon process within the process group when running WSGI applications (default value 15).
Internally, these values map to uwsgi or gunicorn configuration options in your EB environment. For example:
uwsgi --http :8000 --wsgi-file application.py --master --processes 4 --threads 2
Their impact on memory and cpu usage of your instance(s) is based on your application and how resource intensive it is. If you are not sure how to set them up, maybe keeping them at default values would be a good start.
The settings are also available in the EB console, under Software category:
To add on to #Marcin
Amazon linux 2 uses gunicorn
workers are processes in gunicorn
Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.
Gunicorn relies on the operating system to provide all of the load balancing when handling requests. Generally, we (gunicorn creators) recommend (2 x $num_cores) + 1 as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.
To see how the settings in the option settings map to gunicorn you can ssh into your eb instance, go
$ eb ssh
$ cd cd /var/app/current/
$ cat Procfile
web: gunicorn --bind 127.0.0.1:8000 --workers=3 --threads=20 api.wsgi:application
--threads
A positive integer generally in the 2-4 x $(NUM_CORES) range. You’ll want to vary this a bit to find the best for your particular application’s work load.
The threads option only applies to gthread worker type. gunicons default worker class is sync, If you try to use the sync worker type and set the threads setting to more than 1, the gthread worker type will be used instead automatically
based on all the above I would personally choose
workers = (2 x $NUM_CORES ) + 1
threads = 4 x $NUM_CORES
for a t3.medum instance that has 2 cores that translates to
workers = 5
threads = 8
obviously, you need to tweak this for your use case, and treat these as defaults that could very well not be right for your particular application use case, read the refs below to see how to choose the right setup for you use case
References:
REF: Gunicorn Workers and Threads
REF: https://medium.com/building-the-system/gunicorn-3-means-of-concurrency-efbb547674b7
REF: https://docs.gunicorn.org/en/stable/settings.html#worker-class

how to kill gunicorn server inside request response cycle

i have one Django application and one flask application.
it's running with two Gunicorn servers inside the docker containers.
my goal is if the database connection was failed (if db was down), i want to kill the Django app and flask app.
how could i do this ?

uWSGI downtime when restart

I have a problem with uwsgi everytime I restart the server when I have a code updates.
When I restart the uwsgi using "sudo restart accounting", there's a small gap between stop and start instance that results to downtime and stops all the current request.
When I try "sudo reload accounting", it works but my memory goes up (double). When I run the command "ps aux | grep accounting", it shows that I have 10 running processes (accounting.ini) instead of 5 and it freezes up my server when the memory hits the limit.
accounting.ini
I am running
Ubuntu 14.04
Django 1.9
nginx 1.4.6
uwsgi 2.0.12
This is how uwsgi does graceful reload. Keeps old processes until requests are served and creates new ones that will take over incoming requests.
Read Things that could go wrong
Do not forget, your workers/threads that are still running requests
could block the reload (for various reasons) for more seconds than
your proxy server could tolerate.
And this
Another important step of graceful reload is to avoid destroying
workers/threads that are still managing requests. Obviously requests
could be stuck, so you should have a timeout for running workers (in
uWSGI it is called the “worker’s mercy” and it has a default value of
60 seconds).
So i would recommend trying worker-reload-mercy
Default value is to wait 60 seconds, just lower it to something that your server can handle.
Tell me if it worked.
Uwsgi chain reload
This is another try to fix your issue. As you mentioned your uwsgi workers are restarting in a manner described below:
send SIGHUP signal to the master
Wait for running workers.
Close all of the file descriptors except the ones mapped to sockets.
Call exec() on itself.
One of the cons of this kind of reload might be stuck workers.
Additionaly you report that your server crashes when uwsgi maintains 10 proceses (5 old and 5 new ones).
I propose trying chain reload. DIrect quote from documentation explains this kind of reload best:
When triggered, it will restart one worker at time, and the following worker is not reloaded until the previous one is ready to accept new requests.
It means that you will not have 10 processes on your server but only 5.
Config that should work:
# your .ini file
lazy-apps = true
touch-chain-reload = /path/to/reloadFile
Some resources on chain reload and other kinds are in links below:
Chain reloading uwsgi docs
uWSGI graceful Python code deploy

Nginx "Broken pipe" When Debugging Django?

I have a Django site that uses Gunicorn and Nginx. Occasionally, I'll have a problem that I need to debug. In the past, I would shut down Gunicorn and Nginx, go to my Django project directory and start the Django development server ("python ./manage.py runserver 0:8000"), and then restart Nginx. I could then insert set_trace() commands and do my debugging. When I fixed the problem I'd shut down Nginx and then restart Gunicorn and Nginx. I'm pretty sure this was working.
Recently, though, I've begun having problems. What happens now is that when I've stopped at a breakpoint, after a couple of minutes the web page that I've stopped on will change and display "404 Not Found" and if I take another step in the debugger, I'll see this error:
- Broken pipe from ('127.0.0.1', 43742)
This happens on my development, staging, and production servers which I'm accessing via their domain names, e.g. "web01.example.com" (not really example).
What is the correct way to debug my Django application on my remote servers?
Thanks.
I figured out the problem. First I observed that when I stopped at a breakpoint, the page always timed out after exactly one minute which suggested that the Nginx connection to the web server was timing out if the web server took more than 60 seconds to respond. I then found an Nginx proxy_read_timeout directive which defines this timeout. Then it was merely a matter of changing the length of the timeout in my Nginx config file:
# /etc/nginx/sites-enabled/example.conf
http {
server {
...
location #django {
...
# Set timeout to 1 hour
proxy_read_timeout 3600s;
...
}
...
}
}
Once you've made this change you need to reload Nginx, not restart it, in order to this change to take effect. Then you start Django as I indicated above and you can now debug your Django application without it timing out. Just be sure to remove the timeout setting when you're done debugging, reload Nginx again, and restart Gunicorn.