Recommended settings for uwsgi - django

I have a mysql + django + uwsgi + nginx application and I recently had some issues with uwsgi's default configuration so I want to reconfigure it but I have no idea what the recommended values are.
Another problem is that I couldn't find the default settings that uwsgi uses and that makes debugging really hard.
Using the default configuration, the site was too slow under real traffic (too many requests stuck waiting for the uwsgi socket). So I used a configuration from some tutorial and it had cpu-affinity=1 and processes=4 which fixed the issue. The configuration also had limit-as=512 and now the app gets MemoryErrors so I guess 512MB is not enough.
My questions are:
How can I tell what the recommended settings are? I don't need it to be super perfect, just to handle the traffic in a descent way and to not crash from memory errors etc. Specifically the recommended value for limit-as is what I need most right now.
What are the default values of uwsgi's settings?
Thanks!

We run usually quite small applications... Rarely more than 2000 requests per minute. But anyway its is hard to compare different applications. Thats what we use on production:
Recommendation by the documentation
Haharakiri = 20 # respawn processes taking more than 20 seconds
limit-as = 256 # limit the project to 256 MB
max-requests = 5000 # respawn processes after serving 5000 requests
daemonize = /var/log/uwsgi/yourproject.log # background the process & log
uwsgi_conf.yml
processes: 4
threads: 4
# This part might be important too, that way you limit the log file to 200 MB and
# rotate it once
log-maxsize : 200000000
log-backupname : /var/log/uwsgi/yourproject_backup.log
We use the following project for deployment and configuration of our django apps. (No documentation here sorry... Just used it internally)
https://github.com/iterativ/deployit/blob/ubuntu1604/deployit/fabrichelper/fabric_templates/uwsgi.yaml
How can you tell if you configured it correctly... ? Since it depends much on your application I would recommend to use some monitoring tools such as newrelic.com and analyse it.

Related

Potential Django/Celery/amqp threading issues

I'm currently working with a system which provides a Django backend to serve up a rest API.
In addition to this we provide updates to a RabbitMQ instance using celery upon change to records within the Django app.
uwsgi is used to host multiple (5) instances of the Django backend, and also has an attach-daemon setting which launches 5 celery worker instances, ie:
attach-daemon = %(home)/bin/celery --app=djangoProject.worker.app worker -c 5 -l INFO
We have recently added some functionality which increases the rate of updates in some circumstances and have discovered we are not generally getting UnexpectedFrame errors from within the amgqp library:
amqp.exceptions.UnexpectedFrame: Received 0x00 while expecting 0xce
We suspect this is some form of threading type issue, but looking for any advice to overcome the problem.
Looking for any advice on how to over come this sort of issue, or to at least further diagnose where the fault lies.

Optimizing nginx requests per second

I'm trying to optimize a single core 1GB ram Digital Ocean VPS to handle more requests per second. After some tweaking (workers/gzip etc.) it serves about 15 requests per second. I don't have anything to compare it with but I think this number can be higher.
The stack works like this:
VPS -> Docker container -> nginx (ssl) -> Varnish -> nginx -> uwsgi (Django)
I'm aware of the fact that this is a long chain and that Docker might cause some overhead. However, almost all requests can be handled by Varnish.
These are my tests results:
ab -kc 100 -n 1000 https://mydomain | grep 'Requests per second'
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Requests per second: 18.87 [#/sec] (mean)
I have actually 3 questions:
Am I correct that 18.87 requests per second is low?
For a simple Varnished Django blog app, what would be an adequate value (indication)?
I already applied the recommended tweaks (tuned for my system) from this tutorial. What can I tweak more and how do I figure out the bottlenecks.
First some note about Docker. It is not meant to run a multiple processes in a single docker container. Docker is not a replacement for a VM. It simply allows to run processes in isolation. So the docker diagram should be:
VPS -> docker nginx container -> docker varnish container -> docker django container
To make your life using multiple Docker containers simpler, I would recommend to use Docker-compose. It is not perfect but its an excellent start.
Old but still fundamentally relent blog post about that. Note that some suggestions are no-longer relevant like nsenter since docker exec command is now available but most of the blog post is still correct.
As for your performance issues, yes, 18 requests per second is pretty low. However the issue is probably has nothing to do with nginx and is most likely in your Django application and possibly varnish (however very unlikely).
To debug PA issues in Django, I would recommend to use django-debug-toolbar. Most issues in Django are caused by unnecessary SQL queries. You can see them easily in debug toolbar. To solve most them you can use select_related() and prefetch_related. For more detailed analysis, I would also recommend profiling your application. cProfile is a great start. Also some IDEs like PyCharm include built-in profilers so its pretty easy to profile your application to see which functions are taking most of the time which you can optimize. Finally you can use 3rd party tools to profile your application. Even free newrelic account will give you quite a bit of information. Alternatively you can use opbeat which is a new cool kid on the block.

Configure uwsgi server for performance

I am deploying a uwsgi server for a django app. Each request will have a latency around 2 seconds. I need to handle 100 QPS. On a 4 cores machines, how should I configure the number of processes and the number of threads? I tried to play with the values but I do not understand what I am doing.
Go through the uWSGI Things to know page. 100 requests per second should be easily attainable with uWSGI.
Based on uWSGI behavior I've experienced, I would recommend that you start with only processes and don't use any threads. With both processes and threads we observed that there seemed to be an affinity to use threads over processes. That resulted in a single process handling all requests until it's thread pool was fully occupied and only then were requests handled by the next process. This resulted in poor utilization of resources as a single core was maxed out with all other idle. Turning off threading resulted in a massive performance boost for our particular use model.
Your experience may be different. The uWSGI authors stress that there isn't any magic config combination- it's completely dependent on your particular use case. You need benchmark your app against various configurations to find the sweet spot. Additionally, unless you're able to use benchmarks that perfectly model your actual production load, you'll want to continue to monitor performance and methodically tweak settings after you deploy.
From the Things to know page:
There is no magic rule for setting the number of processes or threads
to use. It is very much application and system dependent. Simple math
like processes = 2 * cpucores will not be enough. You need to
experiment with various setups and be prepared to constantly monitor
your apps. uwsgitop could be a great tool to find the best values.

How to improve number of request django on heroku can handle at a moment?

In my project, I use Django and heroku to deploy it. In Heroku, I use uWSGI server (with asynchronous mode), database is MySQL (on AWS RDS). I used 7 dyno for scaling django app
When I run stress test with 600 request/second, timeout is 30 second.
My server return > 50% with timeouts request.
Any ideas can help me improve my server performance?
If your async setup is right (and this is the hardest part), well your only solution is adding more dynos. If you are not sure about django+async (or if you have not done any particular customization to make them work together), you have probably a screwed up setup (no concurrency at all).
Take in account that uWSGI async mode could means dozens of different setup (gevent, ugreen, callbacks, greenlets...), so some detail on your configuration could help.

cpu load and django application that makes long-response-time requests to external API

I'm developing a web application in python for which each user request makes an API call to an external service and takes about 20 seconds to receive response. As a result, in the event of several concurrent requests being made, the CPU load goes crazy (>95%) with several idle processes.
The server consists of a 1.6 GHz dual core Atom 330 with 2GB RAM.
The web app is developed in python which is served through Apache with mod_wsgi
My question is the following. Will a non-blocking webserver such as Tornado improve CPU load and thus handle more concurrent users (I'm also interested why) ? Can you suggest any other scalable solution ?
This really doesn't have anything to do with blocking; it does, but it doesn't. The 20 sec request is blocking one thread, so another has be utilized for the next request. Whereas with quick requests, the threads basically round-robin.
However, this really shouldn't be spiking your CPU output. Webservers have an upward limit of "workers" that get spawned and when they're all tied up, they're all tied up. It won't extend past the limit, so unless you've set or the default setting is higher than the box you have is capable of running, it shouldn't push your CPU that high.
Regardless, all that is merely informational, and doesn't really solve your problem. With such a long running request though, you should be offloading this from your webserver as quick as possible. The webserver should merely hand off the request to another process that can asyncronously handle it and then employ polling to notify the client when the response is ready. Node.js is used a lot in similar scenarios, but I really don't have enough experience with it to give you any real guidance beyond that.
You should look into using message queues to offload tasks so that your user requests are not blocked.
You could look into python libs kombu and celery to handle messages and tasks.
You are likely using prefork MPM with Apache and mod_wsgi embedded mode. This is a bad combination by default because Apache is setup for PHP and not fat Python web applications. Read:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
which explains this exact sort of issue.
Use mod_wsgi daemon mode at the minimum, and preferably also change to worker MPM for Apache.