How to profile Django's bottlenecks for scaling? - django

I am using django and tastypie for REST API.
For profiling, I am using django-silk and below is a summary of requests:
How do I profile the complete flow? Time taken except for database queries is (382 - 147) ms on average. How do I figure out the bottleneck and optimize/scale? I did use #silk_profile() for the get_object_list method for this resource, but even this method doesn't seem to be bottleneck.
I used caching for decreasing response time, but that didn't help much, what are the other options?
When testing using loader.io, the peak the server can handle is 1000 requests per 30 secs (which seems very low). Other than caching (which I already tried) what might help?

Here's a bunch of suggestions:
bring the query per request at least below 5 per request (34 per request is really bad)
install django toolbar and have a look where the time is spent
use gunicorn or uwsgi behind a reverse proxy (NGINX)

You have too much queries, even if they are relatively fast you spend
some time to reach database etc. Also if you have external cache
storage (for example, redis) it could take some time to connect
there.
To investigate slow parts of the code you have two options:
Use a profiler - profiling at local PC could make no sense if you have distributed system deployed to several machines
Add tracing points to your code that will record some message and current time (something like https://gist.github.com/dbf256/0f1d5d7d2c9aa70bce89). Deploy this patched code and test it with your load-testing tool and check logs.

Related

Is there a way to compute this amount of data and still serve a responsive website?

Currently I am developing a django + react website, that will (I hope) serve a decent number of users. The project demo is mostly complete, and I am starting to think about the scale required to put this thing into production
The website essentially does three things:
Grab data from external APIs (i.e. Twitter) for 50,000 unique keywords (the keywords dont change). This process happens every 30 minutes
Run computation on all of the data, and save that computation to the database. Assume that the algorithm is as optimized as possible
When a user visits the website it should serve a pretty graph/chart of all of the computational data per keyword
The issue being, this is far too intense a task to be done by the same application that serves the website, users would be waiting decades to see their data. My current plan is to have a separate API made that services the website with the data, that the website can then store in it's database. This separate API would process the data without fear of affecting users, and it should be able to finish its current computation in under 30 minutes, in time for the next round of data.
Can anyone help me understand how I can better equip my project to handle the scale? I'd love some ideas.
As a 4th year CS Student I figured it's time to put a real project out into the world and I am very excited about it and the progress I've made so far. My main worry is that the end users will be negatively effected, if I don't figure out some kind of pipeline to make this process happen.
To re-iterate my idea:
Django + React - This is the forward facing website
External API - Grabs the data off the internet and processes it, and waits for a GET request from the website
Is there a better way to do this? Or on the other hand am I severely overestimating how computationally heavy this is.
Edit: Including current research
Handling computationally intensive tasks in a Django webapp
Separation of business logic and data access in django
What you want is to have the computation task to be executed by a different process in the "background".
The most straight-forward and popular solution is to use Celery, see here.
The Celery worker(s) - which performs the background task - can either run on the same machine as the web-application or (when scale becomes an issue), you can change the configuration so that it will run on an entirely different machine.

Apache SuperSet is very slow

Any recommendation on how to make superset faster?
Cache seems to load full data from the cache, I thought it load only old data from the cache, and real-time data from the database, isn't it like this?
What about some parallel processing?
This answer is valid as of Superset 0.37.0.
At the moment, dashboard performance is affected by a few different factors. I'll enumerate them below along with methods to improve performance:
Database concurrency limits can have an impact on dashboard performance. Dashboards load their information in parallel via concurrent web requests. Make sure that the database user provided allows enough concurrency that queries aren't being queued at the database layer.
Cache performance your caching layer should be able to return multiple results, if not in parallel, extremely quickly. We've had success leveraging S3 for our cache.
Cache hit percentage Superset will hit the cache only for queries that exactly match one that has been run recently. Otherwise the full query will fall through to the underlying analytical DB (Druid in this case). You can reduce the query load on Druid by using a less granular resolution on your dashboard - if it's possible to have it update less frequently, say a couple of times a day rather than in real-time, this can hit cache for all requests other than the first request in the new period under consideration.
Python Web Process Concurrency Limits make sure that your web application server can handle enough parallel requests. The browser will request multiple charts' data at the same time, and the system will need to be able to handle these requests in parallel.
Chart Query Performance As data is frequently requested, especially for real-time data from a database like Druid, optimizing the queries run by the charts can be very useful. I'd take a look at any virtual datasources that are being leveraged to see if they can be materialized or made more efficient.
Web browser concurrent request limits By default most web browsers limit the number of concurrent requests that can be made to the same FQDN. If you have more than 6 charts on the same dashboard, it can be helpful to balance requests across multiple FQDNs running Superset to get around this browser limitation. There's more information on the approach to that in the issue history on Github, but Superset does support this type of configuration.
The community is very interested in improving performance over time, and as such there have been recommendations to move all analytical queries to Celery as well as making other architectural changes to improve performance. I hope this description helps and that something in here will help you track down the issue!

How can I scale a webapp with long response time, which currently uses django

I am writing a web application with django on the server side. It takes ~4 seconds for server to generate a response to the user. It makes use of a weather api. My application has to make ~50 query to that api for each user request.
Server side uses urllib of python for using the weather api. I used pythons threading to speed up the process because urllib is synchronous. I am using wsgi with apache. The problem is wsgi stack is fully synchronous and when many users use my application, they have to wait for one anothers request to finish. Since each request takes ~4 seconds, this is unacceptable.
I am kind of stuck, what can I do?
Thanks
If you are using mod_wsgi in a multithreaded configuration, or even a multi process configuration, one request should not block another from being able to do something. They should be able to run concurrently. If using a multithreaded configuration, are you sure that you aren't using some locking mechanism on some resource within your own application which precludes requests running through the same section of code? Another possibility is that you have configured Apache MPM and/or mod_wsgi daemon mode poorly so as to preclude concurrent requests.
Anyway, as mentioned in another answer, you are much better off looking at caching strategies to avoid the weather lookups in the first place, or offloading to client.
50 queries to an outside resource per request is probably a bad place to be, and probably not neccesary at all.
The weather doesn't change all that quickly, and so you can probably benefit enormously by just caching results for a while. Then it doesn't matter how many requests you're getting, you don't need to do more than a few queries per day
If that's not your situation, you might be able to get the client to do the work for you. Refactor the code so that the weather api aggregation happens on the client in javascript, rather than funneling it all through the server.
Edit: based on comments you've posted, what you are asking for probably cannot be optimized within the constraints of the API you are using. The problem is that the service is doing a good job of abstracting away the differences in the many sources of weather information they aggregate into a nearest location query. after all, weather stations provide only point data.
If you talk directly to the technical support people that provide the API, you might find that they are willing to support more complex queries (bounding box), for which they will give you instructions. More likely, though, they abstract that away because they don't want to actually reveal the resolution that their API actually provides, or because there is some technical reason in the way that they model their data or perform their calculations that would make such queries too difficult to support.
Without that or caching, you are just out of luck.

Django Performance

I am using a django with apache mod_wsgi, my site has dynamic data on every page and all of the media (css, images, js) is stored on amazon S3 buckets liked via "http://bucket.domain.com/images/*.jpg" inside the markup . . . . my question is, can varnish still help me speed up my web server?
I am trying to knock down all the stumbling blocks here. Is there something else I should be looking at? I have made a query profiler for my code at each page renders at about 0.120 CPU seconds which seems quick enough, but when I use ab -c 5 -n 100 http://mysite.com/ the results are only Requests per second: 12.70 [#/sec] (mean) . . .
I realize that there are lots of variables at play, but I am looking for some guidance on things I can do and thought Varnish might be the answer.
UPDATE
here is a screenshot of my profiler
The only way you can improve your performance is if you measure what is slowing you down. Though it's not the best profiler in the world, Django has good integration with the hotshot profiler (described here) and you can figure out what is taking those 0.120 cpu seconds.
Are you using 2 cpus? If that's the case than perhaps the limitation is in the db when you use ab? I only say that because 0.120 * 12.70 is 1.5 which means that there's .5 seconds waiting for something. This could also be IO or something.
Adding another layer for no reason such as varnish is generally not a good idea. The only case where something like varnish would help is if you have slow clients with poor connections hold onto threads, but the ab test is not hitting this condition and frankly it's not a large enough issue to warrant the extra layer.
Now, the next topic is caching, which varnish can help with. Are your pages customized for each user, or can it be static for long periods of time? Often times pages are static except for a simple login status screen -- in this case consider off loading that login status to javascript with cookies. If you are able to cache entire pages then they would be extremely fast in ab. However, the next problem is that ab is not really a good benchmark of your site, since users aren't going to just sit at one page and hit f5 repeatedly.
A few things to think about before you go installing varnish:
First off, have you enabled the page caching middleware within Django?
Are etags set up and working properly?
Is your database server performing optimally?
Have you considered setting up caching using memcached within your code for common results? (particularly front pages and static pages displayed to non-logged-in users)
Except for query heavy dynamic pages that absolutely must display substantially different data for each user, .12 seconds seems like a long time to serve a page. Look at how you can work caching into your app to improve that performance. If you have a page that is largely static other than a username or something similar, cache the computed part of the page.
When Django's caching is working properly, ab on public pages should be lightning fast. If you aren't using any of the other features of Apache, consider using something lighter and faster like lighttpd or nginx.
Eric Florenzano has put together a pretty useful project called django-newcache that implements some advanced caching behavior. If you're running into the limitations of the built-in caching, consider it. http://github.com/ericflo/django-newcache

How to evaluate the performance of web servers?

I'm planing to deploy a django powered site. But I feel confused about the choice of web servers, which includes apache, lighttpd, nginx and others.
I've read some articles about the performance of each of these choice. But it seems no one agrees. So I'm wondering why not test the performance by myself?
I can't find information about the best approach to performance testing web servers. So my questions are:
Is there any easy approach to test the performance without the production site?
Or can I have a method to simulate the heavy traffic to have a fair test?
How can I keep my test fair and close to production situation?
After the test, I want to figure out:
Why some ones say nginx has a better performance when serving static files.
The cpu and memory needs of each web server.
My best choice.
Tools like ab are commonly used towards testing how much load you can take from a battering of requests at once, alongside cacti/munin/your system monitoring tool or choice you can generate data on system load & requests/sec. The problem with this is many people benchmarking don't realise that they need to request a lot of different requests, as different parts of your code executes it will take varying amounts of time. Profiling and benchmarking code and not requests is also important, to which plenty of folk have already done so for django, benchrun is also not a bad tool either.
The other issue, is how many HTTP requests each page view takes. The less amount of requests, and the quicker they can be processed is the key to having websites that can sustain a high amount of traffic, as the quicker you can finish and close connections, the quicker you allocate resources for new ones.
In terms of general speed of web servers, it goes without saying that a proxy server (running reverse at your end) will always perform faster than a webserver with static content. As for Apache vs nginx in regards to your django app, it seems that mod_python is indeed faster than nginx/lighty + FastCGI but that's no surprise because CGI, regardless of any speed ups is still slow. Executing and caching code at the webserver and letting it manage it is always faster (mod_perl vs use CGI, mod_php vs CGI, etc) if you do it right.
Apache JMeter is an excellent tool for stress-testing web applications. It can be used with any web server, not just Apache.
You need to set up the web server + website of your choice on a machine somewhere, preferably a physical machine with similar hardware specs to the one you will eventually be deploying to.
You then need to use a load testing framework, for example The Grinder (free), to simulate many users using your site at the same time.
The load testing framework should be on separate machine(s) and you should monitor the network and CPU usage of those machines as well to make sure that the limiting factor of your testing is in fact the web server and not your load injectors.
Other than that its just about altering the content and monitoring response times, throughput, memory and CPU use etc... to see how they change depending on what web server you use and what sort of content you are hosting.