Profiling Django

Profiling Django - django

My django application has become painfully slow on the production. Probably it is due to some complex or unindexed queries.
Is there any django-ish way to profile my application?

Try the Django Debug Toolbar. It will show you what queries are executed on each page and how much time they take. It's a really useful, powerful and easy to use tool.
Also, read recommendations about Django performance in Database access optimization from the documentation.
And Django performance tips by
Jacob Kaplan-Moss.

Just type "django-profiling" on google, you'll get these links (and more):
http://code.djangoproject.com/wiki/ProfilingDjango
http://code.google.com/p/django-profiling/
http://www.rkblog.rk.edu.pl/w/p/django-profiling-hotshot-and-kcachegrind/
Personally I'm using the middleware approach - i.e. each user can toggle a "profiling" flag stored in a session, and if my profiling middleware notices that a flag has been set, it uses Python's hotshot module like this:
def process_view(self, request, view_func, view_args, view_kwargs):
# setup things here, along with: settings.DEBUG=True
# to get a SQL dump in connection.queries
profiler = hotshot.Profile(fname)
response = profiler.runcall(view_func, request, *view_args, **view_kwargs)
profiler.close()
# process results
return response
EDIT: For profiling SQL queries http://github.com/robhudson/django-debug-toolbar mentioned by Konstantin is a nice thing - but if your queries are really slow (probably because there are hundreds or thousands of them), then you'll be waiting insane amount of time until it gets loaded into a browser - and then it'll be hard to browse due to slowness. Also, django-debug-toolbar is by design unable to give useful insight into the internals of AJAX requests.
EDIT2: django-extensions has a great profiling command built in:
https://github.com/django-extensions/django-extensions/blob/master/docs/runprofileserver.rst
Just do this and voila:
$ mkdir /tmp/my-profile-data
$ ./manage.py runprofileserver --kcachegrind --prof-path=/tmp/my-profile-data

For profiling data access (which is where the bottleneck is most of the time) check out django-live-profiler. Unlike Django Debug Toolbar it collects data across all requests simultaneously and you can run it in production without too much performance overhead or exposing your app internals.

Shameless plug here, but I recently made https://github.com/django-silk/silk for this purpose. It's somewhat similar to django toolbar but with history, code profiling and more fine grained control over everything.

For all you KCacheGrind fans, I find it's very easy to use the shell in tandem with Django's fantastic test Client for generating profile logs on-the-fly, especially in production. I've used this technique now on several occasions because it has a light touch — no pesky middleware or third-party Django applications are required!
For example, to profile a particular view that seems to be running slow, you could crack open the shell and type this code:
from django.test import Client
import hotshot
c = Client()
profiler = hotshot.Profile("yourprofile.prof") # saves a logfile to your pwd
profiler.runcall(c.get, "/pattern/matching/your/view/")
profiler.close()
To visualize the resulting log, I've used hotshot2cachegrind:
http://kcachegrind.sourceforge.net/html/ContribPython.html
But there are other options as well:
http://www.vrplumber.com/programming/runsnakerun/
https://code.djangoproject.com/wiki/ProfilingDjango

I needed to profile a Django app recently and tried many of these suggestions. I ended up using pyinstrument instead, which can be added to a Django app using a single update to the middleware list and provides a stack-based view of the timings.
Quick summary of my experience with some other tools:
Django Debug Toolbar is great if you the issue is due to SQL queries and works well in combination with pyinstrument
django-silk works well, but requires adding a context manager or decorator to each part of the stack where you want sub-request timings. It also provides an easy way to access cProfile timings and automatically displays ajax timings, both of which can be really helpful.
djdt-flamegraph looked promising, but the page never actually rendered on my system.
Compared to the other tools I tried, pyinstrument was dramatically easier to install and to use.

When the views are not HTML, for example JSON, use simple middleware methods for profiling.
Here are a couple examples:
https://gist.github.com/1229685 - capture all sql calls went into the view
https://gist.github.com/1229681 - profile all method calls used to create the view

You can use line_profiler.
It allows to display a line-by-line analysis of your code with the time alongside of each line (When a line is hit several times, the time is summed up also).
It's used with not-Django python code but there's a little trick to use it on Django in fact: https://stackoverflow.com/a/68163807/1937033

I am using silk for live profiling and inspection of Django application. This is a great tool. You can have a look on it.
https://github.com/jazzband/django-silk

Related

Django app performing slowly (even when cached)

Launching my second-ever Django site.
I've had problems in the past with Django's ORM (basically, the SQL it was generating just wasn't what I wanted and even using things like select_related() I couldn't wrangle it into what it should've been) -- I ended up just writing all my DB queries by hand in my views and using this function, taken from the Django docs, to turn the cursor's responses into usable dictionaries:
def dictfetchall(cursor, returnMultiDictAnyway=False):
"Returns all rows from a cursor as a dict"
desc = cursor.description
rows = [
dict(zip([col[0] for col in desc], row))
for row in cursor.fetchall()
]
if len(rows) == 1 and not returnMultiDictAnyway:
return rows[0]
return rows
I'm almost ready to launch my site but I'm finding pretty huge performance problems on the two different webservers I've tried hosting the app with.
Locally, it doesn't run blazingly fast, but I generally put this down to my machine in general being a little slow. I don't have the numbers to hand (will add later on) but the SQL times aren't crazily high and I've made the effort to optimise MySQL (adding missing indexes etc).
Here's the app, running on two different webhosts (using bit.ly to avoid Google spidering these URLs, sorry!):
http://bit.ly/10iEWYt (hosted on Dreamhost, using Passenger WSGI)
http://bit.ly/UZ9adS (hosted on WebFaction, also using WSGI)
At the moment I have Debug=False on both of those hosts (so there shouldn't be a loading penalty) and a file-based cache of 15 minutes for each one. On the Dreamhost one I have an experimental cronjob hitting the homepage every 15 minutes in an effort to see if this keeps the Python server alive -- this doesn't seem to have done much.
If you try those links you should see how long it takes for the server to respond as you click around, even including the cache (try going from the homepage to another page then back home).
I've tried this profiling middleware but not really sure how to interpret results (can add them to this post later on when I'm home) -- in any case, the functions/lines it pointed to were all inside Django's own code so I struggled to relate that to my own views etc.
Is it likely that the dictfetchall() method above could be an issue here? I use that to work with the results of every DB query on the site (~5-10 per page, most on the homepage). I do have a few included templates but nothing too crazy. I have a context processor for common things like showing album reviews, which I use all over the place. I'm stumped about what else could be causing this slowness.
Thanks, hope this is enough info to be helpful.
EDIT: okay, here's a profiling trace of the site homepage: http://pastebin.com/raw.php?i=c7kHNXAZ -- struggling to interpret it, to be honest.
Also, I looked at the Debug Toolbar stats: 8 SQL queries in 246ms (looking currently at further optimising these), but total time for render of 3235ms (locally). This is what's confusing me.

Django/Pyramid debugtoolbar: is it possible to see not only making query time, but also query duration in database?

And if not, how to check? I'm using SQLAlchemy.
Thanks!

Both projects use werkzeug as far as I can say. Pyramid does use it, and I only heard about django using it but never tried. That said both toolbars should be very different because they depends on different projects.
If you want the query time for sqlalchemy. There are a couple of ways to do that as it's been discussed there.
How can I profile a SQLAlchemy powered application?
With plain old python logging you can guess the time between querys if you enable debugging. The pyramid toolbars allows profiling so you can also there check how much time it took for which functions.
According to the docs, the debug toolbar in sqlalchemy displays time in ms for queries.
http://docs.pylonsproject.org/projects/pyramid_debugtoolbar/en/latest/api.html

Django project eats memory

I have a django project, and a problem - it eats a lot of memory and loads hosting too much.
How can I find the problem places in the project which eat a lot of memory?

If you're using Django with DEBUG = True then Django logs every database query which can quickly mount up and use a substantial amount of memory.
If you're not running in DEBUG mode, then take a look at gc module and in particular try adding gc.set_debug(gc.DEBUG_LEAK) to settings.py. This will show you a great deal of information about what objects are using memory.

In general for debugging/profiling, I suggest django-debug-toolbar as a starting location as well as the various tips in:
http://docs.djangoproject.com/en/dev/topics/db/optimization/
However this won't give memory usage info. If you really need that, you can try some middleware using pympler to log memory usage while debugging and run the development server.
http://www.rkblog.rk.edu.pl/w/p/profiling-django-object-size-and-memory-usage-pympler/?c=1
I've found that doing this grinds my webapps to a near-halt and then there are the problems from using the dev-webserver (e.g., media files not getting served).
But as others said your best bet is to set DEBUG=False:
http://docs.djangoproject.com/en/dev/faq/models/#why-is-django-leaking-memory

As Andrew Wilkinson stated this might have to do with the DEBUG = True setting. However it might also be important to know if you're running this project stand-alone or as a webserver.
A Django will automaticly cache querysets when opening a request and remove the references when the request returns. Since there are no requests in a stand-alone project the references are never delete and hence every queryset ever requested get saved.
To fix the stand-alone python issue simply call django.db.reset_queries() after you've done a bunch of request. This will allow the querysets to be garbage collected and fix your leak.

Monitor database requests in Django, tied to line number

We've got some really strange extraneous DB hits happening in our project. Is there any way to monitor where the requests are coming from, possibly by line number? The SQL printing middleware helps, but we've looked everywhere those kinds of requests might be generated and can't find the source.
If the above isn't possible, any pointers on narrowing down the source would be greatly appreciated.

To find the code executing queries, you can install django-debug-toolbar to figure out what commands are being executed and which tables they're operating on.
Once you've done that, try hooking into the appropriate Django signals for those models and using print and assert to narrow the code.
I'm sure there's a better way to do some of this (a python debugger?) but this is the first thing that comes to mind and probably what I would end up doing myself.

if you want track SQL queries for performance optimization and debug purpose and how to monitor query call in Django
for that this blog will help you out
Tracking SQL Queries for a Request using Django

Django - Mac OSX Workflow - Questions on efficient development methodologies

I am going to outline my workflow and I would like some suggestions on how to improve the efficiency of this. It seems right now a bit cumbersome and repetitive (something I hate), so I am looking for some improvements. Keep in mind I'm still new to django and how it works but I'm a pretty fluent coder (IMHO). So here goes...
Tools (I use these everyday so I'm not inclined to shift):
Mac OSX Leopard
TextMate
Terminal w/tabs
Perforce
Assumptions
Django Basics (Did the tutorials/bought the books)
Python Fluent (running 2.6 with IDLE Support)
Starting my first Application working on models.py
Starting out
Create a TextMate Project with the entire django Tree inside of it.
TextMate Project http://img.skitch.com/20090821-g48cpt38pyfwk4u95mf4gk1m7d.jpg
In the first tab of the terminal start the server
python ./manage.py runserver
In the second tab of the terminal window start the shell
python ./manage.py shell
This spawns up iPython and let's me start the development workflow
Workflow
Create and build a basic Model called models.py
Build a basic Model
class P4Change(models.Model):
"""This simply expands out 'p4 describe' """
change = models.IntegerField(primary_key=True)
client = models.ForeignKey(P4Client)
user = models.ForeignKey(P4User)
files = models.ManyToManyField(P4Document)
desc = models.TextField()
status = models.CharField(max_length=128)
time = models.DateField(auto_now_add=True)
def __unicode__(self):
return str(self.change)
admin.site.register(P4Change)
In the first terminal (Running server) stop it ^C and syncdb start server
> python ./manage.py syncdb
Creating table perforce_p4change
Installing index for perforce.P4Change model
In the shell terminal window load it..
> python ./manage.py shell
Python 2.6.2 (r262:71600, Apr 23 2009, 14:22:01)
Type "copyright", "credits" or "license" for more information.
IPython 0.10 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.
In [1]: from perforce.models import *
In [2]: c = P4Client.objects.get_or_create("nellie")
Did it break yes/no if it did not work then do this:
Stop the shell
Clear the database
Rebuild the database
Fix the code
Reload the shell
Reload the modules
PRAY...
Issues / Comments / Thoughts
Is it me or does this seem terribly inefficient?
It seems like I should be able to do a reload(module) but I can't figure out how to do this.. Anyone?
It would seem as though I should be able to test this from within TextMate?? Anyone??
Even to just get out of the shell I have to verify I want to leave..
The point of this is for all of you geniuses out there to show me the light on a more productive way to work. I am completely open to reasonable suggestions. I'm not inclined to shift tools but I am open to criticisms.

First of all, no need to do a ./manage.py runserver until your models are in place.
Second, clear the database/rebuild the database should be done after fixing the code, and can be done in one fell swoop with ./manage.py reset perforce
Third, the things that you are typing out in the shell each time (import models, try creating an object) should be written in a test suite instead. Then you can do ./manage.py test perforce instead of firing up the shell and typing it again. Actually, if you're using the test suite, you won't need to, because it will create a clean dummy db each time, and break it down for you when it's done.
Fourth, Instead of "PRAY...", try "Watch tests pass."

I find it smoother to write unit tests more often and only use the shell when something is failing and it's not obvious why and you want to poke around to figure it out. It is a little more inefficient at the very beginning, but quickly becomes a wonderful way to work.
I also tend to concentrate on getting the model more or less stable and complete (at least as far as what will affect table structure) before I work on the views and need to run the server. That tends to front-load as many resets as possible so you're doing them when it's cheap.

Thanks to everyone who read this and is looking for a better way. I think unit tests are definately the simpler approach.
So according to the docs you simply need to create a file tests.py parallel to models.py and put tests in there.
from django.test import TestCase
from perforce.models import P4User, P4Client
class ModelTests(TestCase):
def setUp(self):
self.p4 = P4.P4()
self.p4.connect()
def test_BasicP4(self):
"""
Make sure we are running 2009.1 == 65
"""
self.failUnlessEqual(self.p4.api_level, 65)
def test_P4User_get_or_retrieve(self):
"""
This will simply verify we can get a user and push it into the model
"""
user = self.p4.run(("users"))[0]
dbuser = P4User.objects.get_or_retrieve(user.get('User'))
# Did it get loaded into the db?
self.assertEqual(dbuser[1], True)
# Do it again but hey it already exists..
dbuser = P4User.objects.get_or_retrieve(user.get('User'))
# Did it get loaded into the db?
self.assertEqual(dbuser[1], False)
# Verify one field of the data matches
dbuser = dbuser[0]
self.assertEqual(dbuser.email, user.get("Email"))
Now you can simply fire up the terminal and do python manage.py test and that will run the tests but again that's a pretty limited view and still requires you to swap in/out of programs.. So here is how you do this directly from Textmate using ⌘R.
Add an import line at the top and a few line at the bottom.
from django.test.simple import run_tests
#
# Unit tests from above
#
if __name__ == '__main__':
run_tests(None, verbosity=1, interactive=False)
And now ⌘R will work directly from TextMate.

OK, I'll bite :-) Here's what I use:
MAMP. You get a fully functional Apache + MySQL + PHP + phpMyAdmin stack to manage the web and DB layers. It's great for apps that go beyond basic SQLite. Basic version is free but I went ahead and popped for Pro because I use it so much and wanted to support the devs. A good way to test and make sure everything works is start with the Django test server, then deploy and test under MAMP on your own machine, and finally push it out to your deployment site. (You could try to automate the process with something like Fabric).
Eclipse + PyDev + PyDev extensions. Once configured properly you get Python code completion, a nice development environment, and full debugging. You can configure it so it runs the Django test server for you and you can set breakpoints on any line in Django source or your own code. The thing I like about Eclipse is that once you get used to the environment, you can also it for C/C++, Java, JavaScript, Python, and Flex coding.
Aptana for Eclipse. It helps when developing AJAX front-ends and editing Django templates to have a decent Javascript + HTML editor/debugger.
TextMate. I've created a TextMate project that includes all of Django sources and saved it in the Django source directory. This way, I can quickly do project searches through Django source and single-click open the source file. You can also set it up so you can go back and forth between Eclipse and TextMate editors and have them auto-reload.
A decent MySQL or SQLite editor. phpMySQLAdmin is OK but sometimes it's good to have a standalone tool. SequelPro (formerly CocoaMySQL) and Navicat are all pretty good for MySQL. One advantage is that once your app is deployed, you can use these tools to remotely access the deployment DB server and tweak it from your desktop. On the SQLite side SQLiteManager and Base are good commercial tools, as is the freebie FireFox SQLite Manager. At the very least you can watch what Django's doing under the hood.
I use Subversion for version control mostly because it runs on a standalone Mac Mini which saves to a Drobo RAID array plus auto-backups everything to a couple other external drives. This on top of Time Machine (yes, I'm paranoid :-) I used to use Eclipse's SVN support but now I'm a big fan of Versions. At some point when I can figure out a good mirroring scheme I'll switch to Mercurial, Git, or Bazaar, but for now this works pretty well.
Terminal plus a bunch of shell scripts. Everyone has their own version of this. I'm pretty lazy when it comes to these things so I set up a bunch of bash shortcuts to help speed up repetitive Django admin tasks. I posted these up a while back.
Most of these can be had for free or a moderate fee (< $100). But if I had to pick the 'must have' items for Django development on the Mac, it would be Eclipse and PyDev.
I'm sure there are some I've missed. Be great to hear what tools everyone else is using.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js