How to time Django queries - django

I've always used Python's timeit library to time my little Python programs.
Now I'm developing a Django app and I was wondering how to time my Django functions, especially queries.
For example, I have a def index(request) in my views.py which does a bunch of stuff when I load the index page.
How can I use timeit to time this particular function without altering too much my existing functions?

if your django project is in debug, you can see your database queries (and times) using:
>>> from django.db import connection
>>> connection.queries
I know this won't satisfy your need to profile functions, but hope it helps for the queries part!

The debug toolbar is what you want, it helps you time each of your queries.
Alternatively this snippet works too.
http://djangosnippets.org/snippets/93/

The best way you can get is by using Debug Toolbar, you will also get some additional functionalities for Query optimization, which will help you to optimize your db query.
Here is another solution, You can use connection.queries. This will return the SQL command has been made for the command which was executed just before the connect.queries command. You can the reset_queries after getting the time of the previous query by using reset_queries(). Using reset_queries() is not mandatory.
Suppose you have a Model named Device. You can measure the query time like this:
>>> from django.db import connection, reset_queries
>>> from appname.models import Device
>>> devices = Device.objects.all()
>>> connection.queries
>>> reset_queries()

Anyone stumbling on to this checkout Sentry's approach.
https://github.com/getsentry/sentry-python/blob/master/sentry_sdk/integrations/django/__init__.py#L476
You can replace execute and executemany with your owns functions that track the time it takes for execute to return.
A simple approach is to create custom context manager that initiates a timer and on exit writes final value of the timer to an array you pass to it.
Then you can just check the array.

Related

Can't use assertTemplateUsed with unitest.TestCase

Please help, i'm fairly new to Django and not sure what's the best way to proceed with my unit-tests.
So, i have a large django app, and it has dozens of views-methods, and the postgresql schemas get pretty complex. I've read that if I use "from django.test import TestCase" then the test database is flushed after running each unit-test. I wanted to prevent from flushing the db in between unit-tests within the same class, so i started using "from unittest import TestCase". That did the trick and the db is preserved in between unit-tests, but now the statement
self.assertTemplateUsed(response, 'samplepage.html') gives me errors AttributeError: 'TestViews' object has no attribute 'assertTemplateUsed'.
What can I do? Is there an alternative to 'assertTemplateUsed' that can be used with unittest.TestCase? Many thanks in advance!

Optimized Django Queryset

I have the following function to determine who downloaded a certain book:
#cached_property
def get_downloader_info(self):
return self.downloaders.select_related('user').values(
'user__username', 'user__full_name')
Since I'm only using two fields, does it make sense to use .defer() on the remaining fields?
I tried to use .only(), but I get an error that some fields are not JSON serializable.
I'm open to all suggestions, if any, for optimizing this queryset.
Thank you!
Before you try every possible optimization, you should get your hands on the SQL query generated by the ORM (you can print it to stdout or use something like django debug toolbar) and see what is slow about it. After that I suggest you run that query with EXPLAIN ANALYZE and find out what is slow about that query. If the query is slow because lot of data has to be transfer than it makes lot of sense to use only or defer. Using only and defer (or values) gives you better performances only if you need to retrieve lot of data, but it does not make your database job much easier (unless you really have to read a lot of data of course).
Since you are using Django and Postgresql, you can get a psql session with manage.py dbshell and get query timings with \timing

Importing a CSV using celery

I want to allow my users to upload a CSV of contact data that will populate a a model called contacts. I have used django-csv-importer and this seems to work ok. However, I would like to use maybe something like celery so that users can upload and just forget about waiting ( at the moment it can take 5 minutes).
Are they any projects that do what django-csv-importer does but with celery integration part? If so could someone give me any example if there is a better way?
Many thanks.
Happily I've worked with the author of django-csv-importer, and can report there's a newer version in the form of django-adaptors (https://github.com/anthony-tresontani/django-adaptors), it's the same project but renamed, so it might have some new stuff.
As for your specific question, joshua's answer is correct. But if you want a ridiculously rich implementation complete with audit trails, take a look at this: http://codeinthehole.com/writing/use-models-for-uploads/
in tasks.py
from celery.task import task
#task
def import_csv(filename):
my_csv_list = MyCsvModel.import_data(data = open(filename))
...
Then just call import_csv.delay(filename) in your view.

Changing settings for the tests in django (search engine for haystack to be specific)

I have a project that uses a SOLR search engine through django-haystack. The search engine is on the different live server and touching it during the test run is undesirable (actually, it's impossible, since the access to that host is firewalled)
I'm using standard django testrunner. Luckily, it gives me the object test-settings I can modify to my liking, but turns out it's not the end of the story.
A lot of stuff in django-haystack is instantiated at the import-time, so by the time I change test-settings in my test runner it is too late, and despite the fact that I change the SEARCH_BACKEND to dummy, tests still make call to SOLR. The problem is not specific to HAYSTACK - same issue happens with mongoengine. Any class-level statements (eg CharField(default=Blah.objects.find(...))) are executed at the instantiation-time before django has a chance to change settings.
Of course the root of the problem is the fact that Django settings is a scary globally mutable mess and that Django provides no centralized place for the instantiation code. Given that, are there any suggestions on what testing solution will be the easiest? At the moment I'm thinking about a shell script which will change DJANGO_SETTINGS environment variable to test_settings and run ./manage.py test afterwards. It would be nicer if I could still do things via ./manage.py though.
Any better ideas? People with similar problems?
I took the answer from here and modified it slightly. This works great for me:
from contextlib import contextmanager
#contextmanager
def connection(**kwargs):
from haystack import connections
for key, new_value in kwargs.items():
setattr(connections, key, new_value)
connections['default'].options['URL'] = connections.connections_info['default']['URL']
yield
My test, then, looks like:
def test_job_detail_by_title_slug_job_id(self):
with connection(connections_info=solr_settings.HAYSTACK_CONNECTIONS):
resp = self.client.get('/0/rts-crb-unix-production-engineer/27216666/job/')
self.assertEqual(resp.status_code, 404)
resp = self.client.get('/indianapolis/indiana/usa/jobs/')
self.assertEqual(resp.status_code, 200)

Django: Loaddata command after syncdb fails

I'm trying to use fixtures as a DB-agnostic way to get the data into my database, but this is much harder than it should be. I'm wondering what I'm doing wrong...
Specifically, when I do a syncdb followed by a migrate followed by a loaddata I run into trouble, since syncdb already creates data that loaddata tries to read from the dump. This leads to double entries and hence a crashing script.
This seems to be the same problem as described here: https://code.djangoproject.com/ticket/15926
But it's weird to me that this seems to be an ignored issue. Are fixtures not meant to actually put real (live) data in?
If so: is there any Django-format that is meant for this? Or is everyone just dumping data as SQL? And, if so, how would one migrate development data in SQLite to a production database?
syncdb will also load data from fixtures if you have the fixtures named correctly and in the correct location. See this link for more info.
https://docs.djangoproject.com/en/1.3/howto/initial-data/#automatically-loading-initial-data-fixtures
If you do not want the data to load on every syncdb then you will need to change the name of the fixture.
fixtures are an OK way to load your data, I have used it on a number of projects. On some projects when I have a ton of data I sometimes write a special load script that will take the data from my data source and load up my new django models, the custom script is a little more work, but gives you more flexibility.
I tend to stay away from using sql to load if I can, since SQL is usually DB specific, if you have to worry about loading on different database versions, stay away if you can.
"In general, using a fixture is a cleaner method since it’s database-agnostic, but initial SQL is also quite a bit more flexible."
OP here; this is what I came up with so far:
# some_app/management/commands/delete_all_objects.py
from django.core.management.base import BaseCommand, CommandError
from django.db.models import get_models
class Command(BaseCommand):
help = 'Deletes all objects'
def handle(self, *args, **options):
for model in get_models():
model.objects.all().delete()
And then just run delete_all_objects between after syncdb & migrate and before loaddata. I'm not sure I like it, I'm very surprised it's necessary, but it works.