I'm reaching the end of the first page of the Django tutorial. I tried a quick experiment, and since it hasn't worked I'm confused. Following along with the tutorial, I have a variable p:
p = Poll.objects.get(pk=1)
Rather than creating a poll using p.choice_set.create(choice='Not much', votes=0) as the tutorial instructs, I tried:
x = Choice(choice='Not much', votes=0, poll=p)
Having done this I would have thought that p.choice_set.all() would return something more than an empty list. But it does return an empty list.
(However, if I try x.poll then I get <Poll: What's up?> as I would have expected, so the relationship is only working one way it seems.)
I'm sure there's a good reason why this doesn't work, even though it seems like it ought to! (please bear in mind I have no database experience)
Any thoughts welcome
x = Choice(choice='Not much', votes=0, poll=p) creates an instance of a Choice model but it is not yet saved to the database. p.choice_set.all() queries the database for choices which are associated with the given poll. Since x was not saved to the DB it will not be found.
Related
My django app uses update_or_create() to update a bunch of records. In some cases, updates are really few within a ton of records, and it would be nice to know what got updated within those records. Is it possible to know what got updated (i.e fields whose values got changed)? If not, does any one has ideas of workarounds to achieve that?
This will be invoked from the shell, so ideally it would be nice to be prompted for confirmation just before a value is being changed within update_or_create(), but if not that, knowing what got changed will also help.
Update (more context): Thought I'd give more context here. The data in this Django app gets updated through various means (through users coming on the web site, through the admin page, through scripts (run from the shell) that populate data from a csv etc.). The above question is important mostly for the shell scripts that update data from csvs, hence a solution at the database/trigger/signal level may not be helpful here (I guess).
This is what I ended up doing:
for row in reader:
school_obj0, created = Org.objects.get_or_create(school_id = row[0])
if (school_obj0.name != row[1]):
print (school_obj0.name, '==>', row[1])
confirmation = input('proceed? [y/n]: ')
if (confirmation == 'y'):
school_obj1, created = Org.objects.update_or_create(
school_id = row[0], defaults={"name": row[1],})
Happy to know about improvements to this approach (please see the update in the question with more context)
This will be invoked from the shell, so ideally it would be nice to be
prompted for confirmation just before a value is being changed
Unfortunately, databases don't work like that. It's the responsibility of applications to provide this functionality. And django isn't an application. You can however use django to write an application that provides this functionality.
As for finding out whether an object was updated or created, that's what the return value gives you. A tuple where the second value is a flag for update or create
I have made a django app that creates models and database tables on the fly. This is, as far as I can tell, the only viable way of doing what I need. The problem arises of how to pass a dynamically created model between pages.
I can think of a few ways of doing such but they all sound horrible. The methods I can think of are:
Use global variables within views.py. This seems like a horrible hack and likely to cause conflicts if there are multiple simultaneous users.
Pass a reference in the URL and use some eval hackery to try and refind the model. This is probably stupid as the model could potentially be garbage collected en route.
Use a place-holder app. This seems like a bad idea due to conflicts between multiple users.
Having an invisible form that posts the model when a link is clicked. Again very hacky.
Is there a good way of doing this, and if not, is one of these methods more viable than the others?
P.S. In case it helps my app receives data (as a json string) from a pre-existing database, and then caches it locally (i.e. on the webserver) creating an appropriate model and table on the fly. The idea is then to present this data and do various filtering and drill downs on it with-out placing undue strain on the main database (as each query returns a few hundred results out of a database of hundreds of millions of data points.) W.R.T. 3, the tables are named based on a hash of the query and time stamp, however a place-holder app would have a predetermined name.
Thanks,
jhoyla
EDITED TO ADD: Thanks guys, I have now solved this problem. I ended up using both answers together to give a complete answer. As I can only accept one I am going to accept the contenttypes one, sadly I don't have the reputation to give upvotes yet, however if/when I ever do I will endeavor to return and upvote appropriately.
The solution in it's totality,
from django.contrib.contenttypes.models import ContentType
view_a(request):
model = create_model(...)
request.session['model'] = ContentType.objects.get_for_model(model)
...
view_b(request):
ctmodel = request.session.get('model', None)
if not ctmodel:
return Http404
model = ctmodel.model_class()
...
My first thought would be to use content types and to pass the type/model information via the url.
You could also use Django's sessions framework, e.g.
def view_a(request):
your_model = request.session.get('your_model', None)
if type(your_model) == YourModel
your_model.name = 'something_else'
request.session['your_model'] = your_model
...
def view_b(request):
your_model = request.session.get('your_model', None)
...
You can store almost anything in the session dictionary, and managing it is also easy:
del request.session['your_model']
Can or should I ever do this in a view?
a = SomeTable.objects.all()
for r in a:
if r.some_column == 'foo':
r.some_column = 'bar'
It worked like a champ, but I tried a similar thing somewhere else and I was getting strange results, implying that QuerySet objects don't like to be trifled with. And, I didn't see anything in the docs good or bad for this sort of trick.
I know there are other ways to do this, but I'm specifically wanting to know if this is a bad idea, why it's bad, and if it is indeed bad, what the 'best' most django/pythonic way to change values on the fly would be.
This is fine as long as you don't do anything later that will cause the queryset to be re-evaluated - for example, slicing it. That will make another query to the database, and all your modified objects will be replaced with fresh ones.
A way to protect yourself against that would be to convert to a list first:
a = list(SomeTable.objects.all())
This way, further slicing etc won't cause a fresh db call, and any modifications will be preserved.
Yup. See docs here
SomeTable.objects.filter(some_column='foo').update(some_column='bar')
I would go with Django's idiom. It executes the SQL with a single statement with 'where' and 'update' rather sending multiple SQL statements like your code would. This saves time. Check with Django's 'connection' to test SQL time.
If I am creating a list of new model objects based on some form input, e.g.,
new_items = []
for name, value in self.cleaned_data.items():
if name.startswith('content_item_'):
new_items.append(ContentItem(item=value))
# can I add the entire new_items list to the database in one swoop?
I'm having trouble finding whether this in the docs, which generally refer to creating objects one at a time via the .save() method. But one-at-a-time seems inefficient when you have a whole list of objects to add.
Thanks!
https://docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create
Edit: Unfortunately this is not on 1.3
Original Answer
Thank god for bulk_create!
You could then do something like this:
ContentItem.objects.bulk_create(new_items)
For those too lazy to click the link, here is the example from the docs:
>>> Entry.objects.bulk_create([
... Entry(headline="Django 1.0 Released"),
... Entry(headline="Django 1.1 Announced"),
... Entry(headline="Breaking: Django is awesome")
... ])
I believe Brandon Konkle's reply to a similar question is still valid: Question about batch save objects in Django
In summary: Sadly, no, you'll have to use django.db.cursor with a manual query to do so. If the dataset is small, or the performance is of less importance though, looping through isn't really THAT bad, and is the simplest solution.
Also, see this ticket: https://code.djangoproject.com/ticket/661
How can I prevent Django, for testing purposes, from automatically fetching related tables not specified in the select_related() call during the intial query?
I have a large application where I make significant use of
select_related() to bring in related model data during each original
query. All select_related() calls are used to specify the specific related models, rather than relying on the default, e.g. select_related('foo', 'bar', 'foo__bar')
As the application has grown, the select_related calls haven't
completely kept up, leaving a number of scenarios where Django happily
and kindly goes running off to the database to fetch related model
rows. This significantly increases the number of database hits, which
I obviously don't want.
I've had some success in tracking these down by checking the queries
generated using the django.db.connection.queries collection, but some
remain unsolved.
I've tried to find a suitable patch location in the django code to raise an
exception in this scenario, making the tracking much easier, but tend
to get lost in the code.
Thanks.
After some more digging, I've found the place in the code to do this.
The file in question is django/db/models/fields/related.py
You need to insert two lines into this file.
Locate class "SingleRelatedObjectDescriptor". You need to change the function __get__() as follows:
def __get__(self, instance, instance_type=None):
if instance is None:
return self
try:
return getattr(instance, self.cache_name)
except AttributeError:
raise Exception("Automated Database Fetch on %s.%s" % (instance._meta.object_name, self.related.get_accessor_name()))
# leave the old code here for when you revert!
Similarly, in class "ReverseSingleRelatedObjectDescriptor" further down the code, you again need to change __get__() to:
def __get__(self, instance, instance_type=None):
if instance is None:
return self
cache_name = self.field.get_cache_name()
try:
return getattr(instance, cache_name)
except AttributeError:
raise Exception("Automated Database Fetch on %s.%s" % (instance._meta.object_name, self.field.name))
# BEWARE: % parameters are different to previous class
# leave old code here for when you revert
Once you've done this, you'll find that Django raises an exception every time it performs an automatic database lookup. This is pretty annoying when you first start, but it will help you track down those pesky database lookups. Obviously, when you've found them all, it's probably best to revert the database code back to normal. I would only suggest using this during a debugging/performance investigation phase and not in the live production code!
So, you're asking how to stop a method from doing what it's specifically designed to do? I don't understand why you would want to do that.
However, one thing to know about select_related is that it doesn't automatically follow relationships which are defined as null=True. So if you can set your FKs to that for now, the relationship won't be followed.