How can I manually modify models retrieved from the database in Django? - django

I wish to do something such as the following:
people = People.objects.filter(date=date)
person = people[0]
person['salary'] = 45000
The last line results in an error:
object does not support item assignment
To debug something like this I always find it easier to start with something working and modify line by line until something breaks.
I want to modify the object for rendering in the template. If I try:
person.salary = 45000
There is no error but trying
print person.salary
Immediately afterwards results in the original value being printed. Update:
In my code I was actually doing:
people[0].salary = 45000
Which doesn't work. For some reason
person = people[0]
person.salary = 45000
Does work. I thought the two pieces of code would be exactly the same

person is an object, you need to do like this:
person.salary = 45000
person.save()
You should read How to work with models.

Looking at the IDs, it seems that when you assign an entry to a variable, you get its copy, not its original reference:
In [11]: people = People.objects.filter(salary=100)
In [12]: person = people[0]
In [13]: person.salary = 5000
In [14]: print person.salary
5000
In [15]: people[0].salary
Out[15]: 100
In [16]: id(people[0])
Out[16]: 35312400
In [17]: id(person)
Out[17]: 35313104
So, let's look at what it happens in depth.
You know that in Django QuerySets are evaluated only when you need their results (lazy evaluation). To quote the Django documentation:
Slicing. As explained in Limiting QuerySets, a QuerySet can be sliced,
using Python’s array-slicing syntax. Slicing an unevaluated QuerySet
usually returns another unevaluated QuerySet, but Django will execute
the database query if you use the “step” parameter of slice syntax,
and will return a list. Slicing a QuerySet that has been evaluated
(partially or fully) also returns a list.
In particular, looking at the 'django.db.models.query' source code,
def __getitem__(self, k):
"""
Retrieves an item or slice from the set of results.
"""
# some stuff here ...
if isinstance(k, slice):
qs = self._clone()
if k.start is not None:
start = int(k.start)
else:
start = None
if k.stop is not None:
stop = int(k.stop)
else:
stop = None
qs.query.set_limits(start, stop)
return k.step and list(qs)[::k.step] or qs
qs = self._clone()
qs.query.set_limits(k, k + 1)
return list(qs)[0]
you can see that when you use slicing, you are calling the __getitem__ method.
Then the self._clone method will provide you a different instance of the same QuerySet. This is the reason you are getting different results.

Object-relational mapping that Django models provide hides the fact that you are interacting with a DB, by providing object-oriented interface for retrieving and manipulating data.
Unfortunately, ORM abstraction is not perfect, there are various cases when ORM semantic does not match the intuition. In these cases you need to investigate what's going on on the underlying SQL layer to figure out the cause of troubles.
Your problem arises from the fact that:
people = People.objects.filter(date=date)
does not execute any SQL query.
people[0]
executes SELECT a, b, c, .. FROM T WHERE filter, if you modify the resulting object by calling:
people[0].salary = 45000
the modification won't be saved to the DB, because the save() method was not called. The next call to people[0] again executes a SQL query, which does not return the unsaved modification.
When you encounter problems like this, Django Debug Toolbar can greatly help to identify which statements execute what SQL queries.

Related

Queryset in Django if empty field returns all elements

I want to do a filter in Django that uses form method.
If the user type de var it should query in the dataset that var, if it is left in blank to should bring all the elements.
How can I do that?
I am new in Django
if request.GET.get('Var'):
Var = request.GET.get('Var')
else:
Var = WHAT SHOULD I PUT HERE TO FILTER ALL THE ELEMNTS IN THE CODE BELLOW
models.objects.filter(Var=Var)
It's not a great idea from a security standpoint to allow users to input data directly into search terms (and should DEFINITELY not be done for raw SQL queries if you're using any of those.)
With that note in mind, you can take advantage of more dynamic filter creation using a dictionary syntax, or revise the queryset as it goes along:
Option 1: Dictionary Syntax
def my_view(request):
query = {}
if request.GET.get('Var'):
query['Var'] = request.GET.get('Var')
if request.GET.get('OtherVar'):
query['OtherVar'] = request.GET.get('OtherVar')
if request.GET.get('thirdVar'):
# Say you wanted to add in some further processing
thirdVar = request.GET.get('thirdVar')
if int(thirdVar) > 10:
query['thirdVar'] = 10
else:
query['thirdVar'] = int(thirdVar)
if request.GET.get('lessthan'):
lessthan = request.GET.get('lessthan')
query['fieldname__lte'] = int(lessthan)
results = MyModel.objects.filter(**query)
If nothing has been added to the query dictionary and it's empty, that'll be the equivalent of doing MyModel.objects.all()
My security note from above applies if you wanted to try to do something like this (which would be a bad idea):
MyModel.objects.filter(**request.GET)
Django has a good security track record, but this is less safe than anticipating the types of queries that your users will have. This could also be a huge issue if your schema is known to a malicious site user who could adapt their query syntax to make a heavy query along non-indexed fields.
Option 2: Revising the Queryset
Alternatively, you can start off with a queryset for everything and then filter accordingly
def my_view(request):
results = MyModel.objects.all()
if request.GET.get('Var'):
results = results.filter(Var=request.GET.get('Var'))
if request.GET.get('OtherVar'):
results = results.filter(OtherVar=request.GET.get('OtherVar'))
return results
A simpler and more explicit way of doing this would be:
if request.GET.get('Var'):
data = models.objects.filter(Var=request.GET.get('Var'))
else:
data = models.objects.all()

Filter a django QuerySet with the query from another QuerySet: Possible?

Say I create a QuerySet like:
q0 = Thing.objects.all()
fq0 = q0.filter(x=y)
at time t0. Then I add some new things to Thing db. These things form the QuerySet:
q1 = Thing.objects.filter(created_gt=t0)
I want to generate the QuerySet:
fq = (q0 | q1).filter(x=y)
Without having to know what x or y are. In other words, i'd like to be able to do something like this:
fq1 = q1.filter(query=fq0.query)
fq = fq0 | fq1
Is this possible? Manually setting
q1.query = fq0.query
merely sets q1 == fq0. I've seen some people asking about extracting the sql from a queryset, but this won't really help me.
How about something along these lines:
Thing.objects.filter(field__in=Another_Thing.object.filter())
Django will do query and subquery.
Many year after I found a solution.
You can use the __dict__ prop from you queryset. For instance:
o = Model.objects.filter(arg=param)
o2 = Model.objects.all()
o2.query.__dict__ = o.query.__dict__
o2.filter(arg2=param2)
o2 will now is filtered by arg and arg2!
I used this to pass a ModelChoiceField filter to django-jet autocomplete views (fork at https://github.com/paulorsbrito/django-jet).
Hope this helps someone in trouble.
As far as I can tell from looking through the QuerySet and Query modules, Django doesn't keep a running record of the arguments you send into a queryset. It translates everything directly into lower level query fragments then discards the tokens you've given it. Thus, figuring out how a queryset has been filtered without prior knowledge is a non-trivial task.
You could do it manually by hacking together something like the following:
q0 = Thing.objects.all()
filter_kwargs = {'x': y}
fq0 = q0.filter(**filter_kwargs)
fq0.saved_filter_kwargs = filter_kwargs
##### snip #####
fq1 = q1.filter(**fq0.saved_kwargs)
Kind of nasty though. Probably better to try solving this problem in a different way. I'd recommend you post another question and include what you're trying to achieve in the big picture, and we might be able to help you come up with a better architecture.

Django ORM and hitting DB

When I do something like
I. objects = Model.objects.all()
and then
II. objects.filter(field_1=some_condition)
I hit db every time when on the step 2 with various conditions. Is there any way to get all data in first action and then just take care of the result?
You actually don't hit the db until you evaluate the qs, queries are lazy.
Read more here.
edit:
After re-reading your question it becomes apparent you were asking how to prevent db hits when filtering for different conditions.
qs = SomeModel.objects.all()
qs1 = qs.filter(some_field='some_value')
qs2 = qs.filter(some_field='some_other_value')
Usually you would want the database to do the filtering for you.
You could force an evaluation of the qs by converting it to a list. This would prevent further db hits, however it would likely be worse than having your db return you results.
qs_l = list(qs)
qs1_l = [element for element in qs_l if element.some_field='some_value']
qs2_l = [element for element in qs_l if element.some_field='some_other_value']
Of course you will hit db every time. filter() transforms to SQL statement which is executed by your db, you can't filter without hitting it. So you can retrieve all the objects you need with values() or list(Model.objects.all()) and, as zeekay suggested, use Python expressions (like list comprehensions) for additional filtering.
Why don't you just do objs = Model.objects.filter(field=condition)? That said, once the SQL query is executed you can use Python expressions to do further filtering/processing without incurring additional database hits.

QuerySet subscripting won't work as expected

when got a QuerySet by using filter and it's easy to use the following code to do the change and save operation:
qs = SomeModel.objects.filter(owner_id=123)
# suppose qs has 1 or many elements
last_login_time = qs[0].last_login_time
qs[0].last_login_time = datetime.now() # I expect it can assign the new value, but it won't
assertEquals(qs[0].last_login_time, last_login_time) # YES, it doesn't change
qs[0].save() #So it won't update the old record
And after figuring this out, the following code will be used instead and it works:
qs = SomeModel.objects.filter(owner_id=123)
# suppose qs has 1 or many elements
obj = qs[0]
last_login_time = obj.last_login_time
obj.last_login_time = datetime.now() # I expect it can assign the new value, but it will
assertNotEquals(obj.last_login_time, last_login_time) # YES, it does change
obj.save() #So it will update the old record as expected
And I have met some of my friends/colleagues use the first approach to do the record updating. And IMO, it's natural and prone to use. (when you type qs[0] and type obj , they have the same type)
After reading the code(db.models.query), it can be figured out why.(when you subscript the QuerySet it will use the qs = self._clone() and assigning a value won't change at all)
Possible solutions:
make the assigning work for the
subscripting QuerySet
announce the
above first approach is wrong and
let the users know it
So I want to ask:
Is my question a real issue for django?(I'm wondering why django developer not make it work as expected)
What's your suggestion about this issue? And what's your preferred way for such an issue?
Use update for updating fields in a queryset.
I'm not really sure what you're asking here. Are you saying this is a bug? I don't think so, it's clearly defined behaviour: the queryset is lazy, but is evaluated when you iterate or slice it. Each time you do slice it, you get a new object. This is the logical consequence of the fact that slicing by itself doesn't cause the non-sliced queryset to be evaluated - if the result isn't already cached, slicing will perform a single database call with a LIMIT 1 to only get one result. Otherwise, you're left with extremely undesirable side-effects.
Now, if you think this could be better explained in the docs, you're welcome - and encouraged - to submit a bug with a patch that explains it better.

fast lookup for the last element in a Django QuerySet?

I've a model called Valor. Valor has a Robot. I'm querying like this:
Valor.objects.filter(robot=r).reverse()[0]
to get the last Valor the the r robot. Valor.objects.filter(robot=r).count() is about 200000 and getting the last items takes about 4 seconds in my PC.
How can I speed it up? I'm querying the wrong way?
The optimal mysql syntax for this problem would be something along the lines of:
SELECT * FROM table WHERE x=y ORDER BY z DESC LIMIT 1
The django equivalent of this would be:
Valor.objects.filter(robot=r).order_by('-id')[:1][0]
Notice how this solution utilizes django's slicing method to limit the queryset before compiling the list of objects.
If none of the earlier suggestions are working, I'd suggest taking Django out of the equation and run this raw sql against your database. I'm guessing at your table names, so you may have to adjust accordingly:
SELECT * FROM valor v WHERE v.robot_id = [robot_id] ORDER BY id DESC LIMIT 1;
Is that slow? If so, make your RDBMS (MySQL?) explain the query plan to you. This will tell you if it's doing any full table scans, which you obviously don't want with a table that large. You might also edit your question and include the schema for the valor table for us to see.
Also, you can see the SQL that Django is generating by doing this (using the query set provided by Peter Rowell):
qs = Valor.objects.filter(robot=r).order_by('-id')[0]
print qs.query
Make sure that SQL is similar to the 'raw' query I posted above. You can also make your RDBMS explain that query plan to you.
It sounds like your data set is going to be big enough that you may want to denormalize things a little bit. Have you tried keeping track of the last Valor object in the Robot object?
class Robot(models.Model):
# ...
last_valor = models.ForeignKey('Valor', null=True, blank=True)
And then use a post_save signal to make the update.
from django.db.models.signals import post_save
def record_last_valor(sender, **kwargs):
if kwargs.get('created', False):
instance = kwargs.get('instance')
instance.robot.last_valor = instance
post_save.connect(record_last_valor, sender=Valor)
You will pay the cost of an extra db transaction when you create the Valor objects but the last_valor lookup will be blazing fast. Play with it and see if the tradeoff is worth it for your app.
Well, there's no order_by clause so I'm wondering about what you mean by 'last'. Assuming you meant 'last added',
Valor.objects.filter(robot=r).order_by('-id')[0]
might do the job for you.
django 1.6 introduces .first() and .last():
https://docs.djangoproject.com/en/1.6/ref/models/querysets/#last
So you could simply do:
Valor.objects.filter(robot=r).last()
Quite fast should also be:
qs = Valor.objects.filter(robot=r) # <-- it doesn't hit the database
count = qs.count() # <-- first hit the database, compute a count
last_item = qs[ count-1 ] # <-- second hit the database, get specified rownum
So, in practice you execute only 2 SQL queries ;)
Model_Name.objects.first()
//To get the first element
Model_name.objects.last()
//For get last()
in my case, the last is not work because there is only one row in the database
maybe help full for you too :)
Is there a limit clause in django? This way you can have the db, simply return a single record.
mysql
select * from table where x = y limit 1
sql server
select top 1 * from table where x = y
oracle
select * from table where x = y and rownum = 1
I realize this isn't translated into django, but someone can come back and clean this up.
The correct way of doing this, is to use the built-in QuerySet method latest() and feeding it whichever column (field name) it should sort by. The drawback is that it can only sort by a single db column.
The current implementation looks like this and is optimized in the same sense as #Aaron's suggestion.
def latest(self, field_name=None):
"""
Returns the latest object, according to the model's 'get_latest_by'
option or optional given field_name.
"""
latest_by = field_name or self.model._meta.get_latest_by
assert bool(latest_by), "latest() requires either a field_name parameter or 'get_latest_by' in the model"
assert self.query.can_filter(), \
"Cannot change a query once a slice has been taken."
obj = self._clone()
obj.query.set_limits(high=1)
obj.query.clear_ordering()
obj.query.add_ordering('-%s' % latest_by)
return obj.get()