Django: Query with F() into an object not behaving as expected - django

I am trying to navigate into the Price model to compare prices, but met with an unexpected result.
My model:
class ProfitableBooks(models.Model):
price = models.ForeignKey('Price',primary_key=True)
In my view:
foo = ProfitableBooks.objects.filter(price__buy__gte=F('price__sell'))
Producing this error:
'ProfitableBooks' object has no attribute 'sell'

Is this your actual model or a simplification? I think the problem may lie in having a model whose only field is its primary key is a foreign key. If I try to parse that out, it seems to imply that it's essentially a field acting as a proxy for a queryset-- you could never have more profitable books than prices because of the nature of primary keys. It also would seem to mean that your elided books field must have no overlap in prices due to the implied uniqueness constraints.
If I understand correctly, you're trying to compare two values in another model: price.buy vs. price.sell, and you want to know if this unpictured Book model is profitable or not. While I'm not sure exactly how the F() object breaks down here, my intuition is that F() is intended to facilitate a kind of efficient querying and updating where you're comparing or adjusting a model value based on another value in the database. It may not be equipped to deal with a 'shell' model like this which has no fields except a joint primary/foreign key and a comparison of two values both external to the model from which the query is conducted (and also distinct from the Book model which has the identifying info about books, I presume).
The documentation says you can use a join in an F() object as long as you are filtering and not updating, and I assume your price model has a buy and sell field, so it seems to qualify. So I'm not 100% sure where this breaks down behind the scenes. But from a practical perspective, if you want to accomplish exactly the result implied here, you could just do a simple query on your price model, b/c again, there's no distinct data in the ProfitableBooks model (it only returns prices), and you're also implying that each price.buy and price.sell have exactly one corresponding book. So Price.objects.filter(buy__gte=F('sell')) gives the result you've requested in your snipped.
If you want to get results which are book objects, you should do a query like the one you've got here, but start from your Book model instead. You could put that query in a queryset manager called "profitable_books" or something, if you wanted to substantiate it in some way.

Related

What is the most Django-appropriate way to combine multiple database columns into one model field?

I have several times come across a want to have a Django model field that comprises multiple database columns, and am wondering what the most Django way to do it would be.
Three use cases come specifically to mind.
I want to provide a field that wraps another field, keeping record of whether the wrapped field has been set or not. A use case for this particular field would be for dynamic configuration. A new configuration value is introduced, and a view marks itself as dependent upon a configuration value, redirecting if the value isn't set. Storing whether it's been set yet or not allows for easy indefinite caching of the state. This also lets the configuration value itself be not-nullable, and the application can ignore any value it might have when unset.
I want to provide a money field that combines a decimal (or integer) value, and a currency.
I want to provide a file field with a link to some manner of access rule to determine whether the request should include it/a request for it should succeed.
For each of the use cases, there exists a workaround, that in each case seems less elegant.
Define the configuration fields as nullable. This is undesirable for a few reasons: it removes the validity of NULL as a value for the configuration itself, so tristates and other use valid cases for NULL have to become a pair of fields or a different data type, or an edge case; null=True on the fields allows them to be set back to None in modelforms and the admin without writing a custom FormField for them every time; and every nullable column in a database is arguably bad design.
Define the field as a subclass of DecimalField with an argument accepting a string, and use that to contribute another field to the model. (This is what django-money does). Again, this is undesirable: fields are appearing "as if by magic" on the model; and configuring the currency field becomes not obvious.
Define the combined file+rule field instead as an entire model, and one-to-one to it from the model where you want to have the field. This is a solution to all use cases, but again comes with downsides: there's an extra JOIN required for every instance of the field - one can imagine a User with profile_picture, cv, passport, private_key etc.; there's an implicit requirement to .select_related(*fields) on every query that would ever want to access the fields; and the layout of the related model is going to have cold data interleaved with hot data all over the place given that it's reused everywhere.
In addition to solution 3., there's also the option to define a mixin factory that produces the multiple fields with matching names and whatever desired properties and methods. Again this isn't perfect because the user ends up with fields being defined in the model body, but also above that in the inheritance list.
I think the main reason this keeps sending me in circles is because custom Django model fields are always defined in terms of a single base field, because it's done by inheritance.
What is the accepted way to achieve this end?

What are the alternatives to using django signals?

I am working on a project where I need to recalculate values based on if fields changed or not. Here is an example:
Model1:
field_a = DatetimeField()
calculated_field_1 = ForeignKey(Model2)
Model2:
field_j = DatetimeField()
If field_a changes on model1 I have to recalculate the value for field calculated_field_1 to see if it needs to change as well. The calculations that are done require me querying the database to check values of other models and then determining if the value of the calculated field needs to change.
Example) field_a changes then I would have to do this calculation
result = Model2.objects.filter(field_j__gte=Model1.field_a)
If result.exists():
Model1.field_a = result.first()
Model1.save(update_fields=(‘field_a’,))
This is the most basic example I could think of and the queries can be much more complicated than this.
The project started out with one calculation when a field changed so I decided the best approach was to use django signals. Months later the requirements have changed for the project and now there are several other calculations that I had to implement that are very similar to the example above. I have noticed that my post_save function is getting out of hand and I am just wondering what alternatives there are to using signals. Although the post_save calculations I do now take far less than half a second, for the sake of my question pretend they took a second or more.
A valid answer cannot include doing these calculations on the fly when I pull them from the db. We use a validation framework that requires me to set these values on the model and querying on the fly has been an approach we attempted but for performance reasons it was not viable. Also, on field change one of the requirements is that the user needs to see the results of the calculated field so this has to happen synchronously.
What are some alternative approaches to using this pattern?

Performance: Store likes in PostgreSQL ArrayField (Django example)

I have 2 models: Post and Comment, each can be liked by User.
For sure, total likes should be rendered somewhere near each Post or Comment
But also each User should have a page with all liked content.
So, the most obvious way is just do it with m2m field, which seems will lead to lots of problems in some future.
And what about this?
Post and Comment models should have some
users_liked_ids = ArrayField(models.IntegerField())
User model should also have such fields:
posts_liked_ids = ArrayField(models.IntegerField())
comments_liked_ids = ArrayField(models.IntegerField())
And each time User likes something, two actions are performed:
User's id adds to Post's/Comment's users_liked_ids field
Post's/Comment's id adds to User's posts_liked_ids/comments_liked_ids field
The questions are:
Is it a good plan?
Will it be efficient to make lookups in such approach to get "Is that Post/Comment` was liked but current user
Will it be better to store likes in some separate table, rather then in liked model, but also in ArrayField
Probably better stay with obvious m2m?
1) No.
2) Definitely not.
3) Absolutely, incredibly not. Don't split your data up even further.
4) Yes.
Here are some of the problems:
no referential integrity, since you can't create foreign keys on array elements, meaning you could easily have garbage values in an ID array
data duplication with posts having user ids and users having post ids means it's possible for information to get out of sync (what happens when a user or post is deleted?)
inefficient lookups in match arrays (your #2)
Don't, under any circumstances, do this. You may want to combine your "post" and "comment" models to simplify the relationship, but this is what junction tables are for. Arrays are good for use cases that don't involve foreign keys or the potential for extreme length.

Django ORM: Optimizing queries involving many-to-many relations

I have the following model structure:
class Container(models.Model):
pass
class Generic(models.Model):
name = models.CharacterField(unique=True)
cont = models.ManyToManyField(Container, null=True)
# It is possible to have a Generic object not associated with any container,
# thats why null=True
class Specific1(Generic):
...
class Specific2(Generic):
...
...
class SpecificN(Generic):
...
Say, I need to retrieve all Specific-type models, that have a relationship with a particular Container.
The SQL for that is more or less trivial, but that is not the question. Unfortunately, I am not very experienced at working with ORMs (Django's ORM in particular), so I might be missing a pattern here.
When done in a brute-force manner, -
c = Container.objects.get(name='somename') # this gets me the container
items = c.generic_set.all()
# this gets me all Generic objects, that are related to the container
# Now what? I need to get to the actual Specific objects, so I need to somehow
# get the type of the underlying Specific object and get it
for item in items:
spec = getattr(item, item.get_my_specific_type())
this results in a ton of db hits (one for each Generic record, that relates to a Container), so this is obviously not the way to do it. Now, it could, perhaps, be done by getting the SpecificX objects directly:
s = Specific1.objects.filter(cont__name='somename')
# This gets me all Specific1 objects for the specified container
...
# do it for every Specific type
that way the db will be hit once for each Specific type (acceptable, I guess).
I know, that .select_related() doesn't work with m2m relationships, so it is not of much help here.
To reiterate, the end result has to be a collection of SpecificX objects (not Generic).
I think you've already outlined the two easy possibilities. Either you do a single filter query against Generic and then cast each item to its Specific subtype (results in n+1 queries, where n is the number of items returned), or you make a separate query against each Specific table (results in k queries, where k is the number of Specific types).
It's actually worth benchmarking to see which of these is faster in reality. The second seems better because it's (probably) fewer queries, but each one of those queries has to perform a join with the m2m intermediate table. In the former case you only do one join query, and then many simple ones. Some database backends perform better with lots of small queries than fewer, more complex ones.
If the second is actually significantly faster for your use case, and you're willing to do some extra work to clean up your code, it should be possible to write a custom manager method for the Generic model that "pre-fetches" all the subtype data from the relevant Specific tables for a given queryset, using only one query per subtype table; similar to how this snippet optimizes generic foreign keys with a bulk prefetch. This would give you the same queries as your second option, with the DRYer syntax of your first option.
Not a complete answer but you can avoid a great number of hits by doing this
items= list(items)
for item in items:
spec = getattr(item, item.get_my_specific_type())
instead of this :
for item in items:
spec = getattr(item, item.get_my_specific_type())
Indeed, by forcing a cast to a python list, you force the django orm to load all elements in your queryset. It then does this in one query.
I accidentally stubmled upon the following post, which pretty much answers your question :
http://lazypython.blogspot.com/2008/11/timeline-view-in-django.html

Sorting Records using Foreign Keys

From my question at Get Foreign Key Value, I managed to get the desired output...only one last bit remains. I want to sort my records by the year, make, then model in that order. I thought it'd be as simple as Vehicle.objects.all().order_by('common_vehicle') but this doesn't sort anything.
You have to order by specific fields in the related class. You do this by using the double-underscore format. So, for example:
Vehicle.objects.order_by('common_vehicle__year', 'common_vehicle__series__model__model')
to sort by the year value of the CommonVehicle class, then the model value of the Model class which is related via the Series class.
Note that this is a lot of joins, and could make your query performance quite slow. It may be fine for your needs, but just a heads-up that this is a potential source of slowness down the line.