Say I have a model that is
class Bottles(models.Model)
BottleCode = models.IntegerField()
class Labels(models.Model)
LabelCode = models.IntegerField()
How do I get a queryset of Bottles where the BottleCode and LabelCode are equal? (i.e. Bottles and Labels with no common Code are excluded)
It can be achieved via extra():
Bottles.objects.extra(where=["Bottles.BottleCode in (select Labels.LabelCode from Labels)"])
You may also need to add an app name prefix to the table names, e.g. app_bottles instead of bottles.
Though #danihp has a point here, if you would often encounter queries like these, when you are trying to relate unrelated models - you should probably think about changing your model design.
Related
I have three models, let's take an imaginary example:
class Entity(models.Model):
name = models.CharField()
class EntityAssociation(models.Model):
buddy1 = models.ForeignKey(Entity, related_name='+')
buddy2 = models.ForeignKey(Entity, related_name='+')
class EntityPhoto(models.Model):
entity = models.ForeignKey(Entity, null=True)
association = models.ForeignKey(EntityAssociation, null=True)
title = ...
We have some people (Entity), that can share personal photos of themselves. We also have some relations between entities (represented by EntityAssociation) that can also share photos of them together.
For a single entity, I can retrieve all the photo associated to an entity, either directly or through an association, doing so:
obj = Entity.objects.last()
EntityPhoto.objects.filter(
Q(entity=obj) | Q(association__buddy1=obj) | Q(association__buddy2=obj)
)
What I want is being able to prefetch all the photos of a set of entities selected. A typical use-case would be:
for entity in Entity.objects.all().prefetch(???):
print(entity.name, 'has', len(entity.photos_prefetched), 'photos')
print([x.title for x in entity.photos_prefetched])
And this should be returning all the photos. A solution with three queries (Entity listing, prefetch through entity, prefetch through association attr ; two would be perfect) would satisfy me but the more important is to be able to iterate through a single list, on each entity
I tried to look the internal code of Prefetch but it looks like a prefetch is tied to a lookup, plus I don't know how to make the Q query in this case (what should be the right operand in Q(entity__in=...)?)
Notice: The point here is not about refactoring the database structure (EntityAssociation is used for a plenty of other things so it can't be reduced to a M2M of EntityPhoto for example.) but optimizing this specific use-case, if possible.
I'm currently playing a bit with Prefetch. In my opinion it's in a pretty good shape and quite powerful.
My approach to your problem would probably be something like:
entities = Entity.objects.all().prefetch(Prefetch(
'entity_photo__set',
EntityPhoto.objects.filter(
Q(entity=obj) | Q(association__buddy1=obj) | Q(association__buddy2=obj
)
to_attr="photos_prefetched",
))
You can then access those photos with for entity in entities: entity.photos_prefetched.
If this is not working, the problem might be that you're referencing back (Q(entity=obj)) to the actual entity. Not sure if this is properly possible. I'm having trouble with references back to the object, might be a bug in Django.
I've got django 1.8.5 and Python 3.4.3, and trying to create a subquery that constrains my main data set - but the subquery itself (I think) needs a join in it. Or maybe there is a better way to do it.
Here's a trimmed down set of models:
class Lot(models.Model):
lot_id = models.CharField(max_length=200, unique=True)
class Lot_Country(models.Model):
lot = models.ForeignKey(Lot)
country = CountryField()
class Discrete(models.Model):
discrete_id = models.CharField(max_length=200, unique=True)
master_id = models.ForeignKey(Inventory_Master)
location = models.ForeignKey(Location)
lot = models.ForeignKey(Lot)
I am filtering on various attributes of Discrete (which is discrete supply) and I want to go "up" through Lot, over the Lot_Country, meaning "I only want to get rows from Discrete if the Lot associated with that row has an entry in Lot_Country for my appropriate country (let's say US.)
I've tried something like this:
oklots=list(Lot_Country.objects.filter(country='US'))
But, first of all that gives me the str back, which I don't really want (and changed it to be lot_id, but that's a hack.)
What's the best way to constrain Discrete through Lot and over to Lot_Country? In SQL I would just join in the subquery (or even in the main query - maybe that's what I need? I guess I don't know how to join up to a parent then down into that parent's other child...)
Thanks in advance for your help.
I'm not sure what you mean by "it gives me the str back"... Lot_Country.objects.filter(country='US') will return a queryset. Of course if you print it in your console, you will see a string.
I also think your models need refactoring. The way you have currently defined it, you can associate multiple Lot_Countrys with one Lot, and a country can only be associated with one lot.
If I understand your general model correctly that isn't what you want - you want to associate multiple Lots with one Lot_Country. To do that you need to reverse your foreign key relationship (i.e., put it inside the Lot).
Then, for fetching all the Discrete lots that are in a given country, you would do:
discretes_in_us = Discrete.objects.filter(lot__lot_country__country='US')
Which will give you a queryset of all Discretes whose Lot is in the US.
So first off, I want to clarify that I am trying to make One-To-Many relationships, not Many-to-One. I already understand how ForeignKeys work.
For the sake of the discussion, I've simplified my models; they're much more field-rich than this in the real implementation.
I have a model, called a ColumnDefinition:
class ColumnDefinition(Model):
column_name = CharField(max_length=32)
column_type = PositiveSmallIntegerField()
column_size = PositiveSmallIntegerField(null=True, blank=True)
I think have a registry. Each registry has a separate set of columns for it's input and output definition. I've put the theoretical "OneToManyField" in there to demonstrate what I'm trying to do.
class Registry(Model):
input_dictionary = OneToManyField(ColumnDefinition)
output_dictionary = OneToManyField(ColumnDefinition)
created_date = DateTimeField(auto_now_add=True, editable=False)
A ColumnDefinition is only ever related to one Registry ever. So it's not a Many to Many relationship. If I put a ForeignKey on the ColumnDefinition instead to create a reverse relationship, it can only create a single reverse, whereas I need both an input and output reverse.
I don't want to have to do anything kludgey like adding a "column_registry_type" field onto ColumnDefinition if I can get around it.
Does anyone have any good ideas on how to solve this problem?
Thanks!
You can add two ForeignKeys on ColumnDefinition, one for input and one for output, and give them separate related_names:
class ColumnDefinition(Model):
...
input_registry = models.ForeignKey(Registry, related_name='input_columns')
output_registry = models.ForeignKey(Registry, related_name='output_columns')
You can then access the set of columns like registry.input_columns.
You can and should define two different ForeignKey fields on ColumnDefinition. Just make sure to specify a related_name value for at least one of them.
https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.ForeignKey.related_name
I have a group of related companies that share items they own with one-another. Each item has a company that owns it and a company that has possession of it. Obviously, the company that owns the item can also have possession of it. Also, companies sometimes permanently transfer ownership of items instead of just lending it, so I have to allow for that as well.
I'm trying to decide how to model ownership and possession of the items. I have a Company table and an Item table.
Here are the options as I see them:
Inventory table with entries for each Item - Company relationship. Has a company field pointing to a Company and has Boolean fields is_owner and has_possession.
Inventory table with entries for each Item. Has an owner_company field and a possessing_company field that each point to a Company.
Two separate tables: ItemOwner and ItemHolder**.
So far I'm leaning towards option three, but the tables are so similar it feels like duplication. Option two would have only one row per item (cleaner than option one in this regard), but having two fields on one table that both reference the Company table doesn't smell right (and it's messy to draw in an ER diagram!).
Database design is not my specialty (I've mostly used non-relational databases), so I don't know what the best practice would be in this situation. Additionally, I'm brand new to Python and Django, so there might be an obvious idiom or pattern I'm missing out on.
What is the best way to model this without Company and Item being polluted by knowledge of ownership and possession? Or am I missing the point by wanting to keep my models so segregated? What is the Pythonic way?
Update
I've realized I'm focusing too much on database design. Would it be wise to just write good OO code and let Django's ORM do it's thing?
Is there a reason why you don't want your item to contain the relationship information? It feels like the owner and possessor are attributes of the item.
class Company(models.Model):
pass
class Item(models.Model):
...
owner = models.ForeignKey(Company, related_name='owned_items')
holder = models.ForeignKey(Company, related_name='held_items')
Some examples:
company_a = Company.objects.get(pk=1)
company_a.owned_items.all()
company_a.held_items.all()
items_owned_and_held_by_a=Items.objects.filter(owner=company_a, holder=company_a)
items_on_loan_by_a=Items.objects.filter(owner=company_a).exclude(holder=company_a)
#or
items_on_loan_by_a=company_a.owned_items.exclude(holder=company_a)
items_a_is_borrowing=Items.objects.exclude(owner=company_a).filter(holder=company_a)
#or
items_a_is_borrowing=company_a.held_items.exclude(owner=company_a)
company_b = Company.objects.get(pk=2)
items_owned_by_a_held_by_b=Items.objects.filter(owner=company_a, holder=company_b)
#or
items_owned_by_a_held_by_b=company_a.owned_items.filter(holder=company_b)
#or
items_owned_by_a_held_by_b=company_b.held_items.filter(owner=company_a)
I think if your items are only owned by a single company and held by a single company, a separate table shouldn't be needed. If the items can have multiple ownership or multiple holders, a m2m table through an inventory table would make more sense.
class Inventory(models.Model):
REL = (('O','Owns'),('P','Possesses'))
item = models.ForeignKey(Item)
company = models.ForeignKey(Company)
relation = models.CharField(max_length=1,choices=REL)
Could be one implementation, instead of using booleans. So I'd go for the first. This could even serve as an intermediate table if you ever decide to use a 'through' to relate items to company like this:
Company:
items = models.ManyToManyField(Item, through=Inventory)
Option #1 is probably the cleanest choice. An Item has only one owner company and is possessed by only one possessing company.
Put two FK to Company in Item, and remember to explicitly define the related_name of the two inverses to be different each other.
As you want to avoid touching the Item model, either add the FKs from outside, like in field.contribute_to_class(), or put a new model with a one-to-one rel to Item, plus the foreign keys.
The second method is easier to implement but the first will be more natural to use once implemented.
I'm curious if there's any way to do a query in Django that's not a "SELECT * FROM..." underneath. I'm trying to do a "SELECT DISTINCT columnName FROM ..." instead.
Specifically I have a model that looks like:
class ProductOrder(models.Model):
Product = models.CharField(max_length=20, promary_key=True)
Category = models.CharField(max_length=30)
Rank = models.IntegerField()
where the Rank is a rank within a Category. I'd like to be able to iterate over all the Categories doing some operation on each rank within that category.
I'd like to first get a list of all the categories in the system and then query for all products in that category and repeat until every category is processed.
I'd rather avoid raw SQL, but if I have to go there, that'd be fine. Though I've never coded raw SQL in Django/Python before.
One way to get the list of distinct column names from the database is to use distinct() in conjunction with values().
In your case you can do the following to get the names of distinct categories:
q = ProductOrder.objects.values('Category').distinct()
print q.query # See for yourself.
# The query would look something like
# SELECT DISTINCT "app_productorder"."category" FROM "app_productorder"
There are a couple of things to remember here. First, this will return a ValuesQuerySet which behaves differently from a QuerySet. When you access say, the first element of q (above) you'll get a dictionary, NOT an instance of ProductOrder.
Second, it would be a good idea to read the warning note in the docs about using distinct(). The above example will work but all combinations of distinct() and values() may not.
PS: it is a good idea to use lower case names for fields in a model. In your case this would mean rewriting your model as shown below:
class ProductOrder(models.Model):
product = models.CharField(max_length=20, primary_key=True)
category = models.CharField(max_length=30)
rank = models.IntegerField()
It's quite simple actually if you're using PostgreSQL, just use distinct(columns) (documentation).
Productorder.objects.all().distinct('category')
Note that this feature has been included in Django since 1.4
User order by with that field, and then do distinct.
ProductOrder.objects.order_by('category').values_list('category', flat=True).distinct()
The other answers are fine, but this is a little cleaner, in that it only gives the values like you would get from a DISTINCT query, without any cruft from Django.
>>> set(ProductOrder.objects.values_list('category', flat=True))
{u'category1', u'category2', u'category3', u'category4'}
or
>>> list(set(ProductOrder.objects.values_list('category', flat=True)))
[u'category1', u'category2', u'category3', u'category4']
And, it works without PostgreSQL.
This is less efficient than using a .distinct(), presuming that DISTINCT in your database is faster than a python set, but it's great for noodling around the shell.
Update:
This is answer is great for making queries in the Django shell during development. DO NOT use this solution in production unless you are absolutely certain that you will always have a trivially small number of results before set is applied. Otherwise, it's a terrible idea from a performance standpoint.