Filter for elements using exists through a reverse foreign key relationship - django

A relevant image of my model is here: http://i.stack.imgur.com/xzsVU.png
I need to make a queryset that contains all cats who have an associated person with a role of "owner" and a name of "bob".
The sql for this would be shown below.
select * from cat where exists
(select 1 from person inner join role where
person.name="bob" and role.name="owner");
This problem can be solved in two sql queries with the following django filters.
people = Person.objects.filter(name="bob", role__name="owner")
ids = [p.id for p in people]
cats = Cat.objects.filter(id__in=ids)
My actual setup is more complex than this and is dealing with a large dataset. Is there a way to do this with one query? If it is impossible, what is the efficient alternative?

I'm pretty sure this is your query:
cats = Cat.objects.filter(person__name='bob', person__role__name='owner')
read here about look ups spanning relationships

Related

Select all foreign keys from the foreign key table in django

Let's imagine I have 2 models:
Class Tree(models.Model):
title = models.CharField(max_length=255)
Class Apple(models.Model):
tree = models.ForeignKey(Tree, related_name="apples")
How do I select all the Trees that have Apples.
I mean I want to select all the Trees that exist in Apple Model from an instance of Tree.
I think I want to execute this query:
SELECT DISTINCT tree.id, tree.title
FROM apple JOIN tree ON apple.tree = tree.id
Untill now i have written 2 queries and they are working but I think they are not the best practices to do it:
Tree.objects.filter(
apples__tree__in=Apple.objects.all().values_list("tree")
).distinct()
Tree.objects.filter(apples__tree__isnull=False).distinct()
You can query the relation for 'NULL' directly
Trees.objects.filter(apples__isnull=False).distinct()
P.S. If you want the exact query, you can write it like this (but you'll only get the dictionaries, not a Tree object):
Apple.objects.order_by().values('tree__id', 'tree__title').distinct()
You can use django aggregates.
from django.db.models import Count
User.objects.annotate(page_count=Count('page')).filter(page_count__gte=2).count()

Django queryset behind the scenes

**
Difference between creating a foreign key for consistency and for joins
**
I am fine to use Foreignkey and Queryset API with Django.
I just want to understand little bit more deeply how it works behind the scenes.
In Django manual, it says
a database index is automatically created on the ForeignKey. You can
disable this by setting db_index to False. You may want to avoid the
overhead of an index if you are creating a foreign key for consistency
rather than joins, or if you will be creating an alternative index
like a partial of multiple column index.
creating for a foreign key for consistency rather than joins
this part is confusing me.
I expected that you use Join keyword if you do query with Foreign key like below.
SELECT
*
FROM
vehicles
INNER JOIN users ON vehicles.car_owner = users.user_id
For example,
class Place(models.Model):
name = models.Charfield(max_length=50)
address = models.Charfield(max_length=50)
class Comment(models.Model):
place = models.ForeignKeyField(Place)
content = models.Charfield(max_length=50)
if you use queryset like Comment.objects.filter(place=1), i expected using Join Keyword in low level SQL command.
but, when I checked it by printing out queryset.query in console, it showed like below.
(I simplified with Model just to explains. below, it shows all attributes in my model. you can ignore attributes)
SELECT
"bfm_comment"."id", "bfm_comment"."content", "bfm_comment"."user_id", "bfm_comment"."place_id", "bfm_comment"."created_at"
FROM "bfm_comment" WHERE "bfm_comment"."place_id" = 1
creating a foreign key for consistency vs creating a foreign key for joins
simply, I thought if you use any queryset, it means using foreign key for joins. Because you can get parent's table data by c = Comment.objects.get(id=1) c.place.name easily. I thought it joins two tables behind scenes. But result of Print(queryset.query) didn't how Join Keyword but Find it by Where keyword.
The way I understood from an answer
Case 1:
Comment.objects.filter(place=1)
result
SELECT
"bfm_comment"."id", "bfm_comment"."content", "bfm_comment"."user_id", "bfm_comment"."place_id", "bfm_comment"."created_at"
FROM "bfm_comment"
WHERE "bfm_comment"."id" = 1
Case 2:
Comment.objects.filter(place__name="df")
result
SELECT "bfm_comment"."id", "bfm_comment"."content", "bfm_comment"."user_id", "bfm_comment"."place_id", "bfm_comment"."created_at"
FROM "bfm_comment" INNER JOIN "bfm_place" ON ("bfm_comment"."place_id" = "bfm_place"."id")
WHERE "bfm_place"."name" = df
Case1 is searching rows which has comment.id column is 1 in just Comment table.
But in Case 2, it needs to know Place table's attribute 'name', so It has to use JOIN keyword to check values in column of Place table. Right?
So Is it alright to think that I create a foreign key for joins if i use queryset like Case2 and that it is better to create index on the Foreign Key?
for above question, I think I can take the answer from Django Manual
Consider adding indexes to fields that you frequently query using
filter(), exclude(), order_by(), etc. as indexes may help to speed up
lookups. Note that determining the best indexes is a complex
database-dependent topic that will depend on your particular
application. The overhead of maintaining an index may outweigh any
gains in query speed
In conclusion, it really depends on how my application work with it.
If you execute the following command the mystery will be revealed
./manage.py sqlmigrate myapp 0001
Take care to replace myapp with your app name (bfm I think) and 0001 with the actual migration where the Comment model is created.
The generated sql will reveal that the actual table is created with place_id int rather than a place Place that is because the RDBMS doesn't know anything about models, the models are only in the application level. It's the job of the django orm to fetch the data from the RDBMS and convert them into model instances. That's why you always get a place member in each of your Comment instances and that place member gives you access to the members of the related Place instance in turn.
So what happens when you do?
Comment.objects.filter(place=1)
Django is smart enough to know that you are referring to a place_id because 1 is obviously not an instance of a Place. But if you used a Place instance the result would be the same. So there is no join here. The above query would definitely benefit from having an index on the place_id, but it wouldn't benefit from having a foreign key constraint!! Only the Comment table is queried.
If you want a join, try this:
Comment.objects.filter(place__name='my home')
Queries of this nature with the __ often result in joins, but sometimes it results in a sub query.
Querysets are lazy.
https://docs.djangoproject.com/en/1.10/topics/db/queries/#querysets-are-lazy
QuerySets are lazy – the act of creating a QuerySet doesn’t involve
any database activity. You can stack filters together all day long,
and Django won’t actually run the query until the QuerySet is
evaluated. Take a look at this example:

Open JPA how do I get back results from foreign key relations

Good morning. I have been looking all over trying to answer this question.
If you have a table that has foreign keys to another table, and you want results from both tables, using basic sql you would do an inner join on the foreign key and you would get all the resulting information that you requested. When you generate your JPA entities on your foreign keys you get a #oneToone annotation, #oneToMany, #ManyToMany, #ManyToOne, etc over your foreign key columns. I have #oneToMany over the foreign keys and a corresponding #ManyToOne over the primary key in the related table column I also have a #joinedON annotation over the correct column... I also have a basic named query that will select everything from the first table. Will I need to do a join to get the information from both tables like I would need to do in basic sql? Or will the fact that I have those annotations pull those records back for me? To be clear if I have table A which is related to Table B based on a foreign key relationship and I want the records from both tables I would join table A to B based on the foreign key or
Select * From A inner Join B on A.column2 = B.column1
Or other some-such non-sense (Pardon my sql if it is not exactly correct, but you get the idea)...
That query would have selected all column froms A and B where those two selected column...
Here is my named query that I am using....
#NamedQuery(name="getQuickLaunch", query = "SELECT q FROM QuickLaunch q")
This is how I am calling that in my stateless session bean...
try
{
System.out.println("testing 1..2..3");
listQL = emf.createNamedQuery("getQuickLaunch").getResultList();
System.out.println("What is the size of this list: number "+listQL.size());
qLaunchArr = listQL.toArray(new QuickLaunch[listQL.size()]);
}
Now that call returns all the columns of table A, but it lack's the column's of table B. My first instinct would be to change the query to join the two tables... But that kind of makes me think what is the point of using JPA then if I am just writing the same queries that I would be writing anyway, just in a different place. Plus, I don't want to overlook something simple. So what say you stack overflow enthusiasts? How does one get back all the data of joined query using JPA?
Suppose you have a Person entity with a OneToMany association to the Contact entity.
When you get a Person from the entityManager, calling any method on its collection of contacts will lazily load the list of contacts of that person:
person.getContacts().size();
// triggers a query select * from contact c where c.personId = ?
If you want to use a single query to load a person and all its contacts, you need a fetch in the SQL query:
select p from Person p
left join fetch p.contacts
where ...
You can also mark the association itself as eager-loaded, using #OneToMany(lazy = false), but then every time a person is loaded (vie em.find() or any query), its contacts will also be loaded.

How to join non-relational models in Django 1.3 on 2 fields

I've got 2 existing models that I need to join that are non-relational (no foreign keys). These were written by other developers are cannot be modified by me.
Here's a quick description of them:
Model Process
Field filename
Field path
Field somethingelse
Field bar
Model Service
Field filename
Field path
Field servicename
Field foo
I need to join all instances of these two models on the filename and path columns. I've got existing filters I have to apply to each of them before this join occurs.
Example:
A = Process.objects.filter(somethingelse=231)
B = Service.objects.filter(foo='abc')
result = A.filter(filename=B.filename,path=B.path)
This sucks, but your best bet is to iterate all models of one type, and issue queries to get your joined models for the other type.
The other alternative is to run a raw SQL query to perform these joins, and retrieve the IDs for each model object, and then retrieve each joined pair based on that. More efficient at run time, but it will need to be manually maintained if your schema evolves.

Sort by number of matches on queries based on m2m field

I hope the title is not misleading.
Anyway, I have two models, both have m2m relationships with a third model.
class Model1: keywords = m2m(Keyword)
class Model2: keywords = m2m(Keyword)
Given the keywords for a Model2 instance like this:
keywords2 = model2_instance.keywords.all()
I need to retrieve the Model1 instances which have at least a keyword that is in keywords2, something like:
Model1.objects.filter(keywords__in=keywords2)
and sort them by the number of keywords that match (dont think its possible via 'in' field lookup). Question is, how do i do this?
I'm thinking of just manually interating through each of Model1 instances, appending them to a dictionary of results for every match, but I need this to scale, for say tens of thousands of records. Here is how I imagined it would be like:
result = {}
keywords2_ids = model2.keywords.all().values_list('id',flat=True)
for model1 in Model1.objects.all():
keywords_matched = model1.keywords.filter(id__in=keywords2_ids).count()
objs = result.get(str(keywords_matched), [])
result[str(keywords_matched)] = objs.append(obj)
There must be an faster way to do this. Any ideas?
You can just switch to raw SQL. What you have to do is to write a custom manager for Model1 to return the sorted set of ids of Model1 objects based on the keyword match counts. The SQL is simple as joining the two many to many tables(Django automatically creates a table to represent a many to many relationship) on keyword ids and then grouping on Model1 ids for COUNT sql function. Then using an ORDER BY clause on those counts will produce the sorted Model1 id list you need. In MySQL,
SELECT appname_model1_keywords.model1_id, count(*) as match_count FROM appname_model1_keywords
JOIN appname_model2_keywords
ON (appname_model1_keywords.keyword_id = appname_model2_keywords.keyword_id)
WHERE appname_model2_keywords.model2_id = model2_object_id
GROUP BY appname_model1_keywords.model1_id
ORDER BY match_count
Here model2_object_id is the model2_instance id. This will definitely be faster and more scalable.