Model A has a ForeignKey to model B - I would like to fetch A instances and compare them to each other where the key to B is one of the comparison parameters.
Django defers fetching B related info, so if I want to optimize my code and fetch in advance the info I need I can do one of the following:
Use .select_related('B') - which will fetch all related B instances
Use .select_related('B__id') - which will fetch only the ids of all related B instances
AFAIK both require a join, where all I really needed was A.B_id which is a column in the database, as that is all I wanted to compare.
Am I missing something straightforward here? What am I missing here? Can I fetch A.B_id directly?

Firstly, your assertion is wrong: select_related('B__id') doesn't do anything. The double-underscore in a select_related call is only for following subsequent joins: so if B had a ForeignKey to C, select_related('B__C') would follow the second JOIN as well.
Secondly, I'm confused by your optimisation requirement. As you say, you just want B_id: so no JOIN is required, and neither is any optimisation. If you just get your A objects in the normal way, you can refer to the b_id field on each of them directly:
a_objects = A.objects.all()
for obj in a_objects:
print a.b_id
Here only a single db call is made, with no JOINs.


Adding to a Many to Many relationship given primary keys

Given primary keys to two different objects with a many to many relationship, what is the most effective way to add the relationship that results in the least amount of database hits?
I'm thinking something like the below, but it results in hitting the database twice.
ob = A.objects.get(pk=pk_a)
Is it possible to only hit the database once?
Yes. You can just create a model object of the through model, like:
A.b.through.objects.create(a_id=pk_a, b_id=pk_b)
Given the model is A, and b is the name of the ManyToManyField.

Doctrine2 DQL join with unrelated tables to fetch both entities

My DQL query returns only the FROM object, which is nice if the other object were related, but it isn't.
My Query:
$query = $this->em->createQuery('SELECT c, s FROM MyBundle:Person c, MyBundle:Spot s
JOIN s.geo_data g JOIN g.features f WHERE = true AND
ST_Distance(f.location, c.location) < :distance GROUP BY c, s');
This works perfectly in SQL, giving me all the spots and all the persons within :distance of them. But in DQL, it only returns the person object, and since on the database level they are not related, I have no way to fetch the correct spot.
My database setup is correct, I'm using a PostGIS backend and spots and persons are not related in any way. They just happen to be on the same map and I'm querying for spatial relationships.
According to documentation, it's intended behaviour, from what I read, s is being hydrated, but not returned anywhere at all, good job!
How can I teach DQL to please return me what I told it in SELECT? Where's the "I mean what I say, stop being a smartass" switch?
Doctrine cannot give you both entities if they are not related because if the were related you would get the first entity c where you could get s through the relation.
What you can try is selecting all fields of both entities like
SELECT c.location, ..., s.geo_data, ...
This will give you an array for each column that contains all fields from both entities.
Maybe you can use result set mapping to get the entities if desired.
If you want to stuck with Doctrine, you HAVE TO define a OneToMany relation between places and people. In this way, you could set up the PeopleRepository and set up a method like getPeopleByLocationAndMaxDistance(Location $location, $distance)
FROM People AS p
LEFT JOIN Places AS pl
WHERE ST_Distance(p.location, pl.location) < :distance

Behavior of querysets with foreign keys in Django

When a model object is an aggregate of many other objects, whether via Foreign Key or Many To Many, does iterating over the queryset of that object result in individual queries to the related objects?
Lets say I have
class aggregateObj(models.Model):
parentorg = models.ForeignKey(Parentorgs)
contracts = models.ForeignKey(Contracts)
plans = models.ForeignKey(Plans)
and execute
objs = aggregateObj.objects.all()
if I iterate over objs, does every comparison made within the parentorg, contracts or plan fields result in an individual query to that object?
Yes, by default every comparison will create an individual query. To get around that, you can make use of the select_related (and prefetch_related the relationship is in the 'backwards' direction) QuerySet method to fetch all the related object in the initial query:
Returns a QuerySet that will automatically “follow” foreign-key relationships, selecting that additional related-object data when it executes its query. This is a performance booster which results in (sometimes much) larger queries but means later use of foreign-key relationships won’t require database queries.
Yes. To prevent that, use select_related to fetch the related data via a JOIN at query time.

Circular dependency in Django ForeignKey?

I have two models in Django:
b = ForeignKey("B")
a = ForeignKey(A)
I want these ForeignKeys to be non-NULL.
However, I cannot create the objects because they don't have a PrimaryKey until I save(). But I cannot save without having the other objects PrimaryKey.
How can I create an A and B object that refer to each other?
I don't want to permit NULL if possible.
If this is really a bootstrapping problem and not something that will reoccur during normal usage, you could just create a fixture that will prepopulate your database with some initial data. The fixture-handling code includes workarounds at the database layer to resolve the forward-reference issue.
If it's not a bootstrapping problem, and you're going to want to regularly create these circular relations among new objects, you should probably either reconsider your schema--one of the foreign keys is probably unnecessary.
It sounds like you're talking about a one-to-one relationship, in which case it is unnecessary to store the foreign key on both tables. In fact, Django provides nice helpers in the ORM to reference the corresponding object.
Using Django's OneToOneField:
class A(models.Model):
class B(models.Model):
a = OneToOneField(A)
Then you can simply reference them like so:
a = A()
b = B(a=a)
print a.b
print b.a
In addition, you may look into django-annoying's AutoOneToOneField, which will auto-create the associated object on save if it doesn't exist on the instance.
If your problem is not a one-to-one relationship, you should clarify because there is almost certainly a better way to model the data than mutual foreign keys. Otherwise, there is not a way to avoid setting a required field on save.

Django ORM: Optimizing queries involving many-to-many relations

I have the following model structure:
class Container(models.Model):
class Generic(models.Model):
name = models.CharacterField(unique=True)
cont = models.ManyToManyField(Container, null=True)
# It is possible to have a Generic object not associated with any container,
# thats why null=True
class Specific1(Generic):
class Specific2(Generic):
class SpecificN(Generic):
Say, I need to retrieve all Specific-type models, that have a relationship with a particular Container.
The SQL for that is more or less trivial, but that is not the question. Unfortunately, I am not very experienced at working with ORMs (Django's ORM in particular), so I might be missing a pattern here.
When done in a brute-force manner, -
c = Container.objects.get(name='somename') # this gets me the container
items = c.generic_set.all()
# this gets me all Generic objects, that are related to the container
# Now what? I need to get to the actual Specific objects, so I need to somehow
# get the type of the underlying Specific object and get it
for item in items:
spec = getattr(item, item.get_my_specific_type())
this results in a ton of db hits (one for each Generic record, that relates to a Container), so this is obviously not the way to do it. Now, it could, perhaps, be done by getting the SpecificX objects directly:
s = Specific1.objects.filter(cont__name='somename')
# This gets me all Specific1 objects for the specified container
# do it for every Specific type
that way the db will be hit once for each Specific type (acceptable, I guess).
I know, that .select_related() doesn't work with m2m relationships, so it is not of much help here.
To reiterate, the end result has to be a collection of SpecificX objects (not Generic).
I think you've already outlined the two easy possibilities. Either you do a single filter query against Generic and then cast each item to its Specific subtype (results in n+1 queries, where n is the number of items returned), or you make a separate query against each Specific table (results in k queries, where k is the number of Specific types).
It's actually worth benchmarking to see which of these is faster in reality. The second seems better because it's (probably) fewer queries, but each one of those queries has to perform a join with the m2m intermediate table. In the former case you only do one join query, and then many simple ones. Some database backends perform better with lots of small queries than fewer, more complex ones.
If the second is actually significantly faster for your use case, and you're willing to do some extra work to clean up your code, it should be possible to write a custom manager method for the Generic model that "pre-fetches" all the subtype data from the relevant Specific tables for a given queryset, using only one query per subtype table; similar to how this snippet optimizes generic foreign keys with a bulk prefetch. This would give you the same queries as your second option, with the DRYer syntax of your first option.
Not a complete answer but you can avoid a great number of hits by doing this
items= list(items)
for item in items:
spec = getattr(item, item.get_my_specific_type())
instead of this :
for item in items:
spec = getattr(item, item.get_my_specific_type())
Indeed, by forcing a cast to a python list, you force the django orm to load all elements in your queryset. It then does this in one query.
