SQL Index on Django Generic Relation - django

Is it possible/sensible to create an SQL index on a GenericForeignKey in a Django model?
I want to perform a lookup on a large number of (~1 million) objects in my postgreSQL database. My lookup is based on a GenericForeignkey on the relevant model, which is actually stored as two fields: object_id (the pk of the object that is being linked to) and content_type (a FK to the Django ContentType model representing the type of object being linked to).
In SQL terms this is essentially:
WHERE ("my_model"."content_type_id" = x AND "my_model"."object_id" = y)
object_id is a non-unique field - since the generic FK can link to multiple models, its possible that objects of different types will have the same pk.
I am wondering whether I can speed up my query times by creating a non-unique index on my_model.object_id. My knowledge of indexing is limited, so I may not have understood their use correctly, but I know that Django automatically creates indexes on normal ForeignKey relations so I assume there is an associated speedup.
Has anyone had any experience creating indexes for GenericForeignKeys? Did you find a resulting performance increase? Any help or insight is much appreciated.

Related

Django, is filtering by string faster than SQL relationships?

Is it a major flaw if I'm querying my user's information by their user_id (string) rather than creating a Profile model and linking them to other models using SQL relationships?
Example 1: (user_id is stored in django sessions.)
class Information(models.Model):
user_id = models.CharField(...)
...
# also applies for .filter() operations.
information = Information.objects.get(user_id=request.getUser['user_id'])
note: I am storing the user's profile informations on Auth0.
Example 2: (user_id is stored in Profile.)
class Profile(models.Model):
user_id = models.CharField(...)
class Information(models.Model):
profile = models.ForeginKey(Profile, ...)
...
information = Information.objects.get(profile=request.getProfile)
note: With this method Profile will only have one field, user_id.
On Django, will using a string instead of a query object affect performances to retrieve items?
Performance is not an issue here as noted by Dirk; as soon as a column is indexed, the performance difference between data types should be negligible when compared to other factors. Here's a related SO question for more perspective.
What you should take care of is to prevent the duplication of data whose integrity you then would have to take care of on your own instead of relying on well-tested integrity checks in the database.
Another aspect is that if you do have relations between your data, you absolutely should make sure that they are accurately represented in your models using Django's relationships. Otherwise there's really not much point in using Django's ORM at all. Good luck!

Django Postgres ArrayField vs One-to-Many relationship

For a model in my database I need to store around 300 values for a specific field. What would be the drawbacks, in terms of performance and simplicity in query, if I use Postgres-specific ArrayField instead of a separate table with One-to-Many relationship?
If you use an array field
The size of each row in your DB is going to be a bit large thus Postgres is going to be using a lot more toast tables (http://www.postgresql.org/docs/9.5/static/storage-toast.html)
Every time you get the row, unless you specifically use defer (https://docs.djangoproject.com/en/1.9/ref/models/querysets/#defer) the field or otherwise exclude it from the query via only, or values or something, you paying the cost of loading all those values every time you iterate across that row. If that's what you need then so be it.
Filtering based on values in that array, while possible isn't going to be as nice and the Django ORM doesn't make it as obvious as it does for M2M tables.
If you use M2M
You can filter more easily on those related values
Those fields are postponed by default, you can use prefetch_related if you need them and then get fancy if you want only a subset of those values loaded
Total storage in the DB is going to be slightly higher with M2M because of keys, and extra id fields
The cost of the joins in this case is completely negligible because of keys.
Personally I'd say go with the M2M tables, but I don't know your specific application. If you're going to be working with a massive amount of data it's likely worth grabbing a representative dataset and testing both methods with it.

Behavior of querysets with foreign keys in Django

When a model object is an aggregate of many other objects, whether via Foreign Key or Many To Many, does iterating over the queryset of that object result in individual queries to the related objects?
Lets say I have
class aggregateObj(models.Model):
parentorg = models.ForeignKey(Parentorgs)
contracts = models.ForeignKey(Contracts)
plans = models.ForeignKey(Plans)
and execute
objs = aggregateObj.objects.all()
if I iterate over objs, does every comparison made within the parentorg, contracts or plan fields result in an individual query to that object?
Yes, by default every comparison will create an individual query. To get around that, you can make use of the select_related (and prefetch_related the relationship is in the 'backwards' direction) QuerySet method to fetch all the related object in the initial query:
Returns a QuerySet that will automatically “follow” foreign-key relationships, selecting that additional related-object data when it executes its query. This is a performance booster which results in (sometimes much) larger queries but means later use of foreign-key relationships won’t require database queries.
Yes. To prevent that, use select_related to fetch the related data via a JOIN at query time.

How do I express a Django ManyToMany relationship?

I'm hitting a wall here and I know this is a simple question, but I was unable to find it here.
In an ER diagram, what would the relationship be between two objects that have a ManyToMany relationship, in terms of the intermediary table?
Example:
item ---- item_facts ---- fact
I feel like it should be one to one but I'm not completely sure.
user --many2many-- group
user 1----n user_group n---1 group
In django documentation it states that
A many-to-many relationship. Requires a positional argument: the class to which the model is related. This works exactly the same as it does for ForeignKey, including all the options regarding recursive and lazy relationships.
Behind the scenes, Django creates an intermediary join table to represent the many-to-many relationship. By default, this table name is generated using the name of the many-to-many field and the model that contains it. Since some databases don't support table names above a certain length, these table names will be automatically truncated to 64 characters and a uniqueness hash will be used. This means you might see table names like author_books_9cdf4; this is perfectly normal. You can manually provide the name of the join table using the db_table option.
And ForeignKey definition is like:
A many-to-one relationship. Requires a positional argument: the class to which the model is related.
So,ManyToMany relations created by django are creating intermedıary tables that are 1 to N.
Not sure what the question is here. You say that the two objects have a many-to-many relationship.
If two objects (entitied, tables) have a many-to-many relationship, whether you include the intermediate table in the diagram or not, is irrelevant. They still have a many-to-many relationship.

unidirectional one-to-many and many-to-may in django

I'm new to django.
I have 2 simple objects, lets call them - File and FileGroup:
- A FileGroup can hold a list of files, sorted according to an 'order' field.
- Each file can be associated with multiple groups.
so basically, the db tables would be:
1) File
2) File_Group
3) File_Group_Mapping table that has a column named "order" in addition to the fk to the file and file group.
There is a many-to-many relationship here, but the File object is not supposed to be aware of the existence of the FileGroup (doesn't make sense in my case)
My questions -
Is there a way to create a unidirectional many-to-many/one-to-many relationship here? How can I model it with django?
I couldn't find a way to make it unidirectional via django.
I saw a solution that uses something like -
class FileGroup(...):
files = models.ManyToManyField(File, through='FileGroupMapping')
but this will make the File object aware of the FileGroup.
I can also do this via mapping the File_Group_Mapping table in the models file like this -
class FileGroupMapping(...):
files = models.ForeignKey(File)
groups = models.ForeignKey(FileGroup)
order = models...
What is the best way to do this via django?
Thanks
I am also much of a hibernate user. I totally understand what you are looking for, just try using the attribute "symmetrical = False" in your many to many relation ship this would make the relationship unidirectional.
class FileGroup(models.Model):
files = models.ManyToManyField(File, symmetrical = False)
This should do the trick!
Your two approaches are identical. Behind the scenes, Django creates a lookup table for a ManyToManyField. From the ORM perspective, you can put the ManyToManyField on either model, although it makes a difference in the admin, and if you wish to use the 'limit_choices_to' option. Using 'through' lets you add columns to the lookup table to further define the relationship between the two models, which is exactly what you've done by manually creating the lookup table.
Either way, you can still 'get' the FileGroup that a particular File belongs to, as Django querysets will follow a FK relationship bidirectionally.