Django: How to create ordered siblings - django

I want to create a model that will order its children models in the appropriate way. For instance, a Book has many Chapters, but the Chapters have to be in a specific order.
I assume that I need to put an IntegerField on the Chapter model that specifies the order of the Chapters like the following question suggests: Ordered lists in django
My main issue is that whenever I want to insert a new Chapter in between two existing chapters or reorder them in any way, I have to update (almost) every Chapter in the Book. Is there a way (perhaps in the Django Admin, which I'm using) to avoid having to manually change every index on every Chapter whenever I change the order?
I'm not a big fan of creating a "Linked List" style model, as proposed in the above-linked question, as I am under the impression that's not good practice for database creation.
What is the "right" way to model this relationship?

The answer you alluded to was probably the best way to handle this efficiently. Probably requiring a raw SQL statement UPDATE Chapter SET order = order + 1 WHERE book_id = <id_for_book> AND order <= <insert_index_location>. For Django 1.1+: You could use F() to write this in a single line as the following, but it might still be O(n) queries under the hood, using transactions.
Book.objects.get(id=<id_of_book>).chapter_set.filter(order__gt=<place_to_insert>).update(order=F('order')+1)

Use a float instead of an integer to avoid your problem of updating multiple items when you insert between two.
So if you want to insert an item between item 42 and item 43, you can give it an order value halfway between the two (42.5), and you won't have to update any other items.
Insert z between x and y...
z.order = (y.order - x.order) / 2 + x.order

Related

Ordered ManyToMany relation in Django with custom Field

In Django, I would like to have an ordered many-to-many-relation. Assume I have, say, the models OrderedList and Item, and I want to be able to insert Item()s into an OrderedList() at a specific index, I want to be able to retrieve the Item()s of an OrderedList() in their order and also to change the order of Item()s on an OrderedList
I already found Define an order for ManyToManyField with django and https://github.com/gregmuellegger/django-sortedm2m
Both the github repo and the accepted answer in the SO question are working with the same architecture: They create an additional integer field, say order_index, on the junction ("Through") table which represents the position of the Item() on the OrderedList().
Honestly, I do not like that too much. If I see this correctly, having the order stored on the junction table can create inefficiency when I want to reorder Item()s: Imagine, I want to change the position of an Item() on an OrderedList() which has n Item()s. This means O(n) database updates to reorganize the order indices.
I would like to avoid this. I think of an architecture where I have an ordinary many-to-many-relation and one additional column on the OrderedList table which holds a list of Item ids, say items_order. In this architecture, I need one database update and one list operation on items_order - which should be way faster, I guess.
I believe the best way for this is to create a custom model Field. The docs state how to create a custom model Field (https://docs.djangoproject.com/en/2.1/howto/custom-model-fields/) and I can create my items_order field like this. But I did not find how to make a custom Field which, besides creating the order_list, also creates the junction table and takes care of updating the items_order whenever a new related Item() is added or removed from the relation. I think, I should subclass the ManyToMany Field (https://docs.djangoproject.com/en/2.1/_modules/django/db/models/fields/related/#ManyToManyField). But I don't know how to do this, so could you give me some guidance here?

Django filtering with F and Q operations

I have a model class in my django project:
*user_id
*amount
*net_balance
*created_on
I have a list of user_ids(let's say 3). I need to get the last row for each user_id and then do some operation and create a new row for each user id. How do this efficiently. I can certainly do 6 transactions (if there are 3 items in list of userids).
If you want the most recent entry then
YourModel.objects.filter(user=user_id).latest('created_on')
If I understand your question correctly then you need to get all the user_ids (presumably you have a separate User model?) and then loop through them - for each user getting the most recent entry and then create the new row.
You need 1 select (at least) for all the records you interested and 1 insert query for each record returned.
The select query can be generated by ORM abilities (aggregation) or you can use raw SQL if you fill comfortable. If you use PostgreSQL, you can use distinct ability (I recommended) as:
Model.objects.order_by('user_id', '-created_on').distinct('user_id')
or you can use aggregation abilities as:
Model.objects.filter(user_id__in=[1,2,3]).values('user_id', 'created_on').annotate(last_row=Max('created_on')).filter(created_on=F('last_row'))
The correct answer depends on your Django version and database. But there are lots of good features in Django to achieve this kind of stuffs.

postgresql django: how to store an array of instances of variable type?

Suppose that you're creating a blog and each blogpost consists of an array of interleaving text fragments and fragments of svg (for instance).
You store each of those fragments in a custom django field (e.g. HTMLField and SVGField).
What's the best way to organize this?
How to maintain the order of fragments? This solution looks ugly:
class Post(models.Model):
title = CharField(1000)
class Fragment(models.Model):
index = IntegerField()
html = HTMLField()
svg = SVGField()
post = ForeignKey(Post)
As discussed, a separate model is a feasible way to go to record all the fragments. We use one IntegerField to record the fragment order, so that later one the whole Post could be recovered.
Some caveats here:
Use order_by, latest or slice n dice operations to sort/find elements.
When insert/delete operations are needed, it's going to break the overall sequence. We need to increase/decrease multiple elements to maintain the order. Use queryset and F() expression to change multiple records at once, like described in another SO post here.
There are some imperfections about the approach, but It's the best solution I could come up so far(I encountered similar situation before). Linked list is a good way but it's not database-friendly, as to get all fragments we need O(n) operations instead of O(1) with queryset.

What is the best way to use query with a list and keep the list order? [duplicate]

This question already has answers here:
Django: __in query lookup doesn't maintain the order in queryset
(6 answers)
Closed 8 years ago.
I've searched online and could only find one blog that seemed like a hackish attempt to keep the order of a query list. I was hoping to query using the ORM with a list of strings, but doing it that way does not keep the order of the list.
From what I understand bulk_query only works if you have the id's of the items you want to query.
Can anybody recommend an ideal way of querying by a list of strings and making sure the objects are kept in their proper order?
So in a perfect world I would be able to query a set of objects by doing something like this...
Entry.objects.filter(id__in=['list', 'of', 'strings'])
However, they do not keep order, so string could be before list etc...
The only work around I see, and I may just be tired or this may be perfectly acceptable I'm not sure is doing this...
for i in listOfStrings:
object = Object.objects.get(title=str(i))
myIterableCorrectOrderedList.append(object)
Thank you,
The problem with your solution is that it does a separate database query for each item.
This answer gives the right solution if you're using ids: use in_bulk to create a map between ids and items, and then reorder them as you wish.
If you're not using ids, you can just create the mapping yourself:
values = ['list', 'of', 'strings']
# one database query
entries = Entry.objects.filter(field__in=values)
# one trip through the list to create the mapping
entry_map = {entry.field: entry for entry in entries}
# one more trip through the list to build the ordered entries
ordered_entries = [entry_map[value] for value in values]
(You could save yourself a line by using index, as in this example, but since index is O(n) the performance will not be good for long lists.)
Remember that ultimately this is all done to a database; these operations get translated down to SQL somewhere.
Your Django query loosely translated into SQL would be something like:
SELECT * FROM entry_table e WHERE e.title IN ("list", "of", "strings");
So, in a way, your question is equivalent to asking how to ORDER BY the order something was specified in a WHERE clause. (Needless to say, I hope, this is a confusing request to write in SQL -- NOT the way it was designed to be used.)
You can do this in a couple of ways, as documented in some other answers on StackOverflow [1] [2]. However, as you can see, both rely on adding (temporary) information to the database in order to sort the selection.
Really, this should suggest the correct answer: the information you are sorting on should be in your database. Or, back in high-level Django-land, it should be in your models. Consider revising your models to save a timestamp or an ordering when the user adds favorites, if that's what you want to preserve.
Otherwise, you're stuck with one of the solutions that either grabs the unordered data from the db then "fixes" it in Python, or constructing your own SQL query and implementing your own ugly hack from one of the solutions I linked (don't do this).
tl;dr The "right" answer is to keep the sort order in the database; the "quick fix" is to massage the unsorted data from the database to your liking in Python.
EDIT: Apparently MySQL has some weird feature that will let you do this, if that happens to be your backend.

Django ORM: Optimizing queries involving many-to-many relations

I have the following model structure:
class Container(models.Model):
pass
class Generic(models.Model):
name = models.CharacterField(unique=True)
cont = models.ManyToManyField(Container, null=True)
# It is possible to have a Generic object not associated with any container,
# thats why null=True
class Specific1(Generic):
...
class Specific2(Generic):
...
...
class SpecificN(Generic):
...
Say, I need to retrieve all Specific-type models, that have a relationship with a particular Container.
The SQL for that is more or less trivial, but that is not the question. Unfortunately, I am not very experienced at working with ORMs (Django's ORM in particular), so I might be missing a pattern here.
When done in a brute-force manner, -
c = Container.objects.get(name='somename') # this gets me the container
items = c.generic_set.all()
# this gets me all Generic objects, that are related to the container
# Now what? I need to get to the actual Specific objects, so I need to somehow
# get the type of the underlying Specific object and get it
for item in items:
spec = getattr(item, item.get_my_specific_type())
this results in a ton of db hits (one for each Generic record, that relates to a Container), so this is obviously not the way to do it. Now, it could, perhaps, be done by getting the SpecificX objects directly:
s = Specific1.objects.filter(cont__name='somename')
# This gets me all Specific1 objects for the specified container
...
# do it for every Specific type
that way the db will be hit once for each Specific type (acceptable, I guess).
I know, that .select_related() doesn't work with m2m relationships, so it is not of much help here.
To reiterate, the end result has to be a collection of SpecificX objects (not Generic).
I think you've already outlined the two easy possibilities. Either you do a single filter query against Generic and then cast each item to its Specific subtype (results in n+1 queries, where n is the number of items returned), or you make a separate query against each Specific table (results in k queries, where k is the number of Specific types).
It's actually worth benchmarking to see which of these is faster in reality. The second seems better because it's (probably) fewer queries, but each one of those queries has to perform a join with the m2m intermediate table. In the former case you only do one join query, and then many simple ones. Some database backends perform better with lots of small queries than fewer, more complex ones.
If the second is actually significantly faster for your use case, and you're willing to do some extra work to clean up your code, it should be possible to write a custom manager method for the Generic model that "pre-fetches" all the subtype data from the relevant Specific tables for a given queryset, using only one query per subtype table; similar to how this snippet optimizes generic foreign keys with a bulk prefetch. This would give you the same queries as your second option, with the DRYer syntax of your first option.
Not a complete answer but you can avoid a great number of hits by doing this
items= list(items)
for item in items:
spec = getattr(item, item.get_my_specific_type())
instead of this :
for item in items:
spec = getattr(item, item.get_my_specific_type())
Indeed, by forcing a cast to a python list, you force the django orm to load all elements in your queryset. It then does this in one query.
I accidentally stubmled upon the following post, which pretty much answers your question :
http://lazypython.blogspot.com/2008/11/timeline-view-in-django.html