Ordering models using Django AutoField with custom step size - django

Consider a model Section that is displayed on a site and created / edited by a user using the Django admin interface. I would like to add a field that allows the user to easily control the order in which sections are displayed on the site. The easiest option seems to be to allow for an integer field that is auto-incremented but can be edited by the user -- akin to what the built-in AutoField does.
However, to make editing the order easier, I would like to increment the fields default value by 10 every time, to allow the user to shift sections around more easily. The first section would get order=1, the next one order=11 and so on, that way a section can be wedged in between those first two by giving it, e.g., order=6.
Is there a way I can reuse AutoField to achieve this purpose? And if no, how could I best achieve this type of ordering scheme?
Ideally, what I'd like to achieve should look like this:
from django.db import models
class Section(models.Model):
text = models.TextField()
order = AutoField(step=10)
class Meta:
ordering = ('order',)

Autofield won't work. Not editable, needs to be primary key.
I also suggest to solve this visually using drag and drop in the UI and then reorder all sections within the whole, rather then allow wedging. If two people at the same time wedge 25 within 20 and 30, you still have the same problem. Reordering on save is a much cleaner solution, especially when using select_for_update:
Returns a queryset that will lock rows until the end of the transaction, generating a SELECT ... FOR UPDATE SQL statement on supported databases.

Related

What are the alternatives to using django signals?

I am working on a project where I need to recalculate values based on if fields changed or not. Here is an example:
Model1:
field_a = DatetimeField()
calculated_field_1 = ForeignKey(Model2)
Model2:
field_j = DatetimeField()
If field_a changes on model1 I have to recalculate the value for field calculated_field_1 to see if it needs to change as well. The calculations that are done require me querying the database to check values of other models and then determining if the value of the calculated field needs to change.
Example) field_a changes then I would have to do this calculation
result = Model2.objects.filter(field_j__gte=Model1.field_a)
If result.exists():
Model1.field_a = result.first()
Model1.save(update_fields=(‘field_a’,))
This is the most basic example I could think of and the queries can be much more complicated than this.
The project started out with one calculation when a field changed so I decided the best approach was to use django signals. Months later the requirements have changed for the project and now there are several other calculations that I had to implement that are very similar to the example above. I have noticed that my post_save function is getting out of hand and I am just wondering what alternatives there are to using signals. Although the post_save calculations I do now take far less than half a second, for the sake of my question pretend they took a second or more.
A valid answer cannot include doing these calculations on the fly when I pull them from the db. We use a validation framework that requires me to set these values on the model and querying on the fly has been an approach we attempted but for performance reasons it was not viable. Also, on field change one of the requirements is that the user needs to see the results of the calculated field so this has to happen synchronously.
What are some alternative approaches to using this pattern?

Ordering Django querysets using a JSONField's properties

I have a model that kinda looks like this:
class Person(models.Model):
data = JSONField()
The data field has 2 properties, name, and age. Now, lets say I want to get a paginated queryset (each page containing 20 people), with a filter where age is greater than 25, and the queryset is to be ordered in descending order. In a usual setup, that is, a normalized database, I can write this query like so:
person_list_page_1 = Person.objects.filter(age > 25).order_by('-age')[:20]
Now, what is the equivalence of the above when filtering and ordering using keys stored in the JSONField? I have researched into this, and it seems it was meant to be a feature for 2.1, but I can't seem to find anything relevant.
Link to the ticket about it being implemented in the future
I also have another question. Lets say we filter and order using the JSONField. Will the ORM have to get all the objects, filter, and order them before sending the first 20 in such a case? That is, will performance be legitimately slower?
Obviously, I know a normalized database is far better for these things, but my hands are kinda tied.
You can use the postgresql sql syntax to extract subfields. Then they can be used just as any other field on the model in queryset filters.
from django.db.models.expressions import RawSQL
Person.objects.annotate(
age=RawSQL("(data->>'age')::int", [])
).filter(age__gte=25).order_by('-age')[:20]
See the postgresql docs for other operators and functions.
In some cases, you might have to add explicit typecasts (::int, for example)
https://www.postgresql.org/docs/current/static/functions-json.html
Performance will be slower than with a proper field, but it's not bad.

Do Django models really need a single unique key field

Some of my models are only unique in a combination of keys. I don't want to use an auto-numbering id as the identifier as subsets of the data will be exported to other systems (such as spreadsheets), modified and then used to update the master database.
Here's an example:
class Statement(models.Model):
supplier = models.ForeignKey(Supplier)
total = models.DecimalField("statement total", max_digits=10, decimal_places=2)
statement_date = models.DateField("statement date")
....
class Invoice(models.Model):
supplier = models.ForeignKey(Supplier)
amount = models.DecimalField("invoice total", max_digits=10, decimal_places=2)
invoice_date = models.DateField("date of invoice")
statement = models.ForeignKey(Statement, blank=True, null=True)
....
Invoice records are only unique for a combination of supplier, amount and invoice_date
I'm wondering if I should create a slug for Invoice based on supplier, amount and invoice_date so that it is easy to identify the correct record.
An example of the problem of having multiple related fields to identify the right record is django-csvimport which assumes there is only one related field and will not discriminate on two when building the foreign key links.
Yet the slug seems a clumsy option and needs some kind of management to rebuild the slugs after adding records in bulk.
I'm thinking this must be a common problem and maybe there's a best practice design pattern out there somewhere.
I am using PostgreSQL in case anyone has a database solution. Although I'd prefer to avoid that if possible, I can see that it might be the way to build my slug if that's the way to go, perhaps with trigger functions. That just feels a bit like hidden functionality though, and may cause a headache for setting up on a different server.
UPDATE - after reading initial replies
My application requires that data may be exported, modified remotely, and merged back into the master database after review and approval. Hidden autonumber keys don't easily survive that consistently. The relation invoices[2417] is part of statements[265] is not persistent if the statement table was emptied and reloaded from a CSV.
If I use the numeric autonumber pk then any process that is updating the database would need to refresh the related key numbers or by using the multiple WITH clause.
If I create a slug that is based on my 3 keys but easy to reproduce then I can use it as the key - albeit clumsily. I'm thinking of a slug along the lines:
u'%s %s %s' % (self.supplier,
self.statement_date.strftime("%Y-%m-%d"),
self.total)
This seems quite clumsy and not very DRY as I expect I may have to recreate the slug elsewhere duplicating the algorithm (maybe in an Excel formula, or an Access query)
I thought there must be a better way I'm missing but it looks like yuvi's reply means there should be, and there will be, but not yet :-(
What you're talking about it a multi-column primary key, otherwise known as "composite" or "compound" keys. Support in django for composite keys today is still in the works, you can read about it here:
Currently Django models only support a single column in this set,
denying many designs where the natural primary key of a table is
multiple columns [...] Current state is that the issue is
accepted/assigned and being worked on [...]
The link also mentions a partial implementation which is django-compositekeys. It's only partial and will cause you trouble with navigating between relationships:
support for composite keys is missing in ForeignKey and
RelatedManager. As a consequence, it isn't possible to navigate
relationships from models that have a composite primary key.
So currently it isn't entirely supported, but will be in the future. Regarding your own project, you can make of that what you will, though my own suggestion is to stick with the fully supported default of a hidden auto-incremented field that you don't even need to think about (and use unique_together to enforce the uniqness of the described fields instead of making them your primary keys).
I hope this helps!
No.
Model needs to have one field that is primary_key = True. By default this is the (hidden) autofield which stores object Id. But you can set primary_key to True at any other field.
I've done this in cases, Where i'm creating django project upon tables which were previously created manually or through some other frameworks/systems.
In reality - you can use whatever means you can think of, for joining objects together in queries. As long as query returns bunch of data that can be associated with models you have - it does not really matter which field you are using for joins. Just keep in mind, that the solution you use should be as effective as possible.
Alan

Django .order_by() with .distinct() using postgres

I have a Read model that is related to an Article model. What I would like to do is make a queryset where articles are unique and ordered by date_added. Since I'm using postgres, I'd prefer to use the .distinct() method and specify the article field. Like so:
articles = Read.objects.order_by('article', 'date_added').distinct('article')
However this doesn't give the desired effect and orders the queryset by the order they were created. I am aware of the note about .distinct() and .order_by() in Django's documentation, but I don't see that it applies here since the side effect it mentions is there will be duplicates and I'm not seeing that.
# To actually sort by date added I end up doing this
articles = sorted(articles, key=lambda x: x.date_added, reverse=True)
This executes the entire query before I actually need it and could potentially get very slow if there are lots of records. I've already optimized using select_related().
Is there a better, more efficient, way to create a query with uniqueness of a related model and order_by date?
UPDATE
The output would ideally be a queryset of Read instances where their related article is unique in the queryset and only using the Django orm (i.e. sorting in python).
Is there a better, more efficient, way to create a query with uniqueness of a related model and order_by date?
Possibily. It's hard to say without the full picture, but my assumption is that you are using Read to track which articles have and have not been read, and probably tying this to User instance to determine if a particular user has read an article or not. If that's the case, your approach is flawed. Instead, you should do something like:
class Article(models.Model):
...
read_by = models.ManyToManyField(User, related_name='read_articles')
Then, to get a particular user's read articles, you can just do:
user_instance.read_articles.order_by('date_added')
That takes the need to use distinct out of the equation, since there will not be any duplicates now.
UPDATE
To get all articles that are read by at least one user:
Article.objects.filter(read_by__isnull=False)
Or, if you want to set a threshold for popularity, you can use annotations:
from django.db.models import Count
Article.objects.annotate(read_count=Count('read_by')).filter(read_count__gte=10)
Which would give you only articles that have been read by at least 10 users.

Django: limit queryset to a condition in the latest instance of a related set

I have an hierarchy of models that consists of four levels, all for various good reasons but which to explain would be beyond the scope of this question, I assume.
So here it goes in pseudo python:
class Base(models.Model):
...
class Top(models.Model):
base = FK(Base)
class Middle(models.Model):
top = FK(Top)
created_at = DateTime(...)
flag = BooleanField(...)
class Bottom(models.Model):
middle = FK(Middle)
stored_at = DateTime(...)
title = CharField(...)
Given a title, how do I efficiently find all instances of Base for which that title is met only in the latest (stored_at) Bottom instance of the latest (created_at) Middle instance that has flag set to True?
I couldn't find a way using the ORM, and the way I've seen it, .latest() isn't useful to me on the model that I want to query. The same holds for any convenience methods on the Base model. As I'm no SQL expert, I'd like to make use of the ORM as well as avoid denormalization as much as possible.
Thanks!
So, apparently, without heavily dropping into (some very unwieldy) SQL and not finding any alternative solution, I saw myself forced to resort to denormalized fields on the Base model, just as many as were required for efficiently getting the wanted (filtered) querysets of said model.
These fields were then updated at creation/modificatin time of respective Bottom instances.