"SELECT DISTINCT field_name from table" Django using raw sql - django

How can I run SELECT DISTINCT field_name from table; SQL query in Django as raw sql ?
When I try to use Table.objects.raw("""SELECT DISTINCT field_name from table"""), I got an exception as
InvalidQuery: Raw query must include the primary key

If you don't need the model instances (which are useless if you want a single field), you can as well just use a plain db-api cursor:
from django.db import connection
cursor = connection.cursor()
cursor.execute("select distinct field from table")
for row in cursor:
print(row[0])
But for your example use case you don't need SQL at all - the orm has values and values_list querysets and a distinct() modifier too:
queryset = YourModel.objects.values_list("field", flat=True).order_by("field").distinct()
print(str(queryset.query))
# > 'SELECT DISTINCT `table`.`field` FROM `table` ORDER BY `table`.`title` ASC'
for title in queryset:
print(title)
NB :
1/ since we want single field, I use the flat=True argument to avoid getting a list of tuples
2/ I explicitely set the ordering on the field else the default ordering eventually defined in the model's meta could force the ordering field to be part of te generated query too.

Looks like you have to use some workaround
select field_name, max(id)
from table_name
group by field_name;

Related

Inner join on Django with where clause?

I'm trying to filter out a ModelForm to only display dropdown values tied to a specific user.
I've got three tabled tied together:
User, Project, ProjectUser.
One user can have many projects, and one projects can have many users, and the table ProjectUser is just a joined table between User and Project, e.g
id | project_id | user_id
1 5 1
2 6 2
3 6 1
How would I write the following inner join query in Django ORM?
SELECT name
FROM projectuser
INNER JOIN project
ON projectuser.project_id = project.id
WHERE user_id = <request.user here>
When Django ORM applies a Filter to a field specified as a foreign key, Django ORM understands that it is a related table and joins it.
Project.objects.filter(projectuser__user=user)
You can join multiple tables, or even apply a filter to the reverse of a foreign key! You can use the related_name of the ForeignKey field appropriately.
You original SQL
SELECT name
FROM projectuser
INNER JOIN project
ON projectuser.project_id = project.id
WHERE user_id = <request.user here>
So as i see your SQL, you want to get list of name from projectuser for specific user. If so, here is the answer
ProjectUser.objects.filter(user_id = user).values_list('name', flat = True)
I see you accept answer with Project.objects.filter(projectuser__user=user)
For this answer your SQL should look like this
SELECT name
FROM project
INNER JOIN projectuser
ON projectuser.project_id = project.id
WHERE projectuser.user_id = <request.user here>

How to integrate postgresql 10/11 declarative table partitioning (i.e. PARTITION BY clause) in a Django model?

PostgreSQL 10 introduces declarative table partitioning with the PARTITION BY clause, and I would like to use it to a Django model.
In principle all what I would need to do is introduce the PARTITION BY clause at the end of the CREATE TABLE statement that the Django ORM creates.
CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
Is it possible to insert this clause into the model? I thought that maybe there is a way to somehow append custom SQL to the query that the ORM generates, e.g. using the Meta:
class Measurement(models.Model):
...
class Meta:
append = "PARTITION BY RANGE (logdate)"
As far as I am concern the above is not possible. I have also look into the architect library, but it does not use the new PARTITION BY clause. Instead, it uses inheritance and triggers so the code does not suggest any way in which I could append the clause (neither it does for other databases, e.g. MySQL).
I have also though of customizing the migrations, by adding an ALTER TABLE... operation, e.g.:
operations = [
migrations.RunSQL(
"ALTER TABLE measurement PARTITION BY RANGE (logdate)",
),
]
Unfortunately, the above (or similar) doesn't seem to be supported in PostgreSQL ALTER TABLE statement(at least not yet).
A final idea would be to retrieve the CREATE TABLE statement that the Django model generates, before sending the query, e.g. sql = Measurement.get_statement() where Measurement is the model. Then, I could append the PARTITION BY clause, and send the query directly. I couldn't find any method that returns the statement. I went through the Django create_model code and the sql is generated and directly send to the database, so it would not be easy to extract the statement from there.
Does anybody has a clue how this could be achieved in a way in which I can I still use the benefits of the Django ORM?
An approach I suggest trying is to use a SQL capturing schema editor to collect the SQL necessary to perform the create_model. That's what powers the sqlmigrate feature by the way.
from django.db.migrations import CreateModel
class CreatePartitionedModel(CreateModel):
def __init__(self, name, fields, partition_sql, **kwargs):
self.partition_sql = partition_sql
super().__init__(name, fields, **kwargs)
def database_forwards(self, app_label, schema_editor, from_state, to_state):
collector = type(schema_editor)(
schema_editor.connection, collect_sql=True, atomic=False
)
with collector:
super().database_forwards(
app_label, collector, from_state, to_state
)
collected_sql = collector.collected_sql
schema_editor.deferred_sql.extend(
collector.deferred_sql
)
model = to_state.apps.get_model(app_label, self.name)
create_table = 'CREATE TABLE %s' % schema_editor.quote_name(
model._meta.db_table
)
for sql in collected_sql:
if str(sql).startswith(create_table):
sql = '%s PARTITION BY %s' % (sql.rstrip(';'), self.partition_sql)
schema_editor.execute(sql)
From that point you should simply have to replace your makemigrations auto-generated CreateModel with a CreatePartitionedModel operation and make sure to specify partition_sql='RANGE (logdate)'.

change raw query into django models

i want to use django models feature to excute this sql query.
SELECT COUNT(DISTINCT ques_id), title FROM contest_assignment WHERE grp_id = 60 GROUP BY title;
i tried this but it did not give me proper result:
from django.db.models import Count
assignment.objects.values_list('title').annotate(count=Count('ques')).values('title', 'count')
how can i use django model?
You shouldn't use both .values() and .values_list(). The count field is implicitly added to the set of fields that is returned.
Django supports a COUNT(DISTINCT <field>) query by using .annotate(count=Count('<field>', distinct=True)). That would leave you with the following query:
(assignment.objects.filter(grp_id=60).values('title')
.annotate(count=Count('ques_id', distinct=True)))

How to filter a query set with the results of another query set in Django?

Right now I have a Django queryset that I want to filter by the result of another query set. Right now I am doing this as (and it works):
field = 'content_object__pk'
values = other_queryset.values_list(field, flat=True)
objects = queryset.filter(pk__in=values)
where the field is the name of a foreign key the pk in queryset. The ORM is smart enough to run the above is one query.
I was trying to simplify this to (ie to filter with the object list themselves rather than having to explicitly say pk):
field = 'content_object'
objects = queryset & other_queryset.values_list(field, flat=True)
but this gives the following error:
AssertionError: Cannot combine queries on two different base models.
What is the right way to do this type of filtering?
You can do the next:
result = MyModel.objects.filter(field__in=other_query)
The results will be the objects in the model where the field is a foreign key to the model in the other_query
you can chain queries in Django...
qs = entry.objects.filter(...).filter(...)

Ordering in Django Admin based on a custom callable in list display

I have a custom callable in my list display. I want to be able to sort by it, but it does not correspond to a single field, so I cannot use admin_order_field on its own.
I would like to be able to alter the ordering of the queryset to reflect this, if it is selected as a field. However, it looks like the ChangeList view calls get_ordering after running it via the model admin's get_ordering call, and then loops through the given sorted fields (in the format a.b.c.etc.y.z where a, b, c, etc. are all integers corresponding to one or more fields in the display list.
In this example, I have a order page where the customer can be a company/organization or a person. I want to be able to sort it so that all orders by people are listed first, followed by organizations, and all in alphabetical order.
Let's use this model admin setup as an example:
class OrderAdmin(models.ModelAdmin):
list_display = ('pk', 'date_ordered', 'customer')
def customer(self, obj):
return obj.organization or "%s %s" % (obj.first_name, obj.last_name)
At the moment, I can't sort because a sort field is only made available if the callable has a admin_order_field attached:
def customer(self, obj):
return obj.organization or "%s %s" % (obj.first_name, obj.last_name)
customer.admin_order_field = 'customer'
The thing is, ideally I would like to be able to intercept the default code and say, "if one of the fields is 'customer', remove that field from the list and instead sort it using ["organization", "last_name", "first_name"]". But as far as I can tell there is no way to do this.
I suspect extra(select={'customer':...}) would work, except that I'm using django-pyodbc as this is a SQL Server database, and the generated SQL simply does not work and throws an error:
SELECT *
FROM (SELECT
( COALESCE(organization, firstname + ' ' + lastname) ) AS [customer],
...,
( Row_number()
OVER (
ORDER BY [customer] ASC, [orders].[date_created] DESC,
[orders].[order_id] ASC
) ) AS
[rn]
FROM [orders]) AS X
WHERE X.rn BETWEEN 1 AND 100
The error being:
Invalid column name 'customer'.
Short of rewriting django-pyodbc, using .extra is not a solution.
So I'm wondering if there is anything else I can do, or if I just have to give up on using customer name on its own as a sorting field, and replace it with separate organization, last name, and first name columns.
Please note that admin_order_field is a field from SQL result set. So, what you need to do is override get_queryset in admin so you have a "field" that works for your sort.
https://docs.djangoproject.com/en/dev/ref/contrib/admin/#django.contrib.admin.ModelAdmin.get_queryset
You will probably need Q() and F() functions as well. I do not know if they work in pyodbc.
Now, this "customer" field should really be just for sorting. It is accessible via obj.customer.
Then you need a function (it could be called "customer" as well) that sorts on that field. And in it you do the logic you wanted to do with COALESCE.