Integrate Raw SQL Query into Django ORM with Aggregations - django

I'm trying to integrate this RAW query into a django ORM query but I'm facing problems to apply the raw query and the orm query
The original query which works fine with postgres querytools:
"SELECT SUM(counter),type, manufacturer
FROM sells GROUP BY manufacturer, type"
Now I tried to integrate this into a django-orm query like this:
res_postgres = Sells.objects.all().values('manufacturer','type','counter').aggregate(cnter=Sum('counter'))
But what I get is just a the counter cnter ...
What I need is the result from the Raw query which looks like this
What I also tried is to use values and field names. Like Sells.objects.values('manufacturer'....).aggregate(cnter=Sum(counter))
But then django is building a query which integrates a GROUP BY id. Which is not what I need. I need an aggregation of the entire data not the object level while keeping the information of the other fields.
When I use Cars.objects.raw() it asks me about primary keys, which I also don't need.
Any hints here? is that possible with Django ORM at all?

Use annotate(...) instead of aggregate()
res_postgres = Sells.objects.values('manufacturer','type').annotate(cnter=Sum('counter'))

Related

Does Django support setting the beginning value for an id column?

I have seen several questions and answers on SO, most were three years old or older and I looked at the Django documentation (hoping I didn't miss it). I have to have a 9+ digit number for an id. Most responses were to do this at the database. I am guessing that means to create the model in Django and then go back to the database and change the id column Django created with a new starting/next value attribute on the column.
If not how can I create a database table from Django, Code First, that allows me to create a table with an id column that starts at 100000000? And, it be done with the stock model object methods in Django. I don't really want to do a special hack. If that is the case, I can go the database and fix the column. I was trying to adhere to the Code First ideas of Django (though I prefer database first, and am afraid using inspectdb will make a mess.)
Edit: I didn't want to use UUID. I believe BigAutoField is best.
You should be able to do this in two steps:
1 - Specify your primary key explicitly using primary_key=TRUE in your model definition. See the Django docs for more info. You can then specify BigAutoField or whatever other type you want for the primary key.
2A - If you're populating the database up front, just set pk: 100000000 in your fixture.
OR
2B - If you're not populating the database up front, use Django Model Migration Operations RunSQL as detailed here. For your SQL use ALTER TABLE tableName AUTO_INCREMENT=100000000.

Overriding models.save() in Django ORM to use a stored procedure?

I'm putting together a partitioned table in postgres which will be used by an API written in Django. Postgres has a number of issues with this, most of them having to do with the RETURNING clause in SQL returning NULL or creating duplicate records (google postgres partition returning if you want to learn more).
I believe the solution is to override the save() method in ORM to use a stored procedure or custom SQL, but how do I map the incoming arguments to a custom SQL statement?
Ideally it would look like this but instead of calling the super method it would map the args to a custom SQL statement.
The simplest way is make trigger before insert/update on PostgreSQL site.

How can I dynamically get the connector table name of a Many-To-Many relationship?

Unfortunately I have a situation where I have to write a raw SQL query. I have a model that will have many Many-to-Many relationships and I'm trying to do a generic function to get the correct information in a query.
Each of the many-to-many relationships will be setup with something along the lines of shared = models.ManyToManyField(SharedResource). In my raw query function, I'm given the model that has this defined, and need to build the table name to do the raw join.
How can I reliably get the Many-To-Many connector table name?
The table name for a model is available via Model._meta.db_table, and you can get to the through table for an M2M field via Model.field.through. So, in your case:
MyModel.shared.through._meta.db_table

manipulating Django database directly

I was wondering this could produce any problem if I directly add rows or remove some from a model table. I thought maybe Django records the number of rows in all tables? or this could mess up the auto-generated id's?
I don't think it matters but I'm using MySql.
No, it's not a problem because Django does the same that you do "directly" to the database, it execute SQL statements, and the auto generated id is handled by the database server (MySql server in this case), no matter where that SQL queries comes from, whatever it is Mysql Client or Django.
Since you can have Django work on a pre-existing database (one that wasn't created by Django), I don't think you will have problems if you access/write the tables of your own app (you might want to avoid modifying Django's internal tables like auth, permission, content_type etc until you are familiar with them)
When you create a model through Django, Django doesn't store the count or anything (unless your app does), so it's okay if you create the model with Django on the SQL database, and then have another app write/read from that same SQL table
If you use Django signals, those will not be triggered by modifying the SQL table directly through the DB, so you might want to pay attention to side effects like that.
Your RDBMS handles it's own auto generated IDs and referential integrity, counts etc, so you don't have to worry about messing it up.

Warehousing records from a flat item table: Django Signals or PostgreSQL Triggers?

I have a Django website with a PostgreSQL database. There is a Django app and model for a 'flat' item table with many records being inserted regularly, up to millions of inserts per month. I would like to use these records to automatically populate a star schema of fact and dimension tables (initially also modeled in the Django models.py), in order to efficiently do complex queries on the records, and present data from them on the Django site.
Two main options keep coming up:
1) PostgreSQL Triggers: Configure the database directly to insert the appropriate rows into fact and dimensional tables, based on creation or update of a record, possibly using Python/PL-pgsql and row-level after triggers. Pros: Works with inputs outside Django; might be expected to be more efficient. Cons: Splits business logic to another location; triggering inserts may not be expected by other input sources.
2) Django Signals: Use the Signals feature to do the inserts upon creation or update of a record, with the built-in signal django.db.models.signals.post_save. Pros: easier to build and maintain. Cons: Have to repeat some code or stay inside the Django site/app environment to support new input sources.
Am I correct in thinking that Django's built-in signals are the way to go for maintaining the fact table and the dimension tables? Or is there some other, significant option that is being missed?
I ended up using Django Signals. With a flat table "item_record" containing fields "item" and "description", the code in models.py looks like this:
from django.db.models.signals import post_save
def create_item_record_history(instance, created, **kwargs):
if created:
ItemRecordHistory.objects.create(
title=instance.title,
description=instance.description,
created_at=instance.created_at,
)
post_save.connect(create_item_record_history, sender=ItemRecord)
It is running well for my purposes. Although it's just creating an annotated flat table (new field "created_at"), the same method could be used to build out a star schema.