I have partition table with datetime field indexing
In django orm i have 2 variant of requests
First variant
MyModel.objects.filter(my_datetime_field__gt=date(2022, 1 , 1))
This request is being made 3.5-5 seconds
Second variant
MyModel.objects.filter(my_datetime_field__date__gt=date(2022, 1 , 1))
This request is being made 0.05 seconds
Question
Previously, requests were completed in the same time. What could have happened?
Some information
django: 2.0
postgres: 12.3
index_type: btree
I try next
VACUUM (VERBOSE, ANALYZE) my_table
REINDEX INDEX my_index
Related
Today I was just bitten in the rear end by something I didn't expect. Here's a little script to reproduce the issue:
create temporary table aaa_state(id int, amount int);
create temporary table aaa_changes(id int, delta int);
insert into aaa_state(id, amount) values (1, 0);
insert into aaa_changes(id, delta) values (1, 5), (1, 7);
update aaa_changes c join aaa_state s on (c.id=s.id) set s.amount=s.amount+c.delta;
select * from aaa_state;
The final result in the aaa_state table is:
ID
Amount
1
5
Whereas I would expect it to be:
ID
Amount
1
12
What gives? I checked the docs but cannot find anything that would hint at this behavior. Is this a bug that I should report, or is this by design?
The behavior you are seeing is consistent with two updates happening on the aaa_state table. One update is assigning the amount to 7, and then this amount is being clobbered by the second update, which sets to 5. This could be explained by MySQL using a snapshot of the aaa_state table to fetch the amount for each step of the update. If true, the actual steps would look something like this:
1. join the two tables
2. update the amount using the "first" row from the changes table.
now the cached result for the amount is 7, but this value will not actually
be written out to the underlying table until AFTER the entire update
3. update the amount using the "second" row from the changes table.
now the cached amount is 5
5. the update is over, write 5 out for the actual amount
Your syntax is not really correct for what you want to do. You should be using something like the following:
UPDATE aaa_state as
INNER JOIN
(
SELECT id, SUM(delta) AS delta_sum
FROM aaa_changes
GROUP BY id
) ac
ON ac.id = as.id
SET
as.amount = as.amount + ac.delta_sum;
Here we are doing a proper aggregation of the delta values for each id in a separate bona-fide subquery. This means that the delta sums will be properly computed and materialized in the subquery before MySQL does the join, to update the first table.
Please help me with this error:
File "/Users/ecommerce/proshopvenv/lib/python3.8/site-packages/django/db/backends/utils.py",
line 84, in _execute
return self.cursor.execute(sql, params)
psycopg2.errors.CannotCoerce: cannot cast type date to time without time zone
LINE 1: ..." ALTER COLUMN "createdAt" TYPE time USING "createdAt"::time
The above exception was the direct cause of the following exception:
"a lot of code, and:"
django.db.utils.ProgrammingError: cannot cast type date to time without time zone
LINE 1: ..." ALTER COLUMN "createdAt" TYPE time USING "createdAt"::time
In models i have:
createdAt = models.DateTimeField(auto_now_add=True)
and how i understand from docs https://docs.djangoproject.com/en/3.2/ref/models/fields/
"The auto_now and auto_now_add options will always use the date in the default timezone at the moment of creation or update."
But zone doesn't create, or error in something else, i creating new database, so i can delete all tables
Maybe there is step in django where it initials time zone?
Readed this topic postgreSQL alter column data type to timestamp without time zone and many others but can't fix it to 'migrate' without errors
Yes there a problem modifying a date to time with or without time zone. The problem being that in Postgres date does not have a time component so any successful attempt to get time results in '00:00:00'. Try
select current_date::timestamp::time;
Given that then assuming your objective is the change the type then just drop and re-add the column;
alter table test_table drop createdat;
alter table test_table add createdat time default '00:00:00'::time;
Of course I have never figured out what good time is without the date. Which came first, '10:15:00' or '11:15:00'? Well if 10:15 was today and 11:15 was yesterday then 11:15 came first. Unless you define it as hours-since-midnight. So perhaps a better option:
alter table test_table
alter column createdat
set data type timestamp
using createdat::timestamp;
See Demo Yes, timestamp takes more space, but is much more flexible.
First i want say Thanks to Belayer, but have no rating to vote for his answer
I needed to Alter columns from type "date" to make them "timestamp with time zone", because my Django models configuration wanted to be them like this createdAt = models.DateTimeField(auto_now_add=True)
so i run alter table "base_product" alter column "createdAt" type timestamp with time zone;
But there was strange thing for me, all tables where there was type "date" modifyed to "timestamp with time zone" (it was good, because there where few other places where i need that), but only not in table "base_product" where i run query, and also didn't want to modify it automatically any more
so then i run ALTER TABLE "base_product" ALTER COLUMN "createdAt" DROP DEFAULT, ALTER COLUMN "createdAt" TYPE timestamp with time zone USING timestamp with time zone 'epoch' + "createdAt", ALTER COLUMN "createdAt" SET DEFAULT now();
i read that from postgres docs https://www.postgresql.org/docs/9.1/sql-altertable.html
In my case what a did was, delete de migrations to start from where all work fine, then I delete the field, apply a migrations and then a Added like a new column. And it works. Doing it in that way the django doesn't know tha instead of alter a column, he think that I create a new one. 😁
I have encountered a strange error on on of our tables ("nav") with the following structure:
position | character varying(400)
timestamp | timestamp without time zone
If I run the following query, I get back, as expected, all matching results for the 2 columns:
SELECT * FROM nav WHERE timestamp >= '2019-03-24' AND timestamp < '2019-03-25';
However, if I run the following query, I get back the same number of results, but each field/row is empty:
SELECT position FROM nav WHERE timestamp >= '2019-03-24' AND timestamp < '2019-03-25';
I run the same query on similar tables and experience no issues. Any ideas as to what may be causing the second to return blanks in the rows please?
Problem.
After successful data migration from csv files to django /Postgres application .
When I try to add a new record via the application interface getting - duplicate key value violates unique constraint.(Since i had id's in my csv files -i use them as keys )
Basically the app try to generate id's that already migrated.
After each attempt ID increments by one so if I have 160 record I have to get this error 160 times and then when I try 160 times the time 161 record saves ok.
Any ideas how to solve it?
PostgreSQL doesn't have an actual AUTO_INCREMENT column, at least not in the way that MySQL does. Instead it has a special SERIAL. This creates a four-byte INT column and attaches a trigger to it. Behind the scenes, if PostgreSQL sees that there is no value in that ID column, it checks the value of a sequence created when that column was created.
You can see this by:
SELECT
TABLE_NAME, COLUMN_NAME, COLUMN_DEFAULT
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME='<your-table>' AND COLUMN_NAME = '<your-id-column>';
You should see something like:
table_name | column_name | column_default
--------------+---------------------------+-------------------------------------
<your-table> | <your-id-column> | nextval('<table-name>_<your-id-column>_seq'::regclass)
(1 row)
To resolve your particular issue, you're going to need to reset the value of the sequence (named <table-name>_<your-id-column>_seq) to reflect the current index.
ALTER SEQUENCE your_name_your_id_column_seq RESTART WITH 161;
Credit where credit is due.
Sequence syntax is here.
Finding the name is here.
I have a table with one of the columns as date. It can have multiple entries for each date.
date .....
----------- -----
2015-07-20 ..
2015-07-20 ..
2015-07-23 ..
2015-07-24 ..
I would like to get data in the following form using Django ORM with PostgreSQL as database backend:
date count(date)
----------- -----------
2015-07-20 2
2015-07-21 0 (missing after aggregation)
2015-07-22 0 (missing after aggregation)
2015-07-23 1
2015-07-24 1
Corresponding PostgreSQL Query:
WITH RECURSIVE date_view(start_date, end_date)
AS ( VALUES ('2015-07-20'::date, '2015-07-24'::date)
UNION ALL SELECT start_date::date + 1, end_date
FROM date_view
WHERE start_date < end_date )
SELECT start_date, count(date)
FROM date_view LEFT JOIN my_table ON date=start_date
GROUP BY date, start_date
ORDER BY start_date ASC;
I'm having trouble translating this raw query to Django ORM query.
It would be great if someone can give a sample ORM query with/without a workaround for Common Table Expressions using PostgreSQL as database backend.
The simple reason is quoted here:
My preference is to do as much data processing in the database, short of really involved presentation stuff. I don't envy doing this in application code, just as long as it's one trip to the database
As per this answer django doesn't support CTE's natively, but the answer seems quite outdated.
References:
MySQL: Select All Dates In a Range Even If No Records Present
WITH Queries (Common Table Expressions)
Thanks
I do not think you can do this with pure Django ORM, and I am not even sure if this can be done neatly with extra(). The Django ORM is incredibly good in handling the usual stuff, but for more complex SQL statements and requirements, more so with DBMS specific implementations, it is just not quite there yet. You might have to go lower and down to executing raw SQL directly, or offload that requirement to be done by the application layer.
You can always generate the missing dates using Python, but that will be incredibly slow if the range and number of elements are huge. If this is being requested by AJAX for other use (e.g. charting), then you can offload that to Javascript.
from datetime import date, timedelta
from django.db.models.functions import Trunc
from django.db.models.expressions import Value
from django.db.models import Count, DateField
# A is model
start_date = date(2022, 5, 1)
end_date = date(2022, 5, 10)
result = A.objects\
.annotate(date=Trunc('created', 'day', output_field=DateField())) \
.filter(date__gte=start_date, date__lte=end_date) \
.values('date')\
.annotate(count=Count('id'))\
.union(A.objects.extra(select={
'date': 'unnest(Array[%s]::date[])' %
','.join(map(lambda d: "'%s'::date" % d.strftime('%Y-%m-%d'),
set(start_date + timedelta(n) for n in range((end_date - start_date).days + 1)) -
set(A.objects.annotate(date=Trunc('created', 'day', output_field=DateField())) \
.values_list('date', flat=True))))})\
.annotate(count=Value(0))\
.values('date', 'count'))\
.order_by('date')
In stead of the recursive CTE you could use generate_series() to construct a calendar-table:
SELECT calendar, count(mt.zdate) as THE_COUNT
FROM generate_series('2015-07-20'::date
, '2015-07-24'::date
, '1 day'::interval) calendar
LEFT JOIN my_table mt ON mt.zdate = calendar
GROUP BY 1
ORDER BY 1 ASC;
BTW: I renamed date to zdate. DATE is a bad name for a column (it is the name for a data type)