Does transaction.atomic roll back increments to a pk sequence - django

I'm using Django 2.2 and my question is: does transaction.atomic roll back increments to a pk sequence?
Below is the background bug I wrote up that led me to this issue
I'm facing a really weird issue that I can't figure out and I'm hoping someone has faced a similar issue.
An insert using the django ORM .create() function is returning django.db.utils.IntegrityError: duplicate key value violates unique constraint "my_table_pkey" DETAIL: Key (id)=(5795) already exists.
Fine. But then I look at the table and no record with id=5795 exists!
SELECT * from my_table where id=5795;
shows (0 rows)
A look at the sequence my_table_id_seq shows that it has nonetheless incremented to show last_value = 5795 as if the above record was inserted. Moreover the issue does not always occur. A successful insert with different data is inserted at id=5796. (I tried reset the pk sequence but that didn't do anything, since it doesnt seem to be the problem anyway)
I'm quite stumped by this and it has caused us a lot of issues on one specific table. Finally I realize the call is wrapped in transaction.atomic and that a particular scenario may be causing a double insert with the same pk.
So my theory is: The transaction atomic is not rolling back the increment of the

Postgres sequences do not roll back. Every time they are touched by a statement they advance whether the statement succeeds or not. For more information see Notes section here Create Sequence.

Related

Oracle FK constraints are not enforced

I'm using oracle 12c database and want to test out one problem.
When carrying out web service request it returns underlying ORA-02292 error on constraint name (YYY.FK_L_TILSYNSOBJEKT_BEGRENSNING).
Here is SQL of the table with the constraint:
CONSTRAINT "FK_L_TILSYNSOBJEKT_BEGRENSNING" FOREIGN KEY ("BEGRENSNING")
REFERENCES "XXX"."BEGRENSNING" ("IDSTRING") DEFERRABLE INITIALLY DEFERRED ENABLE NOVALIDATE
The problem is, that when I try to delete the row manually with valid IDSTRING (in both tables) from parent table - it successfully does it.
What cause it to behave this way? Is there any other info I should give?
Not sure if it helps someone, since it was fairly stupid mistake, but i'll try to make it useful since people demand answers.
Keyword DEFERRABLE INITIALLY DEFERRED means that constraint is enforced at commit not at query run time as opposed to INITIALLY IMMEDIATE, which does the check right after you issue the query, however this keyword makes database bulk updates a bit slower (since every query in a transaction has to be checked by constraint, meanwhile in initial deference if it turns out there is an issue - whole bulk is rolled back, no additional unnecessary queries are issued and something can be done about it), hence used less often than initial deference.
Error ORA-02292 however is shown only for DELETE statements, knowing that its a bit easier to debug your statements.

Adding data to database in django issue

I have code like this:
form = TestForm(request.POST)
form.save(commit=False).save()
This code sometimes work sometimes dont. Problem is in auto increment id.
When i have some data in db that is not written by django and i want to add data from django i get IntegrityError id already exists.
I i have 2 rows in db(not added by django) i need to click "add data" 3 times. After third time when id increment to 3 all is ok.
How to solve this?
These integrity errors appear, when your table sequence is not updated after new item is created. Or if sequence is out of sync with reality. For example - you import items from some source and the items also contain id, which is higher than your table index sequence indicates. I have not seen a case where django messes sequences up.
So what i guess happens is, that the other source that inserts data into your database, also inserts id's and sequence is not updated. Fix that and your problems should disappear.

Occasional IntegrityError on m2m fields using PostgreSQL

I can't detect any pattern, maybe 1 in each 1000 edits of a certain model returns an IntegrityError on a m2m field. Most of the times this field wasn't even modified. When a model is saved I believe django always wipes the m2m field and then re-adds the items, right? I saw django calls clear() and then add()s the items.
My code then fails with:
IntegrityError: duplicate key value violates unique constraint
"app_model_m2m_field_key" DETAIL: Key (model1_id, model2_id)=(597,
1009) already exists.
It seems like the add of items is performed before the items are cleared, which is very weird. I've tried to reproduce it but it's very hard, only happens occasionally. Any idea what could cause it? Could maybe setting auto commit solve this problem?
Thanks in advance
Most likely, you have two requests racing to commit similar changes at the same time.
Request 1 begins a transaction and DELETEs the existing M2M rows.
Request 2 begins a transaction and DELETEs the M2M rows with the same where clause. This blocks waiting for request 1's transaction to commit.
Request 1 re-INSERTs all the M2M rows and commits.
Request 2 resumes, and the delete succeeds without deleting any rows, because all rows that existed when the statement began have already been deleted.
Request 2 tries to re-INSERT an M2M row, but the database detects that it already exists and returns an error.
It's possible to fix this by upgrading to the SERIALIZABLE isolation level (instead of PostgreSQL's default of READ COMMITTED) but at the cost of even more exciting potential failure modes and worse performance.
I'm assuming you're right that Django is performing a DELETE followed by a series of INSERTs, although that wouldn't be a very good plan precisely because it exacerbates this kind of race.
The best plan is to identify what has actually changed and only ask the database to make those changes, because then if you get an integrity error it's because there was a real conflict that you probably couldn't do anything about anyway.

How can I forward a primary key sequence in Django safely?

Using Django with a PostgreSQL (8.x) backend, I have a model where I need to skip a block of ids, e.g. after giving out 49999 I want the next id to be 70000 not 50000 (because that block is reserved for another source where the instances are added explicitly with id - I know that's not a great design but it's what I have to work with).
What is the correct/safest place for doing this?
I know I can set the sequence with
SELECT SETVAL(
(SELECT pg_get_serial_sequence('myapp_mymodel', 'id')),
70000,
false
);
but when does Django actually pull a number from the sequence?
Do I override MyModel.save(), call its super and then grab me a cursor and check with
SELECT currval(
(SELECT pg_get_serial_sequence('myapp_mymodel', 'id'))
);
?
I believe that a sequence may be advanced by django even if saving the model fails, so I want to make sure whenever it hits that number it advances - is there a better place than save()?
P.S.: Even if that was the way to go - can I actually figure out the currval for save()'s session like this? if I grab me a connection and cursor, and execute that second SQL statement, wouldn't I be in another session and therefore not get a currval?
Thank you for any pointers.
EDIT: I have a feeling that this must be done at database level (concurrency issues) and posted a corresponding PostgreSQL question - How can I forward a primary key sequence in PostgreSQL safely?
As I haven't found an "automated" way of doing this yet, I'm thinking of the following workaround - it would be feasible for my particular situation:
Set the sequence with a MAXVALUE 49999 NO CYCLE
When 49999 is reached, the next save() will run into a postgres error
Catch that exception and reraise as a form error "you've run out of numbers, please reset to the next block then try again"
Provide a view where the user can activate the next block, i.e. execute "ALTER SEQUENCE my_seq RESTART WITH 70000 MAXVALUE 89999"
I'm uneasy about doing the restart automatically when catching the exception:
try:
instance.save()
except RunOutOfIdsException:
restart_id_sequence()
instance.save()
as I fear two concurrent save()'s running out of ids will lead to two separate restarts, and a subsequent violation of the unique constraint. (basically same concept as original problem)
My next thought was to not use a sequence for the primary key, but rather always specify the id explicitly from a separate counter table which I check/update before using its latest number - that should be safe from concurrency issues. The only problem is that although I have a single place where I add model instances, other parts of django or third-party apps may still rely on an implicit id, which I don't want to break.
But that same mechanism happens to be easily implemented on postgres level - I believe this is the solution:
Don't use SERIAL for the primary key, use DEFAULT my_next_id()
Follow the same logic as for "single level gapless sequence" - http://www.varlena.com/GeneralBits/130.php - my_next_id() does an update followed by a select
Instead of just increasing by 1, check if a boundary was crossed and if so, increase even further

Django ORM misreading PostgreSQL sequences?

Background: Running a PostgreSQL database for a Django app (Django 1.1.1, Python2.4, psycopg2 and Postgres 8.1) I've restored the database from a SQL dump several times. Each time I do that and then try to add a new row, either shell, admin, or site front end, I get this error:
IntegrityError: duplicate key violates unique constraint "app_model_pkey"
The data dump is fine and is resetting the sequences. But if I try adding the row again, it's successful! So I can just try jamming a new row into every table and then everything seems to be copacetic.
Question: Given that (1) the SQL dump is good and Postgres is reading it in correctly (per earlier question), and (2) Django's ORM does not seem to be failing systemically getting next values, what is going on in this specific instance?
Django doesn't hold or directly read the sequence values in any way. I've explained it f.ex. in this question: 2088210/django-object-creation-and-postgres-sequences.
Postgresql does increment the sequence when you try to add a row, even if the result of the operation is not successful (raises a duplicate key error) the sequence incrementation doesn't rollback. So, that's the reason why it works the second time you try adding a row.
I don't know why your sequences are not set properly, could you check what is the sequence value before dump and after restore, and do the same with the max() pk of the table? Maybe it's an 8.1 bug with the restore? I don't know. What I'm sure of: it's not Django's fault.
I am guessing that your sequence is out of date.
You can fix that like this:
select setval('app_model_id_seq', max(id)) from app_model;