objects.get_or_create() or transactions in Django views - django

OK, objects.get_or_create(), when called, will create a new record in the database (if there is no record I need). But what if the code throws an exception/fails AFTER objects.get_or_create() has been called?
Basically, I end up with a new record in the database which should not be there. To put it differently, shouldn't the whole thing be wrapped in a transaction which is rolled back if there is a problem? Is it possible?

As Ignacio suggests the answer (in much greater detail than I would be capable of) is available in the django docs.
http://docs.djangoproject.com/en/dev/topics/db/transactions

Related

Why should I care about django-revision operation being atomic?

I want to start using django-reversion. It seems the easiest way is to use their middleware. But it gives the following warning:
Warning: Due to changes in the Django 1.6 transaction handling, revision data will be saved in a separate database transaction to the one used to save your models, even if you set ATOMIC_REQUESTS = True.
What are the caveats if the requests are not atomic? It seems to indicate that there might be some kind of race conditions. How could they look like? What do I need to watch out for?
Thank you for your time. Sorry for spelling mistakes I'm not a native speaker.
As mentioned in the warning, due to some changes in the way django handles transactions since 1.6, the middlewares are no longer wrapped in the same transaction as the view function.
This is discussed at the following issue at django-reversion.
In practice, since the RevisionMiddleware runs outside of the transaction where the models are saved, no strict guarantee can be provided at the database level that reversion data will also be saved.
The usage of RevisionMiddleware was then discouraged. The following practice is advised:
If you need to ensure that your models and revisions are saved in the
save transaction, please use the reversion.create_revision() context
manager or decorator in combination with transaction.atomic()
This way, you can be sure that reversion_data will always be saved alongside model data. I hope this helps.

How to handle requests for non existing dynamic segments in Ember?

If an user modifies the dynamic segment (object ID) in the URL of an Ember App with Ember Data, what's the best practice to handle these URLs as these might refer to non existing Model entries?
In a minimal example one can observe, that for each call with a non-existent ID (for example http://emberjs.jsbin.com/hurozaju/9#/color/30) there is an empty object added to the local ember data store. This is easily observable by the increasing number of "dots" in the output.
The error-action of App.ColorRoute redirects (as intended) to "colors" in case there is a 404 occurring while fetching the model by ID.
Why is there a "new" Object in the store?
Shouldn't the data be left unmodified?
Is there a chance to prevent the creation of new objects in this case?
I spend some time with this problem and i think this is ember-data beta-7 bug. Please report this issue in github.
here is example code how to work around this issue jsbin. This is tested with data-beta.7 and work and with data-beta.4 not working.
Sorry for not waiting as anounced...
This issue is now reported to ember-data on github.

How to update just 1 record in Ember.js with Ember-data? Currently save() and commit() on 1 record actually updates all records of model

Premise: My question is based on my research of Ember-data, which may or may not be correct. So please correct me if I have any misunderstanding. The examples are running with the latest ember as of July 2, 2013.
To edit a record of my model, just 1 record, you need to call this.get('store').commit() or this.get('model').save(). However, downstream of either of these functions actually have to update all of the records of the model, even those left untouched. So this is quite inefficient, especially if the model has numerous records.
What's the best way to update one and only one record when I'm saving the edits of one record?
UPDATE: this problem only occurs for the local-storage-adapter, not the RESTAdapter.
UPDATE #2: I did have a huge misunderstanding. Everything is okay, save() and commit() both update just 1 record, I've been fooled by local storage adapter _saveData's JSON.stringify(this._data) which printed out all records. I assumed that whatever it printed out was the data that is changed, but turns out in _saveData's callers the records in updateRecords and _didSaveRecords were just the single record I was changing. The statements below about different objects containing "all records of the model" can no longer be duplicated. I guess I misread the debugging information.
It makes sense because _saveData uses localstorage, which currently can only setItem for an entire object, which in my case is the model containing all the records. Since localstorage can't update individual entries of that object, the JSON must contain all the records.
Details:
Running Examples:
this.get('store').commit() is used in doneEditing of updating a post this jsbin.
this.get('model').save() is used in acceptChanges of updating a todo this jsbin.
If you turn on Chrome debug and walk into the above two functions, you'll see something similar to below:
Downstream, there is currentTransaction or defaultTransaction, and both have all records of the model inside.
In the case of get('store').commit(), it eventually calls DS.Store's commit, which in turn calls: (see github)
get(this, 'defaultTransaction').commit();
In the case of case of get('model').save(), it eventualy calls DS.Store's scheduleSave and flushSavedRecords, which call: (see github)
get(this, 'currentTransaction').add(record);
get(this, 'currentTransaction').commit();
Note at the end a commit() is called on xxxTransaction, this is DS.Transaction's commit().
DS.Transactionscommit()has acommitDetails, which is based on xxxTransaction, socommitDetails` also has all the records of the data. (github)
Then DS.Adapter's commit and save are called and indeed every single record is updated (github):
this.groupByType(commitDetails.updated).forEach(function(type, set) {
this.updateRecords(store, type, filter(set));
}, this);
(minor side note/question: when was commitDetails set to "updated"?)
I know that DS.Adapter can be customized, but clearly the problem of saving more than one data (i.e. all of the model entries) are set from DS.Store's commitDefaultTransaction and currentTransaction.
Somehow I feel it would be a bad idea to tinker with DS.Store or anything upstream of DS.Adapter, like making my own version of save() and commit(). Basically I am reluctant to customize anything I'm told not to, since there might be ugly side effects.
So what should I do if I want to use Ember data but can only afford to update one record only at a time?
You can create a new transaction just for managing that record, using transaction() method of the store. This transaction has the same api as the defaultTransaction.
var transaction = this.get('store').transaction();
transaction.add(model)
transaction.commit();
Committing this transaction won't affect other changes. See this blog post for further ideas.

Django transaction.commit_on_success - commit still happening despite error/exception, so how to debug?

Using Django 1.3 with PostgreSQL 9.0, I have a multi-step object creation function/view, where:
The main object is created (have tried both MyModel.objects.create() and manually using object.save() methods) and,
Then m2m relationships are setup (they must follow the main object creation so that said object has an id to relate to).
Some of those relationships may fail, or some other problem may arise, thus I need the entire function to behave atomically.
I've tried wrapping the function with the transaction.commit_on_success decorator, as well as tried using commit_manually (and setting the commit point at the end of the function); but neither works. That is, the main object is created and saved in the database, even when an exception is raised later on in the function. This leaves the database in an inconsistent state, to put it politely. So, how to debug this? I've seen similar questions, but they had to do with using MySQL, whereas this kind of broken transaction is not supposed to happen with Postgres. There were tickets on the Django Trac about this issue from years back, but they were supposedly fixed/resolved. Could any Djangonauts out there provide enlightenment please?
See this ticket: https://code.djangoproject.com/ticket/6669
I think for now you'll just need to call transaction.rollback() explicitly when you get an IntegrityError
I don't know if this applies to you, but the problem that brought me here was a failure to read the manual with regard to Django testing.
If you are testing code with transactions in it you need to use TransactionTestCase instead of TestCase, failure to do so will result in the tests seeing the behavior you describe.

Django: How can I protect against concurrent modification of database entries

If there a way to protect against concurrent modifications of the same data base entry by two or more users?
It would be acceptable to show an error message to the user performing the second commit/save operation, but data should not be silently overwritten.
I think locking the entry is not an option, as a user might use the "Back" button or simply close his browser, leaving the lock for ever.
This is how I do optimistic locking in Django:
updated = Entry.objects.filter(Q(id=e.id) && Q(version=e.version))\
.update(updated_field=new_value, version=e.version+1)
if not updated:
raise ConcurrentModificationException()
The code listed above can be implemented as a method in Custom Manager.
I am making the following assumptions:
filter().update() will result in a single database query because filter is lazy
a database query is atomic
These assumptions are enough to ensure that no one else has updated the entry before. If multiple rows are updated this way you should use transactions.
WARNING Django Doc:
Be aware that the update() method is
converted directly to an SQL
statement. It is a bulk operation for
direct updates. It doesn't run any
save() methods on your models, or emit
the pre_save or post_save signals
This question is a bit old and my answer a bit late, but after what I understand this has been fixed in Django 1.4 using:
select_for_update(nowait=True)
see the docs
Returns a queryset that will lock rows until the end of the transaction, generating a SELECT ... FOR UPDATE SQL statement on supported databases.
Usually, if another transaction has already acquired a lock on one of the selected rows, the query will block until the lock is released. If this is not the behavior you want, call select_for_update(nowait=True). This will make the call non-blocking. If a conflicting lock is already acquired by another transaction, DatabaseError will be raised when the queryset is evaluated.
Of course this will only work if the back-end support the "select for update" feature, which for example sqlite doesn't. Unfortunately: nowait=True is not supported by MySql, there you have to use: nowait=False, which will only block until the lock is released.
Actually, transactions don't help you much here ... unless you want to have transactions running over multiple HTTP requests (which you most probably don't want).
What we usually use in those cases is "Optimistic Locking". The Django ORM doesn't support that as far as I know. But there has been some discussion about adding this feature.
So you are on your own. Basically, what you should do is add a "version" field to your model and pass it to the user as a hidden field. The normal cycle for an update is :
read the data and show it to the user
user modify data
user post the data
the app saves it back in the database.
To implement optimistic locking, when you save the data, you check if the version that you got back from the user is the same as the one in the database, and then update the database and increment the version. If they are not, it means that there has been a change since the data was loaded.
You can do that with a single SQL call with something like :
UPDATE ... WHERE version = 'version_from_user';
This call will update the database only if the version is still the same.
Django 1.11 has three convenient options to handle this situation depending on your business logic requirements:
Something.objects.select_for_update() will block until the model become free
Something.objects.select_for_update(nowait=True) and catch DatabaseError if the model is currently locked for update
Something.objects.select_for_update(skip_locked=True) will not return the objects that are currently locked
In my application, which has both interactive and batch workflows on various models, I found these three options to solve most of my concurrent processing scenarios.
The "waiting" select_for_update is very convenient in sequential batch processes - I want them all to execute, but let them take their time. The nowait is used when an user wants to modify an object that is currently locked for update - I will just tell them it's being modified at this moment.
The skip_locked is useful for another type of update, when users can trigger a rescan of an object - and I don't care who triggers it, as long as it's triggered, so skip_locked allows me to silently skip the duplicated triggers.
For future reference, check out https://github.com/RobCombs/django-locking. It does locking in a way that doesn't leave everlasting locks, by a mixture of javascript unlocking when the user leaves the page, and lock timeouts (e.g. in case the user's browser crashes). The documentation is pretty complete.
You should probably use the django transaction middleware at least, even regardless of this problem.
As to your actual problem of having multiple users editing the same data... yes, use locking. OR:
Check what version a user is updating against (do this securely, so users can't simply hack the system to say they were updating the latest copy!), and only update if that version is current. Otherwise, send the user back a new page with the original version they were editing, their submitted version, and the new version(s) written by others. Ask them to merge the changes into one, completely up-to-date version. You might try to auto-merge these using a toolset like diff+patch, but you'll need to have the manual merge method working for failure cases anyway, so start with that. Also, you'll need to preserve version history, and allow admins to revert changes, in case someone unintentionally or intentionally messes up the merge. But you should probably have that anyway.
There's very likely a django app/library that does most of this for you.
Another thing to look for is the word "atomic". An atomic operation means that your database change will either happen successfully, or fail obviously. A quick search shows this question asking about atomic operations in Django.
The idea above
updated = Entry.objects.filter(Q(id=e.id) && Q(version=e.version))\
.update(updated_field=new_value, version=e.version+1)
if not updated:
raise ConcurrentModificationException()
looks great and should work fine even without serializable transactions.
The problem is how to augment the deafult .save() behavior as to not have to do manual plumbing to call the .update() method.
I looked at the Custom Manager idea.
My plan is to override the Manager _update method that is called by Model.save_base() to perform the update.
This is the current code in Django 1.3
def _update(self, values, **kwargs):
return self.get_query_set()._update(values, **kwargs)
What needs to be done IMHO is something like:
def _update(self, values, **kwargs):
#TODO Get version field value
v = self.get_version_field_value(values[0])
return self.get_query_set().filter(Q(version=v))._update(values, **kwargs)
Similar thing needs to happen on delete. However delete is a bit more difficult as Django is implementing quite some voodoo in this area through django.db.models.deletion.Collector.
It is weird that modren tool like Django lacks guidance for Optimictic Concurency Control.
I will update this post when I solve the riddle. Hopefully solution will be in a nice pythonic way that does not involve tons of coding, weird views, skipping essential pieces of Django etc.
To be safe the database needs to support transactions.
If the fields is "free-form" e.g. text etc. and you need to allow several users to be able to edit the same fields (you can't have single user ownership to the data), you could store the original data in a variable.
When the user committs, check if the input data has changed from the original data (if not, you don't need to bother the DB by rewriting old data),
if the original data compared to the current data in the db is the same you can save, if it has changed you can show the user the difference and ask the user what to do.
If the fields is numbers e.g. account balance, number of items in a store etc., you can handle it more automatically if you calculate the difference between the original value (stored when the user started filling out the form) and the new value you can start a transaction read the current value and add the difference, then end transaction. If you can't have negative values, you should abort the transaction if the result is negative, and tell the user.
I don't know django, so I can't give you teh cod3s.. ;)
From here:
How to prevent overwriting an object someone else has modified
I'm assuming that the timestamp will be held as a hidden field in the form you're trying to save the details of.
def save(self):
if(self.id):
foo = Foo.objects.get(pk=self.id)
if(foo.timestamp > self.timestamp):
raise Exception, "trying to save outdated Foo"
super(Foo, self).save()