Django test case db giving inconsistent responses, caching or transaction culprit?

Django test case db giving inconsistent responses, caching or transaction culprit? - django

I am seeing some really surprising and frustrating behavior with Django testing. Model objects are being "found" by a related lookup, but no model objects exist. (I apologize for the weird description here...the behavior is bizarre enough that I don't know quite how to describe it. Do the objects exist? Do I exist? Do you??)
I need them to exist, so I have a method in place that creates them if they don't exist. The problem is that on one line, Django finds that they do exist, and therefore they are not created...and then on the next line we can confirm that no such objects exist.
My tests are giving Errors in test_something() related to the absence of the necessary TaskMetadata object.
#the model
class TaskMetadata(models.Model):
task = models.OneToOneField(ContentType)
...
#the test
class SimpleTest(TestCase):
def setUp(self):
some_utility_function()
def test_something(self):
...something that requires TaskMetadata...
def some_utility_function():
task = ...whatever...
ctype = ContentType.objects.get_for_model(task)
try:
ctype.taskmetadata
except TaskMetadata.DoesNotExist:
...create TaskMetadata...
print "Created TaskMetadata object for %s" % task.__name__
else:
print "TaskMetadata object already exists for %s" % task.__name__
print ctype.taskmetadata
print "ALL OF THEM!! %s" % TaskMetadata.objects.all()
and the printed result of some_utility_function():
TaskMetadata object already exists for SomeTask
some task
ALL OF THEM!! [] # <-- NOTE EMPTY QUERYSET
In summary: "Yes, TaskMetadata object exists. Yes, TaskMetadata object exists. No, there are no TaskMetadata objects at all!!"
So, seriously, what on earth is going on here? Is this a cache problem? I tried clearing the cache (wild guess; I don't have CACHES configured in settings.py)
def setUp(self):
cache.clear()
some_utility_function()
Does not help. Transactions maybe? I'm stumped. How do I even debug this?
UPDATE:
See a minimal django project that replicates the issue here.
When the first testcase runs, TaskMetadata.objects.all() is NOT an empty queryset (it is in fact populated with objects, as I would expect); when the second testcase (exactly the same as the first) runs, it is empty.
I suspect this has something to do with database flushing between testcases that is clearing out the TaskMetadata objects, but the related lookup is cached, and so the next time some_utility_function() is called for the next testcase, it doesn't create any TaskMetadata objects. 1) Is that plausible? 2) How to work around it? 3) This is a Django bug, right?
Django bug ticket

In your tearDown method you need to call ContentType.objects.clear_cache(). This is because Django caches calls to ContentType.objects.get_for_model. Having a one-to-one to content type is a bit weird, so I don't think django needs to make any changes for this, especially as it should be a one line fix for you.

The problem here is the "finally" clause.
A finally clause is always executed before leaving the try statement, whether an exception has occurred or not.
http://docs.python.org/2/tutorial/errors.html
So, the finally clause containing the print statements will always be executed.

Related

Why does django not hit the database when we try to access queryset object attributes?

Knowing that QuerySets are lazy, and a db query is made only when we try to access the queryset, I noticed that a db query is not made even if we iteratively try to access queryset object attributes (however, a query is made when we try to access the object itself). So a sample example for demonstration purposes.
from django.db import connection, reset_queries
reset_queries()
tests=Test.objects.using('db').filter(is_test=True)
print(len(connection.queries)) # returns 0
for obj in tests:
print(obj.id)
print(obj.is_test)
print(len(connection.queries)) # returns 0 again
The attributes are correctly printed, but how can they if it shows that a no query was made? Again if we do print(obj) instead, a query will be made. Any explanation is appreciated.
Edit:
This problem arises when we try to query from a nondefault database, which I guess somewhat explains the issue.

In your example, it hits the database only once, at:
for obj in tests:
...
Is where the QuerySet is evaluated. From that point forward when accessing objects you are not querying the database, but rather using the QuerySet fetched data.

django.db.transaction.TransactionManagementError: cannot perform saving of other object in model within transaction

Can't seem to find much info about this. This is NOT happening in a django test. I'm using DATABASES = { ATOMIC_REQUESTS: True }. Within a method (in mixin I created) called by the view, I'm trying to perform something like this:
def process_valid(self, view):
old_id = view.object.id
view.object.id = None # need a new instance in db
view.object.save()
old_fac = Entfac.objects.get(id=old_id)
new_fac = view.object
old_dets = Detfac.objects.filter(fk_ent__id__exact = old_fac.id)
new_formset = view.DetFormsetClass(view.request.POST, instance=view.object, save_as_new=True)
if new_formset.is_valid():
new_dets = new_formset.save()
new_fac.fk_cancel = old_fac # need a fk reference to initial fac in new one
old_fac.fk_cancel = new_fac # need a fk reference to new in old fac
# any save() action after this crashes with TransactionManagementError
new_fac.save()
I do not understand this error. I already created & saved a new object in db (when I set the object.id to None & saved that). Why would creating other objects create an issue for further saves?
I have tried not instantiating the new_dets objects with the Formset, but instead explicitely defining them:
new_det = Detfac(...)
new_det.save()
But then again, any further save after that raises the error.
Further details:
Essentially, I have an Entfac model, and a Detfac model that has a foreignkey to Entfac. I need to instantiate a new Enfac (distinct in db), as well as corresponding new Detfac for the new Entfac. Then I need to change some values in some of the fields for both new & old objects, and save all that to db.

Ah. The code above is fine.
But turns out, signals can be bad. I had forgotten that upon saving Detfac, there is a signal that goes to another class and that depending on the circumstances, adds a record to another table (sort of an history table).
Since that signal is just a single operation. Something like that:
#receiver(post_save, sender=Detfac)
def quantity_adjust_detfac(sender, **kwargs):
try:
detfac_qty = kwargs["instance"].qte
product = kwargs["instance"].fk_produit
if kwargs["created"]:
initial = {# bunch of values}
adjustment = HistoQuantity(**initial)
adjustment.save()
else:
except TypeError as ex:
logger.error(f"....")
except AttributeError as ex:
logger.error(f"....")
In itself, the fact that THIS wasn't marked as atomic isn't problematic. BUT if one of those exception throws, THEN I get the transactionmanagementerror. I am still not 100% sure why, tough the django docs do mention that when wrapping a whole view in atomic (or any chunk of code for that matter), then try/except within that block can yield unexpected result, because DJango does rely on exception to decide whether or not to commit the transaction as a whole. And the data I was testing with actually threw the exception (type error when creating the HistoQuantity object).
Wrapping the try/exception with a transaction.atomic manager worked however. Guessing that this... removed/handled the throw, thus the outer atomic could work.

How to handle "matching query does not exist" when getting an object

When I want to select objects with a get() function like
personalProfile = World.objects.get(ID=personID)
If get function doesn't return find a value, a "matching query does not exist." error occurs.
If I don't need this error, I'll use try and except function
try:
personalProfile = World.objects.get(ID=personID)
except:
pass
But I think this is not the best way since I use
except:
pass
Please recommend some idea or code sample to fight with this issue

That depends on what you want to do if it doesn't exist..
Theres get_object_or_404:
Calls get() on a given model manager, but it raises Http404 instead of the model’s DoesNotExist exception.
get_object_or_404(World, ID=personID)
Which is very close to the try except code you currently do.
Otherwise theres get_or_create:
personalProfile, created = World.objects.get_or_create(ID=personID)
Although, If you choose to continue with your current approach, at least make sure the except is localised to the correct error and then do something with that as necessary
try:
personalProfile = World.objects.get(ID=personID)
except MyModel.DoesNotExist:
raise Http404("No MyModel matches the given query.")
The above try/except handle is similar to what is found in the docs for get_object_or_404...

A get_or_none() function has been proposed, multiple times now. The rejection notice is feature creep, which you might or might not agree with. The functionality is present --with slightly different semantics-- in the first() queryset method.
But first things first:
The manager throws World.DoesNotExist, a specialized subclass of ObjectDoesNotExist when a World object was not found:
try:
personalProfile = World.objects.get(ID=personID)
except World.DoesNotExist:
pass
There's also get_object_or_404() which raises a Http404 exception when the object was not found.
You can also roll your own get_or_none(). A possible implementation could be:
def get_or_none(queryset, *args, **kwargs):
try:
return queryset.get(*args, **kwargs)
except ObjectDoesNotExist:
return None
Note that this still raises MultipleObjectsReturned when more than one matching object is found. If you always want the first object regardless of any others, you can simplify using first(), which returns None when the queryset is empty:
def get_or_none(queryset, *args, **kwargs):
return queryset.filter(*args, **kwargs).first()
Note however, for this to work reliably, you need a proper order for objects, because in the presence of multiple objects first() might be non-deterministic (it probably returns the first object from the database index used to filter the query and neither indexes not the underlying tables need be sorted or even have a repeatable order).
Use both, however, only when the use of the object to retrieve is strictly optional for the further program flow. When failure to retrieve an object is an error, use get_object_or_404(). When an object should be created when it does not exist, use get_or_create(). In those cases, both are better suited to simplify program flow.

As alasdair mentioned you could use the built in first() method.
It returns the object if it exists or None if it's not
personalProfile = World.objects.filter(ID=personID).first()

django orm: Check if obj is in queryset

How can I check whether an obj is in a queryset or not?
I tried this:
self.assertIn(obj, list(MyModel.objects.filter(...))
But it does not work in my case.
AssertionError: <MyModel 137 'unclassified'> not found in
[<MyModel 1676 'foo'>, ..., <MyModel 137 'unclassified'>, ...]
I don't understand it, since it is in the list.

How about
self.assertTrue(MyModel.filter(...).filter(pk=obj.pk).exists())

First of all, it should be MyModel.objects.filter(...). If you omit the .objects, you should've gotten a different error, so I'm assuming you did include it but just forgot it in the question.
If obj is actually in the QuerySet returned, what you did should have worked as Django model instances provides an equal comparator which compares both the type and the primary key. list() is not required around the QuerySet, though it should still work if you used it.
From the Django 1.5 source:
def __eq__(self, other):
return isinstance(other, self.__class__) and self._get_pk_val() == other._get_pk_val()
If it still doesn't work, there are a few possible causes:
The type doesn't match. In this case, it doesn't matter even if the object pk exists in the QuerySet's object pks. (You cannot compare apples to oranges)
There is no such object in the database (i.e. it hasn't been saved yet)
The type matches but the object is not in the QuerySet (i.e. filtered out)
You overrode the __eq__ method and did something weird. Or you overrode the default manager .objects with some custom filter. This scenario is outside the scope of this answer and if you did this, you should probably know how to fix it.
To help you diagnose which is the case, try this:
self.assertTrue(isinstance(obj, MyModel))
# 1. If it fails here, your object is an incorrect type
# Warning: the following tests can be very slow if you have a lot of data
self.assertIn(obj.pk, MyModel.objects.values_list('pk', flat=True))
# 2. If it fails here, the object doesn't exist in the database
self.assertIn(obj.pk, MyModel.objects.filter(...).values_list('pk', flat=True))
# 3. If it fails here, the object did not pass your filter conditions.
self.assertIn(obj, MyModel.objects.filter(...))
# 4. If it fails here, you probably messed with the Django ORM internals. Tsk tsk.

Just with, note the .all()
queryset_result = MyModel.filter(...).all()
if obj in queryset_result:
//obj is in the queryset

The "in" fails because the objects aren't actually equal, as equality is object identity by default. If you want "in" to work, you'd have to implement __eq__ accordingly on your model.
If you don't want to do that, you can check by comparing the pk, like so
self.assertIn(obj.pk, [o.pk for o in MyModel.filter(...)])

I think This is very simple way get to find out object present in queryset or not.
First Example:
obj_list = MyModel.filter(...)
if obj in obj_list:
print "obj in queryset"
else:
print "not in queryset"
Second Example:
obj_list = MyModel.filter(...)
try:
obj_list.get(pk=obj.id)
except:
# If try get success obj is present in query set else your this except get executed.

How do I deal with this race condition in django?

This code is supposed to get or create an object and update it if necessary. The code is in production use on a website.
In some cases - when the database is busy - it will throw the exception "DoesNotExist: MyObj matching query does not exist".
# Model:
class MyObj(models.Model):
thing = models.ForeignKey(Thing)
owner = models.ForeignKey(User)
state = models.BooleanField()
class Meta:
unique_together = (('thing', 'owner'),)
# Update or create myobj
#transaction.commit_on_success
def create_or_update_myobj(owner, thing, state)
try:
myobj, created = MyObj.objects.get_or_create(owner=user,thing=thing)
except IntegrityError:
myobj = MyObj.objects.get(owner=user,thing=thing)
# Will sometimes throw "DoesNotExist: MyObj matching query does not exist"
myobj.state = state
myobj.save()
I use an innodb mysql database on ubuntu.
How do I safely deal with this problem?

This could be an off-shoot of the same problem as here:
Why doesn't this loop display an updated object count every five seconds?
Basically get_or_create can fail - if you take a look at its source, there you'll see that it's: get, if-problem: save+some_trickery, if-still-problem: get again, if-still-problem: surrender and raise.
This means that if there are two simultaneous threads (or processes) running create_or_update_myobj, both trying to get_or_create the same object, then:
first thread tries to get it - but it doesn't yet exist,
so, the thread tries to create it, but before the object is created...
...second thread tries to get it - and this obviously fails
now, because of the default AUTOCOMMIT=OFF for MySQLdb database connection, and REPEATABLE READ serializable level, both threads have frozen their views of MyObj table.
subsequently, first thread creates its object and returns it gracefully, but...
...second thread cannot create anything as it would violate unique constraint
what's funny, subsequent get on the second thread doesn't see the object created in the first thread, due to the frozen view of MyObj table
So, if you want to safely get_or_create anything, try something like this:
#transaction.commit_on_success
def my_get_or_create(...):
try:
obj = MyObj.objects.create(...)
except IntegrityError:
transaction.commit()
obj = MyObj.objects.get(...)
return obj
Edited on 27/05/2010
There is also a second solution to the problem - using READ COMMITED isolation level, instead of REPEATABLE READ. But it's less tested (at least in MySQL), so there might be more bugs/problems with it - but at least it allows tying views to transactions, without committing in the middle.
Edited on 22/01/2012
Here are some good blog posts (not mine) about MySQL and Django, related to this question:
http://www.no-ack.org/2010/07/mysql-transactions-and-django.html
http://www.no-ack.org/2011/05/broken-transaction-management-in-mysql.html

Your exception handling is masking the error. You should pass a value for state in get_or_create(), or set a default in the model and database.

One (dumb) way might be to catch the error and simply retry once or twice after waiting a small amount of time. I'm not a DB expert, so there might be a signaling solution.

Since 2012 in Django we have select_for_update which lock rows until the end of the transaction.
To avoid race conditions in Django + MySQL
under default circumstances:
REPEATABLE_READ in the Mysql
READ_COMMITTED in the Django
you can use this:
with transaction.atomic():
instance = YourModel.objects.select_for_update().get(id=42)
instance.evolve()
instance.save()
The second thread will wait for the first thread (lock), and only if first is done, the second will read data saved by first, so it will work on updated data.
Then together with get_or_create:
def select_for_update_or_create(...):
instance = YourModel.objects.filter(
...
).select_for_update().first()
if order is None:
instnace = YouModel.objects.create(...)
return instance
The function must be inside transaction block, otherwise, you will get from Django:
TransactionManagementError: select_for_update cannot be used outside of a transaction
Also sometimes it's good to use refresh_from_db()
In case like:
instance = YourModel.objects.create(**kwargs)
response = do_request_which_lasts_few_seconds(instance)
instance.attr = response.something
you'd like to see:
instance = MyModel.objects.create(**kwargs)
response = do_request_which_lasts_few_seconds(instance)
instance.refresh_from_db() # 3
instance.attr = response.something
and that # 3 will reduce a lot a time window of possible race conditions, thus chance for that.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django test case db giving inconsistent responses, caching or transaction culprit? - django

The problem here is the "finally" clause. A finally clause is always executed before leaving the try statement, whether an exception has occurred or not. http://docs.python.org/2/tutorial/errors.html So, the finally clause containing the print statements will always be executed.

Related

Why does django not hit the database when we try to access queryset object attributes?

django.db.transaction.TransactionManagementError: cannot perform saving of other object in model within transaction

How to handle "matching query does not exist" when getting an object

django orm: Check if obj is in queryset

How do I deal with this race condition in django?

Categories

Resources