How to test Models in Django with Foreign Keys - django

I want to make sure I am testing Models/Objects in isolation and not as one huge system.
If I have an Order object and it has Foreign Keys to Customers, Payments, OrderItems, etc. and I want to test Order functionality, I need to create fixtures for all of that related data, or create it in code. I think what I really need to be doing is mocking out the other items, but I don't see an easy (or possible) solution for that if I am doing queries on these Foreign Keys.
The common solutions (fixtures) don't really let me test one Object at a time. I am sure this is partly caused by my app being way over coupled.
I am trying my darndest to adopt TDD as my main method of working, but the way things work with Django, it seems you can either run very trivial unit tests, or these massive integration tests.
[Edit] Better explicit example and a some more humility
What I mean is that I seem to be able to only run trivial unit tests. I have seen people with very well tested and granular modules. I am certain some of this can be followed back to poor design.
Example:
I have a model call Upsell which is linked to a Product model. Then I have a Choices model which are children of Upsell (do you want what's behind door #1, #2, #3).
The Upsell model has several methods on it that derive items necessary to render the template from their choices. The most important one is that it creates a URL for each choice. It does this through some string mangling etc. If I wanted to test the Upsell.get_urls() method, I want to have it not depend on the values of choices in the fixtures, and I want to not have it depend on the value of Product in the fixtures.
Right now I populate the db in the setUp method for the tests, and that works well with the way Django backs out the transaction every time, but only outside of setUp and tearDown. This works fairly well except some of the Models are fairly complex to set up, while I actually only need to get one attribute for it.
I can't give you an example of that, since I can't accomplish it, but here is the type of thing I am doing now. Basically I input an entire order, create the A/B experiment it was attached to, etc. And that's not counting Product, Categories, etc. all set up by fixtures. It's not the extra work I am concerned as I can't even test one database-based object at a time. The Tests below are important, but they are integration tests. I would like to be building up to something like this by testing each item separately. As you pointed out, maybe I shouldn't have chosen a framework so closely tied to the db. Does any sort of dependency injection exist with something like this? (beyond my testing, but the code itself as well)
class TestMultiSinglePaySwap(TestCase):
fixtures = ['/srv/asm/fixtures/alchemysites.json','/srv/asm/fixtures/catalog.json','/srv/asm/fixtures/checkout_smallset.json','/srv/asm/fixtures/order-test-fixture.json','/srv/asm/fixtures/offers.json']
def setUp(self):
self.o = Order()
self.sp = SiteProfile.objects.get(pk=1)
self.c = Customer.objects.get(pk=1)
signals.post_save.disconnect(order_email_first, sender=Order)
self.o.customer = self.c
p = Payment()
p.cc_number = '4444000011110000'
p.cc_exp_month = '12'
p.cc_type = 'V'
p.cc_exp_year = '2020'
p.cvv2 = '123'
p.save()
self.o.payment = p
self.o.site_profile = self.sp
self.o.save()
self.initial_items = []
self.main_kit = Product.objects.get(pk='MOA1000D6')
self.initial_items.append(self.main_kit)
self.o.add_item('MOA1000D6', 1, False)
self.item1 = Product.objects.get(pk='MOA1041A-6')
self.initial_items.append(self.item1)
self.o.add_item('MOA1041A-6', 1, False)
self.item2 = Product.objects.get(pk='MOA1015-6B')
self.initial_items.append(self.item2)
self.o.add_item('MOA1015-6B', 1, False)
self.item3 = Product.objects.get(pk='STP1001-6E')
self.initial_items.append(self.item3)
self.o.add_item('STP1001-6E', 1, False)
self.swap_item1 = Product.objects.get(pk='MOA1041A-1')
def test_single_pay_swap_wholeorder(self):
o = self.o
swap_all_skus(o)
post_swap_order = Order.objects.get(pk = o.id)
swapped_skus = ['MOA1000D','MOA1041A-1','MOA1015-1B','STP1001-1E']
order_items = post_swap_order.get_all_line_items()
self.assertEqual(order_items.count(), 4)
pr1 = Product()
pr1.sku = 'MOA1000D'
item = OrderItem.objects.get(order = o, sku = 'MOA1000D')
self.assertTrue(item.sku.sku == 'MOA1000D')
pr2 = Product()
pr2.sku = 'MOA1015-1B'
item = OrderItem.objects.get(order = o, sku = 'MOA1015-1B')
self.assertTrue(item.sku.sku == 'MOA1015-1B')
pr1 = Product()
pr1.sku = 'MOA1041A-1'
item = OrderItem.objects.get(order = o, sku = 'MOA1041A-1')
self.assertTrue(item.sku.sku == 'MOA1041A-1')
pr1 = Product()
pr1.sku = 'STP1001-1E'
item = OrderItem.objects.get(order = o, sku = 'STP1001-1E')
self.assertTrue(item.sku.sku == 'STP1001-1E')
Note that I have never actually used a Mock framework though I have tried. So I may also just be fundamentally missing something here.

Look into model mommy. It can automagically create objects with Foreign Keys.

This will probably not answer your question but it may give you food for thought.
In my opinion when you are testing a database backed project or application there is a limit to what you can mock. This is especially so when you are using a framework and an ORM such as the one Django offers. In Django there is no distinction between the business model class and the persistence model class. If you want such a distinction then you'll have to add it yourself.
Unless you are willing to add that additional layer of complexity yourself it becomes tricky to test the business objects alone without having to add fixtures etc. If you must do so you will have to tackle some of the auto magic vodoo done by Django.
If you do choose to grit your teeth and dig in then Michael Foord's Python Mock library will come in quite handy.
I am trying my darndest to adopt TDD as my main method of working, but the way things work with Django, it seems you can either run very trivial unit tests, or these massive integration tests.
I have used Django unit testing mechanism to write non-trivial unit tests. My requirements were doubtless very different from yours. If you can provide more specific details about what you are trying to accomplish then users here would be able to suggest other alternatives.

Related

Django Model function behavior changes based on how many tests being run

I have a need for a uniqueID within my Django code. I wrote a simple model like this
class UniqueIDGenerator(models.Model):
nextID = models.PositiveIntegerField(blank=False)
#classmethod
def getNextID(self):
if(self.objects.filter(id=1).exists()):
idValue = self.objects.get(id=1).nextID
idValue += 1
self.objects.filter(id=1).update(nextID=idValue)
return idValue
tempObj = self(nextID=1)
tempObj.save()
return tempObj.nextID
Then I wrote a unit test like this:
class ModelWorking(TestCase):
def setUp(self):
return None
def test_IDGenerator(self):
returnValue = UniqueIDGenerator.getNextID()
self.assertEqual(returnValue, 1)
returnValue = UniqueIDGenerator.getNextID()
self.assertEqual(returnValue, 2)
return None
When I run this test by itself, it runs fine. No issues.
When I run this test as a suite, which includes a bunch of other unit tests as well (which include calls to getNextID() as well), this test fails. The getNextID() always returns 1. Why would that be happening?
I figured it out.
Django runs each test in a transaction to provide isolation. Doc link.
Since my other tests make a call to getNextID(), the first row gets deleted after the first test that makes such a call is complete. Subsequent tests never find (id=1), due to which all subsequent calls return the value 1.
Even though I don't think I would face that situation in production, I went I ahead and changed my code to use .first() instead of (id=1). Like this
def getNextID(self):
firstRow = self.objects.first()
if(firstRow):
That way I believe it would better handle any future scenario when the database table might be emptied.

How to mock spacy models / Doc objects for unit tests?

Loading spacy models slows down running my unit tests. Is there a way to mock spacy models or Doc objects to speed up unit tests?
Example of a current slow tests
import spacy
nlp = spacy.load("en_core_web_sm")
def test_entities():
text = u"Google is a company."
doc = nlp(text)
assert doc.ents[0].text == u"Google"
Based on the docs my approach is
Constructing the Vocab and Doc manually and setting the entities as tuples.
from spacy.vocab import Vocab
from spacy.tokens import Doc
def test()
alphanum_words = u"Google Facebook are companies".split(" ")
labels = [u"ORG"]
words = alphanum_words + [u"."]
spaces = len(words) * [True]
spaces[-1] = False
spaces[-2] = False
vocab = Vocab(strings=(alphanum_words + labels))
doc = Doc(vocab, words=words, spaces=spaces)
def get_hash(text):
return vocab.strings[text]
entity_tuples = tuple([(get_hash(labels[0]), 0, 1)])
doc.ents = entity_tuples
assert doc.ents[0].text == u"Google"
Is there a cleaner more Pythonic solution for mocking spacy objects for unit tests for entities?
This is a great question actually! I'd say your instinct is definitely right: If all you need is a Doc object in a given state and with given annotations, always create it manually wherever possible. And unless you're explicitly testing a statistical model, avoid loading it in your unit tests. It makes the tests slow, and it introduces too much unnecessary variance. This is also very much in line with the philosophy of unit testing: you want to be writing independent tests for one thing at a time (not one thing plus a bunch of third-party library code plus a statistical model).
Some general tips and ideas:
If possible, always construct a Doc manually. Avoid loading models or Language subclasses.
Unless your application or test specifically needs the doc.text, you do not have to set the spaces. In fact, I leave this out in about 80% of the tests I write, because it really only becomes relevant when you're putting the tokens back together.
If you need to create a lot of Doc objects in your test suite, you could consider using a utility function, similar to the get_doc helper we use in the spaCy test suite. (That function also shows you how the individual annotations are set manually, in case you need it.)
Use (session-scoped) fixtures for the shared objects, like the Vocab. Depending on what you're testing, you might want to explicitly use the English vocab. In the spaCy test suite, we do this by setting up an en_vocab fixture in the conftest.py.
Instead of setting the doc.ents to a list of tuples, you can also make it a list of Span objects. This looks a bit more straightforward, is easier to read, and in spaCy v2.1+, you can also pass a string as a label:
def test_entities(en_vocab):
doc = Doc(en_vocab, words=["Hello", "world"])
doc.ents = [Span(doc, 0, 1, label="ORG")]
assert doc.ents[0].text == "Hello"
If you do need to test a model (e.g. in the test suite that makes sure that your custom models load and run as expected) or a language class like English, put them in a session-scoped fixture. This means that they'll only be loaded once per session instead of once per test. Language classes are lazy-loaded and may also take some time to load, depending on the data they contain. So you only want to do this once.
# Note: You probably don't have to do any of this, unless you're testing your
# own custom models or language classes.
#pytest.fixture(scope="session")
def en_core_web_sm():
return spacy.load("en_core_web_sm")
#pytest.fixture(scope="session")
def en_lang_class():
lang_cls = spacy.util.get_lang_class("en")
return lang_cls()
def test(en_lang_class):
doc = en_lang_class("Hello world")

How to reuse template in Flask-appbuilder with exposed custom handlers?

It is a very specific question regarding Flask-appbuilder. During my development, I found FAB's ModelView is suitable for admin role, but need more user logic handlers/views for complex designs.
There is a many to many relationship between devices and users, since each device could be shared between many users, and each user could own many device. So there is a secondary table called accesses, describes the access control between devices and users. In this table, I add "isHost" to just if the user owns the device. Therefore, we have two roles: host and (regular) user. However, these roles are not two roles defined as other applications, since one man can be either host or user in same time. In a very simple application, enforce the user to switch two roles are not very convinient. That makes things worse.
Anyway, I need design some custom handlers with traditional Flask/Jinja2 templates. For example:
class PageView(ModelView):
# FAB default URL: "/pageview/list"
datamodel = SQLAInterface(Page)
list_columns = ['name', 'date', 'get_url']
#expose("/p/<string:url>")
def p(self, url):
title = urllib.unquote(url)
r = db.session.query(Page).filter_by(name = title).first()
if r:
md = r.markdown
parser = mistune.Markdown()
body = parser(md)
return self.render_template('page.html', title = title, body = body)
else:
return self.render_template('404.html'), 404
Above markdown page URL is simple, since it is a seperate UI. But if I goes to DeviceView/AccountView/AccessView for list/show/add/edit operations. I realized that I need a unique styles of UI.
So, now how can I reuse the existing templates/widgets of FAB with custom sqlalchemy queries? Here is my code for DeviceView.
class DeviceView(ModelView):
datamodel = SQLAInterface(Device)
related_views = [EventView, AccessView]
show_template = 'appbuilder/general/model/show_cascade.html'
edit_template = 'appbuilder/general/model/edit_cascade.html'
#expose('/host')
#has_access
def host(self):
base_filters = [['name', FilterStartsWith, 'S'],]
#if there is not return, FAB will throw error
return "host view:{}".format(repr(base_filters))
#expose('/my')
#has_access
def my(self):
# A pure testing method
rec = db.session.query(Access).filter_by(id = 1).all()
if rec:
for r in rec:
print "rec, acc:{}, dev:{}, host:{}".format(r.account_id, r.device_id, r.is_host)
return self.render_template('list.html', title = "My Accesses", body = "{}".format(repr(r)))
else:
return repr(None)
Besides sqlalchemy code with render_template(), I guess base_filters can also help to define custom queries, however, I have no idea how to get query result and get them rendered.
Please give me some reference code or example if possible. Actually I have grep keywords of "db.session/render_template/expoaw"in FAB's github sources. But no luck.

FK validation within Django

Good afternoon,
I have my django server running with a REST api on top to serve my mobile devices. Now, at some point, the mobile device will communicate with Django.
Let's say the device is asking Django to add an object in the database, and within that object, I need to set a FK like this:
objectA = ObjectA.objects.create(title=title,
category_id = c_id, order = order, equipment_id = e_id,
info_maintenance = info_m, info_security = info_s,
info_general = info_g, alphabetical_notation = alphabetical_notation,
allow_comments = allow_comments,
added_by_id = user_id,
last_modified_by_id = user_id)
If the e_id and c_id is received from my mobile devices, should I check before calling this creation if they actually still exists in the DB? That is two extra queries... but if they can avoid any problems, I don't mind!
Thanks a lot!
It think that Django creates constraint on Foreign Key by default ( might depend on database though ). This means that if your foreign keys point to something that does not exist, then saving will fail ( resulting in Exception on Python side ).
You can reduce it to a single query (it should be a single query at least, warning I haven't tested the code):
if MyObject.objects.filter(id__in=[e_id, c_id]).distinct().count() == 2:
# create the object
ObjectA.objects.create(...)
else:
# objects corresponding e_id and c_id do not exist, do NOT create ObjectA
You should always validate any information that's coming from a user or that can be altered by a determined user. It wouldn't be difficult for someone to sniff the traffic and start constructing their own REST requests to your server. Always clean and validate external data that's being added to the system.

DRY in django queries, reverse queries, and predicates

I am frustrated that in Django I often end up having to write methods on a custom Manager:
class EntryManager(Manager):
def filter_beatle(self, beatle):
return self.filter(headline__contains=beatle)
... and repeat pretty much the same method in a different Manager for a reverse query:
class BlogManager(Manager):
def filter_beatle(self, beatle):
return self.filter(entry__headline__contains=beatle)
... and a predicate on Entry:
def headline_contains(self, beatle):
return self.headline.find(beatle) != -1
[Note that the predicate on Entry will work on Entry objects that haven't even been saved yet.]
This feels like a violation of DRY. Is there some way to express this once and use it in all three places?
What I would like to be able to do is write something like:
q = Q(headline__contains="Lennon")
lennon_entries = Entry.objects.filter(q)
lennon_blogs = Blog.objects.filter(q.reverse(Entry))
is_lennon = entry.would_filter(q)
... where 'headline__contains="Lennon"' expresses exactly once what it means to be 'an Entry about "Lennon"', and this can be used to construct reverse queries and a predicate.
The best place for this is a custom manager. According to django's guidelines a manager class is the best place for code that is affecting more than one object of a class.
class EntryManager(models.Manager):
def filter_lennons(self):
return self.get_query_set().filter(headline__contains='Lennon')
class Entry(models.Model):
headline = models.CharField(max_length=100)
objects = EntryManager()
lennons = Entry.objects.filter_lennons()
You should never rarely have to do the following:
if entry.headline.find('Lennon') >= 0:
because the filter should take care of restricting the result set to the instances you're interested in.
If you're going to be using the same filter multiple times, you can create a custom manager or a simple class method.
class Entry(models.Model):
...
# this really should be on a custom manager, but this was quicker to demonstrate
#classmethod
def find_headlines(cls, text):
return cls.objects.filter(headline__contains=text)
entries = Entry.find_headlines('Lennon')
But really, the DRYness has already been contained within the Queryset API. How often are you really going to be hard coding the string 'Lennon' into a query? Usually, the search parameter will be passed into a view from a GET or POST. Perfectly DRY.
So, what is the actual problem? Other than exploring the queryset API, have you ever had to hard code lookup values in multiple queries like your question?
For the "reverse filter" case you can use a subquery:
Blog.objects.filter(entries__in=Entry.objects.filter_beatle("Lennon"))
Reusing or generating predicates is not possible (in general) as there are predicates that cannot be expressed as queries and queries that cannot be expressed as predicates without db access.
My most common use for the predicate seems to be in asserts. Often something like:
class Thing(Model):
class QuerySet(query.QuerySet):
def need_to_be_whacked():
# ... code ...
def needs_to_be_whacked(self):
return Thing.objects.need_to_be_whacked().filter(id=self.id).exists()
def whack(self):
assert self.needs_to_be_whacked()
for thing in Thing.objects.need_to_be_whacked():
thing.whack()
I want to make sure that no other code is calling whack() in state where it doesn't need to be whacked. It costs a database hit, but it works.