Flask test client and unittest: mocking global object? - flask

I have a flask application which uses a global object data_loader.
The main flask file (let's call it main.py) starts as follow:
app = Flask('my app')
...
data_loader = DataLoader(...)
Later on, this global data_loader object is called in the route methods of the webserver:
class MyClass(Resource):
def get(self):
data_loader.load_some_data()
# ... process data, etc
Using unittest, I want to be able to patch the load_some_data() method. I'm using the flask test_client:
from my_module.main import app
class MyTest(unittest.TestCase):
#classmethod
def setUpClass(cls) -> None:
cls.client = app.test_client('my test client')
How can I patch the data_loader method in subsequent tests in MyTest? I have tried this approach, but it does not work (although the data_loader seems to be replaced at some point):
#unittest.mock.patch('my_module.main.DataLoader')
def my_test(self, DataLoaderMock):
data_loader = DataLoaderMock.return_value
data_loader.my_method.return_value = 'new results (patched)'
with app.test_client() as client:
response = client.get(f'/some/http/get/request/to/MyTest/route',
query_string={...})
# ... some assertions to be tested ...
It seems the data_loader is never truly replaced in the Flask app.
Also, is this considered "good practice" to have a global variable in the Flask server, or is the app supposed to have it stored inside?
Thanks

About mocking, patch.object can be used to modify object attributes:
#unittest.mock.patch.object(data_loader, 'my_method')
def my_test(self, my_method_mock):
my_method_mock.return_value = 'new results (patched)'
with app.test_client() as client:
response = client.get(f'/some/http/get/request/to/MyTest/route',
query_string={...})
my_method_mock.assert_called() # ok!
My solution with interesting insights would be:
import unittest
from unittest.mock import patch
class MyTest(unittest.TestCase):
def test_get(self):
client = app.test_client('my test client')
patcher = patch('{package_here}.{module_here}.DataLoader.load_some_data', return_value={'status': 1})
patcher.start()
self.assertDictEqual(client.get('/').json, {'status': 1})
patcher.stop()
# or
with patch('{package_here}.{module_here}.DataLoader.load_some_data', return_value={'status': 1}):
self.assertDictEqual(client.get('/').json, {'status': 1})
About "good practice" and global variables. Yes, I have seen global variables in various projects. But I don't recommend using global variables. Because:
It can lead to recursive imports and dependency hell. I have worked with large Flask application with recursive imports. It is really pain. And you can't fix all problems for a short time.
Let's imagine you have a tests which mocking a global variables. I think refactoring is more difficult when you have a rather big service.
Separate imports and initialization is really simpler and more configurable. In this case all works in one direction import all dependencies -> load config -> initialization -> run. In other case you will have import -> new instance -> new instance -> import -> ....
Another reason for memory leaks.
Maybe global variables is not bad way for a stand alone packages, modules etc but not for a project. I also want to recommend using some additional tools. This will not only make it easier to write tests, but it will also save you headaches.

Related

How to import the same flask limiter in a structured flask app

I'm trying to organize my Flask app, as it's getting quite big in length at close to 1000 lines
I am trying to separate the REST API from my main app, by using the approach shown here: https://flask-restx.readthedocs.io/en/latest/scaling.html#multiple-apis-with-reusable-namespaces
What remains in my main.py is something like
from apiv1 import blueprint as api1
REST_API = Flask(__name__)
REST_API.wsgi_app = ProxyFix(REST_API.wsgi_app, x_for=1)
REST_API.register_blueprint(api1)
However in my app, I am using the flask limiter
# Very basic DOS prevention
try:
limiter = Limiter(
REST_API,
key_func=get_remote_address,
storage_uri="redis://localhost:6379/1",
# storage_options={"connect_timeout": 30},
strategy="fixed-window", # or "moving-window"
default_limits=["90 per minute"]
)
# Allow local workatation run
except:
limiter = Limiter(
REST_API,
key_func=get_remote_address,
default_limits=["90 per minute"]
)
This is likewise placed in a decorator to my various API functions
decorators = [limiter.limit("30/minute")]
def post(self, server_id = ''):
# [..]
Now that I am splitting my REST api from the same file that declaring my endpoints, I don't know how to pass its object. The REST_API var exists only in my main.py
How should I handle passing the limiter variable, or any other global objects for that matter?
I worked for a few hours yesterday but I finally understood the pythonic way to do this sort of thing.
I just couldn't wrap my head around how imports function so I was struggling with questions like "how do I pass the variable during import" etc.
Finally it clicked for me that I need to follow a "pull" method with my imports, instead of trying to push variables into them. I.e. I setup the center location in my package's __init__ which will import my logger module, and then my other modules will import THAT logger variable from there.
So in my app's __init__, I have
from .limiter import limiter
And in the app/apis/v1.py I have
from .. import limiter
And this seems to finally work. I don't know if this is the expected way, meaning to play with relative module paths, so if there;s a more elegant way, please let me know

Python Mocking - How to obtain call arguments from a mock that is passed to another mock as a function argument?

I am not sure about the title of this question, as it is not easy to describe the issue with a single sentence. If anyone can suggest a better title, I'll edit it.
Consider this code that uses smbus2 to communicate with an I2C device:
# device.py
import smbus2
def set_config(bus):
write = smbus2.i2c_msg.write(0x76, [0x00, 0x01])
read = smbus2.i2c_msg.read(0x76, 3)
bus.i2c_rdwr(write, read)
I wish to unit-test this without accessing I2C hardware, by mocking the smbus2 module as best I can (I've tried mocking out the entire smbus2 module, so that it doesn't even need to be installed, but had no success, so I'm resigned to importing smbus2 in the test environment even if it's not actually used - no big deal so far, I'll deal with that later):
# test_device.py
# Depends on pytest-mock
import device
def test_set_config(mocker):
mocker.patch('device.smbus2')
smbus = mocker.MagicMock()
device.set_config(smbus)
# assert things here...
breakpoint()
At the breakpoint, I'm inspecting the bus mock in pdb:
(Pdb) p smbus
<MagicMock id='140160756798784'>
(Pdb) p smbus.method_calls
[call.i2c_rdwr(<MagicMock name='smbus2.i2c_msg.write()' id='140160757018400'>, <MagicMock name='smbus2.i2c_msg.read()' id='140160757050688'>)]
(Pdb) p smbus.method_calls[0].args
(<MagicMock name='smbus2.i2c_msg.write()' id='140160757018400'>, <MagicMock name='smbus2.i2c_msg.read()' id='140160757050688'>)
(Pdb) p smbus.method_calls[0].args[0]
<MagicMock name='smbus2.i2c_msg.write()' id='140160757018400'>
Unfortunately, at this point, the arguments that were passed to write() and read() have been lost. They do not seem to have been recorded in the smbus mock and I've been unable to locate them in the data structure.
Interestingly, if I break in the set_config() function, just after write and read assignment, and inspect the mocked module, I can see:
(Pdb) p smbus2.method_calls
[call.i2c_msg.write(118, [160, 0]), call.i2c_msg.read(118, 3)]
(Pdb) p smbus2.method_calls[0].args
(118, [160, 0])
So the arguments have been stored as a method_call in the smbus2 mock, but not copied over to the smbus mock that is passed into the function.
Why is this information not retained? Is there a better way to test this function?
I think this can be summarised as this:
In [1]: from unittest.mock import MagicMock
In [2]: foo = MagicMock()
In [3]: bar = MagicMock()
In [4]: w = foo.write(1, 2)
In [5]: r = foo.read(1, 2)
In [6]: bar.func(w, r)
Out[6]: <MagicMock name='mock.func()' id='140383162348976'>
In [7]: bar.method_calls
Out[7]: [call.func(<MagicMock name='mock.write()' id='140383164249232'>, <MagicMock name='mock.read()' id='140383164248848'>)]
Note that the bar.method_calls list contains calls to the functions .write and .read (good), but the parameters that were passed to those functions are missing (bad). This seems to undermine the usefulness of such mocks, since they don't interact as I would expect. Is there a better way to handle this?
The reason you can't access the calls to write and read is that they themselves are the return_value of another mock. What you are trying to do is access the mock "parent" (Using the terminology here: https://docs.python.org/3/library/unittest.mock.html).
It actually is possible to access the parent, but I'm not sure it's a good idea, since it used an undocumented and private attribute of the MagicMock object, _mock_new_parent.
def test_set_config(mocker):
"""Using the undocumented _mock_new_parent attribute"""
mocker.patch('device.smbus2')
smbus = mocker.MagicMock()
device.set_config(smbus)
# Retrieving the `write` and `read` values passed to `i2c_rdwr`.
mocked_write, mocked_read = smbus.i2c_rdwr.call_args[0]
# Making some assertions about how the corresponding functions were called.
mocked_write._mock_new_parent.assert_called_once_with(0x76, [0x00, 0x01])
mocked_read._mock_new_parent.assert_called_once_with(0x76, 3)
You can check that the assertions work by using some bogus values instead, and you'll see the pytest assertion errors.
A simpler, and more standard approach IMO is to look at the calls from the module mock directly:
def test_set_config_2(mocker):
""" Using the module mock directly"""
mocked_module = mocker.patch('device.smbus2')
smbus = mocker.MagicMock()
device.set_config(smbus)
mocked_write = mocked_module.i2c_msg.write
mocked_read = mocked_module.i2c_msg.read
mocked_write.assert_called_once_with(0x76, [0x00, 0x01])
mocked_read.assert_called_once_with(0x76, 3)
I just realized that you use dependency injection and that you should take advantage of this.
This would be the clean approach.
Mocks can behave unexpected/nasty (which does not mean that they are evil - only sometime.... counterintuitive)
I would recommend following test structure:
# test_device.py
import device
def test_set_config():
dummy_bus = DummyBus()
device.set_config(dummy_bus)
# assert things here...
assert dummy_bus.read_data == 'foo'
assert dummy_bus.write_data == 'bar'
breakpoint()
class DummyBus:
def __init__(self):
self.read_data = None
self.write_data = None
def i2c_rdwr(write_input, read_input):
self.read_data = read_input
self.write_data = write_input
For anyone coming across this, I posed a variation of this problem in another question, and the result was quite satisfactory:
https://stackoverflow.com/a/73739343/
In a nutshell, create a TraceableMock class, derived from MagicMock, that returns a new mock that keeps track of its parent, as well as the parameters of the function call that led to this mock being created. Together, there is enough information to verify that the correct function was called, and the correct parameters were supplied.

How to mock spacy models / Doc objects for unit tests?

Loading spacy models slows down running my unit tests. Is there a way to mock spacy models or Doc objects to speed up unit tests?
Example of a current slow tests
import spacy
nlp = spacy.load("en_core_web_sm")
def test_entities():
text = u"Google is a company."
doc = nlp(text)
assert doc.ents[0].text == u"Google"
Based on the docs my approach is
Constructing the Vocab and Doc manually and setting the entities as tuples.
from spacy.vocab import Vocab
from spacy.tokens import Doc
def test()
alphanum_words = u"Google Facebook are companies".split(" ")
labels = [u"ORG"]
words = alphanum_words + [u"."]
spaces = len(words) * [True]
spaces[-1] = False
spaces[-2] = False
vocab = Vocab(strings=(alphanum_words + labels))
doc = Doc(vocab, words=words, spaces=spaces)
def get_hash(text):
return vocab.strings[text]
entity_tuples = tuple([(get_hash(labels[0]), 0, 1)])
doc.ents = entity_tuples
assert doc.ents[0].text == u"Google"
Is there a cleaner more Pythonic solution for mocking spacy objects for unit tests for entities?
This is a great question actually! I'd say your instinct is definitely right: If all you need is a Doc object in a given state and with given annotations, always create it manually wherever possible. And unless you're explicitly testing a statistical model, avoid loading it in your unit tests. It makes the tests slow, and it introduces too much unnecessary variance. This is also very much in line with the philosophy of unit testing: you want to be writing independent tests for one thing at a time (not one thing plus a bunch of third-party library code plus a statistical model).
Some general tips and ideas:
If possible, always construct a Doc manually. Avoid loading models or Language subclasses.
Unless your application or test specifically needs the doc.text, you do not have to set the spaces. In fact, I leave this out in about 80% of the tests I write, because it really only becomes relevant when you're putting the tokens back together.
If you need to create a lot of Doc objects in your test suite, you could consider using a utility function, similar to the get_doc helper we use in the spaCy test suite. (That function also shows you how the individual annotations are set manually, in case you need it.)
Use (session-scoped) fixtures for the shared objects, like the Vocab. Depending on what you're testing, you might want to explicitly use the English vocab. In the spaCy test suite, we do this by setting up an en_vocab fixture in the conftest.py.
Instead of setting the doc.ents to a list of tuples, you can also make it a list of Span objects. This looks a bit more straightforward, is easier to read, and in spaCy v2.1+, you can also pass a string as a label:
def test_entities(en_vocab):
doc = Doc(en_vocab, words=["Hello", "world"])
doc.ents = [Span(doc, 0, 1, label="ORG")]
assert doc.ents[0].text == "Hello"
If you do need to test a model (e.g. in the test suite that makes sure that your custom models load and run as expected) or a language class like English, put them in a session-scoped fixture. This means that they'll only be loaded once per session instead of once per test. Language classes are lazy-loaded and may also take some time to load, depending on the data they contain. So you only want to do this once.
# Note: You probably don't have to do any of this, unless you're testing your
# own custom models or language classes.
#pytest.fixture(scope="session")
def en_core_web_sm():
return spacy.load("en_core_web_sm")
#pytest.fixture(scope="session")
def en_lang_class():
lang_cls = spacy.util.get_lang_class("en")
return lang_cls()
def test(en_lang_class):
doc = en_lang_class("Hello world")

Unmocking a mocked object in Django unit tests

I have several TestCase classes in my django application. On some of them, I mock out a function which calls external resources by decorating the class with #mock.patch, which works great. One TestCase in my test suite, let's call it B(), depends on that external resource so I don't want it mocked out and I don't add the decorator. It looks something like this:
#mock.patch("myapp.external_resource_function", new=mock.MagicMock)
class A(TestCase):
# tests here
class B(TestBase):
# tests here which depend on external_resource_function
When I test B independently, things work as expected. However, when I run both tests together, A runs first but the function is still mocked out in B. How can I unmock that call? I've tried reloading the module, but it didn't help.
Patch has start and stop methods. Based on what I can see from the code you have provided, I would remove the decorator and use the setUp and tearDown methods found in the link in your classes.
class A(TestCase):
def setUp(self):
self.patcher1 = patch('myapp.external_resource_function', new=mock.MagicMock)
self.MockClass1 = self.patcher1.start()
def tearDown(self):
self.patcher1.stop()
def test_something(self):
...
>>> A('test_something').run()
Great answer. With regard to Ethereal's question, patch objects are pretty flexible in their use.
Here's one way to approach tests that require different patches. You could still use setUp and tearDown, but not to do the patch.start/stop bit.
You start() the patches in each test and you use a finally clause to make sure they get stopped().
Patches also support Context Manager stuff so that's another option, not shown here.
class A(TestCase):
patcher1 = patch('myapp.external_resource_function', new=mock.MagicMock)
patcher2 = patch('myapp.something_else', new=mock.MagicMock)
def test_something(self):
li_patcher = [self.patcher1]
for patcher in li_patcher:
patcher.start()
try:
pass
finally:
for patcher in li_patcher:
patcher.stop()
def test_something_else(self):
li_patcher = [self.patcher1, self.patcher2]
for patcher in li_patcher:
patcher.start()
try:
pass
finally:
for patcher in li_patcher:
patcher.stop()

Django test not loading fixture data

I have written tests for a Django project that i am working on, but one particular fixture fails to load.
The fixture is generated using dumpdata and i havent fiddled with it at all.
I can load the data using manage.py on that fixture without errors. I have verified that the data actually loaded using shell and querying the data.
This is driving me nuts, any help would be much appreciated.
Here is my test file (irrelevant portions removed):
class ViewsFromUrls(TestCase):
fixtures = [
'centers/fixtures/test_data.json',
'intranet/fixtures/test_data.json',
'training/fixtures/test_data.json', #The one that fails to load
]
def setUp(self):
self.c = Client()
self.c.login(username='USER', password='PASS')
...
def test_ViewBatch(self):
b = Batch.objects.all()[0].ticket_number
response = self.c.get(reverse('training.views.view_batch', kwargs={'id':b}))
self.assertTrue(response.status_code, 200)
...
Import the TestCase from django.test:
from django.test import TestCase
class test_something(TestCase):
fixtures = ['one.json', 'two.json']
...
Not: import unittest
Not: import django.utils.unittest
But: import django.test
That's a day of frustration right there.
Stop complaining - it's in the docs :-/
I Am not sure if this fixes your problem, but on this site:
https://code.djangoproject.com/wiki/Fixtures
I found an interesting remark:
you see that Django searches for appnames/fixtures and
settings.FIXTURE_DIRS and loads the first match. So if you use names
like testdata.json for your fixtures you must make sure that no other
active application uses a fixture with the same name. If not, you can
never be sure what fixtures you actually load. Therefore it is
suggested that you prefix your fixtures with the application names,
e.g. myapp/fixtures/myapp_testdata.json .
Applying this (renaming the fixtures with appname as prefix in the filename), solved my problem (I had the same issue as described here)
Check if the fixture is really in the right place. From the docs:
Django will search in three locations
for fixtures:
In the fixtures directory of every installed application
In any directory named in the FIXTURE_DIRS setting
In the literal path named by the fixture
One thing to note, when creating the FIXTURE_DIRS constant in your settings file, be sure to leave out the leading '/' if you have a general fixtures directory off of the root of your project.
Ex:
'/actual/path/to/my/app/fixtures/'
Now, in the settings.py file:
Will NOT work:
FIXTURE_DIRS = '/fixtures/'
Will work:
FIXTURE_DIRS = 'fixtures/'
It's possible this depends on how your other routes are configured, but it was a gotcha that had me scratching my head for a little while. Hope this is useful. Cheers.
A simple mistake I made was adding a custom setUpClass() and forgetting to include super().setUpClass() with it (which of course, is where Django's logic for loading fixtures lives)