Data migration only executed for the first test - django

I have a simple data migration, which creates a Group, and which looks like this :
def make_manager_group(apps, schema_editor):
Group = apps.get_model("auth", "Group")
managers_group = Group(name="managers")
managers_group.save()
class Migration(migrations.Migration):
dependencies = [
('my_app', '0001_initial'),
('auth', '0006_require_contenttypes_0002'),
]
operations = [
migrations.RunPython(make_manager_group, reverse_code=lambda *args, **kwargs: True)
]
and a simple functional test app containing the following tests :
from django.contrib.auth.models import Group
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
class FunctionalTest(StaticLiveServerTestCase):
def setUp(self):
print("Groups : {}".format(Group.objects.all()))
def test_2(self):
pass
def test_1(self):
pass
When I run the tests, I get :
Creating test database for alias 'default'...
Groups : [<Group: managers>]
.Groups : []
.
Clearly, the group is being created when the test db is created, but when this db is reset between tests, it is reset to an empty db and not to the state it was after all the migrations were applied.
The model itself doesn't contain anything special (I only created one for the migration not to be the first, as in the project I'm working on, but I'm not sure it is needed at all).
Is this a bug, or am I missing something about data migration, to be able to have my group created when every single test starts?
Edit 1 : I'm using Django 1.8.3
Edit 2 : Quick'n'dirty hack added to the setUp of the test class :
from django.contrib.auth.models import Group, Permission
if not Group.objects.all():
managers_group = Group(name="managers")
managers_group.save()
managers_group.permissions.add(
Permission.objects.get(codename='add_news'),
Permission.objects.get(codename='change_news'),
Permission.objects.get(codename='delete_news')
)
This is all but DRY, but until now, I couldn't find another way...

I answer my own question :
It seems to be a filed bug which has became documented
It says that using TransactionTestCase and its subclasses (like in my case LiveServerTestCase) doesn't insert the data migrations before every test. It is just done once for the first of them.
It also says that we could set serialized_rollback to True, which should force the rollback to the filled database. But in my case, I'm having the same error as the last message in the bug report.
So I'm going to stick with my dirty hack for now, and maybe create a data fixture, as it says that fixtures are used everytime.

Related

Referencing external variables in Django data migrations

When using models in migrations, in Django we can use apps.get_model() to make sure that the migration will use the right "historical" version of the model (as it was when the migration was defined).
But how do we deal with "regular" variables (not models) imported from the codebase?
If we import a variable from another module, we will probably face issues in the future. For example:
If someday in the future we delete the variable (because we changed the implementation) this will break migrations. So we won't be able to re-run migrations locally to re-create a database from scratch.
If we modify the variable (e.g. we change the values in a list) this will produce unexpected effects when we run the reverse operation on an existing db.
So the question is: what's the best practice for writing migrations? Should we always hard-code values without importing external variables?
Example
Suppose I want to simply modify the value of a field with a variable that I've defined somewhere in the codebase. For example, I want to turn all normal users into admins. I stored user roles in an enum (UserRoles). One way to write the migration would be this:
from django.db import migrations
from user_roles import UserRoles
def change_user_role(apps, schema_editor):
User = apps.get_model('users', 'User')
users = User.objects.filter(role=UserRoles.NORMAL_USER.value)
for user in users:
user.role = UserRoles.ADMIN.value
User.objects.bulk_update(users, ["role"])
def revert_user_role_changes(apps, schema_editor):
User = apps.get_model('users', 'User')
users = User.objects.filter(role=UserRoles.ADMIN.value)
for user in users:
user.role = UserRoles.NORMAL_USER.value
User.objects.bulk_update(users, ["role"])
class Migration(migrations.Migration):
dependencies = [
('users', '0015_auto_20220612_0824'),
]
operations = [
migrations.RunPython(change_user_role, revert_user_role_changes)
]
As you see, this will have the issues I mentioned above.
I've used the enum example, but this can be applied to every variable referenced inside the migrations.
So the question again: what's the best practice for migrations? Should we always hard-code values without referencing external variables that might change?

Django data migration fails unless it's run separately

I've run into this a couple other times and can't figure out why it happens. When I run the migrations all together through ./manage.py migrate then the last migration (a data migration) fails. The solution is to run the data migration on it's own after the other migrations have been completed. How can I run them all automatically with no errors?
I have a series of migrations:
fulfillment/0001.py
order/0041.py (dependency: fulfillment/0001.py)
order/0042.py
order/0043.py
I followed this RealPython article to move a model to a new app which which works perfectly and is covered by migrations #1 to #3. Migration #3 also adds a GenericForeignKey field. Migration #4 is a data migration that simply populates the GenericForeignKey field from the existing ForeignKey field.
from django.db import migrations, models
def copy_to_generic_fk(apps, schema_editor):
ContentType = apps.get_model('contenttypes.ContentType')
Order = apps.get_model('order.Order')
pickup_point_type = ContentType.objects.get(
app_label='fulfillment',
model='pickuppoint'
)
Order.objects.filter(pickup_point__isnull=False).update(
content_type=pickup_point_type,
object_id=models.F('pickup_point_id')
)
class Migration(migrations.Migration):
dependencies = [
('order', '0042'),
]
operations = [
migrations.RunPython(copy_to_generic_fk, reverse_code=migrations.RunPython.noop)
]
Running the sequence together I get an error:
fake.DoesNotExist: ContentType matching query does not exist.
If I run the migration to #3 then run #4 by itself everything works properly. How can I get them to run in sequence with no errors?
There is two things that might fix the problem, first look into run_before https://docs.djangoproject.com/en/3.1/howto/writing-migrations/#controlling-the-order-of-migrations
if you add it to fulfillment #1, and make sure it runs before orders #4, it should fix the problem.
Another thing that you can do is to move your data migrations to fulfillment #2, that way you know for sure all orders are finished and fulfillment #1 is also finished.
Instead of getting the ContentType through .get() you have to retrieve the model through the apps argument then use get_for_model().
def copy_to_generic_fk(apps, schema_editor):
ContentType = apps.get_model('contenttypes', 'ContentType')
PickupPoint = apps.get_model('fulfillment', 'pickuppoint')
pickup_point_type = ContentType.objects.get_for_model(PickupPoint)
...

Can't db.drop_all() when creating tabes with SqlAlchemy op.create_table

I'm building a Flask service that uses SqlAlchemy Core for database operations, but I'm not using the ORM- just dispatching raw SQL to the PostgreSQL db. In order to track database migrations I'm using Alembic.
My migration looks roughly like this:
def upgrade():
# Add the ossp extenson to enable use of UUIDs
add_extension_command = 'create EXTENSION if not EXISTS "uuid-ossp";'
bind = op.get_bind()
session = Session(bind=bind)
session.execute(add_extension_command)
# Create tables
op.create_table(
"account",
Column(
"id", UUID(as_uuid=True), primary_key=True, server_default=text("uuid_generate_v4()"),
),
Column("created_at", DateTime, server_default=sql.func.now()),
Column("deleted_at", DateTime, default=None),
Column("modified_at", DateTime, server_default=sql.func.now()),
)
This works great in general- the main issue I'm having is with testing. After each test, I want to be able to drop and rebuild the DB to clean out the data. To do this, I'm using PyTest, and created the following App fixture:
#pytest.fixture
def app():
app = create_app("testing")
with app.app_context():
db.init_app(app)
Migrate(app, db)
upgrade()
yield app
db.drop_all()
The general idea here was that each time we need the app context, we apply the database migrations, yield the app, then when the test is done we drop all the tables.
The issue is, db.drop_all() does nothing. I believe this is because the db object is not bound to any MetaData. The research I did lead here, which mentions that the create_table command does not create MetaData, which I assume is why the app is not aware of which tables are available to drop.
I'm a bit stuck here as to what the right path forward is. Should I change how I'm building these migrations? Is this not the right pattern to make sure I remove test data from the DB between tests?

How can I initialise group names in Django every time the program runs?

I have this code and I want it to just create the groups every time the program runs so that if the database is deleted it will still be a sufficient program itself and someone won't have to create groups again, do you know an easy way to do this?
system_administrator = Group.objects.get_or_create(name='system_administrator')
manager = Group.objects.get_or_create(name='manager')
travel_advisor = Group.objects.get_or_create(name='travel_advisor')
If you lose your DB, you'd have to rerun migrations on a fresh db before the program could run again. So I think data migrations might be a good solution for this? A data migration, is a migration that runs python code to alter the data in the DB, not the schema as a normal migration does.
You could do something like this:
In a new migration file (you can run python manage.py makemigrations --empty yourappname to create an empty migration file for an app)
def generate_groups(apps, schema_editor):
Group = apps.get_model('yourappname', 'Group')
Group.objects.get_or_create(name="system_administrator")
Group.objects.get_or_create(name="manager")
Group.objects.get_or_create(name="travel_advisor")
class Migration(migrations.Migration):
dependencies = [
('yourappname', 'previous migration'),
]
operations = [
migrations.RunPython(generate_groups),
]
Worth reading the docs on this https://docs.djangoproject.com/en/3.0/topics/migrations/#data-migrations
You can do it in the ready method of one of your apps.
class YourApp(Appconfig):
def ready(self):
# important do the import inside the method
from something import Group
Group.objects.get_or_create(name='system_administrator')
Group.objects.get_or_create(name='manager')
Group.objects.get_or_create(name='travel_advisor')
The problem with the data migrations approach is that it is useful for populate the database the first time. But if the groups are deleted once the data migration has run, you will need to populate them again.
Also remember that get_or_create return a tuple.
group, created = Group.objects.get_or_create(name='manager')
# group if an instance of Group
# created is a boolean

Model instance fixtures not persisted on the database

I have a test class with two methods, and want to share a saved model instance between both methods.
My fixtures:
#pytest.fixture(scope='class')
def model_factory():
class ModelFactory(object):
def get(self):
x = Model(email='test#example.org',
name='test')
x.save()
return x
return ModelFactory()
#pytest.fixture(scope='class')
def model(model_factory):
m = model_factory.get()
return m
My expectation is to receive only the model fixture on (both) my test methods and have it be the same, persisted on the database:
#pytest.mark.django_db
class TestModel(object):
def test1(self, model):
assert model.pk is not None
Model.objects.get(pk=model.pk) # Works, instance is in the db
def test2(self, model):
assert model.pk is not None # model.pk is the same as in test1
Model.objects.get(pk=model.pk) # Fails:
# *** DoesNotExist: Model matching query does not exist
I've verified using --pdb that at the end of test1, running Model.objects.all() returns the single instance I created. Meanwhile, psql shows no record:
test_db=# select * from model_table;
id | ยทยทยท fields
(0 rows)
Running the Model.objects.all() in pdb at the end of test2 returns an empty list, which is presumably right considering that the table is empty.
Why isn't my model being persisted, while the query still returns an instance anyway?
Why isn't the instance returned by the query in the second test, if my model fixture is marked scope='class' and saved? (This was my original question until I found out saving the model didn't do anything on the database)
Using django 1.6.1, pytest-django 2.9.1, pytest 2.8.5
Thanks
Tests must be independent of each other. To ensure this, Django - like most frameworks - clears the db after each test. See the documentation.
By looking at the postgres log I've found that pytest-django by default does a ROLLBACK after each test to keep things clean (which makes sense, as tests shouldn't depend on state possibly modified by earlier tests).
By decorating the test class with django_db(transaction=True) I could indeed see the data commited at the end of each test from psql, which answers my first question.
Same as before, the test runner ensures no state is kept between tests, which is the answer to my second point.
Scope argument is in this case a bit misleading, however if you would write your code like this:
#pytest.fixture(scope='class')
def model_factory(db, request):
# body
then you would get an error basically saying that database fixture has to be implemented with 'function' scope.
I would like to add that this is being currently worked on and might be an killing feature in the future ;) github pull request