How to introduce unique together with conflicting data?

How to introduce unique together with conflicting data? - django

I worte a little qna script for my website and to prevent users from starting a discussion I want every user only to be able to reply once.
class Q(models.Model):
text = models.TextField()
user = models.ForeignKeyField('auth.User')
class A(models.Model):
text = models.TextField()
user = models.ForeignKeyField('auth.User')
q = models.ForeignKeyField('Q')
class Meta:
unique_together = (('user','q'),)
Now the the migration gives me:
return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: columns user_id, q_id are not unique
Of course the unique clashes with existing data. What I need to know now is how to tell the migration to delete the conflicting answers. A stupid solution like keeping the first one found would already be a big help. Even better would be a way to compare conflicting A by a custom function.
I'm running Django-1.7 with the new migration system - not South.
Thanks you for your help!

You just need to create a data migration, where you can indeed write a custom function to use any logic you want. See the documentation for the details.
As an example, a data migration that kept only the lowest Answer id (which should be a good proxy for the earliest answer), might look like this:
from django.db import models, migrations
def make_unique(apps, schema_editor):
A = apps.get_model("yourappname", "A")
# get all unique pairs of (user, question)
all_user_qs = A.objects.values_list("user_id", "q_id").distinct()
# for each pair, delete all but the first
for user_id, q_id in all_user_qs:
late_answers = A.objects.filter(user_id=user_id, q_id=q_id).order_by('id')[1:]
A.objects.filter(id__in=list(late_answers)).delete()
class Migration(migrations.Migration):
dependencies = [
('yourappname', '0001_initial'),
]
operations = [
migrations.RunPython(make_unique),
]
(This is off the top of my head, and is of course destructive, so please just take this as an example. You might want to look into backing up your data before doing all this deleting.)
To recap: Delete the migration you're trying to run and get rid of the unique constraint. Create an empty data migration as described in the documentation. Write a function to delete the non-unique data that currently exists in the database. Then add the unique constraint back in and run makemigrations. Finally, run both migrations with migrate.

Related

When using natural_keys in Django, how can I distinguish between creates & updates?

In Django, I often copy model fixtures from one database to another. My models use natural_keys (though I'm not sure that's relevant) during serialization. How can I ensure the instances that have been updated in one database are not inserted into the other database?
Consider the following code:
models.py:
class AuthorManager(models.Manager):
def get_by_natural_key(self, name):
return self.get(name=name)
class Author(models.Model):
objects = AuthorManager()
name = models.CharField(max_length=100)
def natural_key(self):
return (self.name,)
Now if I create an author called "William Shakespeare" and dump it to a fixture via python manage.py dumpdata --natural_keys I will wind up w/ the following sort of file:
[
{
"model": "myapp.author",
"fields": {
"name": "Wiliam Shakespeare"
}
}
]
I can load that into another db and it will create a new Author named "William Shakespeare".
But, if I rename that author to "Bill Shakespeare" in the original database and recreate the fixture and load it into the other database then it will create another new Author named "Bill Shakespeare" instead of update the existing Author's name.
Any ideas on how to approach this?

You're using fixtures for what it's not made for: synchronizing databases. It's made for populating empty databases. In particular, "deletion" cannot be expressed in fixtures. An update based on natural keys could be expressed as an insertion + a deletion.
Now you can work around this by simply not using natural keys, but then the primary keys must be identical between databases. If the target database receives inserts from another source, then this is a problem as updates will be occur at the wrong object.
In short: use synchronization/replication tools to synchronize databases, use fixtures for migrations and tests. Trying to use fixtures for synchronization is error prone.

What is a "state operation" and "database operation" in SeparateDatabaseAndState?

This is an extension to this question: Move models between Django (1.8) apps with required ForeignKey references
The Nostalg.io's answer worked perfectly. But I still can't get what is a "state operation" and "database operation" and what actually is going on when using SeparateDatabaseAndState.

A bit tough topic and not enough clear explanations out there, so here are my 50 cents.
It's all about state of your application models. What's a state? Imagine you have an app with a model MyModel and some migrations:
# app/migrations/
# 001.py
CreateModel(name='MyModel')
# 002.py
AddField(name='total', field=models.IntegerField)
# 003.py
RemoveField(name='total', field=models.IntegerField)
When you call manage.py makemigrations app, Django looks through all changes in app/migrations/001.py...003.py to get what's expected to be in your model - i.e. the state of the model. So, state is roughly a combined result of your migrations. If it differs from what you have in your MyModel class in app/models.py, makemigrations creates new migration with corresponding change. Like if MyModel class currently has total field, while in last migration it was removed, Django creates a migration with AddField() operation. Unlike what people often think, makemigrations does NOT look at actual database table.
Normal migration operation changes both state and database: CreateModel() tells Django "hey Django, this migration adds new table" and performs "CREATE TABLE" on the database.
SeparateDatabaseAndState is needed when you need to do different things to the state and to the database (or may be you need to modify just one of them).
Let's look at example from Django docs, where they change existing ManyToMany relation to "through" model. They had model Author and model Book with M-M relation:
class Book(models.Model):
authors = models.ManyToManyField(Author)
But now you want to have a through-model AuthorBook - optionally with some extra fields in it:
class AuthorBook(models.Model):
...
class Book(models.Model):
authors = models.ManyToManyField(Author, through=AuthorBook)
But you don't want new table - you want to use existing core_book_authors table which Django created automatically, and you want data in it.
So you create a migration with SeparateDatabaseAndState operation with database_operations and state_operations.
operations = [
migrations.SeparateDatabaseAndState(
database_operations=[
migrations.RunSQL(
sql='ALTER TABLE core_book_authors RENAME TO core_authorbook',
reverse_sql='ALTER TABLE core_authorbook RENAME TO core_book_authors',
),
],
state_operations=[
migrations.CreateModel(name='AuthorBook'),
]
),
]
In database_operations they rename existing M-M table according to new model name. That's the only thing you actually need to do with the database, unless you're adding new field to AuthorBook at the same time.
state_operations with CreateModel() tells Django: "this migration adds new table" (actually it does not - as we know from database_operations, it renames existing table). But because of this state_operations our new model gets into the state and Django knows this model is "created" and will not try to create it on next makemigrations call.
So, in SeparateDatabaseAndState operation, database_operations affect the database and not the state, state_operations affect the state and not the database.

Data migrations in Django

I am working on a data migration for a Django app to populate
the main table in the db with data that will form the mainstay of
the app - this is persistent/permanent data that may added to but
never deleted.
My reference is the Django 1.7 documentation and in particular an
example on page
https://docs.djangoproject.com/en/1.7/ref/migration-operations/#django.db.migrations.operations.RunPython
with a custom method called forward_funcs:
def forwards_func(apps, schema_editor):
# We get the model from the versioned app registry;
# if we directly import it, it'll be the wrong version
Country = apps.get_model("myapp", "Country")
db_alias = schema_editor.connection.alias
Country.objects.using(db_alias).bulk_create([
Country(name="USA", code="us"),
Country(name="France", code="fr"),])
I am assuming the argument to bulk_create is a list of Country model objects not namedtuple objects, although the format looks exactly the same. Is this the case, and could someone please explain what db_alias is?
Also, if I wish to change or remove existing entries in a table using a data migration what are the methods corresponding to bulk_create to do this?
Thanks in advance for any help.

Country is just the same as you would do from app.models import Country. Only thing different, the import always gives you the latest model and apps.get_model in a migration gives you the model at the time of the migration. It continues to edit the model within the initial migration.
About bulk_create; its argument is indeed a list of unsaved Country objects and uses it to do an huge insert into your db. More information about bulk_create can be found here; https://docs.djangoproject.com/en/1.7/ref/models/querysets/#bulk-create.
About db_alias, it is the name of the database you set within your settings. Most of the time it is default, so you can leave it in your code if you just use one database. The function will probably will called more than once if you have more databases set within your settings. More info about using; https://docs.djangoproject.com/en/1.7/ref/models/querysets/#using.
An bulk delete is actually quite simple, you just filter your Countries and call delete on the queryset. So something like;
Country.objects.filter(continent="Europe").delete()
About the persistent/permanent data question, I don't really have a solution for that one. One thing you can do, I think, is overwrite the .delete() function on the model and Manager.

Slugfield in tango with django

I'm reading this tutorial to lear the basics of django and I'm facing a problem I can't solve.
I'm at this page http://www.tangowithdjango.com/book17/chapters/models_templates.html at the sluglify function approach.
In the tutorial the author says we have to create a new line in our category model for th slugfield. I folow strictly all the steps just as I show here:
from django.db import models
from django.template.defaultfilters import slugify
class Category(models.Model):
name = models.CharField(max_length=128, unique=True)
likes = models.IntegerField(default=0)
views = models.IntegerField(default=0)
slug = models.SlugField(unique=True)
def __unicode__(self):
return self.name
class Page(models.Model):
category = models.ForeignKey(Category)
title = models.CharField(max_length=128)
url = models.URLField()
views = models.IntegerField(default=0)
def __unicode__(self):
return self.title
When I run the "makemgiration" command everything works as expected: I choose the first option and provide ‘’ . BUT when I run "migrate" I get:
django.db.utils.Integrity error: Slug column is not unique
What is going on here? I've repeated several times the migrations and tried other default codes but with the same ending. I can't figure out what im doing wrong. They only thing left is that i'm giving something else instead of ‘’ (Firstly I thoughtthey were '' or ").
Thankyou for your time and help!

Delete db.sqlite3 and re run
python manage.py syncdb
Then perform the migrations
python manage.py makemigrations rango
python manage.py migrate
and re run your population script. This is a downside of Django that whenever you change models.py the db must be recreated.

I am also going through the tutorial and I was having the same issue a couple days ago.
I believe the problem is that you are trying to add a new field (slug) and have it be unique for each of the elements in your table but if I'm not mistaken you already have some data in your table for which that value does not exist and therefore the value that this field would get for those rows is not unique, hence the "django.db.utils.Integrity error: Slug column is not unique".
What I did was simply to delete the data in the table (because it was only a couple of fields, so no big deal) and THEN perform the addition of the field to the model. After doing that, you can put the data back in the table (if you're following the tutorial you should have a script for automated table population so you just run that script and you're golden). Once you have done that, all the rows in the table should have a unique slug field (since it is automatically generated based on the category name) and that solves the problem.
Note that if your table is very large, this may not be a very good solution because of the deletion of data so perhaps there is a better way, like adding that field to the model without it being unique, then running a script that sets the slug field for every row to an unique value and then setting the model's field as unique but I'm not very knowledgeable on SQL and the first solution worked just fine for me so it should work for you as well.

try deleting the sqlite3.db file from the project's directory. i was stuck on a similar problem and that really works..also even if you modify your population script, you have to delete and the recreate the db file to see the changes made....

Django: How do I move part of the data from one instance to another

We have 2 identical Django instances and one is trial the other is for production. Recently one client on trial purchased full product and I need to move ONLY their data from trial to production. I don't know if there's a convenient way to do that, since:
If I use Django fixtures then it might overwrite the existing data in production system because of the default id that Django assigned to each entry(I might be wrong but I think fixtures are only good for initialization).
Using sql to dump the DB might not help either because of the similar problem with the first approach, and it's also complex because there are other customers in trial but I only need to move this client's data.
Please give me some advice if you have similar experience.

The issue is you cannot transfer the primary keys (IDs) from the trial to the production DB, isn't it ? So 2 solutions:
1) A tank to kill an ant
You do a SQL export of your trial database, and you increase every primary key and foreign key by a number (for ex: 10000). This number needs to high enough to avoid unicity constraint violation when you will import it in the DB
2) The smart solution
If, and only if, your model is well designed: for every model you can find a set of its columns that could make a substitute primary key to the ID. If you have attributes with unique = True, or models with unique_together = (...) it's perfect: you can use natural keys !
In every model of your source code, you add the method get_by_natural_key:
class Person(models.Model):
firstname = models.CharField...
last_name = models.CharField...
class Meta:
unique_together = ("first_name", "last_name")
def get_by_natural_key(self, first_name, last_name):
return self.get(first_name=first_name, last_name=last_name)
Then you can use Django dumpdata command to export the trial database with the IDs replaced by the natural keys ! Then with the same code, you use the loaddata command to import these data files.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to introduce unique together with conflicting data? - django

Related

When using natural_keys in Django, how can I distinguish between creates & updates?

What is a "state operation" and "database operation" in SeparateDatabaseAndState?

Data migrations in Django

Slugfield in tango with django

Django: How do I move part of the data from one instance to another

Categories

Resources