Database normalization in django - django

I need an optimally normalized database structure to achieve the following requirement.
models.py
class Learning_Institute(models.Model):
name = models.TextField()
user = models.ManyToManyField(settings.AUTH_USER_MODEL)
class Course(models.Model):
title = models.CharField(max_length=50)
instructor = models.ForeignKey(User, on_delete=models.PROTECT, related_name='courses_taught')
institute = models.ForeignKey(Learning_Institute, on_delete=models.PROTECT, related_name='courses')
I need the instructor field in the Course table to be limited to the set of users in Learning_Institute instead of all the users in the system.
How do I achieve this on the DB level?

I don't think that you can limit in the model itself.
One of the things that you can do is on form save to have validations using form clearing methods like so
And you can create a check that does something like this:
def clean_ instructor(self):
instructor = self.cleaned_data['instructor']
if instructor.type != "instructor":
raise forms.ValidationError("The user is not instructor!")
Another option is to create another User object that will inherit User and you can call it InstrcutorUsers
I have used this tutorial to extend the user model in django

I don't know if it's suitable for your scenario but changing the relations slightly may achieve what you want.
Removing the many to many for User and create a concrete association model for it, will
at least make sure the Course can only have users that also are instructors, by design.
Consider the following model structure:
class LearningInstitute(models.Model):
name = models.TextField()
class InstituteInstructor(models.Model):
class Meta:
unique_together=('user','institute')
user = models.ForeignKey(User, on_delete=models.PROTECT)
institute = models.ForeighKey(LearningInstitute, on_delete=models.PROTECT)
class Course(models.Model):
title = models.CharField(max_length=50)
instructor = models.ForeignKey(InstituteInstructor, on_delete=models.PROTECT)
You have LearningInstitutes
A user can be an instructor with a related institute, a User can only be related to the same institute once
A Course can only have an instructor (and by that also the related institute)
Design can easily be extended to let Courses have multiple instructors.
By design the Course can only have users that are also instructors.

There is a possibility in Django to achieve this in your model class. The option that can be used in models.ForeignKey is called limit_choices_to.
First I'd very strongly recommend to rename the field user in the class LearningInstitute to users. It is a many to many relation, which means an institute can have many users, and a user can perform some work in many institutes.
Naming it correctly in plural helps to better understand the business logic.
Then you can adapt the field instructor in the class Course:
instructor = models.ForeignKey(
'User', # or maybe settings.AUTH_USER_MODEL
on_delete=models.PROTECT,
related_name='courses_taught',
limit_choices_to=~models.Q(learning_institute_set=None)
)
This is not tested and probably will need some adjustment. The idea is to get all User objects, where the field learning_institute_set (default related name, since you haven't specified one) is not (the ~ sign negates the query) None.
This has however nothing to do with normalisation on the database level. The implementation is solely in the application code, and the database has no information about that.
As suggested by #TreantBG, a good approach would be to extend the class User and create class Instructor (or similar). This approach would affect the database by creating an appropriate table for Instructor.

Related

Creating a model with two optional, but one mandatory foreign key

My problem is that I have a model that can take one of two foreign keys to say what kind of model it is. I want it to take at least one but not both. Can I have this still be one model or should I split it into two types. Here is the code:
class Inspection(models.Model):
InspectionID = models.AutoField(primary_key=True, unique=True)
GroupID = models.ForeignKey('PartGroup', on_delete=models.CASCADE, null=True, unique=True)
SiteID = models.ForeignKey('Site', on_delete=models.CASCADE, null=True, unique=True)
#classmethod
def create(cls, groupid, siteid):
inspection = cls(GroupID = groupid, SiteID = siteid)
return inspection
def __str__(self):
return str(self.InspectionID)
class InspectionReport(models.Model):
ReportID = models.AutoField(primary_key=True, unique=True)
InspectionID = models.ForeignKey('Inspection', on_delete=models.CASCADE, null=True)
Date = models.DateField(auto_now=False, auto_now_add=False, null=True)
Comment = models.CharField(max_length=255, blank=True)
Signature = models.CharField(max_length=255, blank=True)
The problem is the Inspection model. This should be linked to either a group or a site, but not both. Currently with this set up it needs both.
I'd rather not have to split this up into two nearly identical models GroupInspection and SiteInspection, so any solution that keeps it as one model would be ideal.
I would suggest that you do such validation the Django way
by overriding the clean method of Django Model
class Inspection(models.Model):
...
def clean(self):
if <<<your condition>>>:
raise ValidationError({
'<<<field_name>>>': _('Reason for validation error...etc'),
})
...
...
Note, however, that like Model.full_clean(), a model’s clean() method is not invoked when you call your model’s save() method.
it needs to be called manually to validate model's data, or you can override model's save method to make it always call the clean() method before triggering the Model class save method
Another solution that might help is using GenericRelations,
in order to provide a polymorphic field that relates with more than one table, but it can be the case if these tables/objects can be used interchangeably in the system design from the first place.
As mentionned in comments, the reason that "with this set up it needs both" is just that you forgot to add the blank=True to your FK fields, so your ModelForm (either custom one or the default generated by the admin) will make the form field required. At the db schema level, you could fill both, either one or none of those FKs, it would be ok since you made those db fields nullable (with the null=True argument).
Also, (cf my other comments), your may want to check that your really want to FKs to be unique. This technically turns your one to many relationship into a one to one relationship - you're only allowed one single 'inspection' record for a given GroupID or SiteId (you can't have two or more 'inspection' for one GroupId or SiteId). If that's REALLY what you want, you may want to use an explicit OneToOneField instead (the db schema will be the same but the model will be more explicit and the related descriptor much more usable for this use case).
As a side note: in a Django Model, a ForeignKey field materializes as a related model instance, not as a raw id. IOW, given this:
class Foo(models.Model):
name = models.TextField()
class Bar(models.Model):
foo = models.ForeignKey(Foo)
foo = Foo.objects.create(name="foo")
bar = Bar.objects.create(foo=foo)
then bar.foo will resolve to foo, not to foo.id. So you certainly want to rename your InspectionID and SiteID fields to proper inspection and site. BTW, in Python, the naming convention is 'all_lower_with_underscores' for anything else than class names and pseudo-constants.
Now for your core question: there's no specific standard SQL way of enforcing a "one or the other" constraint at the database level, so it's usually done using a CHECK constraint, which is done in a Django model with the model's meta "constraints" option.
This being said, how constraints are actually supported and enforced at the db level depends on your DB vendor (MySQL < 8.0.16 just plain ignore them for example), and the kind of constraint you will need here will not be enforced at the form or model level validation, only when trying to save the model, so you also want to add validation either at the model level (preferably) or form level validation, in both cases in the (resp.) model or form's clean() method.
So to make a long story short:
first double-check that you really want this unique=True constraint, and if yes then replace your FK field with a OneToOneField.
add a blank=True arg to both your FK (or OneToOne) fields
add the proper check constraint in your model's meta - the doc is succint but still explicit enough if you know to do complex queries with the ORM (and if you don't it's time you learn ;-))
add a clean() method to your model that checks your have either one or the other field and raises a validation error else
and you should be ok, assuming your RDBMS respects check constraints of course.
Just note that, with this design, your Inspection model is a totally useless (yet costly !) indirection - you'd get the exact same features at a lower cost by moving the FKs (and constraints, validation etc) directly into InspectionReport.
Now there might be another solution - keep the Inspection model, but put the FK as a OneToOneField on the other end of the relationship (in Site and Group):
class Inspection(models.Model):
id = models.AutoField(primary_key=True) # a pk is always unique !
class InspectionReport(models.Model):
# you actually don't need to manually specify a PK field,
# Django will provide one for you if you don't
# id = models.AutoField(primary_key=True)
inspection = ForeignKey(Inspection, ...)
date = models.DateField(null=True) # you should have a default then
comment = models.CharField(max_length=255, blank=True default="")
signature = models.CharField(max_length=255, blank=True, default="")
class Group(models.Model):
inspection = models.OneToOneField(Inspection, null=True, blank=True)
class Site(models.Model):
inspection = models.OneToOneField(Inspection, null=True, blank=True)
And then you can get all the reports for a given Site or Group with yoursite.inspection.inspectionreport_set.all().
This avoids having to add any specific constraint or validation, but at the cost of an additional indirection level (SQL join clause etc).
Which of those solution would be "the best" is really context-dependent, so you have to understand the implications of both and check how you typically use your models to find out which is more appropriate for your own needs. As far as I'm concerned and without more context (or in doubt) I'd rather use the solution with the less indirection levels, but YMMV.
NB regarding generic relations: those can be handy when you really have a lot of possible related models and / or don't know beforehand which models one will want to relate to your own. This is specially useful for reusable apps (think "comments" or "tags" etc features) or extensible ones (content management frameworks etc). The downside is that it makes querying much heavier (and rather impractical when you want to do manual queries on your db). From experience, they can quickly become a PITA bot wrt/ code and perfs, so better to keep them for when there's no better solution (and/or when the maintenance and runtime overhead is not an issue).
My 2 cents.
Django has a new (since 2.2) interface for creating DB constraints: https://docs.djangoproject.com/en/3.0/ref/models/constraints/
You can use a CheckConstraint to enforce one-and-only-one is non-null. I use two for clarity:
class Inspection(models.Model):
InspectionID = models.AutoField(primary_key=True, unique=True)
GroupID = models.OneToOneField('PartGroup', on_delete=models.CASCADE, blank=True, null=True)
SiteID = models.OneToOneField('Site', on_delete=models.CASCADE, blank=True, null=True)
class Meta:
constraints = [
models.CheckConstraint(
check=~Q(SiteID=None) | ~Q(GroupId=None),
name='at_least_1_non_null'),
),
models.CheckConstraint(
check=Q(SiteID=None) | Q(GroupId=None),
name='at_least_1_null'),
),
]
This will only enforce the constraint at the DB level. You will need to validate inputs in your forms or serializers manually.
As a side note, you should probably use OneToOneField instead of ForeignKey(unique=True). You'll also want blank=True.
It might be late to answer your question, but I thought my solution might fit to some other person's case.
I would create a new model, let's call it Dependency, and apply the logic in that model.
class Dependency(models.Model):
Group = models.ForeignKey('PartGroup', on_delete=models.CASCADE, null=True, unique=True)
Site = models.ForeignKey('Site', on_delete=models.CASCADE, null=True, unique=True)
Then I would write the logic to be applicable very explicitly.
class Dependency(models.Model):
group = models.ForeignKey('PartGroup', on_delete=models.CASCADE, null=True, unique=True)
site = models.ForeignKey('Site', on_delete=models.CASCADE, null=True, unique=True)
_is_from_custom_logic = False
#classmethod
def create_dependency_object(cls, group=None, site=None):
# you can apply any conditions here and prioritize the provided args
cls._is_from_custom_logic = True
if group:
_new = cls.objects.create(group=group)
elif site:
_new = cls.objects.create(site=site)
else:
raise ValueError('')
return _new
def save(self, *args, **kwargs):
if not self._is_from_custom_logic:
raise Exception('')
return super().save(*args, **kwargs)
Now you just need to create a single ForeignKey to your Inspection model.
In your view functions, you need to create a Dependency object and then assign it to your Inspection record. Make sure that you use create_dependency_object in your view functions.
This pretty much makes your code explicit and bug proof. The enforcement can be bypassed too very easily. But the point is that it needs prior knowledge to this exact limitation to be bypassed.
I think you're talking about Generic relations, docs.
Your answer looks similar to this one.
Sometime ago I needed to use Generic relations but I read in a book and somewhere else that the usage should be avoided, I think it was Two Scoops of Django.
I ended up creating a model like this:
class GroupInspection(models.Model):
InspectionID = models.ForeignKey..
GroupID = models.ForeignKey..
class SiteInspection(models.Model):
InspectionID = models.ForeignKey..
SiteID = models.ForeignKey..
I‘m not sure if it is a good solution and as you mentioned you'd rather not use it, but this is worked in my case.

Django - How to build an intermediate m2m model that fits? / best practice

First of all I have to admit that I'm quite new to all this coding stuff but as I couldn't find a proper solution doing it myself and learning to code is probably the best way.
Anyway, I'm trying to build an app to show different titleholders, championships and stuff like that. After reading the Django documentation I figured out I have to use intermediate models as a better way. My old models.py looks like this:
class Person(models.Model):
first_name = models.CharField(max_length=64)
last_name = models.CharField(max_length=64)
[...]
class Team(models.Model):
name = models.CharField(max_length=64)
team_member_one = models.ForeignKey(Person)
team_member_two = models.ForeignKey(Person)
class Championship(models.Model):
name = models.CharField(max_length=128)
status = models.BooleanField(default=True)
class Titleholder(models.Model):
championship = models.ForeignKey(Championship)
date_won = models.DateField(null=True,blank=True)
date_lost = models.DateField(null=True,blank=True)
titleholder_one = models.ForeignKey(Person,related_name='titleholder_one',null=True,blank=True)
titleholder_two = models.ForeignKey(Person,related_name='titleholder_two',null=True,blank=True)
Championships can be won by either individuals or teams, depending if it's a singles or team championship, that's why I had to foreign keys in the Titleholder class. Looking at the Django documentation this just seems false. On the other hand, for me as a beginner, the intermediate model in the example just doesn't seem to fit my model in any way.
Long story short: can anyone point me in the right direction on how to build the model the right way? Note: this is merely a question on how to build the models and displaying it in the Django admin, I don't even talk about building the templates as of now.
Help would be greatly appreciated! Thanks in advance guys.
So I will take it up from scratch. You seem to be somewhat new to E-R Database Modelling. If I was trying to do what you do, I would create my models the following way.
Firstly, Team would've been my "corner" model (I use this term to mean models that do not have any FK fields), and then Person model would come into play as follows.
class Team(models.Model):
name = models.CharField(max_length=64)
class Person(models.Model):
first_name = models.CharField(max_length=64)
last_name = models.CharField(max_length=64)
team = models.ForeignKey(to=Team, null=True, blank=True, related_name='members')
This effectively makes the models scalable, and even if you are never going to have more than two people in a team, this is good practice.
Next comes the Championship model. I would connect this model directly with the Person model as a many-to-many relationship with a 'through' model as follows.
class Championship(models.Model):
name = models.CharField(max_length=64)
status = models.BooleanField(default=False) # This is not a great name for a field. I think should be more meaningful.
winners = models.ManyToManyField(to=Person, related_name='championships', through='Title')
class Title(models.Model):
championship = models.ForeignKey(to=Championship, related_name='titles')
winner = models.ForeignKey(to=Person, related_name='titles')
date = models.DateField(null=True, blank=True)
This is just the way I would've done it, based on what I understood. I am sure I did not understand everything that you're trying to do. As my understanding changes, I might modify these models to suit my need.
Another approach that can be taken is by using a GenericForeignKey field to create a field that could be a FK to either the Team model or the Person model. Or another thing that can be changed could be you adding another model to hold details of each time a championship has been held. There are many ways to go about it, and no one correct way.
Let me know if you have any questions, or anything I haven't dealt with. I will try and modify the answer as per the need.

Django many-to-many lookup from different models

I have some models that represents some companies and their structure. Also all models can generate some Notifications (Notes). User can see own Notes, and, of course, can't see others.
class Note(models.Model):
text = models.CharField(...)
class Company(models.Model):
user = models.ForeignKey(User)
note = models.ManyToManyField(Note, blank='True', null='True')
class Department(models.Model):
company = models.ForeignKey(Company)
note = models.ManyToManyField(Note, blank='True', null='True')
class Worker(models.Model):
department = models.ForeignKey(Department)
note = models.ManyToManyField(Note, blank='True', null='True')
class Document(models.Model)
company = models.ForeignKey(Company)
note = models.ManyToManyField(Note, blank='True', null='True')
The question is how I can collect all Notes for particular user to show them?
I can do:
Note.objects.filter(worker__company__user=2)
But its only for Notes that was generated by Workers. What about another? I can try hardcoded all existing models, but if do so dozen of kittens will die!
I also tried to use backward lookups but got "do not support nested lookups". May be I did something wrong.
EDIT:
As I mentioned above I know how to do this by enumerating all models (Company, Worker, etc. ). But if I will create a new model (in another App for example) that also can generate Notes, I have to change code in the View in another App, and that's not good.
You can get the Notes of a user by using the following query:
For example let us think that a user's id is 1 and we want to keep it in variable x so that we can use it in query. So the code will be like this:
>>x = 1
>>Note.objects.filter(Q(**{'%s_id' % 'worker__department__company__user' : x})|Q(**{'%s_id' % 'document__company__user' : x})|Q(**{'%s_id' % 'company__user' : x})|Q(**{'%s_id' % 'department__company__user' : x})).distinct()
Here I am running OR operation using Q and distinct() at the end of the query to remove duplicates.
EDIT:
As I mentioned above I know how to do this by enumerating all models
(Company, Worker, etc. ). But if I will create a new model (in another
App for example) that also can generate Notes, I have to change code
in the View in another App, and that's not good.
In my opinion, if you write another model, how are you suppose to get the notes from that model without adding new query? Here each class (ie. Department, Worker) are separately connected to Company and each of the classes has its own m2m relation with Note and there is no straight connection to User with Note's of other classes(except Company). Another way could be using through but for that you have change the existing model definitions.
Another Solution:
As you have mentioned in comments, you are willing to change the model structure if it makes your query easier, then you can try the following solution:
class BaseModel(models.Model):
user = models.Foreignkey(User)
note = models.ManyToManyField(Note)
reports_to = models.ForeignKey('self', null=True, default=None)
class Company(BaseModel):
class Meta:
proxy = True
class Document(BaseModel):
class Meta:
proxy = True
#And so on.....
Advantages: No need to create separate table for document/company etc.
object creation:
>>c= Company.objects.create(user_id=1)
>>c.note.add(Note.objects.create(text='Hello'))
>>d = Document.objects.create(user_id=1, related_to=c)
>>d.note.add(Note.objects.create(text='Hello World'))

Django Model Design

I've previously asked questions for this project, regarding the Django Admin, Inlines, Generics, etc. Thanks again to the people who contributed answers to those. For background, those questions are here:
Django - Designing Model Relationships - Admin interface and Inline
Django Generic Relations with Django Admin
However, I think I should probably review my model again, to make sure it's actually "correct". So the focus here is on database design, I guess.
We have a Django app with several objects - Users (who have a User Profile), Hospitals, Departments, Institutions (i.e. Education Institutions) - all of whom have multiple addresses (or no addresses). It's quite likely many of these will have multiple addresses.
It's also possible that multiple object may have the same address, but this is rare, and I'm happy to have duplication for those instances where it occurs.
Currently, the model is:
class Address(models.Model):
street_address = models.CharField(max_length=50)
suburb = models.CharField(max_length=20)
state = models.CharField(max_length=3, choices=AUSTRALIAN_STATES_CHOICES)
...
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = generic.GenericForeignKey()
...
class Hospital(models.Model):
name = models.CharField(max_length=20)
address = generic.GenericRelation(Address)
...
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
address = generic.GenericRelation(Address)
...
Firstly - am I doing it right, with the FK field on each object that has address(es)?
Secondly, is there a better alternative to using Generic Relations for this case, where different objects all have addresses? In the other post, somebody mentioned using abstract classes - I thought of that, but there seems to be quite a bit of duplication there, since the addresses model is basically identical for all.
Also, we'll be heavily using the Django admin, so it has to work with that, preferrably also with address as inlines.(That's where I was hitting an issue - because I was using generic relationships, the Django admin was expecting something in the content_type and object_id fields, and was erroring out when those were empty, instead of giving some kind of lookup).
Cheers,
Victor
I think what you've done is more complicated than using some kind of subclassing (the base class can either be abstract or not).
class Addressable(models.Model):
... # common attributes to all addressable objects
class Hospital(Addressable):
name = models.CharField(max_length=20)
...
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
class Address(models.Model):
street_address = models.CharField(max_length=50)
suburb = models.CharField(max_length=20)
state = models.CharField(max_length=3, choices=AUSTRALIAN_STATES_CHOICES)
...
owner = models.ForeignKey(Addressable)
You should consider making the base class (Addressable) abstract, if you don't want a seperate table for it in your database.

does custom user class break applications in django?

Let's say that I have subclassed User model (CustomUser) properly (as explained here: http://scottbarnham.com/blog/2008/08/21/extending-the-django-user-model-with-inheritance/)
and installed the comments app.
to access the user of a comment in the template I write:
{{comment.user}} # which provides User, not my CustomUser
and therefore,
{{comment.user.CustomProperty}} #does not work.
How can I work around it?
The ForeignKey from comments.Comment is to django's built-in User object, so querying comment.user will give you the parent object (ie: the base User model). However, django inheritance does provide a way to get the subclass version from the superclass:
{{ comment.user.customeruser }}
Which would then allow you to do:
{{ comment.user.customeruser.customproperty }}
I happen to think this is a weakness in Django's inheritance implementation since it doesn't exactly mirror the behaviour you'd expect from object inheritance in Python generally, but at least there is a workaround. Since I'm hardly rushing out to submit a patch with my version of the right behaviour, I can't complain :-)
I agree with Carl Meyer's comment: it could expensive to automatically fetch the subclass without altering the parent model's db table, and returning the instance of the subclass from the parent class query would be inconsistent with Django's promise that a queryset returns the model on which the queryset was run.
I still find in practice, however, that Django's inheritance leads to some awkward extra steps on occasion. Having used Django since 0.91, and knowing that all the different strategies for resolving object-relational mapping issues have tradeoffs, I'm very happy to have inheritance in Django now, and feel that the current implementation is excellent... so I'd hate for my original answer to be construed as a slight against the project.
As such, I thought I would edit this answer to link to an answer Carl himself provided on a solution in cases where you don't know what type the subclass is: How do I access the child classes of an object in Django without knowing the name of the child class?. He offers advice there for using the ContentType framework. Still some indirection involved, but a good, generalizable, option in the toolkit.
As you can see in the comments of that post, it is still controversially discussed, what is the best way.
I tried it by subclassing too, but I ran in many problems, while using profiles work perfectly for me.
class IRCUser(models.Model):
user = models.ForeignKey(User, unique=True)
name = models.CharField(max_length=100, blank= True, null = True )
friends = models.ManyToManyField("IRCUser", blank= True, null = True)
dataRecieved= models.BooleanField(default=False)
creating an IRCUser works like this:
>>> IRCUser(user = User.objects.get(username='Kermit')).save()
EDIT: Why are user_profiles elegant:
Let's assume, we are writing a webapp, that will behave as a multi-protocol chat. The users can provide their accounts on ICQ, MSN, Jabber, FaceBook, Google Talk .....
We are free to create a custom user class by inheritance, that will hold all the additional informations.
class CustomUser(User):
irc_username = models.CharField(blank=True, null=True)
irc_password = models.PasswordField(blank=True, null=True)
msn_username = models.CharField(blank=True, null=True)
msn_password = models.PasswordField(blank=True, null=True)
fb_username = models.CharField(blank=True, null=True)
fb_password = models.PasswordField(blank=True, null=True)
gt_username = models.CharField(blank=True, null=True)
gt_password = models.PasswordField(blank=True, null=True)
....
....
this leads to
data-rows with a lot of zero-values
tricky what-if-then validation
the impossibility, to have more the one account with the same service
So now let's do it with user_profiles
class IRCProfile(models.Model):
user = models.ForeignKey(User, unique=True, related_name='ircprofile')
username = models.CharField()
password = models.PasswordField()
class MSNProfile(models.Model):
user = models.ForeignKey(User, unique=True, related_name='msnprofile')
username = models.CharField()
password = models.PasswordField()
class FBProfile(models.Model):
user = models.ForeignKey(User, unique=True, related_name='fbprofile')
username = models.CharField()
password = models.PasswordField()
the result:
User_profiles can be created when needed
the db isn't flooded by zero-values
n profiles of same type can be assigned to one user
validation is easy
this may lead to a more cryptic syntax in the templates, but we are free to have some shortcuts in our views/template_tags or to use {% with ... %} to flavour it as we want.
I don’t think there is a way around it, because as you said, comment.user is a User, not a CustomUser.
See http://docs.djangoproject.com/en/dev/topics/db/models/#proxy-models:
QUERYSETS STILL RETURN THE MODEL THAT WAS REQUESTED
There is no way to have Django return, say, a MyUser object whenever you query for User objects. A queryset for User objects will return those types of objects. The whole point of proxy objects is that code relying on the original User will use those and your own code can use the extensions you included (that no other code is relying on anyway). It is not a way to replace the User (or any other) model everywhere with something of your own creation.
Maybe in this case, the old way of using UserProfile would be a better choice?
Sorry if I’m not helping much.