I'd like to know how I can design my Django model to achieve the following:
Road -> Category (required): Highway (select list)
Road -> Attribute (optional): Traffic -> Heavy + Moderate (checkboxes)
Road -> Attribute (optional): Condition -> Smooth + Rough + Average(checkboxes)
Does it make sense to include TRAFFIC_CHOICES, CONDITION_CHOICES under the Road class vs creating separate classes for each set of choices vs creating a generic Attribute class?
How do I display choices as checkboxes?
The end goal of this model is to be able to create queries such as "Highway roads that are smooth with no traffic"
Here is my attempt:
class Category(models.Model):
CATEGORY_CHOICES = (
('highway', 'Highway'),
('parkway', 'Parkway'),
)
name = models.CharField(max_length=1, choices=CATEGORY_CHOICES, blank=False)
class Road(models.Model):
name = models.TextField(blank=False)
TRAFFIC_CHOICES = (
('moderate', 'Moderate'),
('busy', 'Busy'),
)
traffic = models.CharField(max_length=1, choices=TRAFFIC_CHOICES)
CONDITION_CHOICES = (
('smooth', 'Smooth'),
('rough', 'Rough'),
('average', 'Average'),
)
condition = models.CharField(max_length=1, choices=CONDITION_CHOICES)
First, change the first models.TextField to a CharField like the others.
Category does not have to be a separate model, unless you intend to add new categories after your application is finished, in which case it must be a separate model and you should use a ForeignKey relationship from the Road to the Category and ditch the CATEGORY_CHOICES.
Assuming you do not ever intend to add new categories, you can get rid of the Category model entirely and put the CATEGORY_CHOICES into Road. Then change name = to category = and put that into Road too.
You have a max_length of 1 for all of those fields, which is fine, but in that case you need to make the CHOICES map to single characters so they fit in the field. For instance:
CATEGORY_CHOICES = (
('H', 'Highway'),
('P', 'Parkway'),
)
Why do you want to use checkboxes for your traffic and condition choices? Checkboxes mean that you can select multiple answers instead of being forced to select just one. This doesn't make sense conceptually for road condition or traffic, and it isn't compatible with a CharField (because a CharField can only store one value, without some very contrived setup). You could keep the system you have and use radio buttons if you prefer them over dropdowns, but you can't use checkboxes without getting rid of the choices and instead making each possible checked box its own BooleanField or NullBooleanField.
I usually put my choices outside of the class definition. I don't know if it works how you expect it to work when it's inside. I'm going to move them outside the class definition for my example; this may not be required, so feel free to experiment.
To summarize (and I'm changing the name field to a CharField because TextField is only for really large blocks and not for names):
CATEGORY_CHOICES = (
('H', 'Highway'),
('P', 'Parkway'),
)
TRAFFIC_CHOICES = (
('M', 'Moderate'),
('B', 'Busy'),
)
CONDITION_CHOICES = (
('S', 'Smooth'),
('R', 'Rough'),
('V', 'Varying'),
)
class Road(models.Model):
name = models.CharField(max_length=512, blank=False)
category = models.CharField(max_length=1, choices=CATEGORY_CHOICES, blank=False)
traffic = models.CharField(max_length=1, choices=TRAFFIC_CHOICES)
condition = models.CharField(max_length=1, choices=CONDITION_CHOICES)
Edit: If you really are sure that checkboxes are the best fit, then you have two main choices. As above, if you intend to add choices later after your application is finished, then your strategy is different (you need to use a ManyToManyField). Otherwise you could implement them like this:
class Road(models.Model):
name = models.CharField(max_length=512, blank=False)
category = models.CharField(max_length=1, choices=CATEGORY_CHOICES, blank=False)
moderate_traffic = models.NullBooleanField()
heavy_traffic = models.NullBooleanField()
smooth_condition = models.NullBooleanField()
rough_condition = models.NullBooleanField()
varying_condition = models.NullBooleanField()
The part where you display them as grouped checkboxes specifically happens when you're displaying your form, not in the model.
You could also use some kind of bit field, but that's not included in Django by default -- you'd have to install an extension.
Related
Let's say I have the following models:
class Color(models.Model):
name = models.CharField(max_length=255, unique=True)
users = models.ManyToManyField(User, through="UserColor", related_name="colors")
class UserColor(models.Model):
class Meta:
unique_together = (("user", "color"), ("user", "rank"))
user = models.ForeignKey(User, on_delete=models.CASCADE)
color = models.ForeignKey(Color, on_delete=models.CASCADE)
rank = models.PositiveSmallIntegerField()
I want to fetch all users from the database with their respective colors and color ranks. I know I can do this by traversing across the through model, which makes a total of 3 DB hits:
users = User.objects.prefetch_related(
Prefetch(
"usercolor_set",
queryset=UserColor.objects.order_by("rank").prefetch_related(
Prefetch("color", queryset=Color.objects.only("name"))
),
)
)
for user in users:
for usercolor in user.usercolor_set.all():
print(user, usercolor.color.name, usercolor.rank)
I discovered another way to do this by annotating the rank onto the Color objects, which makes sense because we have a unique constraint on user and color.
users = User.objects.prefetch_related(
Prefetch(
"colors",
queryset=(
Color.objects.annotate(rank=F("usercolor__rank"))
.order_by("rank")
.distinct()
),
)
)
for user in users:
for color in user.colors.all():
print(user, color, color.rank)
This approach comes with several benefits:
Makes only 2 DB hits instead of 3.
Don't have to deal with the through object, which I think is more intuitive.
However, it only works if I chain distinct() (otherwise I get duplicate objects) and I'm worried this may not be a legit approach (maybe I just came up with a hack that may not work in all cases).
So is the second solution legit? Is there a better way to it? Or should I stick to the first one?
I have some models in Django:
# models.py, simplified here
class Category(models.Model):
"""The category an inventory item belongs to. Examples: car, truck, airplane"""
name = models.CharField(max_length=255)
class UserInterestCategory(models.Model):
"""
How interested is a user in a given category. `interest` can be set by any method, maybe a neural network or something like that
"""
user = models.ForeignKey(User, on_delete=models.CASCADE) # user is the stock Django user
category = models.ForeignKey(Category, on_delete=models.CASCADE)
interest = models.PositiveIntegerField(default=0, validators=[MinValueValidator(0)])
class Item(models.Model):
"""This is a product that we have in stock, which we are trying to get a User to buy"""
model_number = models.CharField(max_length=40, default="New inventory item")
product_category = models.ForeignKey(Category, null=True, blank=True, on_delete=models.SET_NULL, verbose_name="Category")
I have a list view showing items, and I'm trying to sort by user_interest_category for the currently logged in user.
I have tried a couple different querysets and I'm not thrilled with them:
primary_queryset = Item.objects.all()
# this one works, and it's fast, but only finds items the users ALREADY has an interest in --
primary_queryset = primary_queryset.filter(product_category__userinterestcategory__user=self.request.user).annotate(
recommended = F('product_category__userinterestcategory__interest')
)
# this one works great but the baby jesus weeps at its slowness
# probably because we are iterating through every user, item, and userinterestcategory in the db
primary_queryset = primary_queryset.annotate(
recommended = Case(
When(product_category__userinterestcategory__user=self.request.user, then=F('product_category__userinterestcategory__interest')),
default=Value(0),
output_field=IntegerField(),
)
)
# this one works, but it's still a bit slow -- 2-3 seconds per query:
interest = Subquery(UserInterestCategory.objects.filter(category=OuterRef('product_category'), user=self.request.user).values('interest'))
primary_queryset = primary_queryset.annotate(interest)
The third method is workable, but it doesn't seem like the most efficient way to do things. Isn't there a better method than this?
I am running into a little bit of unique problem and wanted to see which solution fit best practice or if I was missing anything in my design.
I have a model - it has a field on it that represents a metric. That metric is a foreign key to an object which can come from several database tables.
Idea one:
Multiple ForeignKey fields. I'll have the benefits of the cascade options, direct access to the foreign key model instance from MyModel, (although that's an easy property to add), and the related lookups. Pitfalls include needing to check an arbitrary number of fields on the model for a FK. Another is logic to make sure that only one FK field has a value at a given time (easy to check presave) although .update poses a problem. Then theres added space in the database from all of the columns, although that is less concerning.
class MyModel(models.Model):
source_one = models.ForeignKey(
SourceOne,
null=True,
blank=True,
on_delete=models.SET_NULL,
db_index=True
)
source_two = models.ForeignKey(
SourceTwo,
null=True,
blank=True,
on_delete=models.SET_NULL,
db_index=True
)
source_three = models.ForeignKey(
SourceThree,
null=True,
blank=True,
on_delete=models.SET_NULL,
db_index=True
)
Idea two:
Store a source_id and source on the model. Biggest concern I have with this is needing to maintain logic to set these fields to null if the source is deleted. It otherwise seems like a cleaner solution, but not sure if the overhead to make sure the data is accurate is worth it. I can probably write some logic in a delete hook on the fk models to clean MyModel up if necessary.
class MyModel(models.Model):
ONE = 1
TWO = 2
THREE = 3
SOURCES = (
(ONE, "SourceOne"),
(TWO, "SourceTwo"),
(THREE, "SourceThree")
)
source_id = models.PositiveIntegerField(null=True, blank=True)
source = models.PositiveIntegerField(null=True, blank=True, choices=SOURCES)
I would love the communities opinion.
Your second idea seems fragile as the integrity is not ensured by the database as you have pointed out yourself.
Without knowing more about the use case, it's difficult to provide an enlightened advice however if your "metric" object is refered by many other tables, I wonder if you should consider approaching this the other way round, i.e. defining the relationships from the models consuming this metric.
To exemplify, let's say that your project is a photo gallery and that your model represents a tag. Tags could be associated to photos, photo albums or users (e.g.. the tags they want to follow).
The approach would be as follow:
class Tag(models.Model):
pass
class Photo(models.Model):
tags = models.ManyToManyField(Tag)
class Album(models.Model):
tags = models.ManyToManyField(Tag)
class User(AbstractUser):
followed_tags = models.ManyToManyField(Tag)
You may even consider to factor in this relationship in an abstract model as outlined below:
class Tag(models.Model):
pass
class TaggedModel(models.Model):
tags = models.ManyToManyField(Tag)
class Meta:
abstract = True
class Photo(TaggedModel):
pass
As mentioned in the comments, you are looking for a Generic Relation:
from django.contrib.contenttypes.fields import GenericForeignKey
from django.contrib.contenttypes.models import ContentType
class SourceA(models.Model):
name = models.CharField(max_length=45)
class SourceB(models.Model):
name = models.CharField(max_length=45)
class MyModel(models.Model):
content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
object_id = models.PositiveIntegerField()
source = GenericForeignKey('content_type', 'object_id')
There are three parts to setting up a Generic Relation:
Give your model a ForeignKey to ContentType. The usual name for this field is “content_type”.
Give your model a field that can store primary key values from the models you’ll be relating to. For most models, this means a PositiveIntegerField. The usual name for this field is “object_id”.
Give your model a GenericForeignKey, and pass it the names of the two fields described above. If these fields are named “content_type” and “object_id”, you can omit this – those are the default field names GenericForeignKey will look for.
Now you can pass any Source instance to the source field of MyModel, regardless of which model it belongs to:
source_a = SourceA.objects.first()
source_b = SourceB.objects.first()
MyModel.objects.create(source=source_a)
MyModel.objects.create(source=source_b)
I have a non-ForeignKey ID from an external system that acts as my join key. I'd like to do Django ORM style queries using this ID.
My desired query is:
results = MyModel.objects.filter(level='M', children__name__contains='SOMETHING')
My model looks like this:
class MyModel(BaseModel):
LEVELS = (
('I', 'Instance'),
('M', 'Master'),
('J', 'Joined')
)
level = models.CharField(max_length=2, choices=LEVELS, default='I')
parent = models.ForeignKey('self', blank=True, null=True, related_name='children', on_delete=models.SET_NULL )
master_id = models.CharField(max_length=200)
name = models.CharField(max_length=300, blank=True, null=True)
This works fine with parent as a field, but parent is redundant with the master_id field: master_id indicates which children belong to which master node. I'd like to get rid of parent (primarily because the dataset is fairly large and setting the parent IDs when importing data takes a long time).
The SQL equivalent of what I'm looking for is:
SELECT
DISTINCT( s_m.master_id )
FROM
mytable s_m JOIN
mytable s_i ON
s_i.level = 'I' and s_m.level='M' AND s_i.master_id == s_m.master_id
WHERE
s_i.name like '%SOMETHING%';
I believe there's a way to use Manager or QuerySet to enable clean querying of children (in this case, the children's names) within the Django ORM framework, but I can't figure out how. Any pointers would be appreciated.
Can you try something like this?
table1.objects.filter(master_id__in=table2.objects.filter(level='I').values_list(master_id,flat=True),level='M',name__contains='SOMETHING').values_list(master_id).distinct()
The basic idea is that I want to track training and have a roster for each training session. I would like to also track who entered each person in the roster hence a table rather than just an M2M to the Member model within Training.
So, here is what I currently have:
class Training( models.Model ):
name = models.CharField( max_length=100, db_index=True )
date = models.DateField( db_index=True )
roster = models.ManyToManyField( Member, through='TrainingRoster' )
class TrainingRoster( models.Model ):
training = models.ForeignKey( Training )
member = models.ForeignKey( Member )
## auto info
entered_by = models.ForeignKey( Member, related_name='training_roster_entered_by' )
entered_on = models.DateTimeField( auto_now_add = True )
The problem is that django doesn't like the "roster=models.m2m( Member, through='TrainingRoster') as there are two fields in TrainingRoster with a ForeignKey of Member. I understand why it is unhappy, but is there not a way to specify something like: through='TrainingRoster.member'. That doesn't work, but it seems like it should.
[I will admit that I am wondering if the "entered_by" and "entered_on" fields are the best for these models. I want to track who is entering each piece of information but possible a log table might be better than having the two extra fields in the TrainingRoster table. But that is a whole separate question. Though would make this question easier. :-) ]