Customizing the entry uniqueness in Django

Customizing the entry uniqueness in Django - django

I have a database containing a list of ingredients. I'd like to avoid duplicate entries in this table. I don't want to use the unique keyword for 2 reasons :
My uniqueness constraints are a bit more sophisticated than a mere =
I don't want to raise an exception when a pre-existing ingredient model is created, instead I just want to return that model, so that I can write Ingredient(ingredient_name='tomato') and just go on with my day rather than encapsulating all of that in a try clause. This will allow me to easily add ingredients to my recipe table on the fly.
One solution is simply to have a wrapper function like create_ingredient, but I don't find that to be particularly elegant and more specifically it's not robust to some other developer down the line simply forgetting to use the wrapper. So instead, I'm playing around with the pre_init and post_init signals.
Here's what I have so far :
class Ingredient(models.Model):
ingredient_name = models.CharField(max_length=200)
recipes = models.ManyToManyField(Recipe,related_name='ingredients')
def __str__(self):
return self.ingredient_name
class Name(models.Model):
main_name = models.CharField(max_length=200, default=None)
equivalent_name = models.CharField(max_length=200, primary_key=True, default=None)
def _add_ingredient(sender, args, **kwargs):
if 'ingredient_name' not in kwargs['kwargs'] :
return
kwargs['kwargs']['ingredient_name'] = kwargs['kwargs']['ingredient_name'].lower()
# check if equivalent name exists, make this one the main one otherwise
try:
kwargs['kwargs']['ingredient_name'] = Name.objects.filter(
equivalent_name=kwargs['kwargs']['ingredient_name']
)[0].main_name
except IndexError:
name = Name(main_name=kwargs['kwargs']['ingredient_name'],
equivalent_name=kwargs['kwargs']['ingredient_name'])
name.save()
pre_init.connect(_add_ingredient, Ingredient)
So far so good. This actually works and will replace ingredient_name when needed before the model is initialized. Now what I'd like is to check if the ingredient in question already exists and have the initializer return it if it does. I think I need to play around with post_init to do this but I don't know how to modify the particular instance that's being created. Here's what I mean by that :
def _finalize_ingredient(sender, instance, **kwargs):
try:
# doesn't work because of python's "pass arguments in python's super unique way of doing things" thing
instance = Ingredient.objects.filter(ingredient_name=instance.ingredient_name)[0]
except IndexError:
pass
post_init.connect(_finalize_ingredient, Ingredient)
As I've commented, I don't expect this to work because instance = ... doesn't actually modify instance, it just reassigns the variable name (incidentally if you try to run this all sorts of terrible things happen which I don't care to understand because I know this is flat out wrong). So how do I actually do this ? I really hope wrapper functions aren't the cleanest option here. I'm a big fan of OOP and gosh darn it I want an OOP solution to this (which, as I've said, I think in the long run would be much more robust and safer than wrappers).
I realize of course that I can add an add_ingredient method to Recipe which will do all of this for me, but I really like the idea of containing all of this in my Ingredient class as it will guarantee the proper database behavior under any circumstance. I'm also curious as to know if/how the post_init method can be used to completely override the created object for a given circumstance.
By the way, some of you may be wondering why I don't have a ForeignKey entry in my Name class that would connect the Name table to the Ingredient table. After all, isn't this what my check is essentially accomplishing in my _add_ingredient method ? One of the reasons is that if I do this then I end up with the same problem I'm trying to solve here : If I want to create an ingredient on the fly to add it to my recipe, I could simply create a Name object when creating an Ingredient object, but that would raise an exception if it corresponds to a main_name that is already in use (rather than simply returning the object I need).

I believe you are looking for get_or_create(), which is already a built-in in Django.
You mention:
One solution is simply to have a wrapper function like create_ingredient, but I don't find that to be particularly elegant and more specifically it's not robust to some other developer down the line simply forgetting to use the wrapper.
Well, look at it the other way around. What if you actually need to create a "duplicate" ingredient? Then it is nice to have the possibility.

I've come up with something that is as elegant and robust as I think it's possible to be given what I'm after. I've still had to define an add_ingredient method, but I still have the robustness that I need. I've made it so that it can be generalized to any class with a primary key, and the Name table will contain the info that will define the name uniqueness of any table :
class Name(models.Model):
main_name = models.CharField(max_length=200, default=None)
equivalent_name = models.CharField(max_length=200, primary_key=True, default=None)
def _pre_init_unique_fetcher(sender, args, **kwargs):
pk_name = sender._meta.pk.name
if pk_name not in kwargs['kwargs'] :
return
kwargs['kwargs'][pk_name] = kwargs['kwargs'][pk_name].lower()
# check if equivalent name exists, make this one the main one otherwise
try:
kwargs['kwargs'][pk_name] = Name.objects.filter(
equivalent_name=kwargs['kwargs'][pk_name]
)[0].main_name
except IndexError:
name = Name(main_name=kwargs['kwargs'][pk_name],
equivalent_name=kwargs['kwargs'][pk_name])
name.save()
sender._input_dict = kwargs['kwargs']
def _post_init_unique_fetcher(sender, instance, **kwargs):
pk_name = sender._meta.pk.name
pk_instance = instance.__dict__[pk_name]
filter_dict = {}
filter_dict[pk_name] = pk_instance
try:
post_init.disconnect(_post_init_unique_fetcher,sender)
instance.__dict__ = sender.objects.filter(**filter_dict)[0].__dict__
post_init.connect(_post_init_unique_fetcher, sender)
for key in sender._input_dict:
instance.__dict__[key] = sender._input_dict[key]
del sender._input_dict
except IndexError:
post_init.connect(_post_init_unique_fetcher, sender)
except:
post_init.connect(_post_init_unique_fetcher, sender)
raise
unique_fetch_models = [Ingredient, Recipe, WeekPlan]
for unique_fetch_model in unique_fetch_models :
pre_init.connect(_pre_init_unique_fetcher, unique_fetch_model)
post_init.connect(_post_init_unique_fetcher, unique_fetch_model)
Now what this will do is load up any new model with the pre-existing data of the previous model (rather than the default values) if one with the same name exists. The reason I still need an add_ingredient method in my Recipe class is because I can't call Ingredient.objects.create() for a pre-existing ingredient without raising an exception despite the fact that I can create the model and immediately save it. This has to do with how Django handles the primary_key designation : if you create the model then save it, it assumes you're just updating the entry if it already exists with that key, and yet if you create it, it tries to add another entry and that conflicts with the primary_key designation. So now I can do things like recipe.add_ingredient(Ingredient(ingredient_name='tomato', vegetarian=True)).

Related

django distinct query using custom equivalence

Say that my model looks like this:
class Alert(models.Model):
datetime_alert = models.DateTimeField()
alert_type = models.ForeignKey(Alert_Type, on_delete=models.CASCADE)
dismissed = models.BooleanField(default=False)
datetime_dismissed = models.DateTimeField(null=True)
auid = models.CharField(max_length=64, unique=True)
entities = models.ManyToManyField(to='Entity', through='Entity_To_Alert_Map')
objects = Alert_Manager()
def __eq__(self, other):
return isinstance(other,
self.__class__) and self.alert_type == other.alert_type and \
self.entities.all() == other.entities().all() and self.dismissed == other.dismissed
def __ne__(self, other):
return not self.__eq(other)
what I'm trying to accomplish is say this: two alert objects are equivalent if the dismissed status, alert type, and the associated entities are the same. Using this idea, is it possible to write a query to ask for all the distinct alerts based off that criteria? Selecting all of them and then filtering them out doesn't seem appealing.

You mention one method to do it, and I don't think it is very bad. I'm not aware of anything in Django that can do this.
However, I want you to think why this problem arises? If two alerts are equal if message, status and type is the same, then maybe this should be it's own class. I would consider creating another class DistinctAlert (or some better name) and have a foreign key to this class from Alert. Or even better, have one class that is Alert, and one that is called AlertEvent(your Alert class).
Would this solve your problem?
Edit:
Actually, there is a way to do this. You can combine values() and distinct(). This way, your query will be
Alert.objects.all().values("alert_type", "dismissed", "entities").distinct()
This will return a dictionary.
See more in the documentation of values()

Implementing multiple person relationship

I've done a facebook like model, but I want the Personne to have more than one link with another Personne.
I have an intermediary table PersonneRelation with a custom save method. The idea is: when I add a relation to a person, I want to create another relation the other way. The problem is that if I try to save in the save method it's a recursive call. So my idea was to create a variable of the class and set it to True only when I want to avoid recursion.
Here's how I did:
class Personne(models.Model):
user = models.OneToOneField(User)
relations = models.ManyToManyField('self', through='PersonneRelation',
symmetrical=False)
class PersonneRelation(models.Model):
is_saving = False
# TAB_TYPES omitted for brevity
type_relation = models.CharField(max_length=1,
choices=[(a, b) for a, b in
list(TAB_TYPES.items())],
default=TYPE_FRIEND)
src = models.ForeignKey('Personne', related_name='src')
dst = models.ForeignKey('Personne', related_name='dst')
opposite = models.ForeignKey('PersonneRelation',
null=True, blank=True, default=None)
def save(self, *args, **kwargs):
if self.is_saving:
return super(PersonneRelation, self).save(args, kwargs)
old = None
if self.pk and self.opposite:
old = self.type_relation
retour = super(PersonneRelation, self).save(args, kwargs)
if old:
PersonneRelation.objects.filter(
src=self.dst, dst=self.src, opposite=self, type_relation=old
).update(type_relation=self.type_relation)
if self.opposite is None:
self.opposite = PersonneRelation(
src=self.dst, dst=self.src, opposite=self,
type_relation=self.type_relation, is_reverse=True)
self.opposite.save()
self.is_saving = True
self.save()
self.is_saving = False
return retour
My question is: is it safe to do so (using a class variable is_saving) (I dont know how Python deals with such variables)? If not, why? I feel like it's not ok, so what are the other possibilities to implement multiple many to many relationship that should behave like that?

Unfortunately, it's not safe, because it's not thread-safe. When two simultaneous Django threads will try to save your model, the behaviour can be unpredictable.
If you want to have more reliable locking, take a look, for example, at the Redis locking.
But to be honest, I'd try to implement it using plain reverse relations, maybe incapsulating the complexity into the ModelManager.

Here's how I modified it: I totally removed the save method and used the post_save message to check:
if it was created without opposite side, I create here with opposite side as the one created (and I can do it here without any problem!) then I update the one created with the "opposite"
if it wasn't created, this is an update, so just make sure the opposite side is changed as well.
I did this because I'll almost never have to change relationships between people, and when I'll create new ones there wont be any possible race conditions, because of the context where I will create new relationships
#receiver(post_save, sender=PersonneRelation)
def signal_receiver(sender, **kwargs):
created = kwargs['created']
obj = kwargs['instance']
if created and not obj.opposite:
opposite = PersonneRelation(
src=obj.dst, dst=obj.src, opposite=obj,
type_relation=obj.type_relation, is_reverse=True)
opposite.save()
obj.opposite = opposite
obj.save()
elif not created and obj.type_relation != obj.opposite.type_relation:
obj.opposite.type_relation = obj.type_relation
obj.opposite.save()

If I get the idea behind your code, then:
Django automatically makes relation available on both ends so you can get from src Personne to dst Personne via PersonneRelation and reverse dst -> src in your code. Therefore no need for additional opposite field in PersonneRelation.
If you need to have both symmetrical and asymmetrical realtions, i.e. src -> dst, but not dst -> src for particaular record, then I would suggest to add boolean field:
class PersonneRelation(models.Model):
symmetrical = models.BooleanField(default=False)
this way you can check if symmetrical is True when accessing relation in your code to identify if it's scr -> dst only or both src -> dst and dst -> src. In facebook terms: if symmetrical is False you get src is subscriber of dst, if it's True you get mutual friendship between src and dst. You might want to define custom manager to incapsulate this behavior, though it's more advanced topic.
If you need to check if the model instance is being saved or updated, there's no need in is_saving boolean field. Since you're using automatic primary key field, you can just check if pk on model instance is None. In Django before the model instance is first time saved to DB ('created') pk is None, when the instance is 'updated' (it has been read from DB before and is being saved now with some field values changed) it's pk is set to pk value from DB. This is the way Django ORM decides if it should update or create new record.
In general when redefining Save method on a model, or when using signals like pre_save/post_save take into consideration, that those functions you define on them might not be called by Django in some circumstances, i.e. when the model is updated in bulk. See Django docs for more info.

Django : Validate data by querying the database in a model form (using custom clean method)

I am trying to create a custom cleaning method which look in the db if the value of one specific data exists already and if yes raises an error.
I'm using a model form of a class (subsystem) who is inheriting from an other class (project).
I want to check if the sybsystem already exists or not when i try to add a new one in a form.
I get project name in my view function.
class SubsytemForm(forms.ModelForm):
class Meta:
model = Subsystem
exclude = ('project_name')
def clean(self,project_name):
cleaned_data = super(SubsytemForm, self).clean(self,project_name)
form_subsystem_name = cleaned_data.get("subsystem_name")
Subsystem.objects.filter(project__project_name=project_name)
subsystem_objects=Subsystem.objects.filter(project__project_name=project_name)
nb_subsystem = subsystem_objects.count()
for i in range (nb_subsystem):
if (subsystem_objects[i].subsystem_name==form_subsystem_name):
msg = u"Subsystem already existing"
self._errors["subsystem_name"] = self.error_class([msg])
# These fields are no longer valid. Remove them from the
# cleaned data.
del cleaned_data["subsystem_name"]
return cleaned_data
My view function :
def addform(request,project_name):
if form.is_valid():
form=form.save(commit=False)
form.project_id=Project.objects.get(project_name=project_name).id
form.clean(form,project_name)
form.save()
This is not working and i don't know how to do.
I have the error : clean() takes exactly 2 arguments (1 given)
My model :
class Project(models.Model):
project_name = models.CharField("Project name", max_length=20)
Class Subsystem(models.Model):
subsystem_name = models.Charfield("Subsystem name", max_length=20)
projects = models.ForeignKey(Project)

There are quite a few things wrong with this code.
Firstly, you're not supposed to call clean explicitly. Django does it for you automatically when you call form.is_valid(). And because it's done automatically, you can't pass extra arguments. You need to pass the argument in when you instantiate the form, and keep it as an instance variable which your clean code can reference.
Secondly, the code is actually only validating a single field. So it should be done in a specific clean_fieldname method - ie clean_subsystem_name. That avoids the need for mucking about with _errors and deleting the unwanted data at the end.
Thirdly, if you ever find yourself getting a count of something, iterating through a range, then using that index to point back into the original list, you're doing it wrong. In Python, you should always iterate through the actual thing - in this case, the queryset - that you're interested in. However, in this case that is irrelevant anyway as you should query for the actual name directly in the database and check if it exists, rather than iterating through checking for matches.
So, putting it all together:
class SubsytemForm(forms.ModelForm):
class Meta:
model = Subsystem
exclude = ('project_name')
def __init__(self, *args, **kwargs):
self.project_name = kwargs.pop('project_name', None)
super(SubsystemForm, self).__init__(*args, **kwargs)
def clean_subsystem_name(self):
form_subsystem_name = self.cleaned_data.get("subsystem_name")
existing = Subsystem.objects.filter(
project__project_name=self.project_name,
subsytem_name=form_subsystem_name
).exists()
if existing:
raise forms.ValidationError(u"Subsystem already existing")
return form_subsystem_name

When you do form=form.save(commit=False) you store a Subsystem instance in the variable form but the clean method is defined in SubsystemForm. Isn't it?

Django QuerySets - with a class method

Below is a stripped down model and associated method. I am looking for a simple way upon executing a query to get all of the needed information in a single answer without having to re-query everything. The challenge here is the value is dependent upon the signedness of value_id.
class Property(models.Model):
property_definition = models.ForeignKey(PropertyDefinition)
owner = models.IntegerField()
value_id = models.IntegerField()
def get_value(self):
if self.value_id < 0: return PropertyLong.objects.get(id=-self.value_id)
else: return PropertyShort.objects.get(id=self.value_id)
Right now to get the "value" I need to do this:
object = Property.objects.get(property_definition__name="foo")
print object.get_value()
Can someone provide a cleaner way to solve this or is it "good" enough? Ideally I would like to simply just do this.
object = Property.objects.get(property_definition__name="foo")
object.value
Thanks

Given this is a bad design. You can use the builtin property decorator for your method to make it act as a property.
class Property(models.Model):
property_definition = models.ForeignKey(PropertyDefinition)
owner = models.IntegerField()
value_id = models.IntegerField()
#property
def value(self):
if self.value_id < 0: return PropertyLong.objects.get(id=-self.value_id)
else: return PropertyShort.objects.get(id=self.value_id)
This would enable you to do what you'd ideally like to do: Property.objects.get(pk=1).value
But I would go as far as to call this "cleaner". ;-)
You could go further and write your own custom model field by extending django.models.Field to hide the nastiness in your schema behind an API. This would at least give you the API you want now, so you can migrate the nastiness out later.
That or the Generic Keys mentioned by others. Choose your poison...

this is a bad design. as Daniel Roseman said, take a look at generic foreign keys if you must reference two different models from the same field.
https://docs.djangoproject.com/en/1.3/ref/contrib/contenttypes/#generic-relations

Model inheritance could be used since value is not a Field instance.

Django - Lazy results with a context processor

I am working on a django project that requires much of the common page data be dynamic. Things that appear on every page, such as the telephone number, address, primary contact email etc all need to be editable via the admin panel, so storing them in the settings.py file isn't going to work.
To get around this, I created a custom context processor which acts as an abstract reference to my "Live Settings" model. The model looks like this:
class LiveSetting(models.Model):
id = models.AutoField(primary_key=True)
title = models.CharField(max_length=255, blank=False, null=False)
description = models.TextField(blank=True, null=True)
key = models.CharField(max_length=100, blank=False, null=False)
value = models.CharField(max_length=255, blank=True)
And the context processor like so:
from livesettings.models import LiveSetting
class LiveSettingsProcessor(object):
def __getattr__(self, request):
val = LiveSetting.objects.get(request)
setattr(self, val.key, val.value)
return val.value
l = LiveSettingsProcessor()
def livesetting_processors(request):
return {'settings':l}
It works really nicely, and in my template I can use {{ settings.primary_email }} and get the corresponding value from the database.
The problem with the above code is it handles each live setting request individually and will hit the database each time my {{ settings.*}} tag is used in a template.
Does anyone have any idea if I could make this process lazy, so rather than retrieve the value and return it, it instead updates a QuerySet then returns the results in one hit just before the page is rendered?

You are trying to invent something complex and these is no reason for that. Something as simple as this will work fork you good enough:
def livesetting_processors(request):
settings = LiveSetting.objects.get(request)
return {'settings':settings}
EDIT:
This is how you will solve your problem in current implementation:
class LiveSettingsProcessor(object):
def __getattr__(self, request):
val = getattr(self, '_settings', LiveSetting.objects.get(request))
setattr(self, val.key, val.value)
return val.value
#Hanpan, I've updated my answer to show how you can to solve your problem, but what I want to say is that things you are trying to achieve does not give any practical win, however it increase complexity ant it takes your time. It might also be harder to setup cache on all of this later. And with caching enabled this will not give any improvements in performance at all.
I don't know if you heard this before: premature optimization is the root of evil. I think this thread on SO is useful to read: https://stackoverflow.com/questions/211414/is-premature-optimization-really-the-root-of-all-evil

Maybe you could try Django's caching?
In particular, you may want to check out the low-level caching feature. It seems like it would be a lot less work than what you plan on.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js