Model integrity check in django

Model integrity check in django - django

I have a model named Entry that has the following fields
from django.contrib.auth.models import User
class Entry(models.Model):
start = models.DateTimeField()
end = models.DateTimeField()
creator = models.ForeignKey(User)
canceled = models.BooleanField(default=False)
When I create a new entry I don't want to be created if the creator has allready a created event between the same start and end dates. So my Idea was when user posts data from the creation form
if request.method == 'POST':
entryform = EntryAddForm(request.POST)
if entryform.is_valid():
entry = entryform.save(commit=False)
entry.creator = request.user
#check if an entry exists between start and end
if Entry.objects.get(creator=entry.creator, start__gte=entry.start, end__lte=entry.end, canceled=False):
#Add to message framework that an entry allready exists there
return redirect_to('create_form_page')
else:
#go on with creating the entry
I was thinking that maybe a unique field and checking properly db integrity would be better but it is the canceled field that's troubling me in to how to choose the unique fields. Do you think there will be somethng wrong with my method? I mean does it make sure that no entry will be set between start and end date for user is he has allready saved one? Do you think its better this code do go to the pre-save? The db will start empty, so after entering one entry, everything will take its course (assuming that...not really sure...)

You need to use Q for complex queries.
from django.db.models import Q
_exists = Entry.objects.filter(Q(
Q(Q(start__gte=entry.start) & Q(start__lte=entry.end)) |
Q(Q(end__gte=entry.start) & Q(end__lte=entry.end)) |
Q(Q(start__lte=enrty.start) & Q(end__gte=entry.end))
))
if _exists:
"There is existing event"
else:
"You can create the event"
Since I do not test this, I use Q objects whereever I thought would be necessary.
Using this query, you will not need any unique check.

Related

How to Update models field based on another field of same models

class PurchaseOrder(models.Model):
purchase_order_id = models.AutoField(primary_key=True)
purchase_order_number = models.CharField(unique=True)
vendor = models.ForeignKey(Vendor)
i am creating Purchase Order(po) table. when po created i have to update purchase_order_number as "PO0"+purchase_order_id ex PO0123 (123 is Primary key). so i am using def save in models to accomplish this
def save(self):
if self.purchase_order_id is not None:
self.purchase_order_number = "PO"+str(self.purchase_order_id)
return super(PurchaseOrder, self).save()
It is working fine with single creation but when i try to create bulk of data using locust(Testing tool) its giving an error duplicate entry for PurchseOrdernumber Can we modify field value in models itself some thing like this
purchase_order_number = models.CharField(unique=True,default=("PO"+self.purchase_order_id )

To be honest, I don't think it should work when you create multiple instances. Because as I can see from the code:
if self.purchase_order_id is not None:
self.purchase_order_number = "PO"+str(self.purchase_order_id)
Here purchase_order_id will be None when you are creating new instance. Also, until you call super(PurchaseOrder, self).save(), it will not generate purchase_order_id, meaning purchase_order_number will be empty.
So, what I would recommend is to not store this information in DB. Its basically the same as purchase_order_id with PO in front of it. Instead you can use a property method to get the same value. Like this:
class PurchaseOrder(models.Model):
purchase_order_id = models.AutoField(primary_key=True)
# need to remove `purchase_order_number = models.CharField(unique=True)`
...
#property
def purchase_order_number(self):
return "PO{}".format(self.purchase_order_id)
So, you can also see the purchase_order_number like this:
p = PurchaseOrder.objects.first()
p.purchase_order_number
Downside of this solution is that, you can't make any query on the property field. But I don't think it would be necessary anyway, because you can do the same query for the purchase_order_id, ie PurchaseOrder.objects.filter(purchase_order_id=1).

Checking for overlapping TimeField ranges

I have this model:
class Task(models.Model):
class Meta:
unique_together = ("campaign_id", "task_start", "task_end", "task_day")
campaign_id = models.ForeignKey(Campaign, on_delete=models.DO_NOTHING)
playlist_id = models.ForeignKey(PlayList, on_delete=models.DO_NOTHING)
task_id = models.AutoField(primary_key=True, auto_created=True)
task_start = models.TimeField()
task_end = models.TimeField()
task_day = models.TextField()
I need to write a validation test that checks if a newly created task time range overlaps with an existing one in the database.
For example:
A task with and ID 1 already has a starting time at 5:00PM and ends at 5:15PM on a Saturday. A new task cannot be created between the first task's start and end time. Where should I write this test and what is the most efficent way to do this? I also use DjangoRestFramework Serializers.

When you receive the form data from the user, you can:
Check the fields are consistent: user task_start < user task_end, and warn the user if not.
Query (SELECT) the database to retrieve all existing tasks which intercept the user time,
Order the records by task_start (ORDER BY),
Select only records which validate your criterion, a.k.a.:
task_start <= user task_start <= task_end, or,
task_start <= user task_end <= task_end.
warn the user if at least one record is found.
Everything is OK:
Construct a Task instance,
Store it in database.
Return success.
Implementation details:
task_start and task_end could be indexed in your database to improve selection time.
I saw that you also have a task_day field (which is a TEXT).
You should really consider using UTC DATETIME fields instead of TEXT, because you need to compare date AND time (and not only time): consider a task which starts at 23:30 and finish at 00:45 the day after…

This is how I solved it. It's not optimal by far, but I'm limited to python 2.7 and Django 1.11 and I'm also a beginner.
def validate(self, data):
errors = {}
task_start = data.get('task_start')
task_end = data.get('task_end')
time_filter = Q(task_start__range=[task_start, task_end])
| Q(task_end__range=[task_start, task_end])
filter_check = Task.objects.filter(time_filter).exists()
if task_start > task_end:
errors['error'] = u'End time cannot be earlier than start time!'
raise serializers.ValidationError(errors)
elif filter_check:
errors['errors'] = u'Overlapping tasks'
raise serializers.ValidationError(errors)
else:
pass
return data

Child model updates Parent model in Django ForeignKey relationship

Assuming the following models schema,
Parent model:
class Batch(models.Model):
start = models.DateTimeField()
end = models.DateTimeField()
One of many child models:
class Data(models.Model):
batch = models.ForeignKey(Batch, on_delete=models.ON_CASCADE)
timestamp = models.DateTimeField()
My goals is the following: to have a start field of parent model that is always updated when any child model is modified.
Basically, if the timestamp of a newly data instance is older than the start field I want the start field to be updated to that instance timestamp value. In the case of deletion of the data instance which is the oldest time reference point I want batch start field to be updated to the second oldest. Vice-versa for the end field.

One of the possible way to do this is to add post or pre-save signal of relative models and Update your necessary fields according to this. Django official documentation for signal, link. I want to add another link, one of the best blog post i have seen regarding django signal.
Edit for André Guerra response
One of easiest way to do a get call and bring Batch instance. What i want to say
#receiver(post_save,sender=Data)
def on_batch_child_saving(sender,instance,**kwargs):
batch_instance = Batch.objects.get(pk=instance.batch)
if (instance.timestamp < batch_instance.start):
batch_instance.start = instance.timestamp
batch_instance.save()
elif (instance.timestamp > batch_instance.end):
batch_instance.end = instance.timestamp
batch_instance.save()

Based on Shakil suggestion, I come up with this: (my doubt here was on how to save the parent model)
#receiver(post_save,sender=Data)
def on_batch_child_saving(sender,instance,**kwargs):
if (instance.timestamp < instance.batch.start):
instance.batch.start = instance.timestamp
instance.batch.save()
elif (instance.timestamp > instance.batch.end):
instance.batch.end = instance.timestamp
instance.batch.save()

Django: object creation in atomic transaction

I have a simple Task model:
class Task(models.Model):
name = models.CharField(max_length=255)
order = models.IntegerField(db_index=True)
And a simple task_create view:
def task_create(request):
name = request.POST.get('name')
order = request.POST.get('order')
Task.objects.filter(order__gte=order).update(order=F('order') + 1)
new_task = Task.objects.create(name=name, order=order)
return HttpResponse(new_task.id)
View shifts existing tasks that goes after newly created by + 1, then creates a new one.
And there are lots of users of this method, and I suppose something will go wrong one day with ordering because update and create definitely should be performed together.
So, I just want to be shure, will it be enough to avoid any data corruptions:
from django.db import transaction
def task_create(request):
name = request.POST.get('name')
order = request.POST.get('order')
with transaction.atomic():
Task.objects.select_for_update().filter(order__gte=order).update(order=F('order') + 1)
new_task = Task.objects.create(name=name, order=order)
return HttpResponse(new_task.id)
1) Probably, something more should be done in task creation line like select_for_update before filter of existing Task.objects?
2) Does it matter where return HttpResponse() is located? Inside transaction block or outside?
Big thx

1) Probably, something more should be done in task creation line like select_for_update before filter of existing Task.objects?
No - what you have currently looks fine and should work the way you want it to.
2) Does it matter where return HttpResponse() is located? Inside transaction block or outside?
Yes, it does matter. You need to return a response to the client regardless of whether your transaction was successful or not - so it definitely needs to be outside of the transaction block. If you did it inside the transaction, the client would get a 500 Server Error if the transaction failed.
However if the transaction fails, then you will not have a new task ID and cannot return that in your response. So you probably need to return different responses depending on whether the transaction succeeds, e.g,:
from django.db import IntegrityError, transaction
try:
with transaction.atomic():
Task.objects.select_for_update().filter(order__gte=order).update(
order=F('order') + 1)
new_task = Task.objects.create(name=name, order=order)
except IntegrityError:
# Transaction failed - return a response notifying the client
return HttpResponse('Failed to create task, please try again!')
# If it succeeded, then return a normal response
return HttpResponse(new_task.id)

You could also try to change your model so you don't need to update so many other rows when inserting a new one.
For example, you could try something resembling a double-linked list.
(I used long explicit names for fields and variables here).
# models.py
class Task(models.Model):
name = models.CharField(max_length=255)
task_before_this_one = models.ForeignKey(
Task,
null=True,
blank=True,
related_name='task_before_this_one_set')
task_after_this_one = models.ForeignKey(
Task,
null=True,
blank=True,
related_name='tasks_after_this_one_set')
Your task at the top of the queue would be the one that has the field task_before_this_one set to null. So to get the first task of the queue:
# these will throw exceptions if there are many instances
first_task = Task.objects.get(task_before_this_one=None)
last_task = Task.objects.get(task_after_this_one=None)
When inserting a new instance, you just need to know after which task it should be placed (or, alternatively, before which task). This code should do that:
def task_create(request):
new_task = Task.objects.create(
name=request.POST.get('name'))
task_before = get_object_or_404(
pk=request.POST.get('task_before_the_new_one'))
task_after = task_before.task_after_this_one
# modify the 2 other tasks
task_before.task_after_this_one = new_task
task_before.save()
if task_after is not None:
# 'task_after' will be None if 'task_before' is the last one in the queue
task_after.task_before_this_one = new_task
task_after.save()
# update newly create task
new_task.task_before_this_one = task_before
new_task.task_after_this_one = task_after # this could be None
new_task.save()
return HttpResponse(new_task.pk)
This method only updates 2 other rows when inserting a new row. You might still want to wrap the whole method in a transaction if there is really high concurrency in your app, but this transaction will only lock up to 3 rows, not all the others as well.
This approach might be of use to you if you have a very long list of tasks.
EDIT: how to get an ordered list of tasks
This can not be done at the database level in a single query (as far as I know), but you could try this function:
def get_ordered_task_list():
# get the first task
aux_task = Task.objects.get(task_before_this_one=None)
task_list = []
while aux_task is not None:
task_list.append(aux_task)
aux_task = aux_task.task_after_this_one
return task_list
As long as you only have a few hundered tasks, this operation should not take that much time so that it impacts the response time. But you will have to try that out for yourself, in your environment, your database, your hardware.

Django: Adding objects to a related set without saving to DB

I'm trying to write an internal API in my application without necessarily coupling it with the database.
class Product(models.Model):
name=models.CharField(max_length=4000)
price=models.IntegerField(default=-1)
currency=models.CharField(max_length=3, default='INR')
class Image(models.Model):
# NOTE -- Have changed the table name to products_images
width=models.IntegerField(default=-1)
height=models.IntegerField(default=-1)
url=models.URLField(max_length=1000, verify_exists=False)
product=models.ForeignKey(Product)
def create_product:
p=Product()
i=Image(height=100, widght=100, url='http://something/something')
p.image_set.add(i)
return p
Now, when I call create_product() Django throws up an error:
IntegrityError: products_images.product_id may not be NULL
However, if I call p.save() & i.save() before calling p.image_set.add(i) it works. Is there any way that I can add objects to a related object set without saving both to the DB first?

def create_product():
product_obj = Product.objects.create(name='Foobar')
image_obj = Image.objects.create(height=100, widght=100, url='http://something/something', product=product_obj)
return product_obj
Explanation:
Product object has to be created first and then assign it to the Image object because id and name here is required field.
I am wondering why wouldn't you not require to make a product entry in DB in first case? If there is any specific reason then i may suggest you some work around?
EDIT: Okay! i think i got you, you don't want to assign a product to an image object initially. How about creating a product field as null is equal to true.
product = models.ForeignKey(Product, null=True)
Now, your function becomes something like this:
def create_product():
image_obj = Image.objects.create(height=100, widght=100, url='http://something/something')
return image_obj
Hope it helps you?

I got same issue with #Saurabh Nanda
I am using Django 1.4.2. When I read in django, i see that
# file django/db/models/fields/related.py
def get_query_set(self):
try:
return self.instance._prefetched_objects_cache[rel_field.related_query_name()]
except (AttributeError, KeyError):
db = self._db or router.db_for_read(self.model, instance=self.instance)
return super(RelatedManager,self).get_query_set().using(db).filter(**self.core_filters)
# file django/db/models/query.py
qs = getattr(obj, attname).all()
qs._result_cache = vals
# We don't want the individual qs doing prefetch_related now, since we
# have merged this into the current work.
qs._prefetch_done = True
obj._prefetched_objects_cache[cache_name] = qs
That 's make sese, we only need to set property _prefetched_objects_cache for the object.
p = Product()
image_cached = []
for i in xrange(100):
image=Image(height=100, widght=100, url='http://something/something')
image_cached.append(image)
qs = p.images.all()
qs._result_cache = image_cached
qs._prefetch_done = True
p._prefetched_objects_cache = {'images': qs}

Your problem is that the id isn't set by django, but by the database (it's represented in the database by an auto-incremented field), so until it's saved there's no id. More about this in the documentation.
I can think of three possible solutions:
Set a different field of your Image model as the primary key (documented here).
Set a different field of your Production model as the foreign key (documented here).
Use django's database transactions API (documented here).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Model integrity check in django - django

Related

How to Update models field based on another field of same models

Checking for overlapping TimeField ranges

Child model updates Parent model in Django ForeignKey relationship

Django: object creation in atomic transaction

Django: Adding objects to a related set without saving to DB

Categories

Resources