Finding objects according to related objects - django

I am trying to build a messaging application with Django. The reason I don’t use postman is that I need messaging between other objects than users and I don’t need most of the postman’s features.
Here are my models:
class Recipient(models.Model):
...
def get_unread_threads():
see below...
class Thread(models.Model):
author = models.ForeignKey(Recipient, related_name='initiated_threads')
recipients = models.ManyToManyField(
Tribe,
related_name='received_thread',
through='ThreadReading')
subject = models.CharField(max_length=64)
class Meta:
app_label = 'game'
class Message(models.Model):
author = models.ForeignKey(Recipient, related_name='sent_messages')
date_add = models.DateTimeField(auto_now_add=True)
date_edit = models.DateTimeField(blank=True)
content = models.TextField(max_length=65535)
thread = models.ForeignKey(Thread)
class Meta:
get_latest_by = 'date'
class ThreadReading(models.Model):
thread = models.ForeignKey(Thread)
recipient = models.ForeignKey(Recipient)
date_last_reading = models.DateTimeField(auto_now=True)
My problem is about get_unread_threads. I can’t really find out how to do that. Here is a first try:
def get_unread_threads(self):
"""
return a queryset corresponding to the threads
which at least one message is unread by the recipient
"""
try:
query = self.received_thread.filter(
message__latest__date_add__gt=\
self.threadreading_set.get(thread_id=F('id')).date_last_reading)
except ObjectDoesNotExist:
query = None
return query
But obviously it doesn’t work because lookup can’t follow method latest.

Here you go:
# Get all the readings of the user
thread_readings = recipient.threadreading_set.all()
# Build a query object including all messages who's last reading is after the
# last edit that was made AND whose thread is the thread of the current
# iteration's thread reading
q = models.Q()
for thread_reading in thread_readings:
q = q | models.Q(
models.Q(
date_edit__lte=thread_reading.date_last_reading
& models.Q(
thread=thread_reading.thread
)
)
)
# Get a queryset of all the messages, including the threads (via a JOIN)
queryset = Message.objects.select_related('thread')
# Now, exclude from the queryset every message that matches the above query
# (edited prior to last reading) OR that is in the list of last readings
queryset = queryset.exclude(
q | models.Q(
thread__pk__in=[thread_reading.pk for thread_reading in thread_readings]
)
)
# Make an iterator (to pretend that this is efficient) and return a generator of it
iterator = queryset.iterator()
return (message.thread for message in iterator)
:)
Now, don't ever actually do this - rethink your models. I would read a book called "Object Oriented Analysis and Design with Applications". It'll teach you a great deal about how to think when you're data modelling.

Related

Django raise custom error if invalid API parameters

I am writing an API endpoint that queries for an Event in the time window given.
The Event model has a start and end time, and the query accepts a search start time and an search end time. Any Event that overlaps between these two windows is returned. This works well.
However, if the user asks to search using a search_time_start that is after the search_time_end, it returns no results (obviously), with no indication why.
Instead I would like to return an Error message indicating what is wrong.
I have a rough idea of how to do this, but not how to plug it into my existing code.
Model
class Event(models.Model):
name = model.CharField(max_length=64)
start_time = models.DateTimeField(null=False)
end_time = models.DateTimeField(null=False)
Serializer
class EventSerializer(serializers.ModelSerializer):
start_time = serializers.DateTimeField()
end_time = serializers.DateTimeField()
class Meta:
model = Event
fields = ("start_time", "end_time")
Filter and View
class EventFilterSet(GeoFilterSet):
search_time_start = DateTimeFilter(
field_name="end_time",
lookup_expr="gte",
help_text=(
"The time for the start of the search. Limits events to those that end after the start time."
),
)
search_time_end = DateTimeFilter(
field_name="start_time",
lookup_expr="lte",
help_text=(
"The time for the end of the search. Limits events to those that start before the end time."
),
)
class Meta:
model = Event
fields = ["search_time_start", "search_time_end"]
class BaseEventViewSet(GenericViewSet):
queryset = Event.objects.all()
serializer_class = EventSerializer
filterset_class = EventFilterSet
ordering = ("start_time",)
I have written this snippet which I think should work, but I am unsure if it should go in a view or serializer, and how to make sure the method is actually called. I suspect I may be incorrectly mixing Class-based View and Function-based views here... so perhaps there's a better way to accomplish this.
def check_valid_start_end_search(self, request):
start = request.query_params.get("start_time")
end = request.query_params.get("end_time")
if not start < end:
return Response({"error": "The start time must be after the end time"}, status=HTTP_400_BAD_REQUEST)
I've seen some similar quesitons here, but they all deal with function-based views or handling the same error but on a model field, not a query param
You can raise custom errors from your serializers also to validate start time and end time.
try this solution.
class EventSerializer(serializers.ModelSerializer):
start_time = serializers.DateTimeField()
end_time = serializers.DateTimeField()
class Meta:
model = Event
fields = ("start_time", "end_time")
def validate(self,attrs):
start_time=attrs.get('start_time')
end_time=attrs.get('end_time')
if start_time < end_time:
raise serializers.ValidationError('The start time must be after the end time')
return attrs

How to get all ancestors of a Django object when ForeignKey is not self-referencing?

I have a model Person and another model Relation.
Before creating a new relation I want to check if this relation is possible or not.
This post and some other similar posts provides a solution but for self-referencing models, my model is not self referencing.
Django self-recursive foreignkey filter query for all childs
class Person(models.Model):
identifier = IntegerField(null=True)
title = CharField(max_length=100, null=True)
def get_all_parent_ids(self):
# I want to get all the ancestors of this person
# but this method only gets immediate parents right now
return [parent.parent.id for parent in self.parenting.all()]
def get_all_children_ids(self):
# I want to get all the descendants of this person
# but this method only gets immediate children right now
return [child.children.id for child in self.baby_sitting.all()]
class Relation(models.Model):
name = CharField(max_length=50, null=True)
parent = ForeignKey(Person, on_delete=models.PROTECT, related_name="parenting")
children = ForeignKey(Person, on_delete=models.PROTECT, related_name="baby_sitting")
class Meta:
unique_together = ('parent', 'children')
def is_relation_possible(new_parent_id, new_children_id):
new_parent_person = Person.objects.get(pk=new_parent_id)
all_parents = new_parent_person.get_all_parent_ids()
if new_children_id in all_parents:
return False
else:
return True
For example: Existing relaions
- A to B
- B to C
- C to D
- D to E
I want is_relation_possible(E, A) to return False, as E has an ancestor A.
Currently it only check immediate parents and not all parents.
You should to use recursion:
def is_relation_possible(new_parent_id, new_children_id):
new_parent_person = Person.objects.get(pk=new_parent_id)
all_parents = new_parent_person.get_all_parent_ids()
related = ( new_children_id in all_parents
or
any( is_relation_possible( ancestor_id, new_children_id)
for ancestor_id in all_parents ) # (*1)
)
return related
(*1) Here the trick: is related if itself is related or is related through their ancestors.
Notice 1: If you are working with graphs and not with hierarchies, it may turn on an infinite loop.
Notice 2: Not tested. Test it and come back :)

django querset filter foreign key select first record

I have a History model like below
class History(models.Model):
class Meta:
app_label = 'subscription'
ordering = ['-start_datetime']
subscription = models.ForeignKey(Subscription, related_name='history')
FREE = 'free'
Premium = 'premium'
SUBSCRIPTION_TYPE_CHOICES = ((FREE, 'Free'), (Premium, 'Premium'),)
name = models.CharField(max_length=32, choices=SUBSCRIPTION_TYPE_CHOICES, default=FREE)
start_datetime = models.DateTimeField(db_index=True)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
cancelled_datetime = models.DateTimeField(blank=True, null=True)
Now i have a queryset filtering like below
users = get_user_model().objects.all()
queryset = users.exclude(subscription__history__end_datetime__lt=timezone.now())
The issue is that in the exclude above it is checking end_datetime for all the rows for a particular history object. But i only want to compare it with first row of history object.
Below is how a particular history object looks like. So i want to write a queryset filter which can do datetime comparison on first row only.
You could use a Model Manager method for this. The documentation isn't all that descriptive, but you could do something along the lines of:
class SubscriptionManager(models.Manager):
def my_filter(self):
# You'd want to make this a smaller query most likely
subscriptions = Subscription.objects.all()
results = []
for subscription in subscriptions:
sub_history = subscription.history_set.first()
if sub_history.end_datetime > timezone.now:
results.append(subscription)
return results
class History(models.Model):
subscription = models.ForeignKey(Subscription)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
objects = SubscriptionManager()
Then: queryset = Subscription.objects().my_filter()
Not a copy-pastable answer, but shows the use of Managers. Given the specificity of what you're looking for, I don't think there's a way to get it just via the plain filter() and exclude().
Without knowing what your end goal here is, it's hard to say whether this is feasible, but have you considered adding a property to the subscription model that indicates whatever you're looking for? For example, if you're trying to get everyone who has a subscription that's ending:
class Subscription(models.Model):
#property
def ending(self):
if self.end_datetime > timezone.now:
return True
else:
return False
Then in your code: queryset = users.filter(subscription_ending=True)
I have tried django's all king of expressions(aggregate, query, conditional) but was unable to solve the problem so i went with RawSQL and it solved the problem.
I have used the below SQL to select the first row and then compare the end_datetime
SELECT (end_datetime > %s OR end_datetime IS NULL) AS result
FROM subscription_history
ORDER BY start_datetime DESC
LIMIT 1;
I will select my answer as accepted if not found a solution with queryset filter chaining in next 2 days.

Django Tests: setUpTestData on Postgres throws: "Duplicate key value violates unique constraint"

I am running into a database issue in my unit tests. I think it has something to do with the way I am using TestCase and setUpData.
When I try to set up my test data with certain values, the tests throw the following error:
django.db.utils.IntegrityError: duplicate key value violates unique constraint
...
psycopg2.IntegrityError: duplicate key value violates unique constraint "InventoryLogs_productgroup_product_name_48ec6f8d_uniq"
DETAIL: Key (product_name)=(Almonds) already exists.
I changed all of my primary keys and it seems to be running fine. It doesn't seem to affect any of the tests.
However, I'm concerned that I am doing something wrong. When it first happened, I reversed about an hour's worth of work on my app (not that much code for a noob), which corrected the problem.
Then when I wrote the changes back in, the same issue presented itself again. TestCase is pasted below. The issue seems to occur after I add the sortrecord items, but corresponds with the items above it.
I don't want to keep going through and changing primary keys and urls in my tests, so if anyone sees something wrong with the way I am using this, please help me out. Thanks!
TestCase
class DetailsPageTest(TestCase):
#classmethod
def setUpTestData(cls):
cls.product1 = ProductGroup.objects.create(
product_name="Almonds"
)
cls.variety1 = Variety.objects.create(
product_group = cls.product1,
variety_name = "non pareil",
husked = False,
finished = False,
)
cls.supplier1 = Supplier.objects.create(
company_name = "Acme",
company_location = "Acme Acres",
contact_info = "Call me!"
)
cls.shipment1 = Purchase.objects.create(
tag=9,
shipment_id=9999,
supplier_id = cls.supplier1,
purchase_date='2015-01-09',
purchase_price=9.99,
product_name=cls.variety1,
pieces=99,
kgs=999,
crackout_estimate=99.9
)
cls.shipment2 = Purchase.objects.create(
tag=8,
shipment_id=8888,
supplier_id=cls.supplier1,
purchase_date='2015-01-08',
purchase_price=8.88,
product_name=cls.variety1,
pieces=88,
kgs=888,
crackout_estimate=88.8
)
cls.shipment3 = Purchase.objects.create(
tag=7,
shipment_id=7777,
supplier_id=cls.supplier1,
purchase_date='2014-01-07',
purchase_price=7.77,
product_name=cls.variety1,
pieces=77,
kgs=777,
crackout_estimate=77.7
)
cls.sortrecord1 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-05",
bags_sorted=20,
turnout=199,
)
cls.sortrecord2 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-07",
bags_sorted=40,
turnout=399,
)
cls.sortrecord3 = SortingRecords.objects.create(
tag=cls.shipment1,
date='2015-02-09',
bags_sorted=30,
turnout=299,
)
Models
from datetime import datetime
from django.db import models
from django.db.models import Q
class ProductGroup(models.Model):
product_name = models.CharField(max_length=140, primary_key=True)
def __str__(self):
return self.product_name
class Meta:
verbose_name = "Product"
class Supplier(models.Model):
company_name = models.CharField(max_length=45)
company_location = models.CharField(max_length=45)
contact_info = models.CharField(max_length=256)
class Meta:
ordering = ["company_name"]
def __str__(self):
return self.company_name
class Variety(models.Model):
product_group = models.ForeignKey(ProductGroup)
variety_name = models.CharField(max_length=140)
husked = models.BooleanField()
finished = models.BooleanField()
description = models.CharField(max_length=500, blank=True)
class Meta:
ordering = ["product_group_id"]
verbose_name_plural = "Varieties"
def __str__(self):
return self.variety_name
class PurchaseYears(models.Manager):
def purchase_years_list(self):
unique_years = Purchase.objects.dates('purchase_date', 'year')
results_list = []
for p in unique_years:
results_list.append(p.year)
return results_list
class Purchase(models.Model):
tag = models.IntegerField(primary_key=True)
product_name = models.ForeignKey(Variety, related_name='purchases')
shipment_id = models.CharField(max_length=24)
supplier_id = models.ForeignKey(Supplier)
purchase_date = models.DateField()
estimated_delivery = models.DateField(null=True, blank=True)
purchase_price = models.DecimalField(max_digits=6, decimal_places=3)
pieces = models.IntegerField()
kgs = models.IntegerField()
crackout_estimate = models.DecimalField(max_digits=6,decimal_places=3, null=True)
crackout_actual = models.DecimalField(max_digits=6,decimal_places=3, null=True)
objects = models.Manager()
purchase_years = PurchaseYears()
# Keep manager as "objects" in case admin, etc. needs it. Filter can be called like so:
# Purchase.objects.purchase_years_list()
# Managers in docs: https://docs.djangoproject.com/en/1.8/intro/tutorial01/
class Meta:
ordering = ["purchase_date"]
def __str__(self):
return self.shipment_id
def _weight_conversion(self):
return round(self.kgs * 2.20462)
lbs = property(_weight_conversion)
class SortingModelsBagsCalulator(models.Manager):
def total_sorted(self, record_date, current_set):
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= record_date]
return sum(sorted)
class SortingRecords(models.Model):
tag = models.ForeignKey(Purchase, related_name='sorting_record')
date = models.DateField()
bags_sorted = models.IntegerField()
turnout = models.IntegerField()
objects = models.Manager()
def __str__(self):
return "%s [%s]" % (self.date, self.tag.tag)
class Meta:
ordering = ["date"]
verbose_name_plural = "Sorting Records"
def _calculate_kgs_sorted(self):
kg_per_bag = self.tag.kgs / self.tag.pieces
kgs_sorted = kg_per_bag * self.bags_sorted
return (round(kgs_sorted, 2))
kgs_sorted = property(_calculate_kgs_sorted)
def _byproduct(self):
waste = self.kgs_sorted - self.turnout
return (round(waste, 2))
byproduct = property(_byproduct)
def _bags_remaining(self):
current_set = SortingRecords.objects.values().filter(~Q(id=self.id), tag=self.tag)
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= self.date]
remaining = self.tag.pieces - sum(sorted) - self.bags_sorted
return remaining
bags_remaining = property(_bags_remaining)
EDIT
It also fails with integers, like so.
django.db.utils.IntegrityError: duplicate key value violates unique constraint "InventoryLogs_purchase_pkey"
DETAIL: Key (tag)=(9) already exists.
UDPATE
So I should have mentioned this earlier, but I completely forgot. I have two unit test files that use the same data. Just for kicks, I matched a primary key in both instances of setUpTestData() to a different value and sure enough, I got the same error.
These two setups were working fine side-by-side before I added more data to one of them. Now, it appears that they need different values. I guess you can only get away with using repeat data for so long.
I continued to get this error without having any duplicate data but I was able to resolve the issue by initializing the object and calling the save() method rather than creating the object via Model.objects.create()
In other words, I did this:
#classmethod
def setUpTestData(cls):
cls.person = Person(first_name="Jane", last_name="Doe")
cls.person.save()
Instead of this:
#classmethod
def setUpTestData(cls):
cls.person = Person.objects.create(first_name="Jane", last_name="Doe")
I've been running into this issue sporadically for months now. I believe I just figured out the root cause and a couple solutions.
Summary
For whatever reason, it seems like the Django test case base classes aren't removing the database records created by let's just call it TestCase1 before running TestCase2. Which, in TestCase2 when it tries to create records in the database using the same IDs as TestCase1 the database raises a DuplicateKey exception because those IDs already exists in the database. And even saying the magic word "please" won't help with database duplicate key errors.
Good news is, there are multiple ways to solve this problem! Here are a couple...
Solution 1
Make sure if you are overriding the class method tearDownClass that you call super().tearDownClass(). If you override tearDownClass() without calling its super, it will in turn never call TransactionTestCase._post_teardown() nor TransactionTestCase._fixture_teardown(). Quoting from the doc string in TransactionTestCase._post_teardown()`:
def _post_teardown(self):
"""
Perform post-test things:
* Flush the contents of the database to leave a clean slate. If the
class has an 'available_apps' attribute, don't fire post_migrate.
* Force-close the connection so the next test gets a clean cursor.
"""
If TestCase.tearDownClass() is not called via super() then the database is not reset in between test cases and you will get the dreaded duplicate key exception.
Solution 2
Override TransactionTestCase and set the class variable serialized_rollback = True, like this:
class MyTestCase(TransactionTestCase):
fixtures = ['test-data.json']
serialized_rollback = True
def test_name_goes_here(self):
pass
Quoting from the source:
class TransactionTestCase(SimpleTestCase):
...
# If transactions aren't available, Django will serialize the database
# contents into a fixture during setup and flush and reload them
# during teardown (as flush does not restore data from migrations).
# This can be slow; this flag allows enabling on a per-case basis.
serialized_rollback = False
When serialized_rollback is set to True, Django test runner rolls back any transactions inserted into the database beween test cases. And batta bing, batta bang... no more duplicate key errors!
Conclusion
There are probably many more ways to implement a solution for the OP's issue, but these two should work nicely. Would definitely love to have more solutions added by others for clarity sake and a deeper understanding of the underlying Django test case base classes. Phew, say that last line real fast three times and you could win a pony!
The log you provided states DETAIL: Key (product_name)=(Almonds) already exists. Did you verify in your db?
To prevent such errors in the future, you should prefix all your test data string by test_
I discovered the issue, as noted at the bottom of the question.
From what I can tell, the database didn't like me using duplicate data in the setUpTestData() methods of two different tests. Changing the primary key values in the second test corrected the problem.
I think the problem here is that you had a tearDownClass method in your TestCase without the call to super method.
In this way the django TestCase lost the transactional functionalities behind the setUpTestData so it doesn't clean your test db after a TestCase is finished.
Check warning in django docs here:
https://docs.djangoproject.com/en/1.10/topics/testing/tools/#django.test.SimpleTestCase.allow_database_queries
I had similar problem that had been caused by providing the primary key value to a test case explicitly.
As discussed in the Django documentation, manually assigning a value to an auto-incrementing field doesn’t update the field’s sequence, which might later cause a conflict.
I have solved it by altering the sequence manually:
from django.db import connection
class MyTestCase(TestCase):
#classmethod
def setUpTestData(cls):
Model.objects.create(id=1)
with connection.cursor() as c:
c.execute(
"""
ALTER SEQUENCE "app_model_id_seq" RESTART WITH 2;
"""
)

Reducing queries for manytomany models in django

EDIT:
It turns out the real question is - how do I get select_related to follow the m2m relationships I have defined? Those are the ones that are taxing my system. Any ideas?
I have two classes for my django app. The first (Item class) describes an item along with some functions that return information about the item. The second class (Itemlist class) takes a list of these items and then does some processing on them to return different values. The problem I'm having is that returning a list of items from Itemlist is taking a ton of queries, and I'm not sure where they're coming from.
class Item(models.Model):
# for archiving purposes
archive_id = models.IntegerField()
users = models.ManyToManyField(User, through='User_item_rel',
related_name='users_set')
# for many to one relationship (tags)
tag = models.ForeignKey(Tag)
sub_tag = models.CharField(default='',max_length=40)
name = models.CharField(max_length=40)
purch_date = models.DateField(default=datetime.datetime.now())
date_edited = models.DateTimeField(auto_now_add=True)
price = models.DecimalField(max_digits=6, decimal_places=2)
buyer = models.ManyToManyField(User, through='Buyer_item_rel',
related_name='buyers_set')
comments = models.CharField(default='',max_length=400)
house_id = models.IntegerField()
class Meta:
ordering = ['-purch_date']
def shortDisplayBuyers(self):
if len(self.buyer_item_rel_set.all()) != 1:
return "multiple buyers"
else:
return self.buyer_item_rel_set.all()[0].buyer.name
def listBuyers(self):
return self.buyer_item_rel_set.all()
def listUsers(self):
return self.user_item_rel_set.all()
def tag_name(self):
return self.tag
def sub_tag_name(self):
return self.sub_tag
def __unicode__(self):
return self.name
and the second class:
class Item_list:
def __init__(self, list = None, house_id = None, user_id = None,
archive_id = None, houseMode = 0):
self.list = list
self.house_id = house_id
self.uid = int(user_id)
self.archive_id = archive_id
self.gen_balancing_transactions()
self.houseMode = houseMode
def ret_list(self):
return self.list
So after I construct Itemlist with a large list of items, Itemlist.ret_list() takes up to 800 queries for 25 items. What can I do to fix this?
Try using select_related
As per a question I asked here
Dan is right in telling you to use select_related.
select_related can be read about here.
What it does is return in the same query data for the main object in your queryset and the model or fields specified in the select_related clause.
So, instead of a query like:
select * from item
followed by several queries like this every time you access one of the item_list objects:
select * from item_list where item_id = <one of the items for the query above>
the ORM will generate a query like:
select item.*, item_list.*
from item a join item_list b
where item a.id = b.item_id
In other words: it will hit the database once for all the data.
You probably want to use prefetch_related
Works similarly to select_related, but can deal with relations selected_related cannot. The join happens in python, but I've found it to be more efficient for this kind of work than the large # of queries.
Related reading on the subject