Django query hitting the db for every iteration

Django query hitting the db for every iteration - django

I have a some model.py like so:
class Muestraonline(models.Model):
accessionnumber = models.ForeignKey(Muestra, related_name='online_accessionnumber')
muestraid = models.ForeignKey(Muestra)
taxonid = models.ForeignKey(Taxon, null=True, blank=True)
...
class Muestra(models.Model):
muestraid = models.IntegerField(primary_key=True, db_column='MuestraID') # Field name made lowercase.
latitudedecimal = models.DecimalField(decimal_places=6, null=True, max_digits=20, db_column='LatitudeDecimal', blank=True) # Field name made lowercase.
longitudedecimal = models.DecimalField(decimal_places=6, null=True, max_digits=20, db_column='LongitudeDecimal', blank=True) # Field name made lowercase.
...
And my view.py I want to get the unique specimen and find all specimens which share that taxonid. For the related specimens I just need the lat/long info:
def specimen_detail(request, accession_number=None, specimen_id=None):
specimen = Muestraonline.objects.get(accessionnumber=accession_number, muestraid=specimen_id)
related_specimens = Muestraonline.objects.filter(taxonid=specimen.taxonid).exclude(id=specimen.id)
# create array for the related specimen points
related_coords = []
# loop through results and populate array
for relative in related_specimens:
latlon = (format(relative.muestraid.latitudedecimal), format(relative.muestraid.longitudedecimal))
related_coords.append(latlon)
related_coords = simplejson.dumps(related_coords)
But when I loop through related_specimens it ends up hitting the db once for each relative. Shouldn't I be able to get the latitudedecimal and longitudedecimal values in the format I need with only one extra db query? I know I missing something very basic in my approach here, just not sure of the best way to get this done.
Any help would be much appreciated.

Just use select_related in your QuerySet:
related_specimens = Muestraonline.objects.filter(taxonid=specimen.taxonid).exclude(id=specimen.id).select_related('muestraid')
That will join your Muestraonline model to Muestra behind the scenes and each Muestraonline instance returned by the QuerySet will also contain a cached instance of Muestra.

Related

Get the top n rows of each day from same table in Django

I am sure this is not a novel/new problem. I really tried to look into other solutions. However, I could not find how to solve this.
I have a model like
class Deal(models.Model):
title = models.CharField(max_length=1500)
viewCounter = models.SmallIntegerField(default=0)
thumbsUpCounter = models.SmallIntegerField(default=0)
createDateTime = models.DateTimeField(auto_now_add=True)
owner = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, null=False, related_name='deal_owner',
editable=False)
Now I want to get top 10 (or less) deals ordering by thumbsUpCounter and viewCounter of every day. I tried to look into the Subquery and Outerref. However, I could not figure out how can I get the right results.
** I am using MySQL.
Thanks in advance.

try
from django.db.models.functions import TruncDate
query = Deal.objects.annotate(date=TruncDate('createDateTime'))\ # extract date
.values('date')\ # group by date
.order_by('-thumbsUpCounter')\ # order by
[:10] # slice first 10

Django annotation on compoundish primary key with filter ignoring primary key resutling in too many annotated items

Please see EDIT1 below, as well.
Using Django 3.0.6 and python3.8, given following models
class Plants(models.Model):
plantid = models.TextField(primary_key=True, unique=True)
class Pollutions(models.Model):
pollutionsid = models.IntegerField(unique=True, primary_key=True)
year = models.IntegerField()
plantid = models.ForeignKey(Plants, models.DO_NOTHING, db_column='plantid')
pollutant = models.TextField()
releasesto = models.TextField(blank=True, null=True)
amount = models.FloatField(db_column="amount", blank=True, null=True)
class Meta:
managed = False
db_table = 'pollutions'
unique_together = (('plantid', 'releasesto', 'pollutant', 'year'))
class Monthp(models.Model):
monthpid = models.IntegerField(unique=True, primary_key=True)
year = models.IntegerField()
month = models.IntegerField()
plantid = models.ForeignKey(Plants, models.DO_NOTHING, db_column='plantid')
power = models.IntegerField(null=False)
class Meta:
managed = False
db_table = 'monthp'
unique_together = ('plantid', 'year', 'month')
I'd like to annotate - based on a foreign key relationship and a fiter a value, particulary - to each plant the amount of co2 and the Sum of its power for a given year. For sake of debugging having replaced Sum by Count using the following query:
annotated = tmp.all().annotate(
energy=Count('monthp__power', filter=Q(monthp__year=YEAR)),
co2=Count('pollutions__amount', filter=Q(pollutions__year=YEAR, pollutions__pollutant="CO2", pollutions__releasesto="Air")))
However this returns too many items (a wrong number using Sum, respectively)
annotated.first().co2 # 60, but it should be 1
annotated.first().energy # 252, but it should be 1
although my database guarantees - as denoted, that (plantid, year, month) and (plantid, releasesto, pollutant, year) are unique together, which can easily be demonstrated:
pl = annotated.first().plantid
testplant = Plants.objects.get(pk=pl) # plant object
pco2 = Pollutions.objects.filter(plantid=testplant, year=YEAR, pollutant="CO2", releasesto="Air")
len(pco2) # 1, as expected
Why does django return to many results and how can I tell django to limit the elements to annotate to the 'current primary key' in other words to only annotate the elements where the foreign key matches the primary key?
I can achieve what I intend to do by using distinct and Max:
energy=Sum('yearly__power', distinct=True, filter=Q(yearly__year=YEAR)),
co2=Max('pollutions__amount', ...
However the performance is inacceptable.
I have tested to use model_to_dict and appending the wanted values "by hand" to the dict, which works for the values itself, but not for sorting the resulted dict (e.g. by energy) and it is acutally faster than the workaround directly above.
It conceptually strikes to me that the manual approach is faster than letting the database do, what it is intended to do.
Is this a feature limitation of django's orm or am I missing something?
EDIT1:
The behaviour is known as bug since 11 years.
Even others "spent a whole day on this".
I am now trying it with subqueries. However the forein key I am using is not a primary key of its table. So the kind of "usual" approach to use "pk=''" does not work. More clearly, trying:
tmp = Plants.objects.filter(somefilter)
subq1 = Subquery(Yearly.objects.filter(pk=OuterRef('plantid'), year=YEAR)) tmp1 = tmp.all().annotate(
energy=Count(Subquery(subq1))
)
returns
OperationalError at /xyz
no such column: U0.yid
Which definitely makes sense because Plants has no clue what a yid is, it only knows plantids. How do I adjust the subquery to that?

Query intermediate through fields in django

I have a simple Relation model, where a user can follow a tag just like stackoverflow.
class Relation(models.Model):
user = AutoOneToOneField(User)
follows_tag = models.ManyToManyField(Tag, blank=True, null=True, through='TagRelation')
class TagRelation(models.Model):
user = models.ForeignKey(Relation, on_delete=models.CASCADE)
following_tag = models.ForeignKey(Tag, on_delete=models.CASCADE)
pub_date = models.DateTimeField(default=timezone.now)
class Meta:
unique_together = ['user', 'following_tag']
Now, to get the results of all the tags a user is following:
kakar = CustomUser.objects.get(email="kakar#gmail.com")
tags_following = kakar.relation.follows_tag.all()
This is fine.
But, to access intermediate fields I have to go through a big list of other queries. Suppose I want to display when the user started following a tag, I will have to do something like this:
kakar = CustomUser.objects.get(email="kakar#gmail.com")
kakar_relation = Relation.objects.get(user=kakar)
t1 = kakar.relation.follows_tag.all()[0]
kakar_t1_relation = TagRelation.objects.get(user=kakar_relation, following_tag=t1)
kakar_t1_relation.pub_date
As you can see, just to get the date I have to go through so much query. Is this the only way to get intermediate values, or this can be optimized? Also, I am not sure if this model design is the way to go, so if you have any recomendation or advice I would be very grateful. Thank you.

You need to use Double underscore i.e. ( __ ) for ForeignKey lookup,
Like this :
user_tags = TagRelation.objects.filter(user__user__email="kakar#gmail.com").values("following_tag__name", "pub_date")
If you need the name of the tag, you can use following_tag__name in the query and if you need id you can use following_tag__id.
And for that you need to iterate through the result of above query set, like this:
for items in user_tags:
print items['following_tag__name']
print items['pub_date']
One more thing,The key word values will return a list of dictionaries and you can iterate it through above method and if you are using values_list in the place of values, it will return a list of tuples. Read further from here .

how to get the latest in django model

In this model:
class Rank(models.Model):
User = models.ForeignKey(User)
Rank = models.ForeignKey(RankStructure)
date_promoted = models.DateField()
def __str__(self):
return self.Rank.Name.order_by('promotion__date_promoted').latest()
I'm getting the error:
Exception Value:
'str' object has no attribute 'order_by'
I want the latest Rank as default. How do I set this?
Thanks.
Update #1
Added Rank Structure
class RankStructure(models.Model):
RankID = models.CharField(max_length=4)
SName = models.CharField(max_length=5)
Name = models.CharField(max_length=125)
LongName = models.CharField(max_length=512)
GENRE_CHOICES = (
('TOS', 'The Original Series'),
('TMP', 'The Motion Picture'),
('TNG', 'The Next Generation'),
('DS9', 'Deep Space Nine'),
('VOY', 'VOYAGER'),
('FUT', 'FUTURE'),
('KTM', 'KELVIN TIMELINE')
)
Genre = models.CharField(max_length=3, choices=GENRE_CHOICES)
SPECIALTY_OPTIONS = (
('CMD', 'Command'),
('OPS', 'Operations'),
('SCI', 'Science'),
('MED', 'Medical'),
('ENG', 'Engineering'),
('MAR', 'Marine'),
('FLT', 'Flight Officer'),
)
Specialty = models.CharField(max_length=25, choices=SPECIALTY_OPTIONS)
image = models.FileField(upload_to=image_upload_handler, blank=True)
This is the Rank_structure referenced by Rank in Class Rank.
THe User Foreign key goes to the standard User table.

The reason that you’re getting an error is because self.Rank.Name is not a ModelManager on which you can call order_by. You’ll need an objects in there somewhere if you want to call order_by. We can’t help you with the django formatting for the query you want unless you also post the model definitions as requested by several commenters. That said, I suspect that what you want is something like:
def __str__(self):
return self.objects.filter(Rank_id=self.Rank_id).order_by('date_promoted').latest().User.Name

Django annotate fails when querying unique values

for a Django project I need to combine two parts lists into one.
models.py:
class UserBuild(models.Model):
project = models.ForeignKey(Project)
created = models.DateTimeField(auto_now_add=True)
updated = models.DateTimeField(auto_now=True)
part = models.ForeignKey(Parts)
part_quantity = models.IntegerField(max_length=5, null=True, blank=True)
suggested_quantity = models.IntegerField(
max_length=5, null=True, blank=True
)
views.py:
def CombineProjects(request, template_name='combined_projects.html'):
...
build_set = UserBuild.objects.values(
#'pk',
#'project__pk',
'part__number',
'part__part_type__name',
'part__price',
'part__description',
'part__category__name'
).filter(
project__in=projects
).order_by('part__category', 'part__part_type').annotate(total=Sum('part_quantity'))
Basically here I want to group all parts which are the same and sum their quantity. As above works but if I uncomment either the pk or project__pk arguments, then the parts are no longer grouped (I assume because they are variable even when the part is the same).
Is there some way that I can keep the grouping but also include the pk and project__pk values?

The problem is that your list of values is what you group and sum by; so you really want to group by project and part number only.
I would try:
build_set = UserBuild.objects.values(
'project__pk',
'part__number',
).filter(
project__in=projects
).order_by('part__category', 'part__part_type').annotate(total=Sum('part_quantity'))
then merge the other attributes into this list later on. I'm not sure this will work however. You might need to do:
build_set = UserBuild.objects.values(
'project_id',
'part_id',
).filter(
project__in=projects
).order_by('part__category', 'part__part_type').annotate(total=Sum('part_quantity'))
And do more work by hand afterwards.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django query hitting the db for every iteration - django

Related

Get the top n rows of each day from same table in Django

Django annotation on compoundish primary key with filter ignoring primary key resutling in too many annotated items

Query intermediate through fields in django

how to get the latest in django model

Django annotate fails when querying unique values

Categories

Resources