NDB - querying repeated structured property for attribute - python-2.7

I have two models:
class Author(ndb.Model):
email = ndb.StringProperty(indexed=true)
class Course(ndb.Model):
student = ndb.StructuredProperty(Author, repeated=True)
I am trying to query Course to find where a student's email matches that of user.email_address. Is it possible to structure this as a single query?

You have to query by using Author object as a filter
query = Course.query(Course.student.email == 'my#email.com')
But this query is correct only if you are querying for a single property. Official documentation suggests to use following filter
query = Course.query(Course.student == Student(email='my#email.com'))
See https://cloud.google.com/appengine/docs/standard/python/ndb/queries#filtering_structured_properties for more information

Related

Get models in Django that have all of the values in ManyToMany field (AND-query, no reverse lookups allowed)

I have such a model in Django:
class VariantTag(models.Model):
saved_variants = models.ManyToManyField('SavedVariant')
I need to get all VariantTag models that have saved_variants ManyToMany field with exact ids, say (250, 251), no more, no less. By the nature of the code that I am dealing with there is no way I can do reverse lookup with _set. So, I am looking for a query (or several queries + additional python code filtering) that will get me there but in such a way:
query = Q(...)
tag_queryset = VariantTag.objects.filter(query)
How is it possible to achieve?
I should probably stress out: supplied saved variants (e.g. (250, 251) should be AND - ed, not OR - ed.
Use in lookup
tag_queryset = VariantTag.objects.filter(saved_variants__in=[250,251])
So far I was able to achieve AND result by the following code:
tag_ids = VariantTag.objects.filter(variant_tag_type__name=tag_data['tag'],
saved_variants__in=saved_variant_ids).values_list('id', flat=True).distinct()
for tag_id in tag_ids:
saved_variants = list(VariantTag.objects.get(id=tag_id).saved_variants.all().values_list('id', flat=True))
if all(s in saved_variant_ids for s in saved_variants) and len(saved_variants) == len(saved_variant_ids):
return VariantTag.objects.get(id=tag_id)
So, I am doing the following:
Getting the OR - result
Iterating over the resulting ids of the retrieved model and for each one of them getting all of the ids of the ManyToMany field
Checking if all of the obtained ids of the ManyToMany field are in the required ids list (saved_variant_ids)
If yes - get the model by the id: VariantTag.objects.get(id=tag_id)
In my case there will be only one such model that have the required ids in ManyToMany field. If it is not the case for you - just append the ids of the model (in my case tag_id) to a list - then make a query for all of them.
If anyone has more concise way of doing AND ManyToMany query + code, would be interesting to see.

Comparing JSONFields in Django

If two models both have JSONFields, is there a way to match one against the other? Say I have two models:
Crabadoodle(Model):
classification = CharField()
metadata = JSONField()
Glibotz(Model):
rating = IntegerField()
metadata = JSONField()
If I have a Crabadoodle and want to fetch all the Glibotz objects with identical metadata fields, how would I go about that? If I know specific contents, I can filter simple enough, but how do you go about matching on the whole field?
There is no implementation of this in Django but it is possible by performing raw query using jsonb operators(#>,<#)
Something in line of following
select *
from someapp_crabdoodle crab
join someapp_glizbotz glib
on crab.metadata #> glib.metadata and crab.metadata <# glib.metadata
where crab.id = 1

Sorting by many to many relationship

In the simplified version of my problem I have a model Document that has a manay to many relationship to Tag. I would like to have a query, that given a list of tags will sort the Documents in the order they match the tags i.e. the documents that match more tags will be displayed first and the documents that match fewer tags be displayed later. I know how to do this with a large plain SQL query but i'm having difficulties getting it to work with querysets. Anyone could help?
class Document(model.Model):
title = CharField(max_length = 20)
content = TextField()
class Tag(model.Model):
display_name = CharField(max_length = 10)
documents = ManyToManyField(Document, related_name = "tags")
I would like to do something like the following:
documents = Documents.objects.all().order_by(count(tags__in = ["java", "python"]))
and get first the documents that match both "java" and "python", then the documents that match only one of them and finally the documents that don't match any.
Thanks in advance for your help.
Have a look at this : How to sort by annotated Count() in a related model in Django
Some doc :https://docs.djangoproject.com/en/1.6/topics/db/aggregation/#order-by

Efficiently select latest items of different categories

Consider the following model:
class Data(Model):
created_at = models.DateTimeField()
category = models.CharField(max_length=7)
I want to select the latest object for all categories.
Following this question, i'm selecting the distinct categories and then making a separate query for each of them:
categories = Data.objects.distinct('category').values_list('category', flat=True)
for category in categories:
latest_obj = Data.objects.filter(category=category).latest('created_at')
The downside of the approach is that it makes lots of queries (1 for the distinct categories, and then a separate query per category).
Is there a way to do this with a single query?
Typically, you would use a group by in relational database. Django has an aggergation API
(https://docs.djangoproject.com/en/dev/topics/db/aggregation/#aggregation) which allows you to do the following:
from django.db.models import Max
Data.objects.values('category').annotate(latest=Max('created_at'))
This will perform a single query and return a list like this:
[{'category' : 'cat1', 'latest' : '01/01/01' },{'category' : 'cat2' 'latest' : '02/02/02' }]
But I guess you might want to retrieve the data record id as well within this list. Django does not make thinks simple for you in this case. The problem is django uses all fields in the value clause to make the grouping and you cannot return extra columns from the query.
EDIT: I originally proposed to add a second values() clause to the end of the query based on web resources but this does not add extra columns to the result set.

Limit django queryset by another related table

Lets say I have 2 django models like this:
class Spam(models.Model):
somefield = models.CharField()
class Eggs(models.Model):
parent_spam = models.ForeignKey(Spam)
child_spam = models.ForeignKey(Spam)
Given the input of a "Spam" object, how would the django query looks like that:
Limits this query based on the parent_spam field in the "Eggs" table
Gives me the corresponding child_spam field
And returns a set of "Spam" objects
In SQL:
SELECT * FROM Spam WHERE id IN (SELECT child_spam FROM Eggs WHERE parent_spam = 'input_id')
I know this is only an example, but this model setup doesn't actually validate as it is - you can't have two separate ForeignKeys pointing at the same model without specifying a related_name. So, assuming the related names are egg_parent and egg_child respectively, and your existing Spam object is called my_spam, this would do it:
my_spam.egg_parent.child_spam.all()
or
Spam.objects.filter(egg_child__parent_spam=my_spam)
Even better, define a ManyToManyField('self') on the Spam model, which handles all this for you, then you would do:
my_spam.other_spams.all()
According to your sql code you need something like this
Spam.objects.filter(id__in= \
Eggs.objects.values_list('child_spam').filter(parent_spam='input_id'))