Django complicated queryset - django

i am trying to figure out how to solve this problem without any luck. The situation is that Author has many books divided by genres and i would like to have that when i query author it would return author and book objects divided by genres.
Author object would have these properties:
name
fantasy - would have one book based by given date
crime - would have one book based by given date
romance - would have one book based by given date
Is there a sane way to achieve this by not making thousands(if i would have that many genres) of foreign keys in author model?
class Author(models.Model):
name = models.CharField(u'Name',max_length=100)
GENRE = (
(0,u'Fantasy'),
(1,u'Crime'),
(2,u'Romance')
)
class Book(models.Model):
author = models.ForeignKey(Author)
name = models.CharField(u'Name',max_length=100)
genre = models.SmallIntegerField(u'Genre',choices=GENRE)
date = models.DateField(u'Publish date')
EDIT:
After closer inspection sgarza62 example seems to work bad with large amount of data.
So i tried new django 1.7 feature Prefetch
authors = Author.objects.all().prefetch_related(
Prefetch("book", queryset=Book.objects.filter(genre=0,date_from__lte=datetime.datetime.now()), to_attr='x_fantasy'),
Prefetch("book", queryset=Book.objects.filter(genre=1,date_from__lte=datetime.datetime.now()), to_attr='x_crime'),
Prefetch("book", queryset=Book.objects.filter(genre=2,date_from__lte=datetime.datetime.now()), to_attr='x_romance')
)
But i have 2 issues with this, how to prefetch only one object (latest book in this example) and second, how to appy ordering based on prefetched values.

If you're querying all or several authors, I recommend prefetching related fields. This will snatch up all related objects in a single hit to the database, and store the objects in the Queryset.
authors = Author.objects.all().prefetch_related('book_set')
for author in authors:
# accessing related field will not cause a hit to the db,
# because values are cached in Queryset
for book in author.books_set:
print book.genre
If you're only querying one author, then it's not such a big deal.
author = Author.objects.get(pk=1)
her_books = author.book_set
for book in her_books:
print book.genre
Edit
I'm having a bit of trouble understanding exactly what you're going to do. But, if you're looking for the latest book of each genre, for a given author:
author = Author.objects.get(pk=1)
author_books = author.book_set.order_by('-date') # most recent, first
author_genres = set([b.genre for b in author_books])
for g in author_genres:
print next((b for b in author_books if b.genre==g), None)
Keep in mind that these operations are all on the Queryset, and are not hitting the database each time. This is good, because querying the database is an expensive operation, and most authors have a relatively small list of works, so the Querysets will generally be small.

Related

How to set the range of the primary-key of django objects based on the foreign key range and is it advisable?

I have the below models:
class Author(model):
name = CharField(...)
class Post(mode):
author = ForeignKey(Author)
title = CharField(...)
Suppose we have at most 100 authors. So the primary key of the autors would be in the range of 1 to 100
I want the primary keys of the post model be based on the primary key of the author of the post.
I mean, if the author's primary key is 34, then his/her posts primary keys be 34000001, 34000002, 34000003
is advisable to do this and how can I do it?
This is generally not advisable and might tempt you to follow (or invent) antipatterns. If your goal is to be able to easily access a particular author's posts, for example, you would be safest using Django's normal ORM patterns. For example:
# models.py
class Author(model):
name = CharField(...)
class Post(model):
author = ForeignKey(Author, related_name='posts')
title = CharField(...)
Then anywhere you want, you can...
sally = Author.objects.get(id=1)
sallys_posts = author.posts.all()
This would be much safer than things you might otherwise be tempted to do such as Post.objects.filter(pk__startswith=sally.pk) which would be something that I think would lead to a large number of bugs down the line, and would also mean that you miss out on a lot of normal pattern Django ORM benefits.

Django models: foriegn key or multiple data in a field

Actually, this question has puzzled me for a long time.
Say, I have two models, Course and CourseDate, as follows:
class Course(models.Model):
name = model.CharField()
class CourseDate(models.Model):
course = modelds.ForienKey(Course)
date = models.DateField()
where CourseDate is the dates that a certain course will take place.
I could also define Course as follows and discard CourseDate:
class Course(models.Model):
name = models.CharField()
dates = models.CharField()
where the dates field contains dates represented by strings. For example:
dates = '2016-10-1,2016-10-12,2016-10-30'
I don't know if the second solution is kind of "cheating". So, which one is better?
I don't know about cheating, but it certainly goes against good database design. More to the point, it prevents you from doing almost all kinds of useful queries on that field. What if you wanted to know all courses that had dates within two days of a specific date? Almost impossible to do that with solution 2, but simple with solution 1.

Many to Many Exclude on Multiple Objects

I have the following models:
class Deal(models.Model):
date = models.DateTimeField(auto_now_add=True)
retailer = models.ForeignKey(Retailer, related_name='deals')
description = models.CharField(max_length=255)
...etc
class CustomerProfile(models.Model):
saved_deals = models.ManyToManyField(Deal, related_name='saved_by_customers', null=True, blank=True)
dismissed_deals = models.ManyToManyField(Deal, related_name='dismissed_by_customers', null=True, blank=True)
What I want to do is retrieve deals for a customer, but I don't want to include deals that they have dismissed.
I'm having trouble wrapping my head around the many-to-many relationship and am having no luck figuring out how to do this query. I'm assuming I should use an exclude on Deal.objects() but all the examples I see for exclude are excluding one item, not what amounts to multiple items.
When I naively tried just:
deals = Deal.objects.exclude(customer.saved_deals).all()
I get the error: "'ManyRelatedManager' object is not iterable"
If I say:
deals = Deal.objects.exclude(customer.saved_deals.all()).all()
I get "Too many values to unpack" (though I feel I should note there are only 5 deals and 2 customers in the database right now)
We (our client) presumes that he/she will have thousands of customers and tens of thousands of deals in the future, so I'd like to stay performance oriented as best I can. If this setup is incorrect, I'd love to know a better way.
Also, I am running django 1.5 as this is deployed on App Engine (using CloudSQL)
Where am I going wrong?
Suggest you use customer.saved_deals to get the list of deal ids to exclude (use values_list to quickly convert to a flat list).
This should save you excluding by a field in a joined table.
deals = Deals.exclude( id__in=customer.saved_deals.values_list('id', flat=True) )
You'd want to change this:
deals = Deal.objects.exclude(customer.saved_deals).all()
To something like this:
deals = Deal.objects.exclude(customer__id__in=[1,2,etc..]).all()
Basically, customer is the many-to-many foreign key, so you can't use it directly with an exclude.
Deals saved and deals dismissed are two fields describing almost same thing. There is also a risk too much columns may be used in database if these two field are allowed to store Null values. It's worth to consider remove dismissed_deals at all, and use saved_deal only with True or False statement.
Another thing to think about is move saved_deals out of CustomerProfile class to Deals class. Saved_deals are about Deals so it can prefer to live in Deals class.
class Deal(models.Model):
saved = models.BooleandField()
...
A real deal would have been made by one customer / buyer rather then few. A real customer can have milions of deals, so relating deals to customer would be good way.
class Deal(models.Model):
saved = models.BooleanField()
customer = models.ForeignKey(CustomerProfile)
....
What I want to do is retrieve deals for a customer, but I don't want to include deals that they have dismissed.
deals_for_customer = Deals.objects.all().filter(customer__name = "John")
There is double underscore between customer and name (customer__name), which let to filter model_name (customer is related to CustomerProfile which is model name) and name of field in that model (assuming CutomerProfile class has name attribute)
deals_saved = deals_for_customer.filter(saved = True)
That's it. I hope I could help. Let me know if not.

Django using the values method with m2m relationships / filtering m2m tables using django

class Book(models.Model):
name = models.CharField(max_length=127, blank=False)
class Author(models.Model):
name = models.CharField(max_length=127, blank=False)
books = models.ManyToMany(Books)
I am trying to filter the authors so I can return a result set of authors like:
[{id: 1, name: 'Grisham', books : [{name: 'The Client'},{name: 'The Street Lawyer}], ..]
Before I had the m2m relationship on author I was able to query for any number of author records and get all of the values I needed using the values method with only one db query.
But it looks like
Author.objects.all().values('name', 'books')
would return something like:
[{id: 1, name: 'Grisham', books :{name: 'The Client'}},{id: 1, name: 'Grisham', books :{name: 'The Street Lawyer'}}]
Looking at the docs it doesn't look like that is possible with the values method.
https://docs.djangoproject.com/en/dev/ref/models/querysets/
Warning Because ManyToManyField attributes and reverse relations can
have multiple related rows, including these can have a multiplier
effect on the size of your result set. This will be especially
pronounced if you include multiple such fields in your values() query,
in which case all possible combinations will be returned.
I want to try to get a result set of n size with with the least amount of database hits authorObject.books.all() would result in at least n db hits.
Is there a way to do this in django?
I think one way of doing this with the least amount of database hits would be to :
authors = Authors.objects.all().values('id')
q = Q()
for id in authors:
q = q | Q(author__id = id)
#m2m author book table.. from my understanding it is
#not accessible in the django QuerySet
author_author_books.filter(q) #grab all of the book ids and author ids with one query
Is there a built in way to query the m2m author_author_books table or am I going to have the write the sql? Is there a way to take advantage of the Q() for doing OR logic in raw sql?
Thanks in advance.
I think you want prefetch_related. Something like this:
authors = Author.objects.prefetch_related('books').all()
More on this here.
If you want to query your author_author_books table, I think you need to specify a "through" table:
class BookAuthor(models.Model):
book = models.ForeignKey(Book)
author = models.ForeignKey(Author)
class Author(models.Model):
name = models.CharField(max_length=127, blank=False)
books = models.ManyToMany(Books, through=BookAuthor)
and then you can query BookAuthor like any other model.

Django Filter Return Many Values

I'm new to django and I think this is a simple question -
I have an intermediate class which is coded as follows -
class Link_Book_Course(models.Model):
book = models.ForeignKey(Book)
course = models.ForeignKey(Course)
image = models.CharField(max_length = 200, null=True)
rating = models.CharField(max_length = 200,null=True)
def __unicode__(self):
return self.title
def save(self):
self.date_created = datetime.now()
super(Link_Book_Course,self).save()
I'm making this call as I'd like to have to have all of the authors of the books (Book is another model with author as a CharField)
storeOfAuthorNames = Link_Book_Course.objects.filter(book__author)
However, it doesn't return a querySet of all of the authors, in fact, it throws an error.
I think it's because book__author has multiple values- how can I get all of them?
Thanks!
I don't think you're using the right queryset method. filter() filters by its arguments - so the expected usage is:
poe = Author.objects.get(name='Edgar Allen Poe')
course_books_by_poe = Link_Book_Course.objects.filter(book__author=poe)
It looks like you're trying to pull a list of the names all the authors of books used in a particular course (or maybe all courses?). Maybe you're looking for .values() or values_list()?
all_authors_in_courses = Link_Book_Course.objects.values_list(
'book__author', flat=True
).distinct()
(Edit: Updated per #ftartaggia's suggestion)
As others already explained, the use of filter method is to get a subset of the whole set of objects and does not return instances of other models (no matter if related objects or so)
If you want to have Author models instances back from django ORM and you can use aggregation APIs then you might want to do something like this:
from django.db.models import Count
Author.objects.annotate(num_books=Count('book')).filter(num_books__gt=1)
the filter method you are trying to use translates more or less into SQL like this:
SELECT * FROM Link_Book_Course INNER JOIN Book ON (...) WHERE Book.author = ;
So as you see your query has an incomplete where clause.
Anyway, it's not the query you are looking for.
What about something like (assuming author is a simple text field of Book and you want only authors of books referred from Link_Book_Course instances):
Book.objects.filter(pk__in=Link_Book_Course.objects.all().values_list("book", flat=True)).values_list("author", flat=True)
To start with, a filter statement filters on a field matching some pattern. So if Book has a simple ForeignKey to Author, you could have
storeOfAuthorNames = Link_Book_Course.objects.filter(book__author="Stephen King"), but not just
storeOfAuthorNames = Link_Book_Course.objects.filter(book__author).
Once you get past that, I am guessing Book has Author as a ManyToManyField, not a ForeignKey (because a book can have multiple authors, and an author can publish multiple books?) In that case, just filter(book__author="Stephen King") will still not be enough. Try Link_Book_Course.objects.filter(book_author__in=myBookObject.author.all())