Django model manager live-object issues - django

I am using Django. I am having a few issues with caching of QuerySets for news/category models:
class Category(models.Model):
title = models.CharField(max_length=60)
slug = models.SlugField(unique=True)
class PublishedArticlesManager(models.Manager):
def get_query_set(self):
return super(PublishedArticlesManager, self).get_query_set() \
.filter(published__lte=datetime.datetime.now())
class Article(models.Model):
category = models.ForeignKey(Category)
title = models.CharField(max_length=60)
slug = models.SlugField(unique = True)
story = models.TextField()
author = models.CharField(max_length=60, blank=True)
published = models.DateTimeField(
help_text=_('Set to a date in the future to publish later.'))
created = models.DateTimeField(auto_now_add=True, editable=False)
updated = models.DateTimeField(auto_now=True, editable=False)
live = PublishedArticlesManager()
objects = models.Manager()
Note - I have removed some fields to save on complexity...
There are a few (related) issues with the above.
Firstly, when I query for LIVE objects in my view via Article.live.all() if I refresh the page repeatedly I can see (in MYSQL logs) the same database query being made with exactly the same date in the where clause - ie - the datetime.datetime.now() is being evaluated at compile time rather than runtime. I need the date to be evaluated at runtime.
Secondly, when I use the articles_set method on the Category object this appears to work correctly - the datetime used in the query changes each time the query is run - again I can see this in the logs. However, I am not quite sure why this works, since I don't have anything in my code to say that the articles_set query should return LIVE entries only!?
Finally, why is none of this being cached?
Any ideas how to make the correct time be used consistently? Can someone please explain why the latter setup appears to work?
Thanks
Jay
P.S - database queries below, note the date variations.
SELECT LIVE ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE `news_article`.`published` <= '2011-05-17 21:55:41' ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
SELECT LIVE ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE `news_article`.`published` <= '2011-05-17 21:55:41' ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
CATEGORY SELECT ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE (`news_article`.`published` <= '2011-05-18 21:21:33' AND `news_article`.`category_id` = 1 ) ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
CATEGORY SELECT ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE (`news_article`.`published` <= '2011-05-18 21:26:06' AND `news_article`.`category_id` = 1 ) ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;

You should check out conditional view processing.
def latest_entry(request, article_id):
return Article.objects.latest("updated").updated
#conditional(last_modified_func=latest_entry)
def view_article(request, article_id)
your view code here
This should cache the page rather than reloading a new version every time.
I suspect that if you want the now() to be processed at runtime, you should do use raw sql. I think this will solve the compile/runtime issue.
class PublishedArticlesManager(models.Manager):
def get_query_set(self):
return super(PublishedArticlesManager, self).get_query_set() \
.raw("SELECT * FROM news_article WHERE published <= CURRENT_TIMESTAMP")
Note that this returns a RawQuerySet which may differ a bit from a normal QuerySet

I have now fixed this issue. It appears the problem was that the queryset returned by Article.live.all() was being cached in my urls.py! I was using function-based generic-views:
url(r'^all/$', object_list, {
'queryset' : Article.live.all(),
}, 'news_all'),
I have now changed this to use the class-based approach, as advised in the latest Django documentation:
url(r'^all/$', ListView.as_view(
model=Article,
), name="news_all"),
This now works as expected - by specifying the model attribute rather than the queryset attribute the query is QuerySet is created at compile-time instead of runtime.

Related

django querset filter foreign key select first record

I have a History model like below
class History(models.Model):
class Meta:
app_label = 'subscription'
ordering = ['-start_datetime']
subscription = models.ForeignKey(Subscription, related_name='history')
FREE = 'free'
Premium = 'premium'
SUBSCRIPTION_TYPE_CHOICES = ((FREE, 'Free'), (Premium, 'Premium'),)
name = models.CharField(max_length=32, choices=SUBSCRIPTION_TYPE_CHOICES, default=FREE)
start_datetime = models.DateTimeField(db_index=True)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
cancelled_datetime = models.DateTimeField(blank=True, null=True)
Now i have a queryset filtering like below
users = get_user_model().objects.all()
queryset = users.exclude(subscription__history__end_datetime__lt=timezone.now())
The issue is that in the exclude above it is checking end_datetime for all the rows for a particular history object. But i only want to compare it with first row of history object.
Below is how a particular history object looks like. So i want to write a queryset filter which can do datetime comparison on first row only.
You could use a Model Manager method for this. The documentation isn't all that descriptive, but you could do something along the lines of:
class SubscriptionManager(models.Manager):
def my_filter(self):
# You'd want to make this a smaller query most likely
subscriptions = Subscription.objects.all()
results = []
for subscription in subscriptions:
sub_history = subscription.history_set.first()
if sub_history.end_datetime > timezone.now:
results.append(subscription)
return results
class History(models.Model):
subscription = models.ForeignKey(Subscription)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
objects = SubscriptionManager()
Then: queryset = Subscription.objects().my_filter()
Not a copy-pastable answer, but shows the use of Managers. Given the specificity of what you're looking for, I don't think there's a way to get it just via the plain filter() and exclude().
Without knowing what your end goal here is, it's hard to say whether this is feasible, but have you considered adding a property to the subscription model that indicates whatever you're looking for? For example, if you're trying to get everyone who has a subscription that's ending:
class Subscription(models.Model):
#property
def ending(self):
if self.end_datetime > timezone.now:
return True
else:
return False
Then in your code: queryset = users.filter(subscription_ending=True)
I have tried django's all king of expressions(aggregate, query, conditional) but was unable to solve the problem so i went with RawSQL and it solved the problem.
I have used the below SQL to select the first row and then compare the end_datetime
SELECT (end_datetime > %s OR end_datetime IS NULL) AS result
FROM subscription_history
ORDER BY start_datetime DESC
LIMIT 1;
I will select my answer as accepted if not found a solution with queryset filter chaining in next 2 days.

Query latest votes per user per post in Django

I have a voting system where a user can vote up or down on a post. The votes will be used in a calculation, so I need to store them in a log format, ie, I am saving each vote in it's own table.
Something like this:
class PointLog(models.Model):
post = models.ForeignKey(Post, db_index=True)
points = models.IntegerField(db_index=True)
user = models.ForeignKey(User, blank=True, null=True)
time = models.DateTimeField(auto_now_add=True, db_index=True)
data = models.IntegerField() # -1, 0 or 1
Now I need to display 20 Posts, together with the last vote the user did.
I am using django-rest-framework, so I can use a serializer field that looks like this; uservote = serializers.SerializerMethodField(), together with a function like:
def get_uservote(self, obj):
user = self.context['request'].user
vote = PointLog.objects.only('data').filter(user=user, post=obj).last()
return vote.data if vote else 0
But that will do a db-query 1 time per post, which I hope there is a better solution for.
I could save a db-query for each time the get_uservote is ran by saving the queryset in self.context, so that part is covered.
But how can I do a query that based on a list of items returns all the latest data from another table.
A start would be PointLog.objects.filter(user=user, post__in=posts), but what next? Is this even possible using raw SQL in 1 query?
Update 1:
PointLog.objects.filter(...).order_by('post__id').distinct('post__id') would kinda do it, except that I don't think I am guaranteed to get the newest vote. If I use order_by('pk') (or 'time'), I can't use distinct('post__id') as I will get an sql error (ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions)
I found a solution that only added 1 extra query.
The full get_uservote that works would be;
def get_uservote(self, obj):
user = self.context['request'].user
if user.is_authenticated():
if not self.context.get('get_uservote_data'):
self.context['get_uservote_data'] = {i.post_id: i.data for i in PointLog.objects.filter(user=user, post__in=self.instance).order_by('post__id', '-pk').distinct('post__id')}
return self.context['get_uservote_data'].get(obj.id, 0)
else:
return 0 # Return 0 as "not voted" if not logged in
Note that we are doing a couple of tricks here;
Storing the data in self.context['get_uservote_data'] so we only haveto run the query one time.
Using i.post_id as the key for our data, and not i.post.id. Asking for i.post.id here would trigger 1 query per item.
self.instance is the queryset with all the posts we want to filter on.
This all resulted in a query that looks like this:
SELECT DISTINCT ON ("post_pointlog"."post_id") "post_pointlog"."id",
"post_pointlog"."post_id",
"post_pointlog"."data"
FROM "post_pointlog"
WHERE ( "post_pointlog"."user_id" = 20
AND "post_pointlog"."post_id" IN ( 1, 2, 3, 4 ) )
ORDER BY "post_pointlog"."post_id" ASC,
"post_pointlog"."id" DESC

Django queryset - Adding HAVING constraint

I have been using Django for a couple of years now but I am struggling today with adding a HAVING constraint to a GROUP BY.
My queryset is the following:
crm_models.Contact.objects\
.filter(dealercontact__dealer__pk__in=(265,),
dealercontact__activity='gardening',
date_data_collected__gte=datetime.date(2012,10,1),
date_data_collected__lt=datetime.date(2013,10,1))\
.annotate(nb_rels=Count('dealercontact'))
which gives me the following MySQL query:
SELECT *
FROM `contact`
LEFT OUTER JOIN `dealer_contact` ON (`contact`.`id_contact` = `dealer_contact`.`id_contact`)
WHERE (`dealer_contact`.`active` = True
AND `dealer_contact`.`activity` = 'gardening'
AND `contact`.`date_data_collected` >= '2012-10-01'
AND `contact`.`date_data_collected` < '2013-10-01'
AND `dealer_contact`.`id_dealer` IN (265))
GROUP BY `contact`.`id_contact`
ORDER BY NULL;
I would get exactly what I need with this HAVING constraint:
HAVING SUM(IF(`dealer_contact`.`type`='customer', 1, 0)) = 0
How can I get this fixed with a Django Queryset? I need a queryset in this instance.
Here I am using annotate only in order to get the GROUP BY on contact.id_contact.
Edit: My goal is to get the Contacts who have no "customer" relation in dealercontact but have "ref" relation(s) (according to the WHERE clause of course).
Models
class Contact(models.Model):
id_contact = models.AutoField(primary_key=True)
title = models.CharField(max_length=255L, blank=True, choices=choices_custom_sort(TITLE_CHOICES))
last_name = models.CharField(max_length=255L, blank=True)
first_name = models.CharField(max_length=255L, blank=True)
[...]
date_data_collected = models.DateField(null=True, db_index=True)
class Dealer(models.Model):
id_dealer = models.AutoField(primary_key=True)
address1 = models.CharField(max_length=45L, blank=True)
[...]
class DealerContact(Auditable):
id_dealer_contact = models.AutoField(primary_key=True)
contact = models.ForeignKey(Contact, db_column='id_contact')
dealer = models.ForeignKey(Dealer, db_column='id_dealer')
activity = models.CharField(max_length=32, choices=choices_custom_sort(ACTIVITIES), db_index=True)
type = models.CharField(max_length=32, choices=choices_custom_sort(DEALER_CONTACT_TYPE), db_index=True)
I figured this out by adding two binary fields in DealerContact: is_ref and is_customer.
If type='ref' then is_ref=1 and is_customer=0.
Else if type='customer' then is_ref=0 and is_customer=1.
Thus, I am now able to use annotate(nb_customers=Sum('is_customer')) and then use filter(nb_customers=0).
The final queryset consists in:
Contact.objects.filter(dealercontact__dealer__pk__in=(265,),
dealercontact__activity='gardening',
date_data_collected__gte=datetime.date(2012,10,1),
date_data_collected__lt=datetime.date(2013,10,1))\
.annotate(nb_customers=Sum('dealercontact__is_customer'))\
.filter(nb_customers=0)
Actually there is a way you can add your own custom HAVING and GROUP BY clauses if you need.
Just use my example with caution - if Django ORM code/paths will change in future Django versions, you will have to update your code too.
Image you have Book and Edition models, where for each book there can be multiple editions and you want to select first US edition date within Book queryset.
Adding custom HAVING and GROUP BY clauses in Django 1.5+:
from django.db.models import Min
from django.db.models.sql.where import ExtraWhere, AND
qs = Book.objects.all()
# Standard annotate
qs = qs.annotate(first_edition_date=Min("edition__date"))
# Custom HAVING clause, to limit annotation by US country only
qs.query.having.add(ExtraWhere(['"app_edition"."country"=%s'], ["US"]), AND)
# Custom GROUP BY clause will be needed too
qs.query.group_by.append(("app_edition", "country"))
ExtraWhere can contain not just fields, but any raw sql conditions and functions too.
Are you not using raw query just because you want orm object? Using Contact.objects.raw() generate instances similar filter. Refer to https://docs.djangoproject.com/en/dev/topics/db/sql/ for more help.
My goal is to get the Contacts who have no "customer" relation in
dealercontact but have "ref" relation(s) (according to the WHERE
clause of course).
This simple query fulfills this requirement:
Contact.objects.filter(dealercontact__type="ref").exclude(dealercontact__type="customer")
Is this enough, or do you need it to do something more?
UPDATE: if your requirement is
Contacts that have a "ref" relations, but do not have "customer"
relations with the same dealer
you can do this:
from django.db.models import Q
Contact.objects.filter(Q(dealercontact__type="ref") & ~Q(dealercontact__type="customer"))

"SELECT...AS..." with related model data in Django

I have an application where users select their own display columns. Each display column has a specified formula. To compute that formula, I need to join few related columns (one-to-one relationship) and compute the value.
The models are like (this is just an example model, actual has more than 100 fields):
class CompanyCode(models.Model):
"""Various Company Codes"""
nse_code = models.CharField(max_length=20)
bse_code = models.CharField(max_length=20)
isin_code = models.CharField(max_length=20)
class Quarter(models.Model):
"""Company Quarterly Result Figures"""
company_code = models.OneToOneField(CompanyCode)
sales_now = models.IntegerField()
sales_previous = models.IntegerField()
I tried doing:
ratios = {'growth':'quarter__sales_now / quarter__sales_previous'}
CompanyCode.objects.extra(select=ratios)
# raises "Unknown column 'quarter__sales_now' in 'field list'"
I also tried using raw query:
query = ','.join(['round((%s),2) AS %s' % (formula, ratio_name)
for ratio_name, formula in ratios.iteritems()])
companies = CompanyCode.objects.raw("""
SELECT `backend_companycode`.`id`, %s
FROM `backend_companycode`
INNER JOIN `backend_quarter` ON ( `backend_companycode`.`id` = `backend_companyquarter`.`company_code_id` )
""", [query])
#This just gives empty result
So please give me a little clue as to how I can use related columns preferably using 'extra' command. Thanks.
By now the Django documentation says that one should use extra as a last resort.
So here is a query without extra():
from django.db.models import F
CompanyCode.objects.annotate(
growth=F('quarter__sales_now') / F('quarter__sales_previous'),
)
Since the calculation is being done on a single Quarter instance, where's the need to do it in the SELECT? You could just define a ratio method/property on the Quarter model:
#property
def quarter(self):
return self.sales_now / self.sales_previous
and call it where necessary
Ok, I found it out. In above using:
CompanyCode.objects.select_related('quarter').extra(select=ratios)
solved the problem.
Basically, to access any related model data through 'extra', we just need to ensure that that model is joined in our query. Using select_related, the query automatically joins the mentioned models.
Thanks :).

Ordering by a custom model field in django

I am trying to add an additional custom field to a django model. I have been having quite a hard time figuring out how to do the following, and I will be awarding a 150pt bounty for the first fully correct answer when it becomes available (after it is available -- see as a reference Improving Python/django view code).
I have the following model, with a custom def that returns a video count for each user --
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
positions = models.ManyToManyField('Position', through ='PositionTimestamp', blank=True)
def count(self):
from django.db import connection
cursor = connection.cursor()
cursor.execute(
"""SELECT (
SELECT COUNT(*)
FROM videos_video v
WHERE v.uploaded_by_id = p.id
OR EXISTS (
SELECT NULL
FROM videos_videocredit c
WHERE c.video_id = v.id
AND c.profile_id = p.id
)
) AS Total_credits
FROM userprofile_userprofile p
WHERE p.id = %d"""%(int(self.pk))
)
return int(cursor.fetchone()[0])
I want to be able to order by the count, i.e., UserProfile.objects.order_by('count'). Of course, I can't do that, which is why I'm asking this question.
Previously, I tried adding a custom model Manager, but the problem with that was I also need to be able to filter by various criteria of the UserProfile model: Specifically, I need to be able to do: UserProfile.objects.filter(positions=x).order_by('count'). In addition, I need to stay in the ORM (cannot have a raw sql output) and I do not want to put the filtering logic into the SQL, because there are various filters, and would require several statements.
How exactly would I do this? Thank you.
My reaction is that you're trying to take a bigger bite than you can chew. Break it into bite size pieces by giving yourself more primitives to work with.
You want to create these two pieces separately so you can call on them:
Does this user get credit for this video? return boolean
For how many videos does this user get credit? return int
Then use a combination of #property, model managers, querysets, and methods that make it easiest to express what you need.
For example you might attach the "credit" to the video model taking a user parameter, or the user model taking a video parameter, or a "credit" manager on users which adds a count of videos for which they have credit.
It's not trivial, but shouldn't be too tricky if you work for it.
"couldn't you use something like the "extra" queryset modifier?"
see the docs
I didn't put this in an answer at first because I wasn't sure it would actually work or if it was what you needed - it was more like a nudge in the (hopefully) right direction.
in the docs on that page there is an example
query
Blog.objects.extra(
select={
'entry_count': 'SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id'
},
)
resulting sql
SELECT blog_blog.*, (SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id) AS entry_count
FROM blog_blog;
Perhaps doing something like that and accessing the user id which you currently have as p.id as appname_userprofile.id
note:
Im just winging it so try to play around a bit.
perhaps use the shell to output the query as sql and see what you are getting.
models:
class Positions(models.Model):
x = models.IntegerField()
class Meta:
db_table = 'xtest_positions'
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
positions = models.ManyToManyField(Positions)
class Meta:
db_table = 'xtest_users'
class Video(models.Model):
usr = models.ForeignKey(UserProfile)
views = models.IntegerField()
class Meta:
db_table = 'xtest_video'
result:
test = UserProfile.objects.annotate(video_views=Sum('video__views')).order_by('video_views')
for t in test:
print t.video_views
doc: https://docs.djangoproject.com/en/dev/topics/db/aggregation/
This is either what you want, or I've completely misunderstood!.. Anywhoo... Hope it helps!