Ordering by a custom model field in django - django

I am trying to add an additional custom field to a django model. I have been having quite a hard time figuring out how to do the following, and I will be awarding a 150pt bounty for the first fully correct answer when it becomes available (after it is available -- see as a reference Improving Python/django view code).
I have the following model, with a custom def that returns a video count for each user --
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
positions = models.ManyToManyField('Position', through ='PositionTimestamp', blank=True)
def count(self):
from django.db import connection
cursor = connection.cursor()
cursor.execute(
"""SELECT (
SELECT COUNT(*)
FROM videos_video v
WHERE v.uploaded_by_id = p.id
OR EXISTS (
SELECT NULL
FROM videos_videocredit c
WHERE c.video_id = v.id
AND c.profile_id = p.id
)
) AS Total_credits
FROM userprofile_userprofile p
WHERE p.id = %d"""%(int(self.pk))
)
return int(cursor.fetchone()[0])
I want to be able to order by the count, i.e., UserProfile.objects.order_by('count'). Of course, I can't do that, which is why I'm asking this question.
Previously, I tried adding a custom model Manager, but the problem with that was I also need to be able to filter by various criteria of the UserProfile model: Specifically, I need to be able to do: UserProfile.objects.filter(positions=x).order_by('count'). In addition, I need to stay in the ORM (cannot have a raw sql output) and I do not want to put the filtering logic into the SQL, because there are various filters, and would require several statements.
How exactly would I do this? Thank you.

My reaction is that you're trying to take a bigger bite than you can chew. Break it into bite size pieces by giving yourself more primitives to work with.
You want to create these two pieces separately so you can call on them:
Does this user get credit for this video? return boolean
For how many videos does this user get credit? return int
Then use a combination of #property, model managers, querysets, and methods that make it easiest to express what you need.
For example you might attach the "credit" to the video model taking a user parameter, or the user model taking a video parameter, or a "credit" manager on users which adds a count of videos for which they have credit.
It's not trivial, but shouldn't be too tricky if you work for it.

"couldn't you use something like the "extra" queryset modifier?"
see the docs
I didn't put this in an answer at first because I wasn't sure it would actually work or if it was what you needed - it was more like a nudge in the (hopefully) right direction.
in the docs on that page there is an example
query
Blog.objects.extra(
select={
'entry_count': 'SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id'
},
)
resulting sql
SELECT blog_blog.*, (SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id) AS entry_count
FROM blog_blog;
Perhaps doing something like that and accessing the user id which you currently have as p.id as appname_userprofile.id
note:
Im just winging it so try to play around a bit.
perhaps use the shell to output the query as sql and see what you are getting.

models:
class Positions(models.Model):
x = models.IntegerField()
class Meta:
db_table = 'xtest_positions'
class UserProfile(models.Model):
user = models.ForeignKey(User, unique=True)
positions = models.ManyToManyField(Positions)
class Meta:
db_table = 'xtest_users'
class Video(models.Model):
usr = models.ForeignKey(UserProfile)
views = models.IntegerField()
class Meta:
db_table = 'xtest_video'
result:
test = UserProfile.objects.annotate(video_views=Sum('video__views')).order_by('video_views')
for t in test:
print t.video_views
doc: https://docs.djangoproject.com/en/dev/topics/db/aggregation/
This is either what you want, or I've completely misunderstood!.. Anywhoo... Hope it helps!

Related

Bin a queryset using Django?

Let's say we have the following simplistic models:
class Category(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "categories"
class Status(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "status"
class Product(models.Model):
title = models.CharField(max_length=264)
description = models.CharField(max_length=264)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10)
status = models.ForeignKey(Status, on_delete=models.CASCADE)
My aim is to get some statistics, like total products, total sales, average sales etc, based on which price bin each product belongs to.
So, the price bins could be something like 0-100, 100-500, 500-1000, etc.
I know how to use pandas to do something like that:
Binning column with python pandas
I am searching for a way to do this with the Django ORM.
One of my thoughts is to convert the queryset into a list and apply a function to get the apropriate price bin and then do the statistics.
Another thought which I am not sure how to impliment, is the same as the one above but just apply the bin function to the field in the queryset I am interested in.
There are three pathways I can see.
First is composing the SQL you want to use directly and putting it to your database with a modification of your models manager class. .objects.raw("[sql goes here]"). This answer shows how to define group with a simple function on the content - something like that could work?
SELECT FLOOR(grade/5.00)*5 As Grade,
COUNT(*) AS [Grade Count]
FROM TableName
GROUP BY FLOOR(Grade/5.00)*5
ORDER BY 1
Second is that there is no reason you can't move the queryset (with .values() or .values_list()) into a pandas dataframe or similar and then bin it, as you mentioned. There is probably a bit of an efficiency loss in terms of getting the queryset into a dataframe and then processing it, but I am not sure that it would certainly or always be bad. If its easier to compose and maintain, that might be fine.
The third way I would try (which I think is what you really want) is chaining .annotate() to label points with the bin they belong in, and the aggregate count function to count how many are in each bin. This is more advanced ORM work than I've done, but I think you'd start looking at something like the docs section on conditional aggregation. I've adapted this slightly to create the 'price_class' column first, with annotate.
Product.objects.annotate(price_class=floor(F('price')/100).aggregate(
class_zero=Count('pk', filter=Q(price_class=0)),
class_one=Count('pk', filter=Q(price_class=1)),
class_two=Count('pk', filter=Q(price_class=2)), # etc etc
)
I'm not sure if that 'floor' is going to work, and you may need 'expression wrapper' to ensure the push price_class into the write type of output_field. All the best.

How to filter multiple fields with list of objects

I want to build an webapp like Quora or Medium, where a user can follow users or some topics.
eg: userA is following (userB, userC, tag-Health, tag-Finance).
These are the models:
class Relationship(models.Model):
user = AutoOneToOneField('auth.user')
follows_user = models.ManyToManyField('Relationship', related_name='followed_by')
follows_tag = models.ManyToManyField(Tag)
class Activity(models.Model):
actor_type = models.ForeignKey(ContentType, related_name='actor_type_activities')
actor_id = models.PositiveIntegerField()
actor = GenericForeignKey('actor_type', 'actor_id')
verb = models.CharField(max_length=10)
target_type = models.ForeignKey(ContentType, related_name='target_type_activities')
target_id = models.PositiveIntegerField()
target = GenericForeignKey('target_type', 'target_id')
tags = models.ManyToManyField(Tag)
Now, this would give the following list:
following_user = userA.relationship.follows_user.all()
following_user
[<Relationship: userB>, <Relationship: userC>]
following_tag = userA.relationship.follows_tag.all()
following_tag
[<Tag: tag-job>, <Tag: tag-finance>]
To filter I tried this way:
Activity.objects.filter(Q(actor__in=following_user) | Q(tags__in=following_tag))
But since actor is a GenericForeignKey I am getting an error:
FieldError: Field 'actor' does not generate an automatic reverse relation and therefore cannot be used for reverse querying. If it is a GenericForeignKey, consider adding a GenericRelation.
How can I filter the activities that will be unique, with the list of users and list of tags that the user is following? To be specific, how will I filter GenericForeignKey with the list of the objects to get the activities of the following users.
You should just filter by ids.
First get ids of objects you want to filter on
following_user = userA.relationship.follows_user.all().values_list('id', flat=True)
following_tag = userA.relationship.follows_tag.all()
Also you will need to filter on actor_type. It can be done like this for example.
actor_type = ContentType.objects.get_for_model(userA.__class__)
Or as #Todor suggested in comments. Because get_for_model accepts both model class and model instance
actor_type = ContentType.objects.get_for_model(userA)
And than you can just filter like this.
Activity.objects.filter(Q(actor_id__in=following_user, actor_type=actor_type) | Q(tags__in=following_tag))
What the docs are suggesting is not a bad thing.
The problem is that when you are creating Activities you are using auth.User as an actor, therefore you can't add GenericRelation to auth.User (well maybe you can by monkey-patching it, but that's not a good idea).
So what you can do?
#Sardorbek Imomaliev solution is very good, and you can make it even better if you put all this logic into a custom QuerySet class. (the idea is to achieve DRY-ness and reausability)
class ActivityQuerySet(models.QuerySet):
def for_user(self, user):
return self.filter(
models.Q(
actor_type=ContentType.objects.get_for_model(user),
actor_id__in=user.relationship.follows_user.values_list('id', flat=True)
)|models.Q(
tags__in=user.relationship.follows_tag.all()
)
)
class Activity(models.Model):
#..
objects = ActivityQuerySet.as_manager()
#usage
user_feed = Activity.objects.for_user(request.user)
but is there anything else?
1. Do you really need GenericForeignKey for actor? I don't know your business logic, so probably you do, but using just a regular FK for actor (just like for the tags) will make it possible to do staff like actor__in=users_following.
2. Did you check if there isn't an app for that? One example for a package already solving your problem is django-activity-steam check on it.
3. IF you don't use auth.User as an actor you can do exactly what the docs suggest -> adding a GenericRelation field. In fact, your Relationship class is suitable for this purpose, but I would really rename it to something like UserProfile or at least UserRelation. Consider we have renamed Relation to UserProfile and we create new Activities using userprofile instead. The idea is:
class UserProfile(models.Model):
user = AutoOneToOneField('auth.user')
follows_user = models.ManyToManyField('UserProfile', related_name='followed_by')
follows_tag = models.ManyToManyField(Tag)
activies_as_actor = GenericRelation('Activity',
content_type_field='actor_type',
object_id_field='actor_id',
related_query_name='userprofile'
)
class ActivityQuerySet(models.QuerySet):
def for_userprofile(self, userprofile):
return self.filter(
models.Q(
userprofile__in=userprofile.follows_user.all()
)|models.Q(
tags__in=userprofile.relationship.follows_tag.all()
)
)
class Activity(models.Model):
#..
objects = ActivityQuerySet.as_manager()
#usage
#1st when you create activity use UserProfile
Activity.objects.create(actor=request.user.userprofile, ...)
#2nd when you fetch.
#Check how `for_userprofile` is implemented this time
Activity.objects.for_userprofile(request.user.userprofile)
As stated in the documentation:
Due to the way GenericForeignKey is implemented, you cannot use such fields directly with filters (filter() and exclude(), for example) via the database API. Because a GenericForeignKey isn’t a normal field object, these examples will not work:
You could follow what the error message is telling you, I think you'll have to add a GenericRelation relation to do that. I do not have experience doing that, and I'd have to study it but...
Personally I think this solution is too complex to what you're trying to achieve. If only the user model can follow a tag or authors, why not include a ManyToManyField on it. It would be something like this:
class Person(models.Model):
user = models.ForeignKey(User)
follow_tag = models.ManyToManyField('Tag')
follow_author = models.ManyToManyField('Author')
You could query all followed tag activities per Person like this:
Activity.objects.filter(tags__in=person.follow_tag.all())
And you could search 'persons' following a tag like this:
Person.objects.filter(follow_tag__in=[<tag_ids>])
The same would apply to authors and you could use querysets to do OR, AND, etc.. on your queries.
If you want more models to be able to follow a tag or author, say a System, maybe you could create a Following model that does the same thing Person is doing and then you could add a ForeignKey to Follow both in Person and System
Note that I'm using this Person to meet this recomendation.
You can query seperately for both usrs and tags and then combine them both to get what you are looking for. Please do something like below and let me know if this works..
usrs = Activity.objects.filter(actor__in=following_user)
tags = Activity.objects.filter(tags__in=following_tag)
result = usrs | tags
You can use annotate to join the two primary keys as a single string then use that to filter your queryset.
from django.db.models import Value, TextField
from django.db.models.functions import Concat
following_actor = [
# actor_type, actor
(1, 100),
(2, 102),
]
searchable_keys = [str(at) + "__" + str(actor) for at, actor in following_actor]
result = MultiKey.objects.annotate(key=Concat('actor_type', Value('__'), 'actor_id',
output_field=TextField()))\
.filter(Q(key__in=searchable_keys) | Q(tags__in=following_tag))

"SELECT...AS..." with related model data in Django

I have an application where users select their own display columns. Each display column has a specified formula. To compute that formula, I need to join few related columns (one-to-one relationship) and compute the value.
The models are like (this is just an example model, actual has more than 100 fields):
class CompanyCode(models.Model):
"""Various Company Codes"""
nse_code = models.CharField(max_length=20)
bse_code = models.CharField(max_length=20)
isin_code = models.CharField(max_length=20)
class Quarter(models.Model):
"""Company Quarterly Result Figures"""
company_code = models.OneToOneField(CompanyCode)
sales_now = models.IntegerField()
sales_previous = models.IntegerField()
I tried doing:
ratios = {'growth':'quarter__sales_now / quarter__sales_previous'}
CompanyCode.objects.extra(select=ratios)
# raises "Unknown column 'quarter__sales_now' in 'field list'"
I also tried using raw query:
query = ','.join(['round((%s),2) AS %s' % (formula, ratio_name)
for ratio_name, formula in ratios.iteritems()])
companies = CompanyCode.objects.raw("""
SELECT `backend_companycode`.`id`, %s
FROM `backend_companycode`
INNER JOIN `backend_quarter` ON ( `backend_companycode`.`id` = `backend_companyquarter`.`company_code_id` )
""", [query])
#This just gives empty result
So please give me a little clue as to how I can use related columns preferably using 'extra' command. Thanks.
By now the Django documentation says that one should use extra as a last resort.
So here is a query without extra():
from django.db.models import F
CompanyCode.objects.annotate(
growth=F('quarter__sales_now') / F('quarter__sales_previous'),
)
Since the calculation is being done on a single Quarter instance, where's the need to do it in the SELECT? You could just define a ratio method/property on the Quarter model:
#property
def quarter(self):
return self.sales_now / self.sales_previous
and call it where necessary
Ok, I found it out. In above using:
CompanyCode.objects.select_related('quarter').extra(select=ratios)
solved the problem.
Basically, to access any related model data through 'extra', we just need to ensure that that model is joined in our query. Using select_related, the query automatically joins the mentioned models.
Thanks :).

Django model manager live-object issues

I am using Django. I am having a few issues with caching of QuerySets for news/category models:
class Category(models.Model):
title = models.CharField(max_length=60)
slug = models.SlugField(unique=True)
class PublishedArticlesManager(models.Manager):
def get_query_set(self):
return super(PublishedArticlesManager, self).get_query_set() \
.filter(published__lte=datetime.datetime.now())
class Article(models.Model):
category = models.ForeignKey(Category)
title = models.CharField(max_length=60)
slug = models.SlugField(unique = True)
story = models.TextField()
author = models.CharField(max_length=60, blank=True)
published = models.DateTimeField(
help_text=_('Set to a date in the future to publish later.'))
created = models.DateTimeField(auto_now_add=True, editable=False)
updated = models.DateTimeField(auto_now=True, editable=False)
live = PublishedArticlesManager()
objects = models.Manager()
Note - I have removed some fields to save on complexity...
There are a few (related) issues with the above.
Firstly, when I query for LIVE objects in my view via Article.live.all() if I refresh the page repeatedly I can see (in MYSQL logs) the same database query being made with exactly the same date in the where clause - ie - the datetime.datetime.now() is being evaluated at compile time rather than runtime. I need the date to be evaluated at runtime.
Secondly, when I use the articles_set method on the Category object this appears to work correctly - the datetime used in the query changes each time the query is run - again I can see this in the logs. However, I am not quite sure why this works, since I don't have anything in my code to say that the articles_set query should return LIVE entries only!?
Finally, why is none of this being cached?
Any ideas how to make the correct time be used consistently? Can someone please explain why the latter setup appears to work?
Thanks
Jay
P.S - database queries below, note the date variations.
SELECT LIVE ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE `news_article`.`published` <= '2011-05-17 21:55:41' ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
SELECT LIVE ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE `news_article`.`published` <= '2011-05-17 21:55:41' ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
CATEGORY SELECT ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE (`news_article`.`published` <= '2011-05-18 21:21:33' AND `news_article`.`category_id` = 1 ) ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
CATEGORY SELECT ARTICLES, query #1:
SELECT `news_article`.`id`, `news_article`.`category_id`, `news_article`.`title`, `news_article`.`slug`, `news_article`.`teaser`, `news_article`.`summary`, `news_article`.`story`, `news_article`.`author`, `news_article`.`published`, `news_article`.`created`, `news_article`.`updated` FROM `news_article` WHERE (`news_article`.`published` <= '2011-05-18 21:26:06' AND `news_article`.`category_id` = 1 ) ORDER BY `news_article`.`published` DESC, `news_article`.`slug` ASC;
You should check out conditional view processing.
def latest_entry(request, article_id):
return Article.objects.latest("updated").updated
#conditional(last_modified_func=latest_entry)
def view_article(request, article_id)
your view code here
This should cache the page rather than reloading a new version every time.
I suspect that if you want the now() to be processed at runtime, you should do use raw sql. I think this will solve the compile/runtime issue.
class PublishedArticlesManager(models.Manager):
def get_query_set(self):
return super(PublishedArticlesManager, self).get_query_set() \
.raw("SELECT * FROM news_article WHERE published <= CURRENT_TIMESTAMP")
Note that this returns a RawQuerySet which may differ a bit from a normal QuerySet
I have now fixed this issue. It appears the problem was that the queryset returned by Article.live.all() was being cached in my urls.py! I was using function-based generic-views:
url(r'^all/$', object_list, {
'queryset' : Article.live.all(),
}, 'news_all'),
I have now changed this to use the class-based approach, as advised in the latest Django documentation:
url(r'^all/$', ListView.as_view(
model=Article,
), name="news_all"),
This now works as expected - by specifying the model attribute rather than the queryset attribute the query is QuerySet is created at compile-time instead of runtime.

How can i get a list of objects from a postgresql view table to display

this is a model of the view table.
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
this is the SQL i use to create the table
CREATE VIEW qry_desc_char as
SELECT
tbl_desc.iid_id,
tbl_desc.cid_id,
tbl_desc.cs,
tbl_char.cid,
tbl_char.charname
FROM tbl_desC,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
i dont know if i need a function in models or views or both. i want to get a list of objects from that database to display it. This might be easy but im new at Django and python so i having some problems
Django 1.1 brought in a new feature that you might find useful. You should be able to do something like:
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
managed = False
The documentation for the managed Meta class option is here. A relevant quote:
If False, no database table creation
or deletion operations will be
performed for this model. This is
useful if the model represents an
existing table or a database view that
has been created by some other means.
This is the only difference when
managed is False. All other aspects of
model handling are exactly the same as
normal.
Once that is done, you should be able to use your model normally. To get a list of objects you'd do something like:
qry_desc_char_list = QryDescChar.objects.all()
To actually get the list into your template you might want to look at generic views, specifically the object_list view.
If your RDBMS lets you create writable views and the view you create has the exact structure than the table Django would create I guess that should work directly.
(This is an old question, but is an area that still trips people up and is still highly relevant to anyone using Django with a pre-existing, normalized schema.)
In your SELECT statement you will need to add a numeric "id" because Django expects one, even on an unmanaged model. You can use the row_number() window function to accomplish this if there isn't a guaranteed unique integer value on the row somewhere (and with views this is often the case).
In this case I'm using an ORDER BY clause with the window function, but you can do anything that's valid, and while you're at it you may as well use a clause that's useful to you in some way. Just make sure you do not try to use Django ORM dot references to relations because they look for the "id" column by default, and yours are fake.
Additionally I would consider renaming my output columns to something more meaningful if you're going to use it within an object. With those changes in place the query would look more like (of course, substitute your own terms for the "AS" clauses):
CREATE VIEW qry_desc_char as
SELECT
row_number() OVER (ORDER BY tbl_char.cid) AS id,
tbl_desc.iid_id AS iid_id,
tbl_desc.cid_id AS cid_id,
tbl_desc.cs AS a_better_name,
tbl_char.cid AS something_descriptive,
tbl_char.charname AS name
FROM tbl_desc,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
Once that is done, in Django your model could look like this:
class QryDescChar(models.Model):
iid_id = models.ForeignKey('WhateverIidIs', related_name='+',
db_column='iid_id', on_delete=models.DO_NOTHING)
cid_id = models.ForeignKey('WhateverCidIs', related_name='+',
db_column='cid_id', on_delete=models.DO_NOTHING)
a_better_name = models.CharField(max_length=10)
something_descriptive = models.IntegerField()
name = models.CharField(max_length=50)
class Meta:
managed = False
db_table = 'qry_desc_char'
You don't need the "_id" part on the end of the id column names, because you can declare the column name on the Django model with something more descriptive using the "db_column" argument as I did above (but here I only it to prevent Django from adding another "_id" to the end of cid_id and iid_id -- which added zero semantic value to your code). Also, note the "on_delete" argument. Django does its own thing when it comes to cascading deletes, and on an interesting data model you don't want this -- and when it comes to views you'll just get an error and an aborted transaction. Prior to Django 1.5 you have to patch it to make DO_NOTHING actually mean "do nothing" -- otherwise it will still try to (needlessly) query and collect all related objects before going through its delete cycle, and the query will fail, halting the entire operation.
Incidentally, I wrote an in-depth explanation of how to do this just the other day.
You are trying to fetch records from a view. This is not correct as a view does not map to a model, a table maps to a model.
You should use Django ORM to fetch QryDescChar objects. Please note that Django ORM will fetch them directly from the table. You can consult Django docs for extra() and select_related() methods which will allow you to fetch related data (data you want to get from the other table) in different ways.