Annotate over Multi-table Inheritance in Django - django

I have a base LoggedEvent model and a number of subclass models like follows:
class LoggedEvent(models.Model):
user = models.ForeignKey(User, blank=True, null=True)
timestamp = models.DateTimeField(auto_now_add=True)
class AuthEvent(LoggedEvent):
good = models.BooleanField()
username = models.CharField(max_length=12)
class LDAPSearchEvent(LoggedEvent):
type = models.CharField(max_length=12)
query = models.CharField(max_length=24)
class PRISearchEvent(LoggedEvent):
type = models.CharField(max_length=12)
query = models.CharField(max_length=24)
Users generate these events as they do the related actions. I am attempting to generate a usage-report of how many of each event-type each user has caused in the last month. I am struggling with Django's ORM and while I am close I am running into a problem. Here is the query code:
def usage(request):
# Calculate date range
today = datetime.date.today()
month_start = datetime.date(year=today.year, month=today.month - 1, day=1)
month_end = datetime.date(year=today.year, month=today.month, day=1) - datetime.timedelta(days=1)
# Search for how many LDAP events were generated per user, last month
baseusage = User.objects.filter(loggedevent__timestamp__gte=month_start, loggedevent__timestamp__lte=month_end)
ldapusage = baseusage.exclude(loggedevent__ldapsearchevent__id__lt=1).annotate(count=Count('loggedevent__pk'))
authusage = baseusage.exclude(loggedevent__authevent__id__lt=1).annotate(count=Count('loggedevent__pk'))
return render_to_response('usage.html', {
'ldapusage' : ldapusage,
'authusage' : authusage,
}, context_instance=RequestContext(request))
Both ldapusage and authusage are both a list of users, each user annotated with a .count attribute which is supposed to represent how many particular events that user generated. However in both lists, the .count attributes are the same value. Infact the annotated 'count' is equal to how many events that user generated, regardless of type. So it would seem that my specific
authusage = baseusage.exclude(loggedevent__authevent__id__lt=1)
isn't excluding by subclass. I have tried id__lt=1, id__isnull=True, and others. Halp.

The key to Django model inheritance is remembering that with a non-abstract base class everything is really an instance of the base class which might happen to have some extra data strapped on the side from a separate table. This means that when you do searches on the base table you get back instances of the base class and there's no way to tell which subclass it is without doing repeated database queries on the subclass tables to see if they contain a record with a matching key ("I have an event. Does it have a record in AuthEvent? No. What about LDAP Event?…"). Among other things this means that you can't easily filter on them in normal queries on the base class without doing a join on every subclass table.
You have a couple of choices: one would simply be to do your queries on the subclass and tally the results (ldap_event_count = LDAPEvent.objects.filter(user=foo).count(), …), which might be sufficient for a single report. I usually recommend adding a content type field to the base class so you can efficiently tell which particular subclass an instance is without having to do another query:
content_type = models.ForeignKey("contenttypes.ContentType")
That allows two major improvements: the most common one is that you can deal with many Events generically without having to do something like hit the subclass-specific accessors (e.g. event.authevent or event.ldapevent) and handling DoesNotExist. In this case it would also make it trivial to rewrite your query since you could just do something like Event.objects.aggregate(Count("content_type")) to get the report values, which becomes particularly handy if your logic gets more complicated ("Event is Auth or LDAP and …").

Related

retrieve distinct values of manytomanyfield of another manytomanyfield

I have a simple question but multiple google searches left me without a nice solution. Currently I am doing the following:
allowed_categories = self.allowed_view.all().difference(self.not_allowed_view.all())
users = []
for cat in allowed_categories:
for member in cat.members.all():
users.append(member)
return users
I have a ManyToManyField to Objects that also have a ManyToManyField for instances of Users. In the code above, I am trying to get all the users from all those categories and get a list of all Users.
Later I would like the same in a method allowed_to_view(self, user_instance) but that's for later.
How would I achieve this using Django ORM without using nested for-loops?
[edit]
My models are as follows:
class RestrictedView(models.Model):
allowed_view = models.ManyToManyField(Category)
not_allowed_view = models.ManyToManyField(Category)
class Category(models.Model):
name = models.CharField(max_length=30)
members = models.ManyToManyField(User)
So, I've made the following one-liner with only one query towards the database. It took me some time...
users = User.objects.filter(pk__in=self.allowed_view.all().values("users").difference(self.not_allowed_view.all().values("users")))
This gives me a nice queryset with only the users that are in the allowed_view and explicitly not in the not_allowed_view.
Without seeing you database structure / models.py file its hard to say, but you can do a search on member objects like so:
members_queryset = Member.objects.filter(
category = <allowed categories>,
...
)
users += list(members.all())

Query optimisation for FK association on inherited model from the base model

I have users who create (or receive) transactions. The transaction hierarchy I have is a multi-table inheritance, with Transaction as the base model containing the common fields between all transaction types, such as User (FK), amount, etc. I have several transaction types, which extend the Transaction model with type specific data.
For the sake of this example, a simplified structure illustrating my problem can be found below.
from model_utils.managers import InheritanceManager
class User(models.Model):
pass
class Transaction(models.Model):
DEPOSIT = 'deposit'
WITHDRAWAL = 'withdrawal'
TRANSFER = 'transfer'
TYPES = (
(DEPOSIT, DEPOSIT),
(WITHDRAWAL, WITHDRAWAL),
(TRANSFER, TRANSFER),
)
type = models.CharField(max_length=24, choices=TYPES)
user = models.ForeignKey(User)
amount = models.PositiveIntegerField()
objects = InheritanceManager()
class Meta:
indexes = [
models.Index(fields=['user']),
models.Index(fields=['type'])
]
class Withdrawal(Transaction):
TYPE = Transaction.WITHDRAWAL
bank_account = models.ForeignKey(BankAccount)
class Deposit(Transaction):
TYPE = Transaction.DEPOSIT
card = models.ForeignKey(Card)
class Transfer(Transaction):
TYPE = Transaction.Transfer
recipient = models.ForeignKey(User)
class Meta:
indexes = [
models.Index(fields=['recipient'])
]
I then set each transaction's type in the inherited model's .save() method. This is all fine and well.
The problem comes in when I would like to fetch the a user's transactions. Specifically, I require the sub-model instances (deposits, transfers and withdrawals), rather than the base model (transactions). I also require transactions that the user both created themselves AND transfers they have received. For the former I use django-model-utils's fantastic IneritanceManager, which works great. Except that when I include the filtering on the transfer submodel's recipient FK field the DB query increases by an order of magnitude.
As illustrated above I have placed indexes on the Transaction user column and the Transfer recipient column. But it appeared to me that what I may need is an index on the Transaction subtype, if that is at all possible. I have attempted to achieve this effect by putting an index on the Transaction type field and including it in the query, as you will see below, but this appears to have no effect. Furthermore, I use .select_related() for the user objects since they are required in the serializations.
The query is structured as such:
from django.db.models import Q
queryset = Transaction.objects.select_related(
'user',
'transfer__recipient'
).select_subclasses().filter(
Q(user=request.user) |
Q(type=Transaction.TRANSFER, transfer__recipient=request.user)
).order_by('-id')
So my question is, why is there an order of magnitude difference on the DB query when including the Transfer.recipient in the query? Have I missed something? Am I doing something silly? Or is there a way I can optimise this further?

related field nested lookup error

I have the following models:
class Profile(models.Model):
user = models.ForeignKey(User)# User can have more than one profile
class Order(models.Model):
ship_to = models.ForeignKey(Profile)
class Shipping(models.Model):
order = models.ForeignKey(Order)# one order can have more than one shipping
shipping_company = models.ForeignKey(Shipping_company)
class Shipping_company(models.Model):
name = ...
So now i have the following structure:
User > Receiver > Order > Shipping > Shipping_company
The question is: How can i get all User models, who ordered with specific Shipping company?
If i make a query like this
User.objects.filter(receiver__order__shipping__shipping_company__pk=1)
i get
FieldError: Relation fields do not support nested lookups
if i make something like this
sh_comp = items.objects.get(pk=1) # __unicode__ returns "FedEx"
User.objects.filter(receiver__order__shipping__shipping_company=sh_comp)
the result is
ValueError: Cannot query "FedEx": Must be "Receiver" instance.
This seemed to be a simple and trivial task, but i can't make it work.
One approach that can be taken is as following(I am only considering the four models you have presented in your question),
You have foreign key of Shipping company in Shipping model. So you can make use of model function here on Shipping_company model.
Take a look at this model function
class Shipping_company(models.Model):
fields...
def get_profiles(self):
shippings = Shipping.objects.filter(shipping_company=self)
users = list(set([x.order.ship_to for x in shippings]))
Explanation:
shippings = Shipping.objects.filter(shipping_company=self)
will return all the shippings for one Shipping company(FedEx in your case). Further loop through the shippings to get ship_to from order field.
PS: You can take it as reference and design your own solution.
Walkthrough:
Lets say there is shipping company 'FedEx'. So we do,
fedex = Shipping_company.objects.get(name='FedEx')
Now, when you call get_profiles on fedex, like
fedex.get_profiles()
what will happen is this.
fedex instance refers to self in get_profiles() function now.
Using self(fedex), we filter out shippings by fedex.
Then we loop through those shippings to get order per shipping and each of that order has a ship_to(profile) foreign key.
I guess, you are getting confused because of the return statement.
In elaborate fashion the whole function will look something like this
def get_profiles(self):
users = list()
shippings = Shipping.objects.filter(shipping_company=self)
for shipping in shippings:
order = shipping.order
#Now you have an order per shipping, so you do
if not order.ship_to in users:
users.append(order.ship_to)
return users

Many to Many Exclude on Multiple Objects

I have the following models:
class Deal(models.Model):
date = models.DateTimeField(auto_now_add=True)
retailer = models.ForeignKey(Retailer, related_name='deals')
description = models.CharField(max_length=255)
...etc
class CustomerProfile(models.Model):
saved_deals = models.ManyToManyField(Deal, related_name='saved_by_customers', null=True, blank=True)
dismissed_deals = models.ManyToManyField(Deal, related_name='dismissed_by_customers', null=True, blank=True)
What I want to do is retrieve deals for a customer, but I don't want to include deals that they have dismissed.
I'm having trouble wrapping my head around the many-to-many relationship and am having no luck figuring out how to do this query. I'm assuming I should use an exclude on Deal.objects() but all the examples I see for exclude are excluding one item, not what amounts to multiple items.
When I naively tried just:
deals = Deal.objects.exclude(customer.saved_deals).all()
I get the error: "'ManyRelatedManager' object is not iterable"
If I say:
deals = Deal.objects.exclude(customer.saved_deals.all()).all()
I get "Too many values to unpack" (though I feel I should note there are only 5 deals and 2 customers in the database right now)
We (our client) presumes that he/she will have thousands of customers and tens of thousands of deals in the future, so I'd like to stay performance oriented as best I can. If this setup is incorrect, I'd love to know a better way.
Also, I am running django 1.5 as this is deployed on App Engine (using CloudSQL)
Where am I going wrong?
Suggest you use customer.saved_deals to get the list of deal ids to exclude (use values_list to quickly convert to a flat list).
This should save you excluding by a field in a joined table.
deals = Deals.exclude( id__in=customer.saved_deals.values_list('id', flat=True) )
You'd want to change this:
deals = Deal.objects.exclude(customer.saved_deals).all()
To something like this:
deals = Deal.objects.exclude(customer__id__in=[1,2,etc..]).all()
Basically, customer is the many-to-many foreign key, so you can't use it directly with an exclude.
Deals saved and deals dismissed are two fields describing almost same thing. There is also a risk too much columns may be used in database if these two field are allowed to store Null values. It's worth to consider remove dismissed_deals at all, and use saved_deal only with True or False statement.
Another thing to think about is move saved_deals out of CustomerProfile class to Deals class. Saved_deals are about Deals so it can prefer to live in Deals class.
class Deal(models.Model):
saved = models.BooleandField()
...
A real deal would have been made by one customer / buyer rather then few. A real customer can have milions of deals, so relating deals to customer would be good way.
class Deal(models.Model):
saved = models.BooleanField()
customer = models.ForeignKey(CustomerProfile)
....
What I want to do is retrieve deals for a customer, but I don't want to include deals that they have dismissed.
deals_for_customer = Deals.objects.all().filter(customer__name = "John")
There is double underscore between customer and name (customer__name), which let to filter model_name (customer is related to CustomerProfile which is model name) and name of field in that model (assuming CutomerProfile class has name attribute)
deals_saved = deals_for_customer.filter(saved = True)
That's it. I hope I could help. Let me know if not.

How can i get a list of objects from a postgresql view table to display

this is a model of the view table.
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
this is the SQL i use to create the table
CREATE VIEW qry_desc_char as
SELECT
tbl_desc.iid_id,
tbl_desc.cid_id,
tbl_desc.cs,
tbl_char.cid,
tbl_char.charname
FROM tbl_desC,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
i dont know if i need a function in models or views or both. i want to get a list of objects from that database to display it. This might be easy but im new at Django and python so i having some problems
Django 1.1 brought in a new feature that you might find useful. You should be able to do something like:
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
managed = False
The documentation for the managed Meta class option is here. A relevant quote:
If False, no database table creation
or deletion operations will be
performed for this model. This is
useful if the model represents an
existing table or a database view that
has been created by some other means.
This is the only difference when
managed is False. All other aspects of
model handling are exactly the same as
normal.
Once that is done, you should be able to use your model normally. To get a list of objects you'd do something like:
qry_desc_char_list = QryDescChar.objects.all()
To actually get the list into your template you might want to look at generic views, specifically the object_list view.
If your RDBMS lets you create writable views and the view you create has the exact structure than the table Django would create I guess that should work directly.
(This is an old question, but is an area that still trips people up and is still highly relevant to anyone using Django with a pre-existing, normalized schema.)
In your SELECT statement you will need to add a numeric "id" because Django expects one, even on an unmanaged model. You can use the row_number() window function to accomplish this if there isn't a guaranteed unique integer value on the row somewhere (and with views this is often the case).
In this case I'm using an ORDER BY clause with the window function, but you can do anything that's valid, and while you're at it you may as well use a clause that's useful to you in some way. Just make sure you do not try to use Django ORM dot references to relations because they look for the "id" column by default, and yours are fake.
Additionally I would consider renaming my output columns to something more meaningful if you're going to use it within an object. With those changes in place the query would look more like (of course, substitute your own terms for the "AS" clauses):
CREATE VIEW qry_desc_char as
SELECT
row_number() OVER (ORDER BY tbl_char.cid) AS id,
tbl_desc.iid_id AS iid_id,
tbl_desc.cid_id AS cid_id,
tbl_desc.cs AS a_better_name,
tbl_char.cid AS something_descriptive,
tbl_char.charname AS name
FROM tbl_desc,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
Once that is done, in Django your model could look like this:
class QryDescChar(models.Model):
iid_id = models.ForeignKey('WhateverIidIs', related_name='+',
db_column='iid_id', on_delete=models.DO_NOTHING)
cid_id = models.ForeignKey('WhateverCidIs', related_name='+',
db_column='cid_id', on_delete=models.DO_NOTHING)
a_better_name = models.CharField(max_length=10)
something_descriptive = models.IntegerField()
name = models.CharField(max_length=50)
class Meta:
managed = False
db_table = 'qry_desc_char'
You don't need the "_id" part on the end of the id column names, because you can declare the column name on the Django model with something more descriptive using the "db_column" argument as I did above (but here I only it to prevent Django from adding another "_id" to the end of cid_id and iid_id -- which added zero semantic value to your code). Also, note the "on_delete" argument. Django does its own thing when it comes to cascading deletes, and on an interesting data model you don't want this -- and when it comes to views you'll just get an error and an aborted transaction. Prior to Django 1.5 you have to patch it to make DO_NOTHING actually mean "do nothing" -- otherwise it will still try to (needlessly) query and collect all related objects before going through its delete cycle, and the query will fail, halting the entire operation.
Incidentally, I wrote an in-depth explanation of how to do this just the other day.
You are trying to fetch records from a view. This is not correct as a view does not map to a model, a table maps to a model.
You should use Django ORM to fetch QryDescChar objects. Please note that Django ORM will fetch them directly from the table. You can consult Django docs for extra() and select_related() methods which will allow you to fetch related data (data you want to get from the other table) in different ways.