Django select query with where clause - django

Sheet_Table
id ref_id name data
1 10 A 9078
2 10 AAA 6789
3 12 C 345
Sheet Model have multiple Columns id,ref_id,name,data
Now i want to write this query in django
select data from Sheet_Table where ref_id=10
Here Model/Table name is Sheet_Table

It's pretty explicitly stated in the django doc on queries that filter(foo=bar) evaluates to a WHERE clause. In your specific case, try this to get a list of just the data elements (if your model is actually called Sheet_Table?):
Sheet_Table.objects.filter(ref_id=10).values_list('data', flat=True)
or you can leave off the values_list part if you want to iterate over the model objects (e.g., if you want to examine id as well as data).

Related

Django Effecient way to Perform Query on M2M

class A(models.Model)
results = models.TextField()
class B(models.Model)
name = models.CharField(max_length=20)
res = models.ManyToManyField(A)
Let's suppose we have above 2 models. A model has millions of objects.
I would like to know what would be the best efficient/fastest way to get all the results objects of a particular B object.
Let's suppose we have to retrieve all results for object number 5 of B
Option 1 : A.objects.filter(b__id=5)
(OR)
Option 2 : B.objects.get(id=5).res.all()
Option 1: My Question is filtering by id on A model objects would take lot of time? since there are millions of A model objects.
Option 2: Question: does res field on B model stores the id value of A model objects?
The reason why I'm assuming the option 2 would be a faster way since it stores the reference of A model objects & directly getting those object values first and making the second query to fetch the results. whereas in the first option filtering by id or any other field would take up a lot of time
The first expression will result in one database query. Indeed, it will query with:
SELECT a.*
FROM a
INNER JOIN a_b ON a_b.a_id = a.id
WHERE a_b.b_id = 5
The second expression will result in two queries. Indeed, first Django will query to fetch that specific B object with a query like:
SELECT b.*
FROM b
WHERE b.id = 5
then it will make exactly the same query to retrieve the related A objects.
But retrieving the A object is here not necessary (unless you of course need it somewhere else). You thus make a useless database query.
My Question is filtering by id on A model objects would take lot of time? since there are millions of A model objects.
A database normally stores an index on foreign key fields. This thus means that it will filter effectively. The total number of A objects is usually not (that) relevant (since it uses a datastructure to accelerate search like a B-tree [wiki]). The wiki page has a section named An index speeds the search that explains how this works.

How do I use django's Q with django taggit?

I have a Result object that is tagged with "one" and "two". When I try to query for objects tagged "one" and "two", I get nothing back:
q = Result.objects.filter(Q(tags__name="one") & Q(tags__name="two"))
print len(q)
# prints zero, was expecting 1
Why does it not work with Q? How can I make it work?
The way django-taggit implements tagging is essentially through a ManytoMany relationship. In such cases there is a separate table in the database that holds these relations. It is usually called a "through" or intermediate model as it connects the two models. In the case of django-taggit this is called TaggedItem. So you have the Result model which is your model and you have two models Tag and TaggedItem provided by django-taggit.
When you make a query such as Result.objects.filter(Q(tags__name="one")) it translates to looking up rows in the Result table that have a corresponding row in the TaggedItem table that has a corresponding row in the Tag table that has the name="one".
Trying to match for two tag names would translate to looking up up rows in the Result table that have a corresponding row in the TaggedItem table that has a corresponding row in the Tag table that has both name="one" AND name="two". You obviously never have that as you only have one value in a row, it's either "one" or "two".
These details are hidden away from you in the django-taggit implementation, but this is what happens whenever you have a ManytoMany relationship between objects.
To resolve this you can:
Option 1
Query tag after tag evaluating the results each time, as it is suggested in the answers from others. This might be okay for two tags, but will not be good when you need to look for objects that have 10 tags set on them. Here would be one way to do this that would result in two queries and get you the result:
# get the IDs of the Result objects tagged with "one"
query_1 = Result.objects.filter(tags__name="one").values('id')
# use this in a second query to filter the ID and look for the second tag.
results = Result.objects.filter(pk__in=query_1, tags__name="two")
You could achieve this with a single query so you only have one trip from the app to the database, which would look like this:
# create django subquery - this is not evaluated, but used to construct the final query
subquery = Result.objects.filter(pk=OuterRef('pk'), tags__name="one").values('id')
# perform a combined query using a subquery against the database
results = Result.objects.filter(Exists(subquery), tags__name="two")
This would only make one trip to the database. (Note: filtering on sub-queries requires django 3.0).
But you are still limited to two tags. If you need to check for 10 tags or more, the above is not really workable...
Option 2
Query the relationship table instead directly and aggregate the results in a way that give you the object IDs.
# django-taggit uses Content Types so we need to pick up the content type from cache
result_content_type = ContentType.objects.get_for_model(Result)
tag_names = ["one", "two"]
tagged_results = (
TaggedItem.objects.filter(tag__name__in=tag_names, content_type=result_content_type)
.values('object_id')
.annotate(occurence=Count('object_id'))
.filter(occurence=len(tag_names))
.values_list('object_id', flat=True)
)
TaggedItem is the hidden table in the django-taggit implementation that contains the relationships. The above will query that table and aggregate all the rows that refer either to the "one" or "two" tags, group the results by the ID of the objects and then pick those where the object ID had the number of tags you are looking for.
This is a single query and at the end gets you the IDs of all the objects that have been tagged with both tags. It is also the exact same query regardless if you need 2 tags or 200.
Please review this and let me know if anything needs clarification.
first of all, this three are same:
Result.objects.filter(tags__name="one", tags__name="two")
Result.objects.filter(Q(tags__name="one") & Q(tags__name="two"))
Result.objects.filter(tags__name_in=["one"]).filter(tags__name_in=["two"])
i think the name field is CharField and no record could be equal to "one" and "two" at same time.
in python code the query looks like this(always false, and why you are geting no result):
from random import choice
name = choice(["abtin", "shino"])
if name == "abtin" and name == "shino":
we use Q object for implement OR or complex queries
Into the example that works you do an end on two python objects (query sets). That gets applied to any record not necessarily to the same record that has one AND two as tag.
ps: Why do you use the in filter ?
q = Result.objects.filter(tags_name_in=["one"]).filter(tags_name_in=["two"])
add .distinct() to remove duplicates if expecting more than one unique object

How to group by AND aggregate with Django

I have a fairly simple query I'd like to make via the ORM, but can't figure that out..
I have three models:
Location (a place), Attribute (an attribute a place might have), and Rating (a M2M 'through' model that also contains a score field)
I want to pick some important attributes and be able to rank my locations by those attributes - i.e. higher total score over all selected attributes = better.
I can use the following SQL to get what I want:
select location_id, sum(score)
from locations_rating
where attribute_id in (1,2,3)
group by location_id order by sum desc;
which returns
location_id | sum
-------------+-----
21 | 12
3 | 11
The closest I can get with the ORM is:
Rating.objects.filter(
attribute__in=attributes).annotate(
acount=Count('location')).aggregate(Sum('score'))
Which returns
{'score__sum': 23}
i.e. the sum of all, not grouped by location.
Any way around this? I could execute the SQL manually, but would rather go via the ORM to keep things consistent.
Thanks
Try this:
Rating.objects.filter(attribute__in=attributes) \
.values('location') \
.annotate(score = Sum('score')) \
.order_by('-score')
Can you try this.
Rating.objects.values('location_id').filter(attribute__in=attributes).annotate(sum_score=Sum('score')).order_by('-score')

Django Model QuerySet Group By Specific Fields

Considering this model & data:
Ad_id + Date = primary key
Ad_id date clicks
------------------------------
3 8/10/12 124
3 7/10/12 433
3 6/10/12 99
4 8/10/12 23
4 7/10/12 80
I'm trying to group by ad_id to return the sum of the over all clicks.
in sql terms:
select Ad_id, date, sum(clicks) from ads group by Ad_id
The problem is the Django automatically do the group by for each field in the model, so the group by is not really working (because each row is unique).
Solutions I've already checked:
I know it is possible to do something like this:
Ad.objects.values('ad_id').annotate(clicks_sum=Sum('clicks'))
But it is not good as it doesn't return the Ad Model, but a dictionary.
I can't use also raw SQL because it is not chain-able
Also I tried to set
MyQuerySet.group_by = ['ad_id']
Not working too..
So I really need to group by only by the fields I need, and that the result will be an Ad Model.
You can perform raw SQL queries using Manager.raw(), in your case that would be:
Ad.objects.raw('select Ad_id, date, sum(clicks) from ads group by Ad_id')
This method method takes a raw SQL query, executes it, and returns a RawQuerySet instance. This RawQuerySet instance can be iterated over just like an normal QuerySet to provide object instances.

Grouping Custom Attributes in a Query

I have an application that allows for "contacts" to be made completely customized. My method of doing that is letting the administrator setup all of the fields allowed for the contact. My database is as follows:
Contacts
id
active
lastactive
created_on
Fields
id
label
FieldValues
id
fieldid
contactid
response
So the contact table only tells whether they are active and their identifier; the fields tables only holds the label of the field and identifier, and the fieldvalues table is what actually holds the data for contacts (name, address, etc.)
So this setup has worked just fine for me up until now. The client would like to be able to pull a cumulative report, but say state of all the contacts in a certain city. Effectively the data would have to look like the following
California (from fields table)
Costa Mesa - (from fields table) 5 - (counted in fieldvalues table)
Newport 2
Connecticut
Wallingford 2
Clinton 2
Berlin 5
The state field might be id 6 and the city field might be id 4. I don't know if I have just been looking at this code way to long to figure it out or what,
The SQL to create those three tables can be found at https://s3.amazonaws.com/davejlong/Contact.sql
You've got an Entity Attribute Value (EAV) model. Use the field and fieldvalue tables for searching only - the WHERE caluse. Then make life easier by keeping the full entity's data in a CLOB off the main table (e.g. Contacts.data) in a serialized format (WDDX is good for this). Read the data column out, deserialize, and work with on the server side. This is much easier than the myriad of joins you'd need to do otherwise to reproduce the fully hydrated entity from an EAV setup.