Convert Boolean expression to django queryset - django

I have tagging in place on my django site, and I'd like to allow urls of the form:
http://example.com/search/tags/(foo+dog)|(goat+cat)
Which in English would mean:
Find the items tagged with (foo AND dog) OR (goat AND cat).
So essentially what I need is a way to pare this down into queries using the Django API. At present, I just want to support AND, OR and parentheses.
I imagine there are libraries for interpreting Booleans of this sort, but I haven't been able to find any outside of a full-blown search engine. Are there any tricks to doing this or good starting points using the Django API?
Right now, my code is pretty basic, but it supports either OR queries or AND queries, but not them combined (thus no parentheses either).
EDIT:
I'm fairly convinced that if I could sort this out into a series of AND and OR queries, I'd be all set...but I can't think through how to go from the randomly parenthesised boolean query to a logically useful understanding of the query.
Here's the code I have so far, in case it's useful. I'm not using the tagging module (though maybe I should), and the code is still drafty, but...
#login_required
def view_opinions_by_tag(request, tagValues):
'''Displays opinions tagged by a user with certain tags.
Given a set of tags separated by pluses, pipes, and parentheses, unpack
the set of tags and display the correct opinions. Currently only supports
pluses (AND filters), and pipes (OR filters).
'''
if '|' in tagValues:
# it's an or query.
tagList = tagValues.split('|')
tags = Tag.objects.filter(tag__in = tagList, user = request.user)\
.values_list('pk', flat=True)
faves = Favorite.objects.filter(tags__in = list(tags), user = request.user).distinct()
elif '+' in tagValues:
# it's an and query - not very efficient.
tagList = tagValues.split('+')
tagObject = Tag.objects.get(tag = tagList[0], user = request.user)
faves = Favorite.objects.filter(tags = tagObject, user = request.user)
for tag in tagList[1:]:
tagObject = Tag.objects.filter(tag = tag, user = request.user)
faves = faves.filter(tags = tagObject, user = request.user).distinct()
else:
# it's a single tag
tag = Tag.objects.get(tag = tagValues, user = request.user)
faves = Favorite.objects.filter(tags = tag, user = request.user)
From here, I essentially take the faves queryset, and render it. Perhaps not the most efficient, but seems to work so far.

Use Q objects.
For the parsing, just set a regexp in your urls.py that will get you the stuff that you need:
url(r'^/search/tags/(?p<query>[a-z\-\+\|])$')
In your views:
def your_view(request, query):
for and_exp in query.split('|'):
for tag in and_exp.split('+'):
# do your stuff with q object
EDIT:
Actually this is a generic answer, but I just realized that you are talking about tags, and therefor a many to many relation.
It will depends very much of the implementation so you must give us more details.
If you use django-tagging, you can make several queries using Object.tagged.with_all('tag1', 'tag2')
It's different for django-taggit, and different if you are using your own implementation.

Related

How to structure views for database pulling and manipulation

I am having difficulties understanding how best to structure my views.
I am pulling data on various users and creating variables that summarise some of these variables (such as occurrences per week etc). This is so I can graph these summary variables in my templates. I am doing quite a lot of different manipulations which is getting quite messy , and i shall need these manipulations for other templates. Can somebody recommend how best to structure views in this case. I think using classes is the solution to use the same functions for other templates but I cannot quite understand how. I also feel there must be a better way to structure each manipulation of database data.
def dashboard(request):
posts= Post.objects.filter(user=request.user)
posts_count = posts.count()
post_early = Post.objects.filter(user=request.user).earliest('date') #need to extract the date value from this so I can take the difference
total_days = (datetime.datetime.now().date()- post_early.date).days
average_30days= round((posts_count/total_days)*30,2)
list4=[]
list5=[]
i=1
time3=datetime.datetime.now() + datetime.timedelta(-30)
while i<32:
list4.append(days2(time3,request,Post))
list5.append(time3.strftime('%b %d, %Y'))
i+=1
time3=time3 + datetime.timedelta(+1)
def dashboardView(request):
posts = Post.objects.filter(user=request.user)
posts_count = posts.count()
#need to extract the date value from post_early so I can take the difference
post_early = Post.objects.filter(user=request.user).earliest('date')
total_days = (datetime.datetime.now().date() - post_early.date).days
average_30days = round((posts_count/total_days)*30,2)
list_4 = []
list_5 = []
i = 1
time_3=datetime.datetime.now() + datetime.timedelta(-30)
while i<32:
list_4.append(days2(time_3, request, Post))
list_5.append(time_3.strftime('%b %d, %Y'))
i += 1
time_3 = time_3 + datetime.timedelta(1)
I'd do something like this. There were a few inconsistency:
-keep a space before and a space after operators (=, *, -, +, ...).
-I'd consider a good practice to always suffix -View to your views, but it's just personal preference
-Use empty lines to separate blocks of code, not groups of variables. If you have a long list of variables declarations (not this case) you can use comments to separate and categorize them.
-Use list_3 instead of list3 (and similar cases), it's more readable.
For more you can always check the official python style guide: https://www.python.org/dev/peps/pep-0008/
Anyway, if you're consistent and attain to the coding style used in the Django documentation while learning, you'll be fine.
##### EDIT:
Note: my answer is based on the code you provided, which seems cut (no return statement?) and without the other modules.
You're using a function based view which is not wrong nor correct, just one of the possible choices. If you don't like it, or want to try something else, a ListView may work for you: https://docs.djangoproject.com/en/2.1/topics/class-based-views/generic-display/
Example:
from django.views import ListView
class DashboardView(ListView):
model = Post
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
context['posts'] = Post.objects.filter(user=request.user)
# add all the data you need to the context dictionary
return context

Queryset in Django if empty field returns all elements

I want to do a filter in Django that uses form method.
If the user type de var it should query in the dataset that var, if it is left in blank to should bring all the elements.
How can I do that?
I am new in Django
if request.GET.get('Var'):
Var = request.GET.get('Var')
else:
Var = WHAT SHOULD I PUT HERE TO FILTER ALL THE ELEMNTS IN THE CODE BELLOW
models.objects.filter(Var=Var)
It's not a great idea from a security standpoint to allow users to input data directly into search terms (and should DEFINITELY not be done for raw SQL queries if you're using any of those.)
With that note in mind, you can take advantage of more dynamic filter creation using a dictionary syntax, or revise the queryset as it goes along:
Option 1: Dictionary Syntax
def my_view(request):
query = {}
if request.GET.get('Var'):
query['Var'] = request.GET.get('Var')
if request.GET.get('OtherVar'):
query['OtherVar'] = request.GET.get('OtherVar')
if request.GET.get('thirdVar'):
# Say you wanted to add in some further processing
thirdVar = request.GET.get('thirdVar')
if int(thirdVar) > 10:
query['thirdVar'] = 10
else:
query['thirdVar'] = int(thirdVar)
if request.GET.get('lessthan'):
lessthan = request.GET.get('lessthan')
query['fieldname__lte'] = int(lessthan)
results = MyModel.objects.filter(**query)
If nothing has been added to the query dictionary and it's empty, that'll be the equivalent of doing MyModel.objects.all()
My security note from above applies if you wanted to try to do something like this (which would be a bad idea):
MyModel.objects.filter(**request.GET)
Django has a good security track record, but this is less safe than anticipating the types of queries that your users will have. This could also be a huge issue if your schema is known to a malicious site user who could adapt their query syntax to make a heavy query along non-indexed fields.
Option 2: Revising the Queryset
Alternatively, you can start off with a queryset for everything and then filter accordingly
def my_view(request):
results = MyModel.objects.all()
if request.GET.get('Var'):
results = results.filter(Var=request.GET.get('Var'))
if request.GET.get('OtherVar'):
results = results.filter(OtherVar=request.GET.get('OtherVar'))
return results
A simpler and more explicit way of doing this would be:
if request.GET.get('Var'):
data = models.objects.filter(Var=request.GET.get('Var'))
else:
data = models.objects.all()

Searching a many to many database using Google Cloud Datastore

I am quite new to google app engine. I know google datastore is not sql, but I am trying to get many to many relationship behaviour in it. As you can see below, I have Gif entities and Tag entities. I want my application to search Gif entities by related tag. Here is what I have done;
class Gif(ndb.Model):
author = ndb.UserProperty()
link = ndb.StringProperty(indexed=False)
class Tag(ndb.Model):
name = ndb.StringProperty()
class TagGifPair(ndb.Model):
tag_id = ndb.IntegerProperty()
gif_id = ndb.IntegerProperty()
#classmethod
def search_gif_by_tag(cls, tag_name)
query = cls.query(name=tag_name)
# I am stuck here ...
Is this a correct start to do this? If so, how can I finish it. If not, how to do it?
You can use repeated properties https://developers.google.com/appengine/docs/python/ndb/properties#repeated the sample in the link uses tags with entity as sample but for your exact use case will be like:
class Gif(ndb.Model):
author = ndb.UserProperty()
link = ndb.StringProperty(indexed=False)
# you store array of tag keys here you can also just make this
# StringProperty(repeated=True)
tag = ndb.KeyProperty(repeated=True)
#classmethod
def get_by_tag(cls, tag_name):
# a query to a repeated property works the same as if it was a single value
return cls.query(cls.tag == ndb.Key(Tag, tag_name)).fetch()
# we will put the tag_name as its key.id()
# you only really need this if you wanna keep records of your tags
# you can simply keep the tags as string too
class Tag(ndb.Model):
gif_count = ndb.IntegerProperty(indexed=False)
Maybe you want to use list? I would do something like this if you only need to search gif by tags. I'm using db since I'm not familiar with ndb.
class Gif(db.Model):
author = db.UserProperty()
link = db.StringProperty(indexed=False)
tags = db.StringListProperty(indexed=True)
Query like this
Gif.all().filter('tags =', tag).fetch(1000)
There's different ways of doing many-to-many relationships. Using ListProperties is one way. The limitation to keep in mind if using ListProperties is that there's a limit to the number of indexes per entity, and a limit to the total entity size. This means that there's a limit to the number of entities in the list (depending on whether you hit the index count or entity size first). See the bottom of this page: https://developers.google.com/appengine/docs/python/datastore/overview
If you believe the number of references will work within this limit, this is a good way to go. Considering that you're not going to have thousands of admins for a Page, this is probably the right way.
The other way is to have an intermediate entity that has reference properties to both sides of your many-to-many. This method will let you scale much higher, but because of all the extra entity writes and reads, this is much more expensive.

Filter by certain pattern in all CharField in Model

I got a serach box in my project, the thing is that a user can enter any keyword and my ModelForm filters the fields I explicitly tell to filter, I use the following piece of code in my Form:
def get_matching_siniestros(self):
if self.cleaned_data['keywords'] is None:
return None
matching = []
for kw in self.cleaned_data['keywords']:
numero_ajuste = Siniestro.objects.filter(
numero_ajuste__icontains=kw
)
nombre_contratante = Siniestro.objects.filter(
poliza__contratante__nombre__icontains=kw
)
matching = chain(
numero_ajuste,
nombre_contratante,
matching
)
# verify not repeated Siniestro
non_rep_siniestros = []
for siniestro in matching:
if siniestro not in non_rep_siniestros:
non_rep_siniestros.append(siniestro)
return non_rep_siniestros
What I want to do is to programatically filter on any CharField in the Model and also if possible on any CharField of nested relations, in this example Siniestro has a FK to poliza and poliza has an FK to contratante.
You can iterate over every field and do whatever you like, e.g.:
[process(field) for field in model._meta.fields if field.__class__ == CharField]
where process can be a function, or whatever you require.
That said, I should really point out that the complexity you're trying to involve is bound to get messy. IMO, have a look at django-haystack. Indexing should be the way to go.

Django most efficient way to do this?

I have developed a few Django apps, all pretty straight-forward in terms of how I am interacting with the models.
I am building one now that has several different views which, for lack of a better term, are "canned" search result pages. These pages all return results from the same model, but they are filtered on different columns. One page we might be filtering on type, another we might be filtering on type and size, and on yet another we may be filtering on size only, etc...
I have written a function in views.py which is used by each of these pages, it takes a kwargs and in that are the criteria upon which to search. The minimum is one filter but one of the views has up to 4.
I am simply seeing if the kwargs dict contains one of the filter types, if so I filter the result on that value (I just wrote this code now, I apologize if any errors, but you should get the point):
def get_search_object(**kwargs):
q = Entry.objects.all()
if kwargs.__contains__('the_key1'):
q = q.filter(column1=kwargs['the_key1'])
if kwargs.__contains__('the_key2'):
q = q.filter(column2=kwargs['the_key2'])
return q.distinct()
Now, according to the django docs (http://docs.djangoproject.com/en/dev/topics/db/queries/#id3), these is fine, in that the DB will not be hit until the set is evaluated, lately though I have heard that this is not the most efficient way to do it and one should probably use Q objects instead.
I guess I am looking for an answer from other developers out there. My way currently works fine, if my way is totally wrong from a resources POV, then I will change ASAP.
Thanks in advance
Resource-wise, you're fine, but there are a lot of ways it can be stylistically improved to avoid using the double-underscore methods and to make it more flexible and easier to maintain.
If the kwargs being used are the actual column names then you should be able to pretty easily simplify it since what you're kind of doing is deconstructing the kwargs and rebuilding it manually but for only specific keywords.
def get_search_object(**kwargs):
entries = Entry.objects.filter(**kwargs)
return entries.distinct()
The main difference there is that it doesn't enforce that the keys be actual columns and pretty badly needs some exception handling in there. If you want to restrict it to a specific set of fields, you can specify that list and then build up a dict with the valid entries.
def get_search_object(**kwargs):
valid_fields = ['the_key1', 'the_key2']
filter_dict = {}
for key in kwargs:
if key in valid_fields:
filter_dict[key] = kwargs[key]
entries = Entry.objects.filter(**filter_dict)
return entries.distinct()
If you want a fancier solution that just checks that it's a valid field on that model, you can (ab)use _meta:
def get_search_object(**kwargs):
valid_fields = [field.name for field in Entry._meta.fields]
filter_dict = {}
for key in kwargs:
if key in valid_fields:
filter_dict[key] = kwargs[key]
entries = Entry.objects.filter(**filter_dict)
return entries.distinct()
In this case, your usage is fine from an efficiency standpoint. You would only need to use Q objects if you needed to OR your filters instead of AND.