Is there any acceptable way to chop/recombine Django querysets without using the API? - django

I was to forced to use a models.CharField to store some additional flags in one of my models. So I'm abusing each letter of the field as a flag. As an example, 'MM5' would mean "man, married, age above 50" and 'FS2' "female, single, age above 20".
I'm using methods to query/access these flags. Of course I cannot use these methods with the queryset API. I'm using list comprehensions calling the methods to filter an initial queryset and transform them into a plain list, which is good enough for most template feeding:
people = People.objects.filter(name__startswith='J')
people_i_want = [p for p in people if p.myflags_ismale() and p.myflags_isolderthan(30)]
So, is there any ok'ish way to retransform these lists back into a queryset? Or to chop/filter a queryset based on the output of my methods without transforming it to a normal list in the first place?

It would probably be needlessly complicated as well as bad practice to try and "re-transform" your list back into a QuerySet, the best thing to do is to use cleverer QuerySet filtering.
You should use the queryset filter by regex syntax to get the functionality you need. Documentation here.
For instance the equivalent to ismale() using regex would be something like...
People.objects.filter(myflags__regex=r'.M.') # <-- For matching something like 'MM5'
# Please note I haven't tested this regex expression, but the principal is sound
Also, while I'm admittedly not a database guru, I'm fairly certain using this sort of "flags" in a charfield is a rather inefficient way of doing things.

If you must convert a list back into a queryset, then the idiom I use is:
People.objects.filter(pk__in=[x.pk for x in list_of_objects])
Obviously, this hits the database again. But if you really need it.

Related

How can I dynamically create multi-level hierarchical forms in Django?

I'm building an advanced search page for a scientific database using Django. The goal is to be able to allow some dynamically created sophisticated searches, joining groups of search terms with and & or.
I got part-way there by following this example, which allows multiple terms anded together. It's basically a single group of search terms, dynamically created, that I can either and-together or or-together. E.g.
<field1|field2> <is|is not|contains|doesn't contain> <search term> <->
<+>
...where <-> will remove a search term row and <+> will add a new row.
But I would like the user to be able to either add another search term row, or add an and-group and an or-group, so that I'd have something like:
<and-group|or-group> <->
<field1|field2> <is|is not|contains|doesn't contain> <search term> <->
<+term|+and-group|_or-group>
A user could then add terms or groups. The result search might end up like:
and-group
compound is lysine
or-group
tissue is brain
tissue is spleen
feeding status is not fasted
Thus the resulting filter would be like the following.
Data.objects.filter(Q(compound="lysine") & (Q(tissue=brain) | Q(tissue="spleen")) & ~Q(feeding_status="fasted"))
Note - I'm not necessarily asking how to get the filter expression below correct - it's just the dynamic hierarchical construction component that I'm trying to figure out. Please excuse me if I got the Q and/or filter syntax wrong. I've made these queries before, but I'm still new to Django, so getting it right off the top of my head here is pretty much guaranteed to be zero-chance. I also skipped the model relationships I spanned here, so let's assume these are all fields in the same model, for simplicity.
I'm not sure how I would dynamically add parentheses to the filter expression, but my current code could easily join individual Q expressions with and or or.
I'm also not sure how I could dynamically create a hierarchal form to create the sub-groups. I'm guessing any such solution would have to be a hack and that there are not established mechanisms for doing something like this...
Here's a screenshot example of what I've currently got working:
UPDATE:
I got really far following this example I found. I forked that fiddle and got this proof of concept working before incorporating it into my Django project:
http://jsfiddle.net/hepcat72/d42k38j1/18/
The console spits out exactly the object I want. And there are no errors. Clicking the search button works for form validation. Any fields I leave empty causes a prompt to fill in the field. Here's a demo gif:
Now I need to process the POST input to construct the query (which I think I can handle) and restore the form above the results - which I'm not quite sure how to accomplish - perhaps a recursive function in a custom tag?
Although, is there a way to snapshot the form and restore it when the results load below it? Or maybe have the results load in a different frame?
I don't know if I'm teaching a grandmother to suck eggs, but in case not, one of the features of the Python language may be useful.
foo( bar = 27, baz = None)
can instead be coded
args = {}
a1, a2 = 'bar', 'baz'
d[a1] = 27
d[a2] = None
foo( **args )
so an arbitrary Q object specified by runtime keys and values can be constructed q1 = Q(**args)
IIRC q1 & q2 and q1 | q2 are themselves Q objects so you can build up a filter of arbitrary complexity.
I'll also include a mention of Django-filter which is usually my answer to filtering questions like this one, but I suspect in this case you are after greater power than it easily provides. Basically, it will "and" together a list of filter conditions specified by the user. The built-in ones are simple .filter( key=value), but by adding code you can create custom filters with complex Q expressions related to a user-supplied value.
As for the forms, a Django form is a linear construct, and a formset is a list of similar forms. I think I might resort to JavaScript to build some sort of tree representing a complex query in the browser, and have the submit button encode it as JSON and return it through a single text field (or just pick it out of request.POST without using a form). There may be some Javascript out there already written to do this, but I'm not aware of it. You'd need to be sure that malicious submission of field names and values you weren't expecting doesn't result in security issues. For a pure filtering operation, this basically amounts to being sure that the user is entitled to get all data in database table in any case.
There's a form JSONField in the Django PostgreSQL extensions, which validates that user-supplied (or Javascript-generated) text is indeed JSON, and supplies it to you as Python dicts and lists.

Ordering Django querysets using a JSONField's properties

I have a model that kinda looks like this:
class Person(models.Model):
data = JSONField()
The data field has 2 properties, name, and age. Now, lets say I want to get a paginated queryset (each page containing 20 people), with a filter where age is greater than 25, and the queryset is to be ordered in descending order. In a usual setup, that is, a normalized database, I can write this query like so:
person_list_page_1 = Person.objects.filter(age > 25).order_by('-age')[:20]
Now, what is the equivalence of the above when filtering and ordering using keys stored in the JSONField? I have researched into this, and it seems it was meant to be a feature for 2.1, but I can't seem to find anything relevant.
Link to the ticket about it being implemented in the future
I also have another question. Lets say we filter and order using the JSONField. Will the ORM have to get all the objects, filter, and order them before sending the first 20 in such a case? That is, will performance be legitimately slower?
Obviously, I know a normalized database is far better for these things, but my hands are kinda tied.
You can use the postgresql sql syntax to extract subfields. Then they can be used just as any other field on the model in queryset filters.
from django.db.models.expressions import RawSQL
Person.objects.annotate(
age=RawSQL("(data->>'age')::int", [])
).filter(age__gte=25).order_by('-age')[:20]
See the postgresql docs for other operators and functions.
In some cases, you might have to add explicit typecasts (::int, for example)
https://www.postgresql.org/docs/current/static/functions-json.html
Performance will be slower than with a proper field, but it's not bad.

Django 1.6 How to change a list to queryset or How to write this kind of query?

I want to get all the questinos with no answers.I use this:
all_questions=[q for q in Question.objects.all() if not q.answer_set.all()]
It works. But then I need to invoke order_by method with all_questions, so I need to change it to a queryset, how?
Or, is there a standard method like Question.objects.filter(answer_count=0) ? I find hard but no results.
Solution: Change answer_count__gt=0 to answer_count=0.
all_questions=Question.objects.annotate(answer_count=Count('answer')).filter(answer_count=0)
You should be able to use an annotation much more efficiently than doing one query per question.
Question.objects.annotate(answer_count=Count('answer')).filter(answer_count=0)
That said, you could just add the order_by directly into your Questions.objects.all() query. But like I said, it's much less efficient to do a query per question.

django queryset ordering

I'm listing queryset results and would like to add an option for choosing the order results are displayed.
I would like to pass the actual data from the database to other page for sorting.
I was able to achieve such thing by getting all objects ids and use django session to recreate a new queryset based on the order criteria.
I was thinking if there is any other way to achieve such goal?
10x
Assuming you are currently displaying the data as a table, you could give chance to some javascript client side table sorter such as tablesorter. There are lots of javascript table sorte.
I'm away from my development machine right now, but I think you could just pass the list of ids to a new Queryset, pk__in=list_of_object_ids, and then use the native order_by function.
For example:
objs = Object.objects.filter(pk__in=list_of_object_ids).order_by('value_to_order_by')
Anyway, that's what I would try first, though I'm sure there are better optimizations.
For example, instead of a list of object ids, you could pass a dictionary with a key:value pair that has the value you want to order by.
For example:
[{'obj_id':1,'obj_value':'foo'},{'obj_id':2,'obj_value':'foo'}]
Then use some lambda function to sort it, like here.

Django filter vs exclude

Is there a difference between filter and exclude in django? If I have
self.get_query_set().filter(modelField=x)
and I want to add another criteria, is there a meaningful difference between to following two lines of code?
self.get_query_set().filter(user__isnull=False, modelField=x)
self.get_query_set().filter(modelField=x).exclude(user__isnull=True)
is one considered better practice or are they the same in both function and performance?
Both are lazily evaluated, so I would expect them to perform equivalently. The SQL is likely different, but with no real distinction.
It depends what you want to achieve. With boolean values it is easy to switch between .exclude() and .filter() but what about e.g. if you want to get all articles except those from March? You can write the query as
Posts.objects.exclude(date__month=3)
With .filter() it would be (but I not sure whether this actually works):
Posts.objects.filter(date__month__in=[1,2,4,5,6,7,8,9,10,11,12])
or you would have to use a Q object.
As the function name already suggest, .exclude() is used to exclude datasets from the resultset. For boolean values you can easily invert this and use .filter() instead, but for other values this can be more tricky.
In general exclude is opposite of filter. In this case both examples works the same.
Here:
self.get_query_set().filter(user__isnull=False, modelField=x)
You select entries that field user is not null and modelField has value x
In this case:
self.get_query_set().filter(modelField=x).exclude(user__isnull=True)
First you select entries that modelField has value x(both user in null and user is not null), then you exclude entries that have field user null.
I think that in this case it would be better use first option, it looks more cleaner. But both work the same.