Converting a QuerySet to dictionary without iterating in Python? - django

I would like to store certain fields from a QuerySet object into a python object (e.g. dictionary), but I wonder if it is possible to do this without looping, at least explicitly, in Python?
For example, if the content of the QuerySet are as follows:
>> queryset.values_list('parents', 'children', 'x', 'y')
>> [('parent1', 'child1', 1.1, 1.03), ('parent1', 'child2', 1.4, 1.05), ('parent2', 'child1', 0.1, 0.2), ('parent2', 'child2', 1.3, 2.2)]
Now, I would like to convert this into a dictionary. The actual structure of the dictionary is of secondary importance, as long as all the information is contained. E.g.:
d = {
'child1': {
'parent1' : [1.1, 1.03],
'parent2' : [0.1, 0.2],
...
}
# or
d = {
'parent1': {
'child1': [1.1, 1.03],
'child2': [1.4, 1.05]
...
}
# or even:
d = {
'child1': [
# parent1 # parent2
(1.1, 1.03), (0.1, 0.2)
]
...
}
Of course, I can do this if I loop through the queryset in Python, but is it really necessary to do an explicit for loop? Is there a way to populate the dictionary directly from the queryset, similar to how list(queryset) gives you a list?
I mostly worry that a for loop will not scale well in terms of performance if I need to retrieve tens of thousands of similar entries. I know that possibly iteration is unavoidable, but at least I'd like to avoid doing that explicitly.
Everything I found online shows how to do this by iterating over the queryset in Python, which is clear to me, but I wonder if there are other, perhaps more performant, ways of doing this.

Use Serializers:
from rest_framework import serializers
class SomeModelSerializer(serializers.ModelSerializer):
class Meta:
model = SomeModel
fields = "__all__"
SomeModelSerializer(instance).data
returns
{'auto_now_add': '2018-12-20T21:34:29.494827Z',
'foreign_key': 2,
'id': 1,
'many_to_many': [2],
'normal_value': 1,
'readonly_value': 2}
I think this solution has good performance.I trust Django REST Framework team :)

Related

Combining filters for a Django queryset

Lets say I have models that look like this:
class Sauce(models.Model):
...
class Topping(models.Model):
...
class Pizza(models.Model):
sauces = models.ManyToManyField(Sauce, related_name='pizzas')
toppings = models.ManyToManyField(Topping, related_name='pizzas')
geo_type = models.CharField(max_length=50, choices=(('NY', 'New York'), ('IT', 'Italy')))
Now, I have an endpoint which accepts URL parameters for filtering the pizza table. For example, one time I might get the following:
{
"sauces": [1, 4],
"toppings": [4, 7],
"geo_type": "NY"
}
Using this, I would simply filter using the following code:
Pizza.objects.filter(sauces__in=url_params["sauces"], toppings__in=url_params["toppings"], geo_type=url_params["geo_type"])
And this would work perfectly fine. However, sometimes I might get URL parameters which look like this:
{
"sauces": [],
"toppings": [4, 7],
"geo_type": "NY"
}
Notice the empty array for the sauces parameter. This means that for this request, I don't care about sauces and it can be anything. Now the query would be something like this:
Pizza.objects.filter(toppings__in=url_params["toppings"], geo_type=url_params["geo_type"])
Once again, this works as expected. However, the issues is that I have a lot of these fields to filter, and the number of combinations is huge. Is there some to just tell my queryset to ignore a filter if it is an empty array? And if the geo_type is an empty string or null, it should ignore those too. Hopefully I have gotten my point across.
Thanks for any help.
You can omit the empty lists, for example by making a helper function:
def filter_ignore_if_empty(qs, **kwargs):
return qs.filter(**{k: v for k, v in kwargs.items() if v != []})
and then filter with:
filter_ignore_if_empty(
Pizza.objects.all(),
sauces__in=url_params['sauces'],
toppings__in=url_params['toppings'],
geo_type=url_params['geo_type']
)

Django: Getting results from multiple joined tables

I have 3 tables flotation has many lightresidues has many compositions.
I'm wanting to make a view for each flotation which can access a list of light residues and a list of associated compositions. There are only ever 2-3 light residues to each flotation and the same for compositions, so manageable.
I can get a flotation record and its referenced light residues, but I'm having trouble passing the lightresidue_id to get the compositions. [n.b. I know lightresidue.id is the Django way of doing things, but I opt for this way]
The view code is as follows, I've hard coded for lightresidue.lightresidue_id = 17 which works, but how do I substitute this for the lightresidue.lightresidue_id = composition.lightresidue_id.
def botanyoverview(request, flotation_id):
flotation = get_object_or_404(Flotation, pk=flotation_id)
lightresidue = LightResidue.objects.filter(flotation_id__flotation_id=flotation_id)
# composition = Composition.objects.filter(lightresidue.lightresidue_id)
composition = Composition.objects.filter(lightresidue_id=17)
return render(request, 'dashboard/botanyoverview.html',
{
'flotation':flotation,
'lightresidue':lightresidue,
'composition':composition,
})
You could do this by chaining the compositions of all the lightresidue using itertools.
from itertools import chain
def botanyoverview(request, flotation_id):
flotation = get_object_or_404(Flotation, pk=flotation_id)
lightresidue = LightResidue.objects.filter(flotation_id__flotation_id=flotation_id)
queryset = []
for i in lightresidue:
queryset += Composition.objects.filter(lightresidue_id = i.lightresidue_id)
composition = chain.from_iterable(queryset)
return render(request, 'dashboard/botanyoverview.html',
{
'flotation':flotation,
'lightresidue':lightresidue,
'composition':composition,
})
the lightresidue may contain one or more objects and so here i am getting all the compositions associated with each lightresidue sepearately and combining them using itertools.

Custom Django count filtering

A lot of websites will display:
"1.8K pages" instead of "1,830 pages"
or
"43.2M pages" instead of "43,200,123 pages"
Is there a way to do this in Django?
For example, the following code will generate the quantified amount of objects in the queryset (i.e. 3,123):
Books.objects.all().count()
Is there a way to add a custom count filter to return "3.1K pages" instead of "3,123 pages?
Thank you in advance!
First off, I wouldn't do anything that alters the way the ORM portion of Django works. There are two places this could be done, if you are only planning on using it in one place - do it on the frontend. With that said, there are many ways to achieve this result. Just to spout off a few ideas, you could write a property on your model that calls count then converts that to something a little more human readable for the back end. If you want to do it on the frontend you might want to find a JavaScript lib that could do the conversion.
I will edit this later from my computer and add an example of the property.
Edit: To answer your comment, the easier one to implement depends on your skills in python vs in JavaScript. I prefer python so I would probably do it in there somewhere on the model.
Edit2: I have wrote an example to show you how I would do a classmethod on a base model or on the model that you need these numbers on. I found a python package called humanize and I took its function that converts these to readable and modified it a bit to allow for thousands and took out some of the super large number conversion.
def readable_number(value, short=False):
# Modified from the package `humanize` on pypy.
powers = [10 ** x for x in (3, 6, 9, 12, 15, 18)]
human_powers = ('thousand', 'million', 'billion', 'trillion', 'quadrillion')
human_powers_short = ('K', 'M', 'B', 'T', 'QD')
try:
value = int(value)
except (TypeError, ValueError):
return value
if value < powers[0]:
return str(value)
for ordinal, power in enumerate(powers[1:], 1):
if value < power:
chopped = value / float(powers[ordinal - 1])
chopped = format(chopped, '.1f')
if not short:
return '{} {}'.format(chopped, human_powers[ordinal - 1])
return '{}{}'.format(chopped, human_powers_short[ordinal - 1])
class MyModel(models.Model):
#classmethod
def readable_count(cls, short=True):
count = cls.objects.all().count()
return readable_number(count, short=short)
print(readable_number(62220, True)) # Returns '62.2K'
print(readable_number(6555500)) # Returns '6.6 million'
I would stick that readable_number in some sort of utils and just import it in your models file. Once you have that, you can just stick that string wherever you would like on your frontend.
You would use MyModel.readable_count() to get that value. If you want it under MyModel.objects.readable_count() you will need to make a custom object manager for your model, but that is a bit more advanced.

In Django's template engine, is it possible to run a filter through an entire array?

For example, if I have an array of datetime.date objects, I would like to apply a date format filter to each of its elements, while still making use of the default string representation of the array.
Given a date array that looks like:
[datetime.date(2011, 2, 28), datetime.date(2011, 3, 1), datetime.date(2011, 3, 2)]
Assuming that I already passed it to the template's context, I'd like to do this in the template:
<script>
// ...
var dates = {{ my_date_array|date:'b d, Y' }};
// ...
</script>
so it produces:
var dates = ['Feb 28, 2011', 'Mar 1, 2011', 'Mar 2, 2011'];
..instead of having to loop through the elements of the array.
Is this possible by default, without creating a custom filter?
Looking at the source, I'd say that's not possible using the default date filter.
You will have to either use a loop in your template, or create a custom filter that accepts a list of date objects.
Update:
It should be relatively easy to create your own filter by making use of the existing one. For example:
from django.template.defaultfilters import date
from django import template
register = template.Library()
# Only mildly tested. Use with caution.
def datelist(values, arg=None):
try:
outstr = "', '".join([date(v, arg) for v in values])
except TypeError: # non-iterable?
outstr = date(values, arg)
return "['%s']" % outstr
register.filter('datelist', datelist)
If you don't like that approach for determining iterable objects, you could also use:
# requires Python >=2.4
from collections import Iterable
if isinstance(values, Iterable):
# ....

Django: ordering numerical value with order_by

I'm in a situation where I must output a quite large list of objects by a CharField used to store street addresses.
My problem is, that obviously the data is ordered by ASCII codes since it's a Charfield, with the predictable results .. it sort the numbers like this;
1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21....
Now the obvious step would be to change the Charfield the proper field type (IntegerField let's say), however it cannot work since some address might have apartments .. like "128A".
I really don't know how I can order this properly ..
If you're sure there are only integers in the field, you could get the database to cast it as an integer via the extra method, and order by that:
MyModel.objects.extra(
select={'myinteger': 'CAST(mycharfield AS INTEGER)'}
).order_by('myinteger')
Django is trying to deprecate the extra() method, but has introduced Cast() in v1.10. In sqlite (at least), CAST can take a value such as 10a and will cast it to the integer 10, so you can do:
from django.db.models import IntegerField
from django.db.models.functions import Cast
MyModel.objects.annotate(
my_integer_field=Cast('my_char_field', IntegerField())
).order_by('my_integer_field', 'my_char_field')
which will return objects sorted by the street number first numerically, then alphabetically, e.g. ...14, 15a, 15b, 16, 16a, 17...
If you're using PostgreSQL (not sure about MySQL) you can safely use following code on char/text fields and avoid cast errors:
MyModel.objects.extra(
select={'myinteger': "CAST(substring(charfield FROM '^[0-9]+') AS INTEGER)"}
).order_by('myinteger')
Great tip! It works for me! :) That's my code:
revisioned_objects = revisioned_objects.extra(select={'casted_object_id': 'CAST(object_id AS INTEGER)'}).extra(order_by = ['casted_object_id'])
I know that I’m late on this, but since it’s strongly related to the question, and that I had a hard time finding this:
You have to know that you can directly put the Cast in the ordering option of your model.
from django.db import models
from django.db.models.functions import Cast
class Address(models.Model):
street_number = models.CharField()
class Meta:
ordering = [
Cast("street_number", output_field=models.IntegerField()),
]
From the doc about ordering:
You can also use query expressions.
And from the doc about database functions:
Functions are also expressions, so they can be used and combined with other expressions like aggregate functions. 
The problem you're up against is quite similar to how filenames get ordered when sorting by filename. There, you want "2 Foo.mp3" to appear before "12 Foo.mp3".
A common approach is to "normalize" numbers to expanding to a fixed number of digits, and then sorting based on the normalized form. That is, for purposes of sorting, "2 Foo.mp3" might expand to "0000000002 Foo.mp3".
Django won't help you here directly. You can either add a field to store the "normalized" address, and have the database order_by that, or you can do a custom sort in your view (or in a helper that your view uses) on address records before handing the list of records to a template.
In my case i have a CharField with a name field, which has mixed (int+string) values, for example. "a1", "f65", "P", "55" e.t.c ..
Solved the issue by using the sql cast (tested with postgres & mysql),
first, I try to sort by the casted integer value, and then by the original value of the name field.
parking_slots = ParkingSlot.objects.all().extra(
select={'num_from_name': 'CAST(name AS INTEGER)'}
).order_by('num_from_name', 'name')
This way, in any case, the correct sorting works for me.
In case you need to sort version numbers consisting of multiple numbers separated by a dot (e.g. 1.9.0, 1.10.0), here is a postgres-only solution:
class VersionRecordManager(models.Manager):
def get_queryset(self):
return super().get_queryset().extra(
select={
'natural_version': "string_to_array(version, '.')::int[]",
},
)
def available_versions(self):
return self.filter(available=True).order_by('-natural_version')
def last_stable(self):
return self.available_versions().filter(stable=True).first()
class VersionRecord(models.Model):
objects = VersionRecordManager()
version = models.CharField(max_length=64, db_index=True)
available = models.BooleanField(default=False, db_index=True)
stable = models.BooleanField(default=False, db_index=True)
In case you want to allow non-numeric characters (e.g. 0.9.0 beta, 2.0.0 stable):
def get_queryset(self):
return super().get_queryset().extra(
select={
'natural_version':
"string_to_array( "
" regexp_replace( " # Remove everything except digits
" version, '[^\d\.]+', '', 'g' " # and dots, then split string into
" ), '.' " # an array of integers.
")::int[] "
}
)
I was looking for a way to sort the numeric chars in a CharField and my search led me here. The name fields in my objects are CC Licenses, e.g., 'CC BY-NC 4.0'.
Since extra() is going to be deprecated, I was able to do it this way:
MyObject.objects.all()
.annotate(sorting_int=Cast(Func(F('name'), Value('\D'), Value(''), Value('g'), function='regexp_replace'), IntegerField()))
.order_by('-sorting_int')
Thus, MyObject with name='CC BY-NC 4.0' now has sorting_int=40.
All the answeres in this thread did not work for me because they are assuming numerical text. I found a solution that will work for a subset of cases. Consider this model
Class Block(models.Model):
title = models.CharField()
say I have fields that sometimes have leading characters and trailing numerical characters If i try and order normally
>>> Block.objects.all().order_by('title')
<QuerySet [<Block: 1>, <Block: 10>, <Block: 15>, <Block: 2>, <Block: N1>, <Block: N12>, <Block: N4>]>
As expected, it's correct alphabetically, but makes no sense for us humans. The trick that I did for this particular use case is to replace any text i find with the number 9999 and then cast the value to an integer and order by it.
for most cases that have leading characters this will get the desired result. see below
from django.db.models.expressions import RawSQL
>>> Block.objects.all()\
.annotate(my_faux_integer=RawSQL("CAST(regexp_replace(title, '[A-Z]+', '9999', 'g') AS INTEGER)", ''))\
.order_by('my_faux_integer', 'title')
<QuerySet [<Block: 1>, <Block: 2>, <Block: 10>, <Block: 15>, <Block: N1>, <Block: N4>, <Block: N12>]>