Django: serialize an annotated and aggregated queryset to GeoJSON - django

I am trying to use Django ORM (or any other way using Django) to execute this query (PostgreSQL) AND send the result back to the front end in GeoJSON format.
I am using Django 2.2.15
SELECT string_agg(name, '; '), geom
FROM appname_gis_observation
where sp_order = 'order1'
GROUP BY geom;
The model looks like this (models.py)
from django.db import models
from django.contrib.gis.db import models
class gis_observation(models.Model):
name = models.CharField(max_length=100,null=True)
sp_order = models.CharField(max_length=100,null=True)
geom = models.MultiPointField(srid=4326)
So I thought this would work (views.py)
from django.core.serializers import serialize
from .models import *
from django.shortcuts import render
from django.contrib.postgres.aggregates.general import StringAgg
def show_observation(request):
results = gis_observation.objects.values('geom').filter(sp_order='order1').annotate(newname=StringAgg('name', delimiter='; '))
data_geojson = serialize('geojson', results, geometry_field='geom', fields=('newname',))
return render(request, "visualize.html", {"obs" : data_geojson})
The ORM query works fine in the Django shell but Django complains at the serialize step: AttributeError: 'dict' object has no attribute '_meta'.
Even if the serialize step worked, I suspect it would skip my annotated field (by reading other posts)
Apparently I am not the only one who met that same problem but I could not find a solution for it.

This is the solution I came up with. Frankly I'd be glad to accept another answer, so any proposal still welcome!
In the end, I built a Geojson array by looping though the result set. I guess I could as well have gone for a cursor sql query instead and skip the orm api entirely.
queryset = gis_species_observation.objects.values('geom').filter(sp_order='order1').annotate(name=StringAgg('name', delimiter='; '))
mydict = []
results = list(queryset)
for result in results:
rec = {}
rec["type"] = "Feature"
rec["geometry"] = json.loads(result["geom"].geojson)
rec["properties"] = {"name":result["name"]}
mydict.append(rec)
data_geojson = json.dumps(mydict)
return render(request, "visualize_romania.html", {"mynames" :data_geojson})

Related

In Django how can I get result from two tables in a single queryset without having relationship? [duplicate]

I'm trying to build the search for a Django site I am building, and in that search, I am searching in three different models. And to get pagination on the search result list, I would like to use a generic object_list view to display the results. But to do that, I have to merge three querysets into one.
How can I do that? I've tried this:
result_list = []
page_list = Page.objects.filter(
Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term))
article_list = Article.objects.filter(
Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term) |
Q(tags__icontains=cleaned_search_term))
post_list = Post.objects.filter(
Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term) |
Q(tags__icontains=cleaned_search_term))
for x in page_list:
result_list.append(x)
for x in article_list:
result_list.append(x)
for x in post_list:
result_list.append(x)
return object_list(
request,
queryset=result_list,
template_object_name='result',
paginate_by=10,
extra_context={
'search_term': search_term},
template_name="search/result_list.html")
But this doesn't work. I get an error when I try to use that list in the generic view. The list is missing the clone attribute.
How can I merge the three lists, page_list, article_list and post_list?
Concatenating the querysets into a list is the simplest approach. If the database will be hit for all querysets anyway (e.g. because the result needs to be sorted), this won't add further cost.
from itertools import chain
result_list = list(chain(page_list, article_list, post_list))
Using itertools.chain is faster than looping each list and appending elements one by one, since itertools is implemented in C. It also consumes less memory than converting each queryset into a list before concatenating.
Now it's possible to sort the resulting list e.g. by date (as requested in hasen j's comment to another answer). The sorted() function conveniently accepts a generator and returns a list:
result_list = sorted(
chain(page_list, article_list, post_list),
key=lambda instance: instance.date_created)
If you're using Python 2.4 or later, you can use attrgetter instead of a lambda. I remember reading about it being faster, but I didn't see a noticeable speed difference for a million item list.
from operator import attrgetter
result_list = sorted(
chain(page_list, article_list, post_list),
key=attrgetter('date_created'))
Try this:
matches = pages | articles | posts
It retains all the functions of the querysets which is nice if you want to order_by or similar.
Please note: this doesn't work on querysets from two different models.
Related, for mixing querysets from the same model, or for similar fields from a few models, starting with Django 1.11 a QuerySet.union() method is also available:
union()
union(*other_qs, all=False)
New in Django 1.11. Uses SQL’s UNION operator to combine the results of two or more QuerySets. For example:
>>> qs1.union(qs2, qs3)
The UNION operator selects only distinct values by default. To allow duplicate values, use the all=True
argument.
union(), intersection(), and difference() return model instances of
the type of the first QuerySet even if the arguments are QuerySets of
other models. Passing different models works as long as the SELECT
list is the same in all QuerySets (at least the types, the names don’t
matter as long as the types in the same order).
In addition, only LIMIT, OFFSET, and ORDER BY (i.e. slicing and
order_by()) are allowed on the resulting QuerySet. Further, databases
place restrictions on what operations are allowed in the combined
queries. For example, most databases don’t allow LIMIT or OFFSET in
the combined queries.
You can use the QuerySetChain class below. When using it with Django's paginator, it should only hit the database with COUNT(*) queries for all querysets and SELECT() queries only for those querysets whose records are displayed on the current page.
Note that you need to specify template_name= if using a QuerySetChain with generic views, even if the chained querysets all use the same model.
from itertools import islice, chain
class QuerySetChain(object):
"""
Chains multiple subquerysets (possibly of different models) and behaves as
one queryset. Supports minimal methods needed for use with
django.core.paginator.
"""
def __init__(self, *subquerysets):
self.querysets = subquerysets
def count(self):
"""
Performs a .count() for all subquerysets and returns the number of
records as an integer.
"""
return sum(qs.count() for qs in self.querysets)
def _clone(self):
"Returns a clone of this queryset chain"
return self.__class__(*self.querysets)
def _all(self):
"Iterates records in all subquerysets"
return chain(*self.querysets)
def __getitem__(self, ndx):
"""
Retrieves an item or slice from the chained set of results from all
subquerysets.
"""
if type(ndx) is slice:
return list(islice(self._all(), ndx.start, ndx.stop, ndx.step or 1))
else:
return islice(self._all(), ndx, ndx+1).next()
In your example, the usage would be:
pages = Page.objects.filter(Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term))
articles = Article.objects.filter(Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term) |
Q(tags__icontains=cleaned_search_term))
posts = Post.objects.filter(Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term) |
Q(tags__icontains=cleaned_search_term))
matches = QuerySetChain(pages, articles, posts)
Then use matches with the paginator like you used result_list in your example.
The itertools module was introduced in Python 2.3, so it should be available in all Python versions Django runs on.
In case you want to chain a lot of querysets, try this:
from itertools import chain
result = list(chain(*docs))
where: docs is a list of querysets
The big downside of your current approach is its inefficiency with large search result sets, as you have to pull down the entire result set from the database each time, even though you only intend to display one page of results.
In order to only pull down the objects you actually need from the database, you have to use pagination on a QuerySet, not a list. If you do this, Django actually slices the QuerySet before the query is executed, so the SQL query will use OFFSET and LIMIT to only get the records you will actually display. But you can't do this unless you can cram your search into a single query somehow.
Given that all three of your models have title and body fields, why not use model inheritance? Just have all three models inherit from a common ancestor that has title and body, and perform the search as a single query on the ancestor model.
This can be achieved by two ways either.
1st way to do this
Use union operator for queryset | to take union of two queryset. If both queryset belongs to same model / single model than it is possible to combine querysets by using union operator.
For an instance
pagelist1 = Page.objects.filter(
Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term))
pagelist2 = Page.objects.filter(
Q(title__icontains=cleaned_search_term) |
Q(body__icontains=cleaned_search_term))
combined_list = pagelist1 | pagelist2 # this would take union of two querysets
2nd way to do this
One other way to achieve combine operation between two queryset is to use itertools chain function.
from itertools import chain
combined_results = list(chain(pagelist1, pagelist2))
You can use Union:
qs = qs1.union(qs2, qs3)
But if you want to apply order_by on the foreign models of the combined queryset... then you need to Select them beforehand this way... otherwise it won't work.
Example
qs = qs1.union(qs2.select_related("foreignModel"), qs3.select_related("foreignModel"))
qs.order_by("foreignModel__prop1")
where prop1 is a property in the foreign model.
DATE_FIELD_MAPPING = {
Model1: 'date',
Model2: 'pubdate',
}
def my_key_func(obj):
return getattr(obj, DATE_FIELD_MAPPING[type(obj)])
And then sorted(chain(Model1.objects.all(), Model2.objects.all()), key=my_key_func)
Quoted from https://groups.google.com/forum/#!topic/django-users/6wUNuJa4jVw. See Alex Gaynor
Requirements:
Django==2.0.2, django-querysetsequence==0.8
In case you want to combine querysets and still come out with a QuerySet, you might want to check out django-queryset-sequence.
But one note about it. It only takes two querysets as it's argument. But with python reduce you can always apply it to multiple querysets.
from functools import reduce
from queryset_sequence import QuerySetSequence
combined_queryset = reduce(QuerySetSequence, list_of_queryset)
And that's it. Below is a situation I ran into and how I employed list comprehension, reduce and django-queryset-sequence
from functools import reduce
from django.shortcuts import render
from queryset_sequence import QuerySetSequence
class People(models.Model):
user = models.OneToOneField(User, on_delete=models.CASCADE)
mentor = models.ForeignKey('self', null=True, on_delete=models.SET_NULL, related_name='my_mentees')
class Book(models.Model):
name = models.CharField(max_length=20)
owner = models.ForeignKey(Student, on_delete=models.CASCADE)
# as a mentor, I want to see all the books owned by all my mentees in one view.
def mentee_books(request):
template = "my_mentee_books.html"
mentor = People.objects.get(user=request.user)
my_mentees = mentor.my_mentees.all() # returns QuerySet of all my mentees
mentee_books = reduce(QuerySetSequence, [each.book_set.all() for each in my_mentees])
return render(request, template, {'mentee_books' : mentee_books})
Here's an idea... just pull down one full page of results from each of the three and then throw out the 20 least useful ones... this eliminates the large querysets and that way you only sacrifice a little performance instead of a lot.
The best option is to use the Django built-in methods:
# Union method
result_list = page_list.union(article_list, post_list)
That will return the union of all the objects in those querysets.
If you want to get just the objects that are in the three querysets, you will love the built-in method of querysets, intersection.
# intersection method
result_list = page_list.intersection(article_list, post_list)
This will do the work without using any other libraries:
result_list = page_list | article_list | post_list
You can use "|"(bitwise or) to combine the querysets of the same model as shown below:
# "store/views.py"
from .models import Food
from django.http import HttpResponse
def test(request):
# ↓ Bitwise or
result = Food.objects.filter(name='Apple') | Food.objects.filter(name='Orange')
print(result)
return HttpResponse("Test")
Output on console:
<QuerySet [<Food: Apple>, <Food: Orange>]>
[22/Jan/2023 12:51:44] "GET /store/test/ HTTP/1.1" 200 9
And, you can use |= to add the queryset of the same model as shown below:
# "store/views.py"
from .models import Food
from django.http import HttpResponse
def test(request):
result = Food.objects.filter(name='Apple')
# ↓↓ Here
result |= Food.objects.filter(name='Orange')
print(result)
return HttpResponse("Test")
Output on console:
<QuerySet [<Food: Apple>, <Food: Orange>]>
[22/Jan/2023 12:51:44] "GET /store/test/ HTTP/1.1" 200 9
Be careful, if adding the queryset of a different model as shown below:
# "store/views.py"
from .models import Food, Drink
from django.http import HttpResponse
def test(request):
# "Food" model # "Drink" model
result = Food.objects.filter(name='Apple') | Drink.objects.filter(name='Milk')
print(result)
return HttpResponse("Test")
There is an error below:
AssertionError: Cannot combine queries on two different base models.
[22/Jan/2023 13:40:54] "GET /store/test/ HTTP/1.1" 500 96025
But, if adding the empty queryset of a different model as shown below:
# "store/views.py"
from .models import Food, Drink
from django.http import HttpResponse
def test(request):
# "Food" model # Empty queryset of "Drink" model
result = Food.objects.filter(name='Apple') | Drink.objects.none()
print(result)
return HttpResponse("Test")
There is no error below:
<QuerySet [<Food: Apple>]>
[22/Jan/2023 13:51:09] "GET /store/test/ HTTP/1.1" 200 9
Again be careful, if adding the object by get() as shown below:
# "store/views.py"
from .models import Food
from django.http import HttpResponse
def test(request):
result = Food.objects.filter(name='Apple')
# ↓↓ Object
result |= Food.objects.get(name='Orange')
print(result)
return HttpResponse("Test")
There is an error below:
AttributeError: 'Food' object has no attribute '_known_related_objects'
[22/Jan/2023 13:55:57] "GET /store/test/ HTTP/1.1" 500 95748
This recursive function concatenates array of querysets into one queryset.
def merge_query(ar):
if len(ar) ==0:
return [ar]
while len(ar)>1:
tmp=ar[0] | ar[1]
ar[0]=tmp
ar.pop(1)
return ar

Django: JSONField + Full Text Search + Indexing -> Seq Scan. How to configure indexing to work?

Im using Django 2.2 and PostgreSQL 12.
Here is my model:
from django.contrib.postgres.search import SearchVectorField, SearchVector
from django.contrib.postgres.fields import JSONField
class ProfileUser(models.Model):
name = JSONField()
search_vector = SearchVectorField(null=True)
class Meta:
indexes = [
GinIndex(fields=['search_vector'], name='user_full_name_gin_idx')
]
def save(self, *args, **kwargs):
super(ProfileUser, self).save(*args, **kwargs)
ProfileUser.objects.update(search_vector=SearchVector('name'))
Here Im creating a new user and trying to find it:
from apps.profiles.models import ProfileUser
from django.contrib.postgres.search import SearchVector
ProfileUser.objects.create(name=[{'name': 'SomeUser', 'lang': 'en'}])
ProfileUser.objects.annotate(search=SearchVector('name')).filter(search__icontains='someuser').explain()
Result:
"Seq Scan on profiles_user (cost=0.00..81.75 rows=1 width=316)\n
Filter: (upper((to_tsvector(COALESCE((name)::text, ''::text)))::text)
~~ '%someuser%'::text)"
How to make indexing working?
EDIT:
As a response to #ivissani's comment, I added 5000 users and tried .filter(search__icontains='someuser') and .filter(search_vector__icontains='someuser') - same story -> Seq Scan
I think you were not using completely well the full-text search Django module.
The main issue I can see in your code are:
updating the search vector field without filtering your object
executing your search query on an annotated SearchVector using an icontains instead of using the SearchVectorField with your GinIndex
I updated a bit your models code:
from django.contrib.postgres.fields import JSONField
from django.contrib.postgres.indexes import GinIndex
from django.contrib.postgres.search import SearchVectorField, SearchVector
from django.db import models
from django.db.models import F
class ProfileUser(models.Model):
name = JSONField()
search_vector = SearchVectorField(null=True)
class Meta:
indexes = [GinIndex(fields=["search_vector"], name="user_full_name_gin_idx")]
def save(self, *args, **kwargs):
super().save(*args, **kwargs)
ProfileUser.objects.annotate(search_vector_name=SearchVector("name")).filter(
id=self.id
).update(search_vector=F("search_vector_name"))
As you can see I added an annotate and a filter in save method to update only the search vector fields of your model (You can find in another answer of mine another example of this usage)
Here you can see the code I used in the python shell to create a new ProfileUser.
You can see the two SQL query executed in the save method:
>>> from users.models import ProfileUser
>>> ProfileUser.objects.create(name=[{'name': 'SomeUser', 'lang': 'en'}])
INSERT INTO "users_profileuser" ("name", "search_vector")
VALUES ('[{"name": "SomeUser", "lang": "en"}]', NULL) RETURNING "users_profileuser"."id"
UPDATE "users_profileuser"
SET "search_vector" = to_tsvector(COALESCE(("users_profileuser"."name")::text, ''))
WHERE "users_profileuser"."id" = 1
And below there the code I executed in the python shell to search the ProfileUser with the SearchVectorField using the GINindex of the model.
You can see the Index Scan on the index:
>>> from django.contrib.postgres.search import SearchQuery
>>> ProfileUser.objects.filter(search_vector=SearchQuery('someuser')).explain()
EXPLAIN
SELECT "users_profileuser"."id",
"users_profileuser"."name",
"users_profileuser"."search_vector"
FROM "users_profileuser"
WHERE "users_profileuser"."search_vector" ## (plainto_tsquery('someuser')) = true
"Bitmap Heap Scan on users_profileuser (cost=12.28..21.74 rows=4 width=68)
Recheck Cond: (search_vector ## plainto_tsquery('someuser'::text))
-> Bitmap Index Scan on user_full_name_gin_idx (cost=0.00..12.28 rows=4 width=0)
Index Cond: (search_vector ## plainto_tsquery('someuser'::text))"
If you want to know more about Full-text Search with Django and PostgreSQL you can read the official documentation about the full-text search.
If you are interested in external article about that here it's the one I wrote:
Full-Text Search in Django with PostgreSQL
Based on this article I found short solution for Django 2.2+
Model:
from django.contrib.postgres.fields import JSONField
from django.contrib.postgres.indexes import GinIndex
from django.contrib.postgres.search import SearchVectorField, SearchVector
from django.db import models
class ProfileUser(models.Model):
name = JSONField()
search_vector = SearchVectorField(null=True)
class Meta:
indexes = [GinIndex(fields=["search_vector"], name="user_full_name_gin_idx")]
def save(self, *args, **kwargs):
super(ProfileUser, self).save(*args, **kwargs)
ProfileUser.objects.filter(pk=self.pk).update(search_vector=SearchVector('name'))
Query:
from django.contrib.postgres.search import SearchQuery
from apps.profiles.models import ProfileUser
ProfileUser.objects.create(name=[{'name': 'Adriano Celentano', 'lang': 'en'}])
partial_name = 'celen' # or 'celentano adr'
query = re.sub(r'[!\'()|&]', ' ', partial_name).strip()
if query:
query = re.sub(r'\s+', ' & ', query)
query += ':*' # -> 'celen:*' or 'celentano & adr:*'
# Please note, that `search_type` parameter was added to Django 2.2.
ProfileUser.objects.filter(search_vector=SearchQuery(query, search_type='raw')).explain()
Such SearchQuery allows to search names partially (ie "starting with" approach, example: can find "celen" but can not find "lent") and case insensitively. If you need to "lent" part, probably you need to use Trigram Similarity as shown in #paolo-melchiorre 's article
"Bitmap Heap Scan on profiles_user (cost=13.03..194.69 rows=101
width=333)\n Recheck Cond: (search_vector ##
to_tsquery('celen:'::text))\n -> Bitmap Index Scan on
user_full_name_gin_idx (cost=0.00..13.01 rows=101 width=0)\n
Index Cond: (search_vector ## to_tsquery('celen:'::text))"
P.S. Regarding icontains and contains I found in different sources, that they always do Sequential Scan.
One more possibly useful article.

How do I handle %20 into spaces for a django filter for queries?

class SomeFilter(filters.FilterSet):
class Meta:
model = SomeModel
fields = {
'column1': '__all__',
'column2': '__all__'
}
So basically lets say I have a GET request using this filter like www.someAPI.com/?column2=something%20or%20Another
When I apply the filter above, it doesn't work because it's querying column 2 with %20 instead of spaces (which is what is in the sql database) how can I handle this so it queries correctly?
You can use urllib
import urllib
old_string = 'stringwith%20'
new_string = urllib.unquote(old_string)

In django how can i create a model instance from a string value?

All,
I have strings that represent my model and fields, like this
modelNameStr = 'MyModel'
fieldNameStr = 'modelField'
My model looks like this;
class MyModel(models.Model):
modelField = ForeignKey( ForeignModel )
...
What i want to do is create an instance of MyModel using the string variables, something like
model_instance = modelNameStr.objects.filter(fieldNameStr=ForeignModelInstance)
How can i do this?
Gath
model_instance = ContentType.objects.get(app_label=u'someapp', model=modelNameStr).model_class()(**{fieldNameStr: ForeignModelInstance})
Phew! Try saying that five times fast! But make sure you use the appropriate value for app_label.
Retrieving the model class you can use the get_model function from Django. Though you have to use a string like my_app.MyModel where 'my_app' is your django app which includes the model. Filtering field values can be achieved via a dict. Here an example:
from django.db.models import get_model
modelNameStr = 'my_app.MyModel'
fieldNameStr = 'modelField'
ModelClass = get_model(*model_class.split('.'))
filters = {fieldNameStr: ForeignModelInstance}
model_instance = ModelClass.objects.filter(**filters)

Is it possible to specify a QuerySet model dynamically as a string?

I am trying to build a query in Django dynamically. I have a lot of models that I would like to build a query for, but I don't want to code the name of the model, I want to pass it as a string.
from django.db.models.query import QuerySet
a_works = QuerySet(model_A)
a_doesnt_work = QuerySet("model_A") # I want this to work, too
a_works.filter(pk=23) # no error
a_doesnt_work.filter(pk=23) # error: AttributeError: 'str' object has no attribute '_meta'
# then I am dynamically filtering different fields, which works fine with a_works
kwargs = { "%s__%s" % (field, oper) : val }
results = a_works.filter( **kwargs )
Is there a way to make the dynamic model selection work?
Don't try and build querysets via the QuerySet class itself. You should always go via a model's Manager.
You can get the model via the get_model function defined in django.db.models. It takes parameters of the app name and the model name.
from django.db.models import get_model
model = get_model('myapp', 'modelA')
model.objects.filter(**kwargs)
Refer this: https://stackoverflow.com/a/75168880/7212249
from django.apps import apps
def get_app_label_and_model_name(instance: object):
"""
get_model(), which takes two pieces of information — an “app label” and “model name” — and returns the model
which matches them.
#return: None / Model
"""
app_label = instance._meta.app_label
model_name = instance.__class__.__name__
model = apps.get_model(app_label, model_name)
return model
How to use?
model_name = get_app_label_and_model_name(pass_model_object_here)
and use this to get dynamic model name for queries
model_name = get_app_label_and_model_name(pass_model_object_here)
query_set = model_name.objects.filter() # or anything else