How to add distance from point as an annotation in GeoDjango - django

I have a Geographic Model with a single PointField, I'm looking to add an annotation for the distance of each model from a given point, which I can later filter on and do additional jiggery pokery.
There's the obvious queryset.distance(to_point) function, but this doesn't actually annotate the queryset, it just adds a distance attribute to each model in the queryset, meaning I can't then apply .filter(distance__lte=some_distance) to it later on.
I'm also aware of filtering by the field and distance itself like so:
queryset.filter(point__distance_lte=(to_point, D(mi=radius)))
but since I will want to do multiple filters (to get counts of models within different distance ranges), I don't really want to make the DB calculate the distance from the given point every time, since that could be expensive.
Any ideas? Specifically, is there a way to add this as a regular annotation rather than an inserted attribute of each model?

I couldn't find any baked in way of doing this, so in the end I just created my own Aggregation class:
This only works with post_gis, but making one for another geo db shouldn't be too tricky.
from django.db.models import Aggregate, FloatField
from django.db.models.sql.aggregates import Aggregate as SQLAggregate
class Dist(Aggregate):
def add_to_query(self, query, alias, col, source, is_summary):
source = FloatField()
aggregate = SQLDist(
col, source=source, is_summary=is_summary, **self.extra)
query.aggregates[alias] = aggregate
class SQLDist(SQLAggregate):
sql_function = 'ST_Distance_Sphere'
sql_template = "%(function)s(ST_GeomFromText('%(point)s'), %(field)s)"
This can be used as follows:
queryset.annotate(distance=Dist('longlat', point="POINT(1.022 -42.029)"))
Anyone knows a better way of doing this, please let me know (or tell me why mine is stupid)

One of the modern approaches is the set "output_field" arg to avoid «Improper geometry input type: ». Withour output_field django trying to convert ST_Distance_Sphere float result to GEOField and can not.
queryset = self.objects.annotate(
distance=Func(
Func(
F('addresses__location'),
Func(
Value('POINT(1.022 -42.029)'),
function='ST_GeomFromText'
),
function='ST_Distance_Sphere',
output_field=models.FloatField()
),
function='round'
)
)

Doing it like this this works for me, ie I can apply a filter on an annotation.
Broken up for readability.
from models import Address
from django.contrib.gis.measure import D
from django.contrib.gis.db.models.functions import Distance
intMiles = 200
destPoint = Point(5, 23)
queryset0 = Address.objects.all().order_by('-postcode')
queryset1 = queryset0.annotate(distance=Distance('myPointField' , destPoint ))
queryset2 = queryset1.filter(distance__lte=D(mi=intMiles))
Hope it helps somebody :)

You can use GeoQuerySet.distance
cities = City.objects.distance(reference_pnt)
for city in cities:
print city.distance()
Link: GeoDjango distance documentaion
Edit: Adding distance attribute along with distance filter queries
usr_pnt = fromstr('POINT(-92.69 19.20)', srid=4326)
City.objects.filter(point__distance_lte=(usr_pnt, D(km=700))).distance(usr_pnt).order_by('distance')
Supported distance lookups
distance_lt
distance_lte
distance_gt
distance_gte
dwithin

A way to annotate & sort w/out GeoDjango. This model contains a foreignkey to a Coordinates record which contains lat and lng properties.
def get_nearby_coords(lat, lng, max_distance=10):
"""
Return objects sorted by distance to specified coordinates
which distance is less than max_distance given in kilometers
"""
# Great circle distance formula
R = 6371
qs = Precinct.objects.all().annotate(
distance=Value(R)*Func(
Func(
F("coordinates__lat")*Value(math.sin(math.pi/180)),
function="sin",
output_field=models.FloatField()
) * Value(
math.sin(lat*math.pi/180)
) + Func(
F("coordinates__lat")* Value(math.pi/180),
function="cos",
output_field=models.FloatField()
) * Value(
math.cos(lat*math.pi/180)
) * Func(
Value(lng*math.pi/180) - F("coordinates__lng") * Value(math.pi/180),
function="cos",
output_field=models.FloatField()
),
function="acos"
)
).order_by("distance")
if max_distance is not None:
qs = qs.filter(distance__lt=max_distance)
return qs

Related

How to aggregate sum of several previous aggregated values in django ORM

In use: django 3.2.10, postgresql 13.4
I have next query set with aggregation function Count
queryset = Model.objects.all().aggregate(
trues=Count('id', filter=Q(criteria=True)),
falses=Count('id', filter=Q(criteria=False)),
)
What I want:
queryset = Model.objects.all().aggregate(
trues=Count('id', filter=Q(criteria=True)),
falses=Count('id', filter=Q(criteria=False)),
total=trues+falses, <--------------THIS
)
How to do this?
There is little thing you can do after aggregation, as it returns a python dict object.
I do understand your example here is not your real situation, as you can simply do
Model.objects.aggregate(
total = (Count('id', filter=Q(criteria=True))
+ Count('id', filter=Q(criteria=False)))
)
What I want to say is Django provides .values().annotate() to achieve GROUP BY clause as in sql language.
Take your example here
queryset = Model.objects.values('criteria').annotate(count=Count('id'))
queryset here is still a 'QuerySet' object, and you can further modify the queryset like
queryset = queryset.aggregate(
total=Sum('count')
)
Hopefully it helps.
it seems you want the total number of false and true criteria so you can simply do as follow
queryset = Model.objects.all().filter(
Q(criteria=True) | Q(criteria=False)).count()
or you can use (not recommended except you want to show something in the middle)
from django.db.models import Avg, Case, Count, F, Max, Min, Prefetch, Q, Sum, When
query = Model.objects.annotate(trues=Count('id',filter=Q(criteria=True)),
falses=Count('id',filter=Q(criteria=False))).annotate(trues_false=F('trues')+F('falses')).aggregate(total=Sum('trues_false'))

Django filter through computed value of 2 Foreignkeys

Consider the following:
class Fighter(models.Model):
...
#a bunch of fields
class View(models.Model):
fighter = models.ForeignKey(Fighter,on_delete=models.CASCADE, related_name="views")
viewer = models.ForeignKey(User,on_delete=models.PROTECT, related_name="viewed") #User.viewed.all() returns all View objects of the Fighters the user viewed
class Clash(models.Model):
win_fighter = models.ForeignKey(Fighter,on_delete=models.SET_NULL, related_name="wins")
loss_fighter = models.ForeignKey(Fighter,on_delete=models.SET_NULL, related_name="losses")
The key here is fighter_quality = wins/views = Fighter.wins.all().count()/Fighter.views.all().count()
I need to be able to filter this quality, for instance all Fighters where 50% < quality < 80%. I want my Postgres DB to make the work.
I feel like it should be possible via Aggregate but can't figure out how...
You can .annotate(..) the fighters with that quality metric, and then filter with the given range, like:
from django.db.models import Count, ExpressionWrapper, FloatField
Fighter.objects.annotate(
quality=ExpressionWrapper(
Count('wins', distinct=True)/Count('views', distinct=True),
output_field=FloatField()
)
).filter(
quality__range=(0.5, 0.8)
)
The distinct=True is necessary, since otherwise the quality is always 1: indeed, since we make two JOINs, and we count the ids of the Views and the ids of the Wins, but these numbers always match.
The quality__range=(0.5, 0.8) will thus filter the quality annotation with the __range lookup with 0.5 the lower bound, and 0.8 the upper bound (both inclusive).
The ExpressionWrapper(..., outputField=FloatField()) is necessary such that Django understand that quality is a float, otherwise it will convert 0.5 and 0.8 to an int, and thus check for values between 0 and 0.

Django foreign keys in extra() expression

I'm trying to use the Django extra() method to filter all the objects in a certain radius, just like in this answer: http://stackoverflow.com/questions/19703975/django-sort-by-distance/26219292 but I'm having some problems with the 'gcd' expression as I have to reach the latitude and longitude through two foreign key relationships, instead of using direct model fields.
In particular, I have one Experience class:
class Experience(models.Model):
starting_place_geolocation = models.ForeignKey(GooglePlaceMixin, on_delete=models.CASCADE,
related_name='experience_starting')
visiting_place_geolocation = models.ForeignKey(GooglePlaceMixin, on_delete=models.CASCADE,
related_name='experience_visiting')
with two foreign keys to the same GooglePlaceMixin class:
class GooglePlaceMixin(models.Model):
latitude = models.DecimalField(max_digits=20, decimal_places=15)
longitude = models.DecimalField(max_digits=20, decimal_places=15)
...
Here is my code to filter the Experience objects by starting place location:
def search_by_proximity(self, experiences, latitude, longitude, proximity):
gcd = """
6371 * acos(
cos(radians(%s)) * cos(radians(starting_place_geolocation__latitude))
* cos(radians(starting_place_geolocation__longitude) - radians(%s)) +
sin(radians(%s)) * sin(radians(starting_place_geolocation__latitude))
)
"""
gcd_lt = "{} < %s".format(gcd)
return experiences \
.extra(select={'distance': gcd},
select_params=[latitude, longitude, latitude],
where=[gcd_lt],
params=[latitude, longitude, latitude, proximity],
order_by=['distance'])
but when I try to call the foreign key object "strarting_place_geolocation__latitude" it returns this error:
column "starting_place_geolocation__latitude" does not exist
What should I do to reach the foreign key value? Thank you in advance
When you are using extra (which should be avoided, as stated in documentation), you are actually writing raw SQL. As you probably know, to get value from ForeignKey you have to perform JOIN. When using Django ORM, it translates that fancy double underscores to correct JOIN clause. But the SQL can't. And you also cannot add JOIN manually. The correct way here is to stick with ORM and define some custom database functions for sin, cos, radians and so on. That's pretty easy.
class Sin(Func):
function = 'SIN'
Then use it like this:
qs = experiences.annotate(distance=Cos(Radians(F('starting_place_geolocation__latitude') )) * ( some other expressions))
Note the fancy double underscores comes back again and works as expected
You have got the idea.
Here is a full collection of mine if you like copy pasting from SO)
https://gist.github.com/tatarinov1997/3af95331ef94c6d93227ce49af2211eb
P. S. You can also face the set output_field error. Then you have to wrap your whole distance expression into ExpressionWrapper and provide it an output_field=models.DecimalField() argument.

Annotations in GeoDjango on many-to-many tables

short
I am trying to query a model Foo which has a many-to-many relationship to Address where addresses will be within a specified distance from a given point and sort results by ascending distance. Seems like annotation would be able to do this however I can't figure out how to do that in GeoDjango since it does not support geo annotations.
longer
Here is the basic model structure I have:
# app name is bar
from django.contrib.gis.db import models
class Location(models.Model):
latlon = Models.PointFields(spatial_index=True)
# other fields ommitted
objects = models.GeoManager()
class Address(models.Model):
latlon = models.PointField(spatial_index=True)
# other fields omitted
objects = models.GeoManager()
class Foo(models.Model):
addresses = models.ManyToManyField(Address)
# other fields omitted
objects = models.GeoManager()
Using the above models I am able to construct a query which selects all Foo objects which have addresses within a specific distance from a specific point. For example:
from django.contrib.gis.geos import Point
from django.contrib.gis.measure import Distance
new_york = Point(-73.98497, 40.75813) # == Location.latlon
Foo.objects.filter(addresses__latlon__distance_lte=(new_york, Distance(mi=20)))
That generates a query something like:
SELECT
"bar_foo"."id",
...
FROM "bar_foo"
INNER JOIN "bar_foo_address"
ON ("bar_foo"."id" = "bar_foo_address"."foo_id")
INNER JOIN "bar_address"
ON ("bar_foo_address"."address_id" = "bar_address"."id")
WHERE (ST_distance_sphere("bar_address"."latlon",ST_GeomFromEWKB(
'\x0101000020e6100000aaf1d24d628052c096218e75715b4440' :: BYTEA)) <= 32186.88)
That works very well except I run into trouble if I want to sort the all foos by their distance from the given point. I tried something like:
(Foo
.objects
.filter(addresses__latlon__distance_lte=(new_york, Distance(mi=20)))
.distance(Location.latlon)
.order_by('distance'))
# produces
TypeError: ST_Distance output only available on GeometryFields.
When I read some source code I tried to modify the query and yet still getting errors:
(Foo
.objects
.filter(addresses__latlon__distance_lte=(new_york, Distance(mi=20)))
.distance(Location.latlon)
.order_by('distance', field_name='addresses_latlon'))
# produces
ValueError: <django.contrib.gis.db.models.fields.PointField: latlon> not in self.query.related_select_cols
I guess this is related to a fact that Address and Foo have many-to-many relationship. Unfortunately regular annotations are not supported in GeoDjango so I cant do something like:
# hypothetical syntax
(Foo
.objects
.annotate(distance=DistanceAnnotation('addresses__latlon', new_york, unit='mi'))
.filter(distance__lte=20)
.order_by('distance'))
# which would generate
SELECT
"bar_foo"."id",
(ST_distance_sphere("bar_address"."latlon",ST_GeomFromEWKB(
'\x0101000020e6100000aaf1d24d628052c096218e75715b4440' :: BYTEA)) as distance,
...
FROM "bar_foo"
INNER JOIN "bar_foo_address"
ON ("bar_foo"."id" = "bar_foo_address"."foo_id")
INNER JOIN "bar_address"
ON ("bar_foo_address"."address_id" = "bar_address"."id")
WHERE distance <= 32186.88)
ORDER BY distance ASC
So the question is how can I do do regular annotation using existing API? Or maybe some other way I can accomplish the desired result?
Distance annotation was implemented in 1.9
https://docs.djangoproject.com/en/dev/ref/contrib/gis/functions/#distance
Distance
class Distance(expr1, expr2, spheroid=None, **extra) Availability:
MySQL, PostGIS, Oracle, SpatiaLite
Accepts two geographic fields or expressions and returns the distance
between them, as a Distance object. On MySQL, a raw float value is
returned when the coordinates are geodetic.
(...)
In the following example, the distance from the city of Hobart to
every other PointField in the AustraliaCity queryset is calculated:
>>> from django.contrib.gis.db.models.functions import Distance
>>> pnt = AustraliaCity.objects.get(name='Hobart').point
>>> for city in AustraliaCity.objects.annotate(distance=Distance('point', pnt)):
... print(city.name, city.distance)
Wollongong 990071.220408 m
Shellharbour 972804.613941 m
Thirroul 1002334.36351 m
...

Django Sort By Calculated Field

Using the distance logic from this SO post, I'm getting back a properly-filtered set of objects with this code:
class LocationManager(models.Manager):
def nearby_locations(self, latitude, longitude, radius, max_results=100, use_miles=True):
if use_miles:
distance_unit = 3959
else:
distance_unit = 6371
from django.db import connection, transaction
cursor = connection.cursor()
sql = """SELECT id, (%f * acos( cos( radians(%f) ) * cos( radians( latitude ) ) *
cos( radians( longitude ) - radians(%f) ) + sin( radians(%f) ) * sin( radians( latitude ) ) ) )
AS distance FROM locations_location HAVING distance < %d
ORDER BY distance LIMIT 0 , %d;""" % (distance_unit, latitude, longitude, latitude, int(radius), max_results)
cursor.execute(sql)
ids = [row[0] for row in cursor.fetchall()]
return self.filter(id__in=ids)
The problem is I can't figure out how to keep the list/ queryset sorted by the distance value. I don't want to do this as an extra() method call for performance reasons (one query versus one query on each potential location in my database). A couple of questions:
How can I sort my list by distance? Even taking off the native sort I've defined in my model and using "order_by()", it's still sorting by something else (id, I believe).
Am I wrong about the performance thing and Django will optimize the query, so I should use extra() instead?
Is this the totally wrong way to do this and I should use the geo library instead of hand-rolling this like a putz?
To take your questions in reverse order:
Re 3) Yes, you should definitely take advantage of PostGIS and GeoDjango if you're working with geospatial data. It's just silly not to.
Re 2) I don't think you could quite get Django to do this query for you using .extra() (barring acceptance of this ticket), but it is an excellent candidate for the new .raw() method in Django 1.2 (see below).
Re 1) You are getting a list of ids from your first query, and then using an "in" query to get a QuerySet of the objects corresponding to those ids. Your second query has no access to the calculated distance from the first query; it's just fetching a list of ids (and it doesn't care what order you provide those ids in, either).
Possible solutions (short of ditching all of this and using GeoDjango):
Upgrade to Django 1.2 beta and use the new .raw() method. This allows Django to intelligently interpret the results of a raw SQL query and turn it into a QuerySet of actual model objects. Which would reduce your current two queries into one, and preserve the ordering you specify in SQL. This is the best option if you are able to make the upgrade.
Don't bother constructing a Django queryset or Django model objects at all, just add all the fields you need into the raw SQL SELECT and then use those rows direct from the cursor. May not be an option if you need model methods etc later on.
Perform a third step in Python code, where you iterate over the queryset and construct a Python list of model objects in the same order as the ids list you got back from the first query. Return that list instead of a QuerySet. Won't work if you need to do further filtering down the line.