Given the following (simplified) models:
from django.contrib.gis.db import models
class City(models.Model):
center = models.PointField(spatial_index=True, null=True)
objects = models.GeoManager()
class Place(models.Model):
city = models.ForeignKey(City, null=True)
lat = models.FloatField(null=True)
lng = models.FloatField(null=True)
objects = models.GeoManager()
Forgetting for the moment that the lat/lng in Place should be moved to a PointField(), I am trying to look through all of the Places and find the closest city. Currently, I am doing:
from django.contrib.gis.geos import Point
places = Property.objects.filter(lat__isnull=False, lng__isnull=False)
for place in places:
point = Point(place.lng, place.lat, srid=4326) # setting srid just to be safe
closest_city = City.objects.distance(point).order_by('distance')[0]
This results in the following error:
DatabaseError: geometry_distance_spheroid: Operation on two GEOMETRIES with different SRIDs
Assuming that the SRIDs were not defaulting to 4326, I included srid=4326 in the above code and verified that all of the cities have City.center has an SRID of 4326:
In [6]: [c['center'].srid for c in City.objects.all().values('center')]
Out[6]: [4326, 4326, 4326, ...]
Any ideas on what could be causing this?
UPDATE:
There seems to be something in how the sql query is created that causes a problem. After the error is thrown, looking at the sql shows:
In [9]: from django.db import connection
In [10]: print connection.queries[-1]['sql']
SELECT (ST_distance_sphere("model_city"."center",
ST_GeomFromEWKB(E'\\001\\001...\\267C#'::bytea))) AS "distance",
"model_city"."id", "model_city"."name", "listing_city"."center"
FROM "model_city" ORDER BY "model_city"."name" ASC LIMIT 21
It looks like django is turning the point argument of distance() into Extended Well-Known Binary. If I then change ST_GeomFromEWKB to ST_GeomFromText everything works fine. Example:
# SELECT (ST_distance_sphere("listing_city"."center",
ST_GeomFromText('POINT(-118 38)',4326))) AS "distance",
"model_city"."name", "model_city"."center" FROM "model_city"
ORDER BY "listing_city"."name" ASC LIMIT 5;
distance | name | center
------------------+-------------+----------------------------------------------------
3124059.73265751 | Akron | 0101000020E6100000795DBF60376154C01CB62DCA6C8A4440
3742978.5514446 | Albany | 0101000020E6100000130CE71A667052C038876BB587534540
1063596.35270877 | Albuquerque | 0101000020E6100000CC0D863AACA95AC036E7E099D08A4140
I can't find anything in the documentation that speaks to how GeoQuerySet.distance() translates into SQL. I can certainly use raw SQL in the query to get things to work, but would prefer to keep everything nicely in the Django framework.
i think this error : "Operation on two GEOMETRIES with different SRIDs"
"geometry_columns" table on your database is set different srid between your table name to process
***** you should change it yourself
Related
Context
There is a dataframe of customer invoices and their due dates.(Identified by customer code)
Week(s) need to be added depending on customer code
Model is created to persist the list of customers and week(s) to be added
What is done so far:
Models.py
class BpShift(models.Model):
bp_name = models.CharField(max_length=50, default='')
bp_code = models.CharField(max_length=15, primary_key=True, default='')
weeks = models.IntegerField(default=0)
helper.py
from .models import BpShift
# used in views later
def week_shift(self, df):
df['DueDateRange'] = df['DueDate'] + datetime.timedelta(
weeks=BpShift.objects.get(pk=df['BpCode']).weeks)
I realised my understanding of Dataframes is seriously flawed.
df['A'] and df['B'] would return Series. Of course, timedelta wouldn't work like this(weeks=BpShift.objects.get(pk=df['BpCode']).weeks).
Dataframe
d = {'BpCode':['customer1','customer2'],'DueDate':['2020-05-30','2020-04-30']}
df = pd.DataFrame(data=d)
Customer List csv
BP Name,BP Code,Week(s)
Customer1,CA0023MY,1
Customer2,CA0064SG,1
Error
BpShift matching query does not exist.
Commentary
I used these methods in hope that I would be able to change the dataframe at once, instead of
using df.iterrows(). I have recently been avoiding for loops like a plague and wondering if this
is the "correct" mentality. Is there any recommended way of doing this? Thanks in advance for any guidance!
This question Python & Pandas: series to timedelta will help to take you from Series to timedelta. And although
pandas.Series(
BpShift.objects.filter(
pk__in=df['BpCode'].tolist()
).values_list('weeks', flat=True)
)
will give you a Series of integers, I doubt the order is the same as in df['BpCode']. Because it depends on the django Model and database backend.
So you might be better off to explicitly create not a Series, but a DataFrame with pk and weeks columns so you can use df.join. Something like this
pandas.DataFrame(
BpShift.objects.filter(
pk__in=df['BpCode'].tolist()
).values_list('pk', 'weeks'),
columns=['BpCode', 'weeks'],
)
should give you a DataFrame that you can join with.
So combined this should be the gist of your code:
django_response = [('customer1', 1), ('customer2', '2')]
d = {'BpCode':['customer1','customer2'],'DueDate':['2020-05-30','2020-04-30']}
df = pd.DataFrame(data=d).set_index('BpCode').join(
pd.DataFrame(django_response, columns=['BpCode', 'weeks']).set_index('BpCode')
)
df['DueDate'] = pd.to_datetime(df['DueDate'])
df['weeks'] = pd.to_numeric(df['weeks'])
df['new_duedate'] = df['DueDate'] + df['weeks'] * pd.Timedelta('1W')
print(df)
DueDate weeks new_duedate
BpCode
customer1 2020-05-30 1 2020-06-06
customer2 2020-04-30 2 2020-05-14
You were right to want to avoid looping. This approach gets all the data in one SQL query from your Django model, by using filter. Then does a left join with the DataFrame you already have. Casts the dates and weeks to the right types and then computes a new due date using the whole columns instead of loops over them.
NB the left join will give NaN and NaT for customers that don't exist in your Django database. You can either avoid those rows by passing how='inner' to df.join or handle them whatever way you like.
If two models both have JSONFields, is there a way to match one against the other? Say I have two models:
Crabadoodle(Model):
classification = CharField()
metadata = JSONField()
Glibotz(Model):
rating = IntegerField()
metadata = JSONField()
If I have a Crabadoodle and want to fetch all the Glibotz objects with identical metadata fields, how would I go about that? If I know specific contents, I can filter simple enough, but how do you go about matching on the whole field?
There is no implementation of this in Django but it is possible by performing raw query using jsonb operators(#>,<#)
Something in line of following
select *
from someapp_crabdoodle crab
join someapp_glizbotz glib
on crab.metadata #> glib.metadata and crab.metadata <# glib.metadata
where crab.id = 1
Given these models:
class Listing(models.Model):
features = models.ManyToManyField('Feature', related_name='listing_details')
comments = models.TextField()
class Feature(models.Model):
feature = models.CharField(max_length=100, unique=True)
How do I do a full-text search for Listings with text in either comments or one of the related Features?
I tried this:
In[28]: Listing.objects.annotate(search=SearchVector('comments', 'features__feature')).filter(search='something').count()
Out[28]:
1215
So, I know not all those records contain the text something.
However, the number is "right" in the sense that a regular non-full-text query comes up with the same number:
In[33]: Listing.objects.filter(Q(comments__icontains='something') | Q(features__feature__icontains='something')).count()
Out[33]:
1215
I can get down to just the Listing objects containing the text something in the comments field or in features__feature like so:
In[34]: Listing.objects.filter(Q(comments__icontains='something') | Q(features__feature__icontains='something')).distinct().count()
Out[34]:
25
The real question boils down to how do I get those same 25 records back with full text search?
I used ManyToManyField in SearchVector with StringAgg to avoid strange duplication and have correct results.
In your example the correct query should be:
from django.contrib.postgres.aggregates import StringAgg
from django.contrib.postgres.search import SearchVector
Listing.objects.annotate(
search=SearchVector('comments') + SearchVector(StringAgg('features__feature', delimiter=' '))
).filter(search='something')
short
I am trying to query a model Foo which has a many-to-many relationship to Address where addresses will be within a specified distance from a given point and sort results by ascending distance. Seems like annotation would be able to do this however I can't figure out how to do that in GeoDjango since it does not support geo annotations.
longer
Here is the basic model structure I have:
# app name is bar
from django.contrib.gis.db import models
class Location(models.Model):
latlon = Models.PointFields(spatial_index=True)
# other fields ommitted
objects = models.GeoManager()
class Address(models.Model):
latlon = models.PointField(spatial_index=True)
# other fields omitted
objects = models.GeoManager()
class Foo(models.Model):
addresses = models.ManyToManyField(Address)
# other fields omitted
objects = models.GeoManager()
Using the above models I am able to construct a query which selects all Foo objects which have addresses within a specific distance from a specific point. For example:
from django.contrib.gis.geos import Point
from django.contrib.gis.measure import Distance
new_york = Point(-73.98497, 40.75813) # == Location.latlon
Foo.objects.filter(addresses__latlon__distance_lte=(new_york, Distance(mi=20)))
That generates a query something like:
SELECT
"bar_foo"."id",
...
FROM "bar_foo"
INNER JOIN "bar_foo_address"
ON ("bar_foo"."id" = "bar_foo_address"."foo_id")
INNER JOIN "bar_address"
ON ("bar_foo_address"."address_id" = "bar_address"."id")
WHERE (ST_distance_sphere("bar_address"."latlon",ST_GeomFromEWKB(
'\x0101000020e6100000aaf1d24d628052c096218e75715b4440' :: BYTEA)) <= 32186.88)
That works very well except I run into trouble if I want to sort the all foos by their distance from the given point. I tried something like:
(Foo
.objects
.filter(addresses__latlon__distance_lte=(new_york, Distance(mi=20)))
.distance(Location.latlon)
.order_by('distance'))
# produces
TypeError: ST_Distance output only available on GeometryFields.
When I read some source code I tried to modify the query and yet still getting errors:
(Foo
.objects
.filter(addresses__latlon__distance_lte=(new_york, Distance(mi=20)))
.distance(Location.latlon)
.order_by('distance', field_name='addresses_latlon'))
# produces
ValueError: <django.contrib.gis.db.models.fields.PointField: latlon> not in self.query.related_select_cols
I guess this is related to a fact that Address and Foo have many-to-many relationship. Unfortunately regular annotations are not supported in GeoDjango so I cant do something like:
# hypothetical syntax
(Foo
.objects
.annotate(distance=DistanceAnnotation('addresses__latlon', new_york, unit='mi'))
.filter(distance__lte=20)
.order_by('distance'))
# which would generate
SELECT
"bar_foo"."id",
(ST_distance_sphere("bar_address"."latlon",ST_GeomFromEWKB(
'\x0101000020e6100000aaf1d24d628052c096218e75715b4440' :: BYTEA)) as distance,
...
FROM "bar_foo"
INNER JOIN "bar_foo_address"
ON ("bar_foo"."id" = "bar_foo_address"."foo_id")
INNER JOIN "bar_address"
ON ("bar_foo_address"."address_id" = "bar_address"."id")
WHERE distance <= 32186.88)
ORDER BY distance ASC
So the question is how can I do do regular annotation using existing API? Or maybe some other way I can accomplish the desired result?
Distance annotation was implemented in 1.9
https://docs.djangoproject.com/en/dev/ref/contrib/gis/functions/#distance
Distance
class Distance(expr1, expr2, spheroid=None, **extra) Availability:
MySQL, PostGIS, Oracle, SpatiaLite
Accepts two geographic fields or expressions and returns the distance
between them, as a Distance object. On MySQL, a raw float value is
returned when the coordinates are geodetic.
(...)
In the following example, the distance from the city of Hobart to
every other PointField in the AustraliaCity queryset is calculated:
>>> from django.contrib.gis.db.models.functions import Distance
>>> pnt = AustraliaCity.objects.get(name='Hobart').point
>>> for city in AustraliaCity.objects.annotate(distance=Distance('point', pnt)):
... print(city.name, city.distance)
Wollongong 990071.220408 m
Shellharbour 972804.613941 m
Thirroul 1002334.36351 m
...
Hi Stackoverflow people,
I am confused with m2m queries in Django. I have a model RadioStations which lists radio stations around a continent (simply name and the available country) and has the following declaration:
class Station(models.Model):
name = models.CharField(_('Station Name'), max_length=255
reference = models.URLField(_('Link'), blank=True, verify_exists=True)
country = models.ManyToManyField(WorldBorder)
The class WorldBorder follows the GeoDjango example here.
Now I would like to search for all stations in the US.
If I use:
s = Station.objects.filter(country__name__contains = "United States")
I get all stations in the US. However, if I now search with a user location, e.g.
pnt = fromstr('POINT(-96.876369 29.905320)', srid=4326)
s = Station.objects.filter(country__mpoly__contains = pnt)
the result of the query is empty (even so the point is located in the U.S.
Is that related to the way of doing a m2m query? Why would the results of the query being empty? Is there a different way of addressing the m2m relationship?
Thank you for your suggestions!
I was not able to successfully make any geospatial queries using fromstr when I tried geodjano. To solve my issues I used Point.
from django.contrib.gis.geos import Point
pnt = Point(-96.876369, 29.905320)
Perhaps you could trying using hte point class?
The solution to the question is the follows:
Instead of going from Stations to WorldBorder, I ended up going the other way.
Django allows the reversed look up through the attribute_set.all() method.
The solution is to look up which country contains the Point with
country = WorldBorder.objects.get(mpoly__contains = ref_point)
and then look up all Stations which contain the country with
station_list = country.stations_set.all()
Note that the set.all() requires a get query, and not a filter query.
More background on the set.all() method can be found here.