I am running a django application and I am using the PostGis extension for my db. I am trying to understand better what happens under the hood when I send coordinates, especially because I am working with different coordinate systems which translate to different SRIDs. My question is threefold:
Is django/postgis handling the transformation when creating a Point or Polygon in the DB.
Can I query it back using a different SRID
Is it advisable to use the default SRID=4326
Let's say I have a model like this (note I am setting the standard SRID=4326):
class MyModel(models.Model):
name = models.CharField(
max_length=120,
)
point = models.PointField(
srid=4326,
)
polygon = models.PolygonField(
srid=4326,
)
Now I am sending different coordinates and polygons with different SRIDS.
I am reading here in the django docs that:
Moreover, if the GEOSGeometry is in a different coordinate system (has a different SRID value) than that of the field, then it will be implicitly transformed into the SRID of the model’s field, using the spatial database’s transform procedure
So if I understand this correctly, this mean that when I am sending an API request like this:
data = {
"name": "name"
"point": "SRID=2345;POLYGON ((12.223242267 280.123144553))"
"polygon": "SRID=5432;POLYGON ((133.2345662 214.1429138285, 123.324244572 173.755820912250072))"
}
response = requests.request("post", url=url, data=data)
Both - the polygon and the point - will correctly be transformed into SRID=4326??
EDIT:
When I send a point with SRID=25832;POINT (11.061859 49.460983) I get 'SRID=4326;POINT (11.061859 49.460983)' from the DB. When I send a polygon with 'SRID=25832;POLYGON ((123.2796155732267 284.1831980485285, ' '127.9249715130572 273.7782091450072, 142.2351651215613 ' '280.3825718937042, 137.558146278483 290.279508688337, ' '123.2796155732267 284.1831980485285))' I get a polygon 'SRID=4326;POLYGON ((4.512360573651161 0.002563158966576373, ' '4.512402191765552 0.002469312460126783, 4.512530396754145 ' '0.002528880231016955, 4.512488494972807 0.00261814442892858, ' '4.512360573651161 0.002563158966576373))' from the DB
Can I query it back using a different SRID
Unfortunately I haven't found a way to query the same points back to their original SRID. Is this even possible?
And lastly I am working mostly with coordinates in Europe but I also might have to include sporadically coordinates from all over the world too. Is SRID=4326 a good standard to use?
Thanks a lot for all the help in advance. Really appreciated.
Transforming SRS of geometries is much more than just changing their SRID. So, if for some reason after a transformation the coordinates return with exactly the same values, there was most probably no transformation at all.
This example uses ST_Transform to transform a geometry from 25832 to 4326. See the results yourself:
WITH j (geom) AS (
VALUES('SRID=25832;POINT (11.061 49.463)'::geometry))
SELECT ST_AsEWKT(geom),ST_AsEWKT(ST_Transform(geom,4326)) FROM j;
st_asewkt | st_asewkt
---------------------------------+------------------------------------------------------
SRID=25832;POINT(11.061 49.463) | SRID=4326;POINT(4.511355210946569 0.000446125446657)
(1 Zeile)
The Polygon transformation in your question is btw correct.
Make sure that django is really storing the values you mentioned. Send a 25832 geometry and directly check the SRS in the database. If you're only checking using django, it might be that it is transforming the coordinates back again in the requests, which might explain you not seeing any difference.
To your question:
Is SRID=4326 a good standard to use?
WGS84 is the most used SRS worldwide, so I'd tend to say yes, but it all depends on your use case. If you're uncertain of which SRS to use, it might indicate that your use case does not impose any constraint to it. So, stick to WGS84 but keep in mind that you don't mix different SRS in your application. Btw: if you try to store geometries in multiple SRS in the same table, PostgreSQL will raise an exception ;)
Further reading: ST_AsEWKT, WGS84
First of all, I'm not big expert at GIS (I have created just a few small things in Django and GIS), but...
In this documentaion about GeoDjango: https://docs.djangoproject.com/en/3.1/ref/contrib/gis/tutorial/#automatic-spatial-transformations . According to it:
When doing spatial queries, GeoDjango automatically transforms geometries if they’re in a different coordinate system. ...
Try in console (./manage.py shell):
from <yourapp>.models import MyModel
obj1 = MyModel.objects.all().first()
print(obj1)
print(obj1.point)
print(dir(obj1.point))
print(obj1.point.srid)
--edit--
You can manually test converting between SRID similary to this page: https://gis.stackexchange.com/questions/94640/geodjango-transform-not-working
obj1.point.transform(<new-srid>)
Related
I’d like to use pvlib library to calculate irradiance POA data for a single axis tracker system.
From the documentation it appears that this is possible, by creating a pvlib.tracking.SingleAxisTracker class (with the appropriate metadata), and then calling the get_irradiance method.
I've done so as such:
HSAT = SingleAxisTracker(axis_tilt=0,
axis_azimuth=167.5,
max_angle=50,
backtrack=True,
gcr=0.387)
I then use the get_irradiance method of the HSAT instance of the SingleAxisTracker I just created, expecting it to use the metadata that I just entered to calculate POA data for this Horizontal single axis tracker system:
hsat_poa = HSAT.get_irradiance(surface_tilt=0,
surface_azimuth=167.5,
solar_zenith=sz,
solar_azimuth=sa,
dni=dni,
ghi=ghi,
dhi=dhi,
airmass=None,
model='haydavies')
When I go to plot hsat_poa, however, I get what looks like POA data for a fixed tilt system.
When I looked at the source code, I noticed that the SingleAxisTracker.get_irradiance method ultimately calls the location.total_irrad() method, which only returns POA data for a fixed tilt systems.
Do I need to provide my down surface_tilt data from the HSAT system? I had assumed that pvlib models an HSAT system, and would generate the surface_tilt values for me, based on the arguments provided in the SingleAxisTracker class instantiation. But it appears that's not what happens.
So my question is does pvlib require the tracker angle as an input in order to calculate POA data for Single Axis Tracker systems, or can it model the tracker angle itself, based on metadata like axis_tilt, max_angle, and backtrack?
Turns out pvlib.tracking.singleaxis() is the missing link.
This will determine the rotation angle of a single axis tracker system.
tracker_data = pvlib.tracking.singleaxis(solar_position['apparent_zenith'],
solar_position['azimuth'],
axis_tilt=MOUNTING_TILT,
axis_azimuth=MOUNTING_AZIMUTH,
max_angle=MAX_ANGLE,
backtrack=True,
gcr=MOUNTING_GCR)
and then using tracker_data like so:
hsat_poa_model_tracker = HSAT.get_irradiance(surface_tilt=tracker_data['surface_tilt'],
surface_azimuth=tracker_data['surface_azimuth'],
solar_zenith=solar_position['apparent_zenith'],
solar_azimuth=solar_position['azimuth'],
dni=dni,
ghi=ghi,
dhi=dhi,
airmass=None,
model='haydavies')
will calculate POA data for a single axis tracker.
Found the answer in this jupyter notebook:
http://nbviewer.jupyter.org/github/pvlib/pvlib-python/blob/master/docs/tutorials/tracking.ipynb
Can it model the tracker angle itself, based on metadata like
axis_tilt, max_angle, and backtrack?
pvlib's ModelChain will do this. See the PV Power Forecast documentation for an example of using a ModelChain with a SingleAxisTracker.
I built a pymc3 model using the DensityDist distribution. I have four parameters out of which 3 use Metropolis and one uses NUTS (this is automatically chosen by the pymc3). However, I get two different UserWarnings
1.Chain 0 contains number of diverging samples after tuning. If increasing target_accept does not help try to reparameterize.
MAy I know what does reparameterize here mean?
2. The acceptance probability in chain 0 does not match the target. It is , but should be close to 0.8. Try to increase the number of tuning steps.
Digging through a few examples I used 'random_seed', 'discard_tuned_samples', 'step = pm.NUTS(target_accept=0.95)' and so on and got rid of these user warnings. But I couldn't find details of how these parameter values are being decided. I am sure this might have been discussed in various context but I am unable to find solid documentation for this. I was doing a trial and error method as below.
with patten_study:
#SEED = 61290425 #51290425
step = pm.NUTS(target_accept=0.95)
trace = sample(step = step)#4000,tune = 10000,step =step,discard_tuned_samples=False)#,random_seed=SEED)
I need to run these on different datasets. Hence I am struggling to fix these parameter values for each dataset I am using. Is there any way where I give these values or find the outcome (if there are any user warnings and then try other values) and run it in a loop?
Pardon me if I am asking something stupid!
In this context, re-parametrization basically is finding a different but equivalent model that it is easier to compute. There are many things you can do depending on the details of your model:
Instead of using a Uniform distribution you can use a Normal distribution with a large variance.
Changing from a centered-hierarchical model to a
non-centered
one.
Replacing a Gaussian with a Student-T
Model a discrete variable as a continuous
Marginalize variables like in this example
whether these changes make sense or not is something that you should decide, based on your knowledge of the model and problem.
Initially, I created an interactive map of the UK Postcode area where an individual area is color represented based on its value (e.g. population in that post code area) as following.
from bokeh.plotting import figure
from bokeh.palettes import Viridis256 as palette
from bokeh.models import LinearColorMapper
from bokeh.models import ColumnDataSource
import geopandas as gpd
shp = 'file_path_to_the_downloaded_shapefile'
#read shape file into dataframe using geopandas
df = gpd.read_file(shp)
def expandMultiPolygons(row, geometry):
if row[geometry].type = 'MultiPolygon':
row[geometry] = [p for p in row[geometry]]
return row
#Some rows were in MultiPolygons instead of Polygons.
#Expand MultiPolygons to multi rows of Polygons
df = df.apply(expandMultiPolygons, geometry='geometry', axis=1)
df = df.set_index('Area')['geometry'].apply(pd.Series).stack().reset_index()
#Visualize the polygons. To visualize different colors for different post areas, I added another column called 'value' which has some random integer value.
p = figure()
color_mapper = LinearColorMapper(palette=palette)
source = ColumnDataSource(df)
p.patches('x', 'y', source=source,\
fill_color={'field': 'value', 'transform': color_mapper},\
fill_alpha=1.0, line_color="black", line_width=0.05)
where df is a dataframe of four columns : post code area, x-coordinate, y-coordinate, value (i.e. population).
The above code creates an interactive map on a web browser which is great but I noticed the interactivity is not very smooth in speed. If I zoom in or move the map, it renders slowly. The size of the dataframe is only 1106 rows, so I'm quite confused why it is so slow.
As one of the possible solutions, I came across with datashader (https://datashader.readthedocs.io/en/latest/) but I find the example script is quite complicated and most of them are with holoview package on Jupyter notebook but I want to create a dashboard using bokeh.
Does anyone advise me in incorporating datashader into the above bokeh script? Do I need a different function within datashader to create the shape map instead of using bokeh's patches function?
Any suggestion would be highly appreciated!!!
Without the data file involved, I can't answer your question directly, but can offer some observations:
Datashader is unlikely to be of value for this purpose, because datashader does not currently have any support for rendering polygons. As a rule of thumb, Datashader is designed to aggregate your data, and if it's already aggregated, Datashader won't normally be of help. Here your data is aggregated by postcode, which datashader can't process, but if you had the original data per person it would be happy to render it.
If you prefer working with Bokeh directly rather than via the higher-level HoloViews/GeoViews interface, I'd recommend folllwing Matt Rocklin's work on accelerating geopandas; his approach should be very fast for your purpose.
All that said, HoloViews, and GeoViews should be a convenient way to work with Bokeh in general, whether or not you want to create a dashboard. E.g. the 2017 JupyterCon tutorial shows how to make a simple Bokeh dashboard using both libraries. It doesn't cover shape files, but those are covered in other GeoViews examples.
As mentioned in my comment, I believe that the complexity of your polygons might cause your problem. The file you linked to contains several shapefile of different sizes and complexities. You can simplify those, i.e. reduce the number of points for each polygon. This can change how they look. It can range from almost no difference over a bit more "edginess" to an angular appearance. This depends on the level of simplification you chose. Depending on your needs you can chose different levels of simplicity.
I know of three easy options to get this done:
GUI: Try QGis. It is a great opensource tool for geospatial data processing. Load your Shapefile as a new layer. Then use the "Simplify Geometries" tool under the Vector menu.
Command-Line: GDAL is an open-source library. It comes with an useful command-line tool. You can use it like this: ogr2ogr outfile.shp infile.shp -simplify 0.000001
Online: Visit mapshader. Import your file. Select simplify and chose your level. Then, export the result. What I really like here is that your file is rendered instantly. Hence, you can immediately see the result of your simplification.
Other than that, you should also update your bokeh version. It gets updated regularly and there have been some performance improvements since.
Using HoloViews or GeoViews will not positively affect your performance. Thus, it is not related to your issues. I guess #James A. Bednar was just giving some side advice there.
I found a way to speed up the interactive visualization of the UK map as I move the slider.
I created individual image (in 2D) for a different value of slider first and updated the map using the 2D images instead of using bokeh's patches function.
Since the images are in array format, it is much faster to update the image while changing the values in the slider. one downside in this method is that I can no longer use hover function on the UK map.
I referred to the following url to convert polygon information into arrays: https://gist.github.com/brendancol/db030013e981c46acb2886060dde607e#file-rasterio_datashader_polygons-py-L35
I am learning how to do data mining and I am using this data set from UCI's website.
http://archive.ics.uci.edu/ml/datasets/Forest+Fires
The problem I am encountering is how to deal with the area class. My understanding from the description is that I need to apply ln(x+1) to area using AddExpression.
Am I going in the correct direction with this? Or are there other filters I should investigate? Thank you.
I try to answer your question based on the little information you provide. And I haven't worked with the forest-fires data set, but by inspection I see that the classifier attribute "area" often has the value 0. Maybe you can't simply filter out these rows with Area = 0. Your dataset might become too small, or whatnot.
I think you are asked to perform regression of some attribute(s) against "log(area)" in order to linearize it. However,when you try to calculate the log of the Area, values such as log(0) are a problem. values between 0 and 1 might also be problematic.
So a common fix is to add 1 to the value of "Area". This introduces a systematic error, but it is small, and it removes all 0-values, and you can still derive useful models from your log(x+1)-transformed dataset.
And yes, in Weka you do this by "Preprocess"/ AddExpression(x+1). This creates a new attribute. Then you might remove the old area attribute.
Of course, in interpreting your model, you should be aware of the transformation. If you just want to find out what the significant independent attributes are in your linear regression model, I'd say the transformation does not matter. The data points are just shifted a little bit.
I am having trouble understanding the documentation on geodjango. First of all using the zipcode example:
class Zipcode(models.Model):
code = models.CharField(max_length=5)
poly = models.PolygonField()
objects = models.GeoManager()
Is PolygonField where I would store the actual long/lattitude coordinates of the zipcode? The other question is how would I actually translate the zipcode into those coordinates? That is the one step I cannot figure out how to do.
I assume I'm going to need to convert the zipcode into coordinates and then compare those against other coordinates to determine 'nearest zipcodes to location x', which is what I'm trying to do.
On a side note, I found https://github.com/coderholic/django-cities, which seems like I would be able to accomplish this WITHOUT converting zipcodes into coordinates, but there isn't really any documentation, so I have no idea.
GeoDjango does not handle converting zip codes into locations: that's simply not what it's for. You'll need a geocoding library, a Google search should reveal plenty.
The project you link to simply uses an existing set of geocoded data for cities and zip codes, and even tells you where to get it - see the relevant management command.