Reading in OSM buildings geojson data into Python via geopandas - python-2.7

I'm having problems reading an OpenStreetMap buildings (IMPOSM GEOJSON) file into a geopandas data frame object (Python 2.7). This is on MAC OS X 10.11.3. Here are the messages I'm getting:
>>> import geopandas as gpd
>>> df=gpd.read_file('san-francisco-bay_california_buildings.geojson')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/ewang/anaconda/lib/python2.7/site-packages/geopandas/io/file.py", line 28, in read_file
gdf = GeoDataFrame.from_features(f, crs=crs)
File "/Users/ewang/anaconda/lib/python2.7/site-packages/geopandas/geodataframe.py", line 193, in from_features
d = {'geometry': shape(f['geometry']) if f['geometry'] else None}
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/geo.py", line 34, in shape
return Polygon(ob["coordinates"][0], ob["coordinates"][1:])
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/polygon.py", line 229, in __init__
self._geom, self._ndim = geos_polygon_from_py(shell, holes)
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/polygon.py", line 508, in geos_polygon_from_py
geos_shell, ndim = geos_linearring_from_py(shell)
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/polygon.py", line 450, in geos_linearring_from_py
n = len(ob[0])
IndexError: list index out of range
The odd thing is that I can load OSM roads data IMPOSM GEOJSON files with geopandas. Am I missing something obvious here? Thanks very much.
EDIT - link to the data below:
OSM data from mapzen

Related

Seaborn KDE visualisation, value error on dataset

I am attempting to visualise a KDE plot in Seaborn, but am encountering an error on entering data.
The data is a set of scores ranging from 1-13 and is in the form of a numpy array.
Below is the section of code I'm using.
query_CNM = 'SELECT SCORE from CNMATCH LIMIT 2000'
df = pd.read_sql(query_CNM, conn, index_col = None)
yy = np.array(df)
plot = sns.kdeplot(yy)
Below is the full error that I'm receiving.
Traceback (most recent call last):
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1758, in <module>
main()
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1752, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1147, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Users/uni/Desktop/Proof_Of_Concept/PYQTKDE.py", line 66, in <module>
plot = sns.kdeplot(yy)
File "/Users/uni/.conda/envs/fing.py/lib/python2.7/site-packages/seaborn/distributions.py", line 664, in kdeplot
x, y = data.T
ValueError: need more than 1 value to unpack
I can't seem to find exactly how the data needs to be formatted for sea-born in order to fit a KDE, if any insights can be provided on this it would be greatly appreciated.

TensorVariable to Array

I'm trying to evaluate a theano TensorValue expression:
import pymc3
import numpy as np
with pymc3.Model():
growth = pymc3.Normal('growth_%s' % 'some_name', 0, 10)
x = np.arange(4)
(x * growth).eval()
but get the error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/graph.py", line 522, in eval
self._fn_cache[inputs] = theano.function(inputs, self)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/pfunc.py", line 486, in pfunc
output_keys=output_keys)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function_module.py", line 1839, in orig_function
name=name)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function_module.py", line 1487, in __init__
accept_inplace)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function_module.py", line 181, in std_fgraph
update_mapping=update_mapping)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/fg.py", line 175, in __init__
self.__import_r__(output, reason="init")
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/fg.py", line 346, in __import_r__
self.__import__(variable.owner, reason=reason)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/fg.py", line 391, in __import__
raise MissingInputError(error_msg, variable=r)
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute InplaceDimShuffle{x}(growth_some_name), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
I tried
Can someone please help me see what the theano variables actually output?
Thank you!
I'm using Python 2.7 and theano 1.0.3
While PyMC3 distributions are TensorVariable objects, they don't technical have any values to be evaluated outside of sampling. If you want values, you have to at least run sampling on the model:
with pymc3.Model():
growth = pymc3.Normal('growth', 0, 10)
trace = pymc3.sample(10)
x = np.arange(4)
x[:, np.newaxis]*trace['growth']
If you want to view node values during sampling, you'd need to use theano.tensor.printing.Print objects. For more info, see the PyMC3 debugging tips.

Google Vision Python 2.7 TypeError: construct_settings() got an unexpected keyword argument 'metrics_headers'

After installing the required packages using pip, downloading a Json key and setting the enviroment variable in the cmd window with: set GOOGLE_APPLICATION_CREDENTIALS = 'C:\Users\ xxx .json' and following the instructions to use the Google Vision API on https://googlecloudplatform.github.io/google-cloud-python/stable/vision-usage.html#authentication-and-configuration
I tried the following and got the following error without any idea how to solve the error, so all suggestions are much appreciated
>>> from google.cloud import vision
>>> client =vision.Client()
>>> print client
<google.cloud.vision.client.Client object at 0x08D414F0>
>>> image = client.image(filename='test2.jpg')
>>> print image
<google.cloud.vision.image.Image object at 0x0CBF68F0>
>>> text = image.detect_text()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\google\cloud\vision\image.py", line 289, in detect_text
annotations = self.detect(features)
File "C:\Python27\lib\site-packages\google\cloud\vision\image.py", line 143, in detect
return self._detect_annotation(images)
File "C:\Python27\lib\site-packages\google\cloud\vision\image.py", line 117, in _detect_annotation
return self.client._vision_api.annotate(images)
File "C:\Python27\lib\site-packages\google\cloud\vision\client.py", line 114, in _vision_api
self._vision_api_internal = _GAPICVisionAPI(self)
File "C:\Python27\lib\site-packages\google\cloud\vision\_gax.py", line 34, in __init__
lib_version=__version__)
File "C:\Python27\lib\site-packages\google\cloud\gapic\vision\v1\image_annotator_client.py", line 140, in __init__
metrics_headers=metrics_headers, )
TypeError: construct_settings() got an unexpected keyword argument 'metrics_headers'

Preparing data for TfidfVectorizer use (scikitlearn)

I am trying to use TfIdfVectorizer of sklearn. I am having trouble because my input is probably not matching TfIdfVectorizer needs. I have a bunch of JSONs I loaded and appended into a list, and I now want that to be the corpus for TfIdfVectorizer use.
The code:
import json
import pandas
from sklearn.feature_extraction.text import TfidfVectorizer
train=pandas.read_csv("train.tsv", sep='\t')
documents=[]
for i,row in train.iterrows():
data = json.loads(row['boilerplate'].lower())
documents.append(data['body'])
vectorizer=TfidfVectorizer(min_df=1)
X = vectorizer.fit_transform(documents)
idf = vectorizer.idf_
print dict(zip(vectorizer.get_feature_names(), idf))
I am getting the following error:
Traceback (most recent call last):
File "<ipython-input-56-94a6b95b0745>", line 1, in <module>
runfile('C:/Users/Guinea Pig/Downloads/try.py', wdir='C:/Users/Guinea Pig/Downloads')
File "D:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 585, in runfile
execfile(filename, namespace)
File "C:/Users/Guinea Pig/Downloads/try.py", line 19, in <module>
X = vectorizer.fit_transform(documents)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 1219, in fit_transform
X = super(TfidfVectorizer, self).fit_transform(raw_documents)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 780, in fit_transform
vocabulary, X = self._count_vocab(raw_documents, self.fixed_vocabulary)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 715, in _count_vocab
for feature in analyze(doc):
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 229, in <lambda>
tokenize(preprocess(self.decode(doc))), stop_words)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 195, in <lambda>
return lambda x: strip_accents(x.lower())
AttributeError: 'NoneType' object has no attribute 'lower'
I am getting that the documents array consists of Unicode objects, and not string objects, but I can't seem to solve this issue. ant ideas?
Eventually I used:
str_docs=[]
for item in documents:
str_docs.append(documents[i].encode('utf-8'))
As an addition

'Polygone' object does not support indexing

I am trying to render an SVG map using Kartograph.py. It throws me the TypeError. Here is the python code:
import kartograph
from kartograph import Kartograph
import sys
from kartograph.options import read_map_config
css = open("stylesheet.css").read()
K = Kartograph()
cfg = read_map_config(open("config.json"))
K.generate(cfg, outfile='dd.svg', format='svg', stylesheet=css)
Here is the error it throws
Traceback (most recent call last):
File "<pyshell#33>", line 1, in <module>
K.generate(cfg, outfile='dd.svg', format='svg', stylesheet=css)
File "C:\Python27\lib\site-packages\kartograph.py-0.6.8-py2.7.egg\kartograph\kartograph.py", line 46, in generate
_map = Map(opts, self.layerCache, format=format)
File "C:\Python27\lib\site-packages\kartograph.py-0.6.8-py2.7.egg\kartograph\map.py", line 61, in __init__
layer.get_features()
File "C:\Python27\lib\site-packages\kartograph.py-0.6.8-py2.7.egg\kartograph\maplayer.py", line 81, in get_features
charset=layer.options['charset']
File "C:\Python27\lib\site-packages\kartograph.py-0.6.8-py2.7.egg\kartograph\layersource\shplayer.py", line 121, in get_features
geom = shape2geometry(shp, ignore_holes=ignore_holes, min_area=min_area, bbox=bbox, proj=self.proj)
File "C:\Python27\lib\site-packages\kartograph.py-0.6.8-py2.7.egg\kartograph\layersource\shplayer.py", line 153, in shape2geometry
geom = shape2polygon(shp, ignore_holes=ignore_holes, min_area=min_area, proj=proj)
File "C:\Python27\lib\site-packages\kartograph.py-0.6.8-py2.7.egg\kartograph\layersource\shplayer.py", line 217, in shape2polygon
poly = MultiPolygon(polygons)
File "C:\Python27\lib\site-packages\shapely\geometry\multipolygon.py", line 74, in __init__
self._geom, self._ndim = geos_multipolygon_from_polygons(polygons)
File "C:\Python27\lib\site-packages\shapely\geometry\multipolygon.py", line 30, in geos_multipolygon_from_polygons
N = len(ob[0][0][0])
TypeError: 'Polygon' object does not support indexing
I had a look at shapely and it seems like you are using an outdated version.
Update your current install:
pip install -U shapely