I'm trying to evaluate a theano TensorValue expression:
import pymc3
import numpy as np
with pymc3.Model():
growth = pymc3.Normal('growth_%s' % 'some_name', 0, 10)
x = np.arange(4)
(x * growth).eval()
but get the error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/graph.py", line 522, in eval
self._fn_cache[inputs] = theano.function(inputs, self)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/pfunc.py", line 486, in pfunc
output_keys=output_keys)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function_module.py", line 1839, in orig_function
name=name)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function_module.py", line 1487, in __init__
accept_inplace)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/compile/function_module.py", line 181, in std_fgraph
update_mapping=update_mapping)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/fg.py", line 175, in __init__
self.__import_r__(output, reason="init")
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/fg.py", line 346, in __import_r__
self.__import__(variable.owner, reason=reason)
File "/home/danna/.virtualenvs/lib/python2.7/site-packages/theano/gof/fg.py", line 391, in __import__
raise MissingInputError(error_msg, variable=r)
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start from 0), used to compute InplaceDimShuffle{x}(growth_some_name), was not provided and not given a value. Use the Theano flag exception_verbosity='high', for more information on this error.
I tried
Can someone please help me see what the theano variables actually output?
Thank you!
I'm using Python 2.7 and theano 1.0.3
While PyMC3 distributions are TensorVariable objects, they don't technical have any values to be evaluated outside of sampling. If you want values, you have to at least run sampling on the model:
with pymc3.Model():
growth = pymc3.Normal('growth', 0, 10)
trace = pymc3.sample(10)
x = np.arange(4)
x[:, np.newaxis]*trace['growth']
If you want to view node values during sampling, you'd need to use theano.tensor.printing.Print objects. For more info, see the PyMC3 debugging tips.
Related
import glob
from bs4 import BeautifulSoup
f = open('csvfile.csv','w')
for file in glob.glob('*.htm'):
print 'Processing', file
for y in range(0,3):
for x in range(0, 6):
soup = BeautifulSoup(open(file).read())
all_string=soup.find_all("h2")[x].get_text()
#stack=[]
#acct.write(", ".join(stack) + '\n')
f.write(all_string)
f.write('\n')
print(all_string)
x=0
f.close()
Output-
Processing Alkali-Controlled C–H Cleavage or N–C Bond Formation by N2-Derived Iron Nitrides and Imides - Journal of the American Chemical Society (ACS Publications).htm
Abstract
Supporting Information
Vanadium-catalyzed Reduction of Molecular Dinitrogen into Silylamine under Ambient Reaction Conditions
Traceback (most recent call last):
File "", line 1, in
runfile('/Users/ROXX/Desktop/project/csv1.py', wdir='/Users/ROXX/Desktop/project')
File
"/Users/ROXX/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py",
line 880, in runfile
execfile(filename, namespace)
File
"/Users/ROXX/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py",
line 94, in execfile
builtins.execfile(filename, *where)
File "/Users/ROXX/Desktop/project/csv1.py", line 17, in
all_string=soup.find_all("h2")[x].get_text()
IndexError: list index out of range
The reason for the error probably is that there are less than 7 occurrences of h2 in the file you are processing (in the second loop). Repeating on "all_string" instead of a fixed interval would probably solve the issue.
I am trying to use TfIdfVectorizer of sklearn. I am having trouble because my input is probably not matching TfIdfVectorizer needs. I have a bunch of JSONs I loaded and appended into a list, and I now want that to be the corpus for TfIdfVectorizer use.
The code:
import json
import pandas
from sklearn.feature_extraction.text import TfidfVectorizer
train=pandas.read_csv("train.tsv", sep='\t')
documents=[]
for i,row in train.iterrows():
data = json.loads(row['boilerplate'].lower())
documents.append(data['body'])
vectorizer=TfidfVectorizer(min_df=1)
X = vectorizer.fit_transform(documents)
idf = vectorizer.idf_
print dict(zip(vectorizer.get_feature_names(), idf))
I am getting the following error:
Traceback (most recent call last):
File "<ipython-input-56-94a6b95b0745>", line 1, in <module>
runfile('C:/Users/Guinea Pig/Downloads/try.py', wdir='C:/Users/Guinea Pig/Downloads')
File "D:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 585, in runfile
execfile(filename, namespace)
File "C:/Users/Guinea Pig/Downloads/try.py", line 19, in <module>
X = vectorizer.fit_transform(documents)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 1219, in fit_transform
X = super(TfidfVectorizer, self).fit_transform(raw_documents)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 780, in fit_transform
vocabulary, X = self._count_vocab(raw_documents, self.fixed_vocabulary)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 715, in _count_vocab
for feature in analyze(doc):
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 229, in <lambda>
tokenize(preprocess(self.decode(doc))), stop_words)
File "D:\Anaconda\lib\site-packages\sklearn\feature_extraction\text.py", line 195, in <lambda>
return lambda x: strip_accents(x.lower())
AttributeError: 'NoneType' object has no attribute 'lower'
I am getting that the documents array consists of Unicode objects, and not string objects, but I can't seem to solve this issue. ant ideas?
Eventually I used:
str_docs=[]
for item in documents:
str_docs.append(documents[i].encode('utf-8'))
As an addition
I'm having problems reading an OpenStreetMap buildings (IMPOSM GEOJSON) file into a geopandas data frame object (Python 2.7). This is on MAC OS X 10.11.3. Here are the messages I'm getting:
>>> import geopandas as gpd
>>> df=gpd.read_file('san-francisco-bay_california_buildings.geojson')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/ewang/anaconda/lib/python2.7/site-packages/geopandas/io/file.py", line 28, in read_file
gdf = GeoDataFrame.from_features(f, crs=crs)
File "/Users/ewang/anaconda/lib/python2.7/site-packages/geopandas/geodataframe.py", line 193, in from_features
d = {'geometry': shape(f['geometry']) if f['geometry'] else None}
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/geo.py", line 34, in shape
return Polygon(ob["coordinates"][0], ob["coordinates"][1:])
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/polygon.py", line 229, in __init__
self._geom, self._ndim = geos_polygon_from_py(shell, holes)
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/polygon.py", line 508, in geos_polygon_from_py
geos_shell, ndim = geos_linearring_from_py(shell)
File "/Users/ewang/anaconda/lib/python2.7/site-packages/shapely/geometry/polygon.py", line 450, in geos_linearring_from_py
n = len(ob[0])
IndexError: list index out of range
The odd thing is that I can load OSM roads data IMPOSM GEOJSON files with geopandas. Am I missing something obvious here? Thanks very much.
EDIT - link to the data below:
OSM data from mapzen
def voigt_PD(x,y0,xc,A,wG,wL):
def integconvo(t,x1,xc1,wG1,wL1):
return exp(-t**2)/((sqrt(log(2))*wL1/wG1)**2+ ((sqrt(4*log(2))*(x1- xc1)/wG1)-t)**2)
return y0+(A*(2*log(2)/pi**1.5)*(wL/wG**2)* quad(integconvo,-inf,inf,args=(x,xc,wG,wL)))
I am defining this voigt model function to fit a curve but I am getting an error stating that "Supplied function does not return a valid float". The following is the fitting routine and followed by the error. Can anyone help to find the mistake ? Thanks in advance
import pylab
from lmfit import Model
from readdatafile import readdatafile
from voigt_PD import voigt_PD
X,Y = readdatafile('data.dat')
gmod = Model(voigt_PD)
result = gmod.fit(Y, x=X,y0=0,xc=0,A=0.1,wG=0.5,wL=0.5)
print(result.fit_report())
pylab.plot(X, Y, 'bo')
pylab.plot(X, result.best_fit, 'r-')
pylab.show()
Then it gives
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "E:/programming/python scripts/test5.py", line 29, in <module>
result = gmod.fit(Y, x=X,y0=0,xc=0,A=0.1,wG=0.5,wL=0.5)
File "build\bdist.win32\egg\lmfit\model.py", line 542, in fit
File "build\bdist.win32\egg\lmfit\model.py", line 746, in fit
File "build\bdist.win32\egg\lmfit\model.py", line 408, in eval
File "voigt_PD.py", line 13, in voigt_PD
return y0+(A*(2*log(2)/pi**1.5)*(wL/wG**2)* (quad(integconvo,-inf,inf,args=(x,xc,wG,wL)))[0])
File "C:\Python27\lib\site-packages\scipy\integrate\quadpack.py", line 311, in quad
points)
File "C:\Python27\lib\site-packages\scipy\integrate\quadpack.py", line 378, in _quad
return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit)
quadpack.error: Supplied function does not return a valid float.
I am using pandas v0.14.1 with python 2.7
I have a groupby object and I am trying to pull out a group identified by particular key. The key is in fact in the group:
>>> key in key_groups.groups.keys()
True
but when I try to make the get_group call it fails with a memory error:
>>>> key_groups.get_group(key)
*** MemoryError:
The full stacktrace is:
Traceback (most recent call last):
File "main.py", line 141, in <module>
main(num_days=arguments.days, num_variants=arguments.variants)
File "main.py", line 76, in main
problem, solution = Solver.Solve(request, num_variants)
File "/srv/compunctuator/src/Solver.py", line 49, in Solve
solution = attempt_minimization(t)
File "/srv/compunctuator/src/Solver.py", line 41, in attempt_minimization
t.scruple()
File "/srv/compunctuator/src/Compunctuator.py", line 136, in scruple
self.__iterate__()
File "/srv/compunctuator/src/Compunctuator.py", line 95, in __iterate__
self.__maximize_impressions__()
File "/srv/compunctuator/src/Compunctuator.py", line 583, in __maximize_impressions__
df = key_groups.get_group(key)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 573, in get_group
inds = self._get_index(name)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 429, in _get_index
sample = next(iter(self.indices))
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 414, in indices
return self.grouper.indices
File "properties.pyx", line 34, in pandas.lib.cache_readonly.__get__ (pandas/lib.c:36380)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 1253, in indices
return _get_indices_dict(label_list, keys)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 3474, in _get_indices_dict
np.prod(shape))
File "algos.pyx", line 1997, in pandas.algos.groupsort_indexer (pandas/algos.c:37521) MemoryError
If I actually use the dictionary lookup I can get the indices out:
>>>> key_groups.groups[key]
[0, 2]
It seems like everything should work here.
I realize a similar question was asked here pandas get_group causes memory error
but it was never resolved and I thought I could give more details if necessary.