nltk lookup error in Stanford Neural Dependency Parser - python-2.7

I am trying to use the Stanford Neural Dependency Parser provided by nltk. The problem I'm having is that when I call st = nltk.parse.stanford.StanfordNeuralDependencyParser(), I get the following error:
>>> st = nltk.parse.stanford.StanfordNeuralDependencyParser()
Traceback (most recent call last):
File "C:\Users\<user>\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-5-ca2dec4f3c1f>", line 1, in <module>
st = nltk.parse.stanford.StanfordNeuralDependencyParser()
File "C:\Users\<user>\Anaconda2\lib\site-packages\nltk\parse\stanford.py", line 378, in __init__
super(StanfordNeuralDependencyParser, self).__init__(*args, **kwargs)
File "C:\Users\<user>\Anaconda2\lib\site-packages\nltk\parse\stanford.py", line 51, in __init__
key=lambda model_name: re.match(self._JAR, model_name)
File "C:\Users\<user>\Anaconda2\lib\site-packages\nltk\internals.py", line 714, in find_jar_iter
raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
LookupError:
===========================================================================
NLTK was unable to find stanford-corenlp-(\d+)(\.(\d+))+\.jar! Set
the CLASSPATH environment variable.
For more information, on stanford-corenlp-(\d+)(\.(\d+))+\.jar, see:
<http://nlp.stanford.edu/software/lex-parser.shtml>
===========================================================================
But, when I run os.environ.get('CLASSPATH') I get the result
`C:\nltk_data\;C:\nltk_data\stanford\;C:\nltk_data\stanford\stanford-ner\`
I know that I have the corenlp jar file in C:\nltk_data\stanford\ so I run the following and end up with a slightly different error.
>>> st = nltk.parse.stanford.StanfordNeuralDependencyParser('C:\\nltk_data\\stanford\\')
Traceback (most recent call last):
File "C:\Users\<user>\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-22-28d797d702d9>", line 1, in <module>
st = StanfordNeuralDependencyParser('C:\\nltk_data\\stanford\\')
File "C:\Users\<user>\Anaconda2\lib\site-packages\nltk\parse\stanford.py", line 378, in __init__
super(StanfordNeuralDependencyParser, self).__init__(*args, **kwargs)
File "C:\Users\<user>\Anaconda2\lib\site-packages\nltk\parse\stanford.py", line 51, in __init__
key=lambda model_name: re.match(self._JAR, model_name)
File "C:\Users\<user>\Anaconda2\lib\site-packages\nltk\internals.py", line 635, in find_jar_iter
(name_pattern, path_to_jar))
LookupError: Could not find stanford-corenlp-(\d+)(\.(\d+))+\.jar jar file at C:\nltk_data\stanford\
I have downloaded the jar stanford-english-corenlp-2016-01-10-models.jar from the Stanford NLP website and also renamed it to stanford-corenlp-2016-01-10.jar to try and match the pattern but I was still end up with the same errors. I have also downloaded the Stanford Parser version 3.6.0 but it doesn't contain any corenlp files.
Is there any way to get this to work, or am I misunderstanding something?

Related

Using Tensorflow 2.X model on OpenCV

I have to use a Tensorflow 2.X model with the OpenCV framework (v.4.X with C++).
To do this, I need a single .pb file or a .pb and a .pbtxt file, instead of a Tensorflow Saved Model like the one I have.
So my question is: Is there a way to convert a Saved Model in a format that OpenCV could read? Like, maybe, a caffe model?
I tried with MMdnn but it gives me a strange error:
Traceback (most recent call last):
File "/usr/local/bin/mmconvert", line 8, in <module>
sys.exit(_main())
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convert.py", line 102, in _main
ret = convertToIR._convert(ir_args)
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convertToIR.py", line 62, in _convert
from mmdnn.conversion.tensorflow.tensorflow_parser import TensorflowParser
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/tensorflow/tensorflow_parser.py", line 15, in <module>
from tensorflow.tools.graph_transforms import TransformGraph
ImportError: No module named 'tensorflow.tools.graph_transforms'
And I suppose it is because it was developed and tested with Tensorflow 1.X.
Edit: I also have the relative Keras model (now that it is integrated with Tensorflow 2), but it is incompatible with OpenCV DNN framework too. Trying converting it with MMdnn I get this error:
Traceback (most recent call last):
File "/usr/local/bin/mmconvert", line 8, in <module>
sys.exit(_main())
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convert.py", line 102, in _main
ret = convertToIR._convert(ir_args)
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/_script/convertToIR.py", line 46, in _convert
parser = Keras2Parser(model)
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/keras/keras2_parser.py", line 126, in __init__
model = self._load_model(model[0], model[1])
File "/usr/local/lib/python3.5/dist-packages/mmdnn/conversion/keras/keras2_parser.py", line 78, in _load_model
'DepthwiseConv2D': layers.DepthwiseConv2D})
File "/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py", line 664, in model_from_json
return deserialize(config, custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 168, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 147, in deserialize_keras_object
list(custom_objects.items())))
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1056, in from_config
process_layer(layer_data)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1042, in process_layer
custom_objects=custom_objects)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/__init__.py", line 168, in deserialize
printable_module_name='layer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 149, in deserialize_keras_object
return cls.from_config(config['config'])
File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 1179, in from_config
return cls(**config)
File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/convolutional.py", line 484, in __init__
**kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/convolutional.py", line 117, in __init__
self.kernel_initializer = initializers.get(kernel_initializer)
File "/usr/local/lib/python3.5/dist-packages/keras/initializers.py", line 515, in get
return deserialize(identifier)
File "/usr/local/lib/python3.5/dist-packages/keras/initializers.py", line 510, in deserialize
printable_module_name='initializer')
File "/usr/local/lib/python3.5/dist-packages/keras/utils/generic_utils.py", line 140, in deserialize_keras_object
': ' + class_name)
ValueError: Unknown initializer: GlorotUniform
Edit 04/2021: Now the ONNX converter mentioned in the comments works properly with OpenCV 4.5.1 (Version 4.5.0 has a bug with some ONNX networks).
If you have the .h5 file, you can try this approach instead of MMdnn, using TensorFlow. The function converts the current session into a static computation graph to capture current states. Then you can write the graph in .pb format using tf.train.write_graph.
You can load the pretrained model with model = load_model('./model/keras_model.h5') before you freeze the graph. There is also a blog post for further explanation.

Attempting to debug terminal applications made with Python+Blessed using ipdb breaks IPython?

I am using the Blessed library to build a simple terminal application.
My application builds upon the following simple example for a dumb editor: https://github.com/jquast/blessed/blob/master/bin/editor.py
Warning: the following steps will break your IPython, and I don't know how to fix it!
For the purposes of this question, I'll just use editor.py. Let's make a couple of changes to allow debugging:
1) import ipdb
2) put in ipdb.set_trace() on line 224
Run editor.py now: python editor.py. The following error should be produced:
Traceback (most recent call last):
File "editor.py", line 14, in <module>
from manager import Manager
File "/home/abcd/python_scripts/editor.py", line 25, in <module>
import ipdb
File "/usr/local/lib/python2.7/dist-packages/ipdb/__init__.py", line 7, in <module>
from ipdb.__main__ import set_trace, post_mortem, pm, run, runcall, runeval, launch_ipdb_on_exception
File "/usr/local/lib/python2.7/dist-packages/ipdb/__main__.py", line 47, in <module>
ipapp.initialize([])
File "<decorator-gen-110>", line 2, in initialize
File "/usr/lib/python2.7/dist-packages/IPython/config/application.py", line 92, in catch_config_error
return method(app, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/terminal/ipapp.py", line 332, in initialize
self.init_shell()
File "/usr/lib/python2.7/dist-packages/IPython/terminal/ipapp.py", line 348, in init_shell
ipython_dir=self.ipython_dir, user_ns=self.user_ns)
File "/usr/lib/python2.7/dist-packages/IPython/config/configurable.py", line 354, in instance
inst = cls(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/terminal/interactiveshell.py", line 328, in __init__
**kwargs
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 483, in __init__
self.init_readline()
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 1843, in init_readline
self.readline_startup_hook = readline.set_startup_hook
AttributeError: 'module' object has no attribute 'set_startup_hook'
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev#scipy.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
c.Application.verbose_crash=True
Now, whenever one runs IPython by executing the ipython command, this error will be produced:
Traceback (most recent call last):
File "/usr/bin/ipython", line 5, in <module>
start_ipython()
File "/usr/lib/python2.7/dist-packages/IPython/__init__.py", line 120, in start_ipython
return launch_new_instance(argv=argv, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/config/application.py", line 564, in launch_instance
app.initialize(argv)
File "<decorator-gen-110>", line 2, in initialize
File "/usr/lib/python2.7/dist-packages/IPython/config/application.py", line 92, in catch_config_error
return method(app, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/terminal/ipapp.py", line 332, in initialize
self.init_shell()
File "/usr/lib/python2.7/dist-packages/IPython/terminal/ipapp.py", line 348, in init_shell
ipython_dir=self.ipython_dir, user_ns=self.user_ns)
File "/usr/lib/python2.7/dist-packages/IPython/config/configurable.py", line 354, in instance
inst = cls(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/terminal/interactiveshell.py", line 328, in __init__
**kwargs
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 483, in __init__
self.init_readline()
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 1843, in init_readline
self.readline_startup_hook = readline.set_startup_hook
AttributeError: 'module' object has no attribute 'set_startup_hook'
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev#scipy.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
c.Application.verbose_crash=True
So, IPython seems to be globally broken. I have gotten this issue on both Cygwin and Ubuntu.
What's going wrong?

Celery raised unexpected LookUp Error

I am using celery for my django project.
java_path = "/your/Java/jdk/home/java.exe
os.environ['JAVAHOME'] = java_path
st = POSTagger('/your/postagger/models/path/english-bidirectional-distsim.tagger','/your/postagger/jar/file/path/stanford-postagger.jar')
tag = st.tag([key])
here the key is a list of feature words.
I got following errors, when using celery to execute:
raised unexpected:
LookupError('\n\n===========================================================================\nNLTK
`was unable to find the java file!\nUse software specific configuration paramaters or set the JAVAHOME environment`
variable.\n===========================================================================',)
Traceback (most recent call last):
File "/Users/Envs/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/Users/Envs/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/Users/Envs/src/evolvelearning/tasks.py", line 772, in RunProgram
tag = st.tag([key])
File "/Users/Envs/lib/python2.7/site-packages/nltk/tag/stanford.py", line 59, in tag
return self.tag_sents([tokens])[0]
File "/Users/Envs/lib/python2.7/site-packages/nltk/tag/stanford.py", line 64, in tag_sents
config_java(options=self.java_options, verbose=False)
File "/Users/Envs/lib/python2.7/site-packages/nltk/internals.py", line 82, in config_java
_java_bin = find_binary('java', bin, env_vars=['JAVAHOME', 'JAVA_HOME'], verbose=verbose, binary_names=['java.exe'])
File "/Users/Envs/lib/python2.7/site-packages/nltk/internals.py", line 544, in find_binary
binary_names, url, verbose))
File "/Users/Envs/lib/python2.7/site-packages/nltk/internals.py", line 538, in find_binary_iter
url, verbose):
File "/Users/Envs/lib/python2.7/site-packages/nltk/internals.py", line 517, in find_file_iter
raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
LookupError:
===========================================================================
NLTK was unable to find the java file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
===========================================================================
I have set java_path and javahome and I would like to know why still thess errors occur?
my environment is MAC.
The problem is the java_path, should be:
java_path = "/your/Java/jdk/home

how to use envoy package in python?

I have installed package envoy. I ran the script but a windows error occured .I commented envoy.run then the full script runs but when I remove the comment, error occurs.
import envoy
# This data is checked-in to the repository and is a compressed
# version of the output from Example 3
F = 'resources/ch06-mailboxes/data/enron.mbox.json.bz2'
r = envoy.run("bunzip2 %s" % (F,))
print r.std_out
print r.std_err
traceback of script:
Exception in thread Thread-9:
Traceback (most recent call last):
File "C:\Users\sachin\Anaconda\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Users\sachin\Anaconda\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Users\sachin\Anaconda\lib\site-packages\envoy\core.py", line 40, in target
bufsize=0,
File "C:\Users\sachin\Anaconda\lib\subprocess.py", line 709, in __init__
errread, errwrite)
File "C:\Users\sachin\Anaconda\lib\subprocess.py", line 957, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
Please try this:
F = os.path.join(os.getcwd(), 'resources/ch06-mailboxes/data/enron.mbox.json.bz2')

Scrapy cannot handle bad headers properly [ScrapyHTTPPageGetter,client] Unhandled Error

Environment:
Scrapy 0.16.2
Twisted-12.2.0
python 2.7
macosx-10.6
Okey here is my problem:
I try to run
scrapy shell http://aaa.17domn.com/bt9/file.php/MERH77V.html
Error:
[ScrapyHTTPPageGetter,client] Unhandled Error
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/internet/selectreactor.py", line 150, in _doReadOrWrite
why = getattr(selectable, method)()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/internet/tcp.py", line 202, in doRead
return self._dataReceived(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/internet/tcp.py", line 208, in _dataReceived
rval = self.protocol.dataReceived(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/protocols/basic.py", line 564, in dataReceived
why = self.lineReceived(line)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.2-py2.7.egg/scrapy/core/downloader/webclient.py", line 50, in lineReceived
return HTTPClient.lineReceived(self, line.rstrip())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/web/http.py", line 450, in lineReceived
self.extractHeader(self._header)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/web/http.py", line 406, in extractHeader
key, val = header.split(':',1)
exceptions.ValueError: need more than 1 value to unpack
I found the solution from https://groups.google.com/forum/#!msg/scrapy-users/xFKo8ggzPxs/VXDl3CZ4V4cJ
They describe this is caused by twisted. Then I patched function extractHeader in /twisted/web/http.py from http://twistedmatrix.com/trac/ticket/2842. Its WORKS
BUT BUT, Hold on NOt yet!!!
I run another web
scrapy shell http://www1.wkdown.info/fs3/file.php/M994ATR.html
Error:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.2.0-py2.7-macosx-10.6-intel.egg/twisted/internet/defer.py", line 551, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.2-py2.7.egg/scrapy/core/downloader/webclient.py", line 122, in _build_response
status = int(self.status)
ValueError: invalid literal for int() with base 10: 'html'
I think something happen on response headers. Scrapy cannot handle it well.
Any idea?
Thank you!