Python 2.7 pickle won't recognize numpy multiarray

Python 2.7 pickle won't recognize numpy multiarray - python-2.7

I need to load a set of pickled data from a collaborator. Problem is, it seems I need multiarray for this. My code is as below:
f = open('data.p', 'rb')
a = pickle.load(f)
And here is the error message.
ImportError Traceback (most recent call last)
<ipython-input-3-17918c47ae2d> in <module>()
----> 1 a = pk.load(f)
/usr/lib/python2.7/pickle.pyc in load(file)
1382
1383 def load(file):
-> 1384 return Unpickler(file).load()
1385
1386 def loads(str):
/usr/lib/python2.7/pickle.pyc in load(self)
862 while 1:
863 key = read(1)
--> 864 dispatch[key](self)
865 except _Stop, stopinst:
866 return stopinst.value
/usr/lib/python2.7/pickle.pyc in load_global(self)
1094 module = self.readline()[:-1]
1095 name = self.readline()[:-1]
-> 1096 klass = self.find_class(module, name)
1097 self.append(klass)
1098 dispatch[GLOBAL] = load_global
/usr/lib/python2.7/pickle.pyc in find_class(self, module, name)
1128 def find_class(self, module, name):
1129 # Subclasses may override this
-> 1130 __import__(module)
1131 mod = sys.modules[module]
1132 klass = getattr(mod, name)
ImportError: No module named multiarray
I thought it was the problem of the compiled numpy in my computer. So I uninstalled the numpy from my Arch Linux repo and installed the numpy through
sudo -H pip2 install numpy
Yet the problem persist. I have checked the folder $PACKAGE-SITE/numpy/core, multiarray.so is in it. And I have no idea why pickle can't load the module.
How can I solve the problem? What else do I need to do?
PS1. I am using Arch Linux. And tried all versions of python 2.7 since last year October. None of them works.
PS2. Since the problem is with the loading step. I suspect the problem being more likely from internal conflicts of python rather than from the data file.

Thanks to #MikeMcKems, the problem is now solved.
The issue is caused by different special symbols used MS Windows and Linux(eg. end of line symbol). My collaborator was using Windows machine, and saved the data with
pickle.dump(obj, 'filename', 'w')
The data was saved in plain text with a lot of special symbols in it. And when I load the data with my Linux machine, the symbols were misintepreted hence causing the problem.
The easiest way to solve it is to find a Windows machine, load the data with
a=pickle.load(open('filename_in', 'r'))
Then output with binary form
pickle.dump(a, open('filename_out', 'wb'))
Since binary data is universally recognized as long as you use pickle to read it, the file filename_out is easily recognizable by Python in linux.

Related

Error Import Parquet in Windows

I'm trying to install Parquet file format to use it with Apache Spark. I learned that I had to install Thrift, ThriftPy, and Python-Snappy in order to fully install Parquet.
I install Thrift using the command
pip install thrift
Then I installed python-snappy manually through a wheel file found here. This was because I was unable to install python-snappy automatically. Anyways, python-snappy was successfully installed.
I also installed ThrifPy using the similar command
pip install ThriftPy
And finally, I used pip to install parquet which was successful. After installing, when I try to import parquet, it raises the error as
---------------------------------------------------------------------------
ThriftParserError Traceback (most recent call last)
<ipython-input-55-942008defa53> in <module>()
----> 1 import parquet
C:\anaconda2\lib\site-packages\parquet\__init__.py in <module>()
17 from thriftpy.protocol.compact import TCompactProtocolFactory
18
---> 19 from . import encoding
20 from . import schema
21 from .converted_types import convert_column
C:\anaconda2\lib\site-packages\parquet\encoding.py in <module>()
17
18 THRIFT_FILE = os.path.join(os.path.dirname(__file__), "parquet.thrift")
---> 19 parquet_thrift = thriftpy.load(THRIFT_FILE,
module_name=str("parquet_thrift")) # pylint: disable=invalid-name
20
21 logger = logging.getLogger("parquet") # pylint: disable=invalid-name
C:\anaconda2\lib\site-packages\thriftpy\parser\__init__.pyc in load(path,
module_name, include_dirs, include_dir)
28 real_module = bool(module_name)
29 thrift = parse(path, module_name, include_dirs=include_dirs,
---> 30 include_dir=include_dir)
31
32 if real_module:
C:\anaconda2\lib\site-packages\thriftpy\parser\parser.pyc in parse(path,
module_name, include_dirs, include_dir, lexer, parser, enable_cache)
494 raise ThriftParserError('ThriftPy does not support generating
module '
495 'with path in protocol \'{}\''.format(
--> 496 url_scheme))
497
498 if module_name is not None and not module_name.endswith('_thrift'):
ThriftParserError: ThriftPy does not support generating module with path in
protocol 'c'
Would someone tell me what I'm doing wrong ?
For reference, I'm using anaconda Python 2.7 on a Jupyter notebook. My OS is Windows 7 , and I'm using Spark on a single cluster.

Modify the parser code in the windows side. It is in site-packages of python.
For Example:- C:\Anaconda\python27\Lib\site-packages\thriftpy\parser\parser.py
Modify the 488 line:
#if url_scheme == '':
if len(url_scheme) <= 1:
Then try again.

Error in reading html to data frame in Python “html5lib not found”

I've come accross the following error about html5lib when trying to read an html data frame.
Here is the code:
!pip install html5lib
!pip install lxml
!pip install beautifulSoup4
import html5lib
import lxml
from bs4 import BeautifulSoup
table_list = pd.read_html("http://www.psmsl.org/data/obtaining/")
This is the error:
ImportError Traceback (most recent call last)
<ipython-input-68-e24654a0a301> in <module>()
----> 1 table_list = pd.read_html("http://www.psmsl.org/data/obtaining/")
/home/sage/sage-8.0/local/lib/python2.7/site-packages/pandas/io/html.pyc in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, tupleize_cols, thousands, encoding, decimal, converters, na_values, keep_default_na)
913 thousands=thousands, attrs=attrs, encoding=encoding,
914 decimal=decimal, converters=converters, na_values=na_values,
--> 915 keep_default_na=keep_default_na)
/home/sage/sage-8.0/local/lib/python2.7/site-packages/pandas/io/html.pyc in _parse(flavor, io, match, attrs, encoding, **kwargs)
737 retained = None
738 for flav in flavor:
--> 739 parser = _parser_dispatch(flav)
740 p = parser(io, compiled_match, attrs, encoding)
741
/home/sage/sage-8.0/local/lib/python2.7/site-packages/pandas/io/html.pyc in _parser_dispatch(flavor)
680 if flavor in ('bs4', 'html5lib'):
681 if not _HAS_HTML5LIB:
--> 682 raise ImportError("html5lib not found, please install it")
683 if not _HAS_BS4:
684 raise ImportError(
ImportError: html5lib not found, please install it
Any help would be much appreciated.
Thanks

If you read the error message, you don't have html5lib installed. Do:
pip install html5lib
in your terminal.
If you are calling from jupyter notebook (just like you did with !), try to restart the kernel in order to have the packages loaded.

I had this exact error show up while trying to read a saved .htm file using Spyder IDE.
This code displayed html5lib error:
import pandas as pd
df = pd.read_html("F:\xxxx\xxxxx\xxxxx\aaaa.htm")
I knew I had html5lib installed and working correctly because I had other scripts that worked.
For whatever reason, file path needed to be a string literal (putting an r in front of the file path).
This code works for me:
import pandas as pd
df = pd.read_html(r"F:\xxxx\xxxxx\xxxxx\aaaa.htm")

I ran into this error when I gave the wrong path to the local file I was trying to open. So also be sure that you're pointing to the right place!

"RuntimeError: Could not create write struct" with pyplot

UPDATE:
I get this message no matter what I attempt to plot: even this
plt.plot([1,2,3,4])
plt.ylabel('some numbers')
plt.show()
returns the error RuntimeError: Could not create write struct
I am trying to plot a raw image inline. My Jupyter notebook is up on an AWS instance with port forwarding.
My code is as follows:
see above update
When I try this, I get the error message below, which culminates in the message RuntimeError: Could not create write struct.
The weird thing is, the exact same code runs fine locally. I can view images all day long.
So as an experiment I pulled the image down off AWS and ran it locally and I could see it displayed just fine.
I'm thinking, there must be some problem with either my Matplotlib or even jupyter notebook.
I've removed / reinstalled both multiple times, in multiple configurations. I made sure the local and AMI versions of the packages are the exact same.
I have no idea what is going on.
The error itself, naturally, isn't useful. And when googling the error, there's few exact string matching results, which is always scary.
Other random stuff:
I'm using Python 2.7
Both libraries are managed within Conda
Jupyter: 4.4.0
Matplotlib: 2.1.2
<matplotlib.image.AxesImage at 0x7f261c1f2b50>
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/IPython/core/formatters.pyc in __call__(self, obj)
332 pass
333 else:
--> 334 return printer(obj)
335 # Finally look for special method names
336 method = get_real_method(obj, self.print_method)
/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/IPython/core/pylabtools.pyc in <lambda>(fig)
238
239 if 'png' in formats:
--> 240 png_formatter.for_type(Figure, lambda fig: print_figure(fig, 'png', **kwargs))
241 if 'retina' in formats or 'png2x' in formats:
242 png_formatter.for_type(Figure, lambda fig: retina_figure(fig, **kwargs))
/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/IPython/core/pylabtools.pyc in print_figure(fig, fmt, bbox_inches, **kwargs)
122
123 bytes_io = BytesIO()
--> 124 fig.canvas.print_figure(bytes_io, **kw)
125 data = bytes_io.getvalue()
126 if fmt == 'svg':
/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/matplotlib/backend_bases.pyc in print_figure(self, filename, dpi, facecolor, edgecolor, orientation, format, **kwargs)
2214 orientation=orientation,
2215 dryrun=True,
-> 2216 **kwargs)
2217 renderer = self.figure._cachedRenderer
2218 bbox_inches = self.figure.get_tightbbox(renderer)
/home/ubuntu/anaconda3/envs/pytorch_p27/lib/python2.7/site-packages/matplotlib/backends/backend_agg.pyc in print_png(self, filename_or_obj, *args, **kwargs)
524 try:
525 _png.write_png(renderer._renderer, filename_or_obj,
--> 526 self.figure.dpi, metadata=metadata)
527 finally:
528 if close:
RuntimeError: Could not create write struct
<matplotlib.figure.Figure at 0x7f2624b94950>

This is unclear and I have no idea why, but I removed the conda installation of matplotlib, and then reinstalled matplotlib with pip.
Now everything works fine.
¯\_(ツ)_/¯

Python 2: Type error "only integer scalar arrays can be converted to a scalar index" using pd.read() with neo.Spike2IO

I have code to load in Spike2 .smr files and read them in Jupyter. My code was working fine 2 days ago and now, with absolutely no change on either the file that is loaded in or the code that loads it in, it is not working. The problem code is as follows...
Cell 1 Input (to show the versions of my packages):
import sys
print("Python version: {}\n\nPackages versions: ".format(sys.version))
# which package versions are installed?
import pip
all_packages = pip.get_installed_distributions()
used_packages = ["matplotlib", "neo", "numpy", "OpenElectrophy", "os", "pandas",
"pylab", "scipy"]
for entry in used_packages:
for p in all_packages:
if entry in str(p):
print(str(p))
Cell 1 Output:
Python version: 2.7.13 |Anaconda custom (64-bit)| (default, Dec 20 2016, 23:09:15)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
Packages versions:
matplotlib 1.4.3
matplotlib-venn 0.11.3
neo 0.3.3
numpy 1.12.0
pycosat 0.6.1
nose 1.3.7
backports.ssl-match-hostname 3.5.0.1
pandas 0.19.2
scipy 0.15.1
Cell 2 Input (load in my modules):
import pylab
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats as st
import os
import tables
import neo
import scipy.signal as sg
from scipy import interpolate as inter
import h5py as h
import quantities as q
plt.style.use('ggplot')
pd.options.display.max_rows = 999
%matplotlib inline
Now, I load in the Spike2 .smr file with:
r = neo.Spike2IO("Rawdata/143-16/nerve.smr").read()[0]
and get the following type error:
TypeError Traceback (most recent call last)
<ipython-input-3-f81fd520a4c5> in <module>()
----> 1 r = neo.Spike2IO("Rawdata/143-16/nerve.smr").read()[0]
/home/wolverine/anaconda/lib/python2.7/site-packages/neo/io/baseio.pyc in read(self, lazy, cascade, **kargs)
107 if not cascade:
108 return bl
--> 109 seg = self.read_segment(lazy=lazy, cascade=cascade, **kargs)
110 bl.segments.append(seg)
111 create_many_to_one_relationship(bl)
/home/wolverine/anaconda/lib/python2.7/site-packages/neo/io/spike2io.pyc in read_segment(self, take_ideal_sampling_rate, lazy, cascade)
120 if channelHeader.kind in [1, 9]:
121 #~ print 'analogChanel'
--> 122 anaSigs = self.readOneChannelContinuous( fid, i, header, take_ideal_sampling_rate, lazy = lazy)
123 #~ print 'nb sigs', len(anaSigs) , ' sizes : ',
124 for anaSig in anaSigs :
/home/wolverine/anaconda/lib/python2.7/site-packages/neo/io/spike2io.pyc in readOneChannelContinuous(self, fid, channel_num, header, take_ideal_sampling_rate, lazy)
240
241 anaSigs = [ ]
--> 242 if channelHeader.unit in unit_convert:
243 unit = pq.Quantity(1, unit_convert[channelHeader.unit] )
244 else:
/home/wolverine/anaconda/lib/python2.7/site-packages/neo/io/spike2io.pyc in __getattr__(self, name)
444 else:
445 l = np.fromstring(self.array[name][0], 'u1')
--> 446 return self.array[name][1:l+1]
447 else:
448 return self.array[name]
TypeError: only integer scalar arrays can be converted to a scalar index
The "neo.Spike2IO("filename.smr") works fine, but as soon as I add the "read()[0]" part, that is when I get the TypeError. I read up on this type error and the only answers I saw were that the file could be corrupt. I deleted my local file and re-downloaded it and also downloaded another similar file just in case the master file for the other one was corrupt. I retried my code on these two new files and received the Type Error code for both. As stated before, the code was working flawlessly just two days ago and now it won't load any .smr file. I went through and updated all of my modules and pip and anaconda, all of this did not help.
Here is a link to a short sample .smr file (only 3.1 MB) that I cut for sharing purposes. It also gives the Type Error. Any ideas? Thank you.

I solved this issue by further updating my modules and Anaconda itself (and all of its respective modules). Something must have reverted to an older version.
The code to update every package in Anaconda is:
conda update --all
Further help can be found here at the Conda homepage. Shutting down, then restarting your computer can also help to ensure that all of these updates are implemented.

Creating a single executable .exe from Python script that uses PuLP

I have been struggling with this for a while. I have used py2exe and cx_freeze to package everything. I am using a 32 bit machine and Everything works fine and the interface opens up and everything just that I know the entire puLP package is not being copied correctly into the package. I know this because the solver does not work. Inside both library zips in the packages created by py2exe and cx_freeze, there are only .pyc files included where PuLP has cbc.exe and other file types that make the solver work.
Is there any work around this? I have tried copying the actual PuLP package into the library.zip as well as into the dist folder and that didn't work.
Here is the setup I used for py2exe:
import sys
from cx_Freeze import setup, Executable
# Dependencies are automatically detected, but it might need fine tuning.
build_exe_options = {"packages": ["pulp"],
"icon": "icon.ico",
"include_files": ["icon.ico","cbc.exe","cbc-32","cbc-64","cbc-osx-64","CoinMP.dll"]}
# GUI applications require a different base on Windows (the default is for a
# console application).
base = None
if sys.platform == "win32":
base = "Win32GUI"
setup( name = "my_app",
version = "0.1",
options = {"build_exe": build_exe_options},
executables = [Executable("my_app.py", base=base)])
I received the following error:
Exception in Tkinter callback
Traceback (most recent call last):
File "Tkinter.pyc", line 1470, in __call__
File "my_app.py", line 796, in <lambda>
File "my_app.py", line 467, in branchAndBound
File "pulp\pulp.pyc", line 1619, in solve
AttributeError: 'NoneType' object has no attribute 'actualSolve'
EDIT
I tried to change the paths to cbc.exe and CoinMP.dll but that didn't really work either. I am probably missing something.
I changed the following inside solvers.py in the PuLP package:
try:
coinMP_path = config.get("locations", "CoinMPPath").split(', ')
except configparser.Error:
coinMP_path = ['/Users/al/Desktop/my_app/build/exe.win32-2.7']
try:
cbc_path = config.get("locations", "CbcPath")
except configparser.Error:
cbc_path = '/Users/al/Desktop/my_app/build/exe.win32-2.7'
try:
pulp_cbc_path = config.get("locations", "PulpCbcPath")
except configparser.Error:
pulp_cbc_path = '/Users/al/Desktop/my_app/build/exe.win32-2.7'
What am I missing or doing wrong?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Python 2.7 pickle won't recognize numpy multiarray - python-2.7

Related

Error Import Parquet in Windows

Error in reading html to data frame in Python “html5lib not found”

"RuntimeError: Could not create write struct" with pyplot

Python 2: Type error "only integer scalar arrays can be converted to a scalar index" using pd.read() with neo.Spike2IO

Creating a single executable .exe from Python script that uses PuLP

Categories

Resources