how to resolve lmdb.Error: mdb_txn_commit: Input/output error? - python-2.7

I was creating lmdb database for images with labels. My code is follows:
with in_db.begin(write=True) as in_txn:
for in_idx, img_path in enumerate(X):
img = cv2.imread(img_path, cv2.IMREAD_COLOR)
#print(Y_gender[in_idx])
label = int(Y_gender[in_idx])
datum = make_datum(img, label)
in_txn.put('{:0>8d}'.format(in_idx), datum.SerializeToString())
#print '{:0>8d}'.format(in_idx) + ':' + img_path
in_db.close()
and I am getting following error:
Traceback (most recent call last):
File "create_lmdb_faces.py", line 40, in <module>
in_txn.put('{:0>8d}'.format(in_idx), datum.SerializeToString())
lmdb.Error: mdb_txn_commit: Input/output error
How can I fix this error?

You could be running out of disk space. The .mdb file is huge. This is the closest answer I could get for my situation. Not sure about your case. Simply write to another disk.
Check this answer from Google groups

Related

Google Vision API 'TypeError: invalid file'

The following piece of code comes from Google's Vision API Documentation, the only modification I've made is adding the argument parser for the function at the bottom.
import argparse
import os
from google.cloud import vision
import io
def detect_text(path):
"""Detects text in the file."""
client = vision.ImageAnnotatorClient()
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", type=str,
help="path to input image")
args = vars(ap.parse_args())
detect_text(args)
If I run it from a terminal like below, I get this invalid file error:
PS C:\VisionTest> python visionTest.py --image C:\VisionTest\test.png
Traceback (most recent call last):
File "visionTest.py", line 31, in <module>
detect_text(args)
File "visionTest.py", line 10, in detect_text
with io.open(path, 'rb') as image_file:
TypeError: invalid file: {'image': 'C:\\VisionTest\\test.png'}
I've tried with various images and image types as well as running the code from different locations with no success.
Seems like either the file doesn't exist or is corrupt since it isn't even read. Can you try another image and validate it is in the location you expect?

Saving files in Python using "with" method

I am wanting to create a file and save it to json format. Every example I find specifies the 'open' method. I am using Python 2.7 on Windows. Please help me understand why the 'open' is necessary for a file I am saving for the first time.
I have read every tutorial I could find and researched this issue but with no luck still. I do not want to create the file outside of my program and then have my program overwrite it.
Here is my code:
def savefile():
filename = filedialog.asksaveasfilename(initialdir =
"./Documents/WorkingDirectory/",title = "Save file",filetypes = (("JSON
files","*.json"), ("All files", "*.")))
with open(filename, 'r+') as currentfile:
data = currentfile.read()
print (data)
Here is this error I get:
Exception in Tkinter callback Traceback (most recent call last):
File "C:\Python27\lib\lib-tk\Tkinter.py", line 1542, in call
return self.func(*args) File "C:\Users\CurrentUser\Desktop\newproject.py", line 174, in savefile
with open(filename, 'r+') as currentfile: IOError: [Errno 2] No such file or directory:
u'C:/Users/CurrentUser/Documents/WorkingDirectory/test.json'
Ok, I figured it out! The problem was the mode "r+". Since I am creating the file, there is no need for read and write, just write. So I changed the mode to 'w' and that fixed it. I also added the '.json' so it would be automatically added after the filename.
def savefile():
filename = filedialog.asksaveasfilename(initialdir =
"./Documents/WorkingDirectory/",title = "Save file",filetypes = (("JSON
files","*.json"), ("All files", "*.")))
with open(filename + ".json", 'w') as currentfile:
line1 = currentfile.write(stringone)
line2 = currentfile.write(stringtwo)
print (line1,line2)

Turi Create - Please use dropna() to drop rows

I am having issues with Apple Turi Create and image classifier. I have successfully created a model with 22 categories. I have recently added 5 more categories and console is giving me error warning
Please use dropna() to drop rows with missing target values.
The full console log looks like this:
[16:30:30] src/nnvm/legacy_json_util.cc:190: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[16:30:30] src/nnvm/legacy_json_util.cc:198: Symbol successfully upgraded!
Resizing images...
Performing feature extraction on resized images...
Premature end of JPEG file
Completed 512/1883
Completed 1024/1883
Completed 1536/1883
Completed 1883/1883
PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
You can set ``validation_set=None`` to disable validation tracking.
[ERROR] turicreate.toolkits._main: Toolkit error: Target column has missing value.
Please use dropna() to drop rows with missing target values.
Traceback (most recent call last):
File "train.py", line 8, in <module>
model = tc.image_classifier.create(train_data, target='label', max_iterations=1000)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/turicreate/toolkits/image_classifier/image_classifier.py", line 132, in create
verbose=verbose)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/turicreate/toolkits/classifier/logistic_classifier.py", line 312, in create
seed=seed)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/turicreate/toolkits/_supervised_learning.py", line 397, in create
options, verbose)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/turicreate/toolkits/_main.py", line 75, in run
raise ToolkitError(str(message))
turicreate.toolkits._main.ToolkitError: Target column has missing value.
Please use dropna() to drop rows with missing target values.
I have upgraded turi and coremltools to the lates versions, but I don't know where I should implement the dropna() in the code. I only found this reference and followed the code.
It looks like this:
data.py
import turicreate as tc
image_data = tc.image_analysis.load_images('images', with_path=True)
labels = ['A', 'B', 'C', 'D']
def get_label(path, labels=labels):
for label in labels:
if label in path:
return label
image_data['label'] = image_data['path'].apply(get_label)
#import os
#image_data['label'] = image_data['path'].apply(lambda path: os.path.dirname(path).split('/')[-1])
image_data.save('boxes.sframe')
image_data.explore()
train.py
import turicreate as tc
data = tc.SFrame('boxes.sframe')
data.dropna()
train_data, test_data = data.random_split(0.8)
model = tc.image_classifier.create(train_data, target='label', max_iterations=1000)
predictions = model.classify(test_data)
results = model.evaluate(test_data)
print "Accuracy : %s" % results['accuracy']
print "Confusion Matrix : \n%s" % results['confusion_matrix']
model.save('boxes.model')
How do I drop all the empty columns and rows please? Does the max_iterations=1000 have also effect on the error?
Thank you for suggestions
data.dropna() isn't done in place, you need to write it: data = data.dropna()
See documentation here https://apple.github.io/turicreate/docs/api/generated/turicreate.SFrame.dropna.html

retriving data saved under HDF5 group as Carray

I am new to HDF5 file format and I have a data(images) saved in HDF5 format. The images are saved undere a group called 'data' which is under the root group as Carrays. what I want to do is to retrive a slice of the saved images. for example the first 400 or somthing like that. The following is what I did.
h5f = h5py.File('images.h5f', 'r')
image_grp= h5f['/data/'] #the image group (data) is opened
print(image_grp[0:400])
but I am getting the following error
Traceback (most recent call last):
File "fgf.py", line 32, in <module>
print(image_grp[0:40])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
(/feedstock_root/build_artefacts/h5py_1496410723014/work/h5py-2.7.0/h5py/_objects.c:2846)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
(/feedstock_root/build_artefacts/h5py_1496410723014/work/h5py
2.7.0/h5py/_objects.c:2804)
File "/..../python2.7/site-packages/h5py/_hl/group.py", line 169, in
__getitem__oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "/..../python2.7/site-packages/h5py/_hl/base.py", line 133, in _e name = name.encode('ascii')
AttributeError: 'slice' object has no attribute 'encode'
I am not sure why I am getting this error but I am not even sure if I can slice the images which are saved as individual datasets.
I know this is an old question, but it is the first hit when searching for 'slice' object has no attribute 'encode' and it has no solution.
The error happens because the "group" is a group which does not have the encoding attribute. You are looking for the dataset element.
You need to find/know the key for the item that contains the dataset.
One suggestion is to list all keys in the group, and then guess which one it is:
print(list(image_grp.keys()))
This will give you the keys in the group.
A common case is that the first element is the image, so you can do this:
image_grp= h5f['/data/']
image= image_grp(image_grp.keys[0])
print(image[0:400])
yesterday I had a similar error and wrote this little piece of code to take my desired slice of h5py file.
import h5py
def h5py_slice(h5py_file, begin_index, end_index):
slice_list = []
with h5py.File(h5py_file, 'r') as f:
for i in range(begin_index, end_index):
slice_list.append(f[str(i)][...])
return slice_list
and it can be used like
the_desired_slice_list = h5py_slice('images.h5f', 0, 400)

PYPDF watermarking returns error

hi im trying to watermark a pdf fileusing pypdf2 though i get this error i cant figure out what goes wrong.
i get the following error:
Traceback (most recent call last): File "test.py", line 13, in <module>
page.mergePage(watermark.getPage(0)) File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1594, in mergePage
self._mergePage(page2) File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1651, in _mergePage
page2Content, rename, self.pdf) File "C:Python27\site-packages\PyPDF2\pdf.py", line 1547, in
_contentStreamRename
op = operands[i] KeyError: 0
using python 2.7.6 with pypdf2 1.19 on windows 32bit.
hopefully someone can tell me what i do wrong.
my python file:
from PyPDF2 import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
input = PdfFileReader(open("test.pdf", "rb"))
watermark = PdfFileReader(open("watermark.pdf", "rb"))
# print how many pages input1 has:
print("test.pdf has %d pages." % input.getNumPages())
print("watermark.pdf has %d pages." % watermark.getNumPages())
# add page 0 from input, but first add a watermark from another PDF:
page = input.getPage(0)
page.mergePage(watermark.getPage(0))
output.addPage(page)
# finally, write "output" to document-output.pdf
outputStream = file("outputs.pdf", "wb")
output.write(outputStream)
outputStream.close()
Try writing to a StringIO object instead of a disk file. So, replace this:
outputStream = file("outputs.pdf", "wb")
output.write(outputStream)
outputStream.close()
with this:
outputStream = StringIO.StringIO()
output.write(outputStream) #write merged output to the StringIO object
outputStream.close()
If above code works, then you might be having file writing permission issues. For reference, look at the PyPDF working example in my article.
I encountered this error when attempting to use PyPDF2 to merge in a page which had been generated by reportlab, which used an inline image canvas.drawInlineImage(...), which stores the image in the object stream of the PDF. Other PDFs that use a similar technique for images might be affected in the same way -- effectively, the content stream of the PDF has a data object thrown into it where PyPDF2 doesn't expect it.
If you're able to, a solution can be to re-generate the source pdf, but to not use inline content-stream-stored images -- e.g. generate with canvas.drawImage(...) in reportlab.
Here's an issue about this on PyPDF2.