How to fine-tune a network in caffe2 - caffe2

There are very little information about how to fine-tuning parameters and it really confuses me a lot about how to fine-tune a network in caffe2. Could anybody show me some codes about the fine-tuning part. Many thanks.
By the way, in the link:Food101 SqueezeNet Caffe2 number of iterations, it seems that the author has successfully fine-tuned the network.
add: Here are some codes of my train part,
train_model = cnn.CNNModelHelper(order="NCHW", name="train")
train_model.param_init_net.AppendNet(core.Net(init_net))
train_model.net.AppendNet(core.Net(predict_net))
train_model.param_init_net.RunAllOnGPU(gpu_id=0)
train_model.net.RunAllOnGPU(gpu_id=0)
workspace.RunNetOnce(train_model.param_init_net)
AddTrainingOperators(train_model, 'softmaxout', 'label')
AddBookkeepingOperators(train_model)
workspace.RunNetOnce(train_model.param_init_net)
data, label = AddInput(train_model, batch_size=3,
db=os.path.join(data_folder, 'toy_train.lmdb'),
db_type='lmdb')
workspace.FeedBlob('data', data)
workspace.FeedBlob('label', label)
workspace.CreateNet(train_model.net)
However, when i run the code, an error which warns
Traceback (most recent call last):
File "lenetForChineseFinetune.py", line 62, in <module>
workspace.FeedBlob('data', data)
File "/opt/caffe2/caffe2/local/caffe2/python/workspace.py", line 262, in FeedBlob
return C.feed_blob(name, arr)
RuntimeError: [enforce fail at pybind_state.cc:825] . Unexpected type of argument - only numpy array or string are supported for feeding
occured. How should i modify the codes?

Related

Tensorboard callback with keras fit_generator, 'Function' has no attribute 'fetch_callbacks'

I am trying to run a model using keras's fit_generator with a tensorboard_callback for profiling a specific epoch. I am running the following code for the generator:
def gen(source):
loopable = iter(source)
for batch in loopable:
yield (batch[0], batch[1])
In the main training script I am instantiating the generator and using the model with a tensorboard callback as follows:
train_gen = gen(train_datasource)
log_dir="logs/profile/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, profile_batch = 3)
m.fit_generator(train_gen, epochs=5, steps_per_epoch=500, use_multiprocessing=True, workers=32, callbacks=[tensorboard_callback])
The main issue I am facing is that the training always halts with the error 'Function' has no attribute 'fetch_callbacks' with the following stack trace:
m.fit_generator(train_gen, epochs=5, steps_per_epoch=500, use_multiprocessing=True, workers=32, callbacks=[tensorboard_callback])
File "/usr/local/lib/python2.7/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training_generator.py", line 177, in fit_generator
callbacks.on_epoch_begin(epoch)
File "/usr/local/lib/python2.7/dist-packages/keras/callbacks.py", line 65, in on_epoch_begin
callback.on_epoch_begin(epoch, logs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/callbacks_v1.py", line 386, in on_epoch_begin
self.merged] = self._fetch_callback
AttributeError: 'Function' object has no attribute 'fetch_callbacks'
I am using tensorflow 1.15 also tried downgrading to 1.14 but still no success. I am trying to train with the tensorboard callback to debug the performance for a specific epoch other than the first one. But so far my attempts have failed to make the callback function correctly. I made sure the GPU is running and detected correctly too.
Any help would be much appreciated.
I ended up using tf.keras fit function instead of fit generator and it worked correctly as expected:
m.fit(x=train_gen, epochs=5, steps_per_epoch=500, use_multiprocessing=True, workers=8, callbacks=[tensorboard_callback])

retriving data saved under HDF5 group as Carray

I am new to HDF5 file format and I have a data(images) saved in HDF5 format. The images are saved undere a group called 'data' which is under the root group as Carrays. what I want to do is to retrive a slice of the saved images. for example the first 400 or somthing like that. The following is what I did.
h5f = h5py.File('images.h5f', 'r')
image_grp= h5f['/data/'] #the image group (data) is opened
print(image_grp[0:400])
but I am getting the following error
Traceback (most recent call last):
File "fgf.py", line 32, in <module>
print(image_grp[0:40])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
(/feedstock_root/build_artefacts/h5py_1496410723014/work/h5py-2.7.0/h5py/_objects.c:2846)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
(/feedstock_root/build_artefacts/h5py_1496410723014/work/h5py
2.7.0/h5py/_objects.c:2804)
File "/..../python2.7/site-packages/h5py/_hl/group.py", line 169, in
__getitem__oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "/..../python2.7/site-packages/h5py/_hl/base.py", line 133, in _e name = name.encode('ascii')
AttributeError: 'slice' object has no attribute 'encode'
I am not sure why I am getting this error but I am not even sure if I can slice the images which are saved as individual datasets.
I know this is an old question, but it is the first hit when searching for 'slice' object has no attribute 'encode' and it has no solution.
The error happens because the "group" is a group which does not have the encoding attribute. You are looking for the dataset element.
You need to find/know the key for the item that contains the dataset.
One suggestion is to list all keys in the group, and then guess which one it is:
print(list(image_grp.keys()))
This will give you the keys in the group.
A common case is that the first element is the image, so you can do this:
image_grp= h5f['/data/']
image= image_grp(image_grp.keys[0])
print(image[0:400])
yesterday I had a similar error and wrote this little piece of code to take my desired slice of h5py file.
import h5py
def h5py_slice(h5py_file, begin_index, end_index):
slice_list = []
with h5py.File(h5py_file, 'r') as f:
for i in range(begin_index, end_index):
slice_list.append(f[str(i)][...])
return slice_list
and it can be used like
the_desired_slice_list = h5py_slice('images.h5f', 0, 400)

py2neo 2.0 bind and push node errors

Using Py2Neo 2.0 and Pycharm Community Edition 4
I'm trying to update a node. First I get the node object, change a node property, bind to the database, and then push the node. I get a slew of errors. Here is the code.
user_node = Graph().find_one('USER',
property_key='email',
property_value='marnee#marnee.com')
user_properties['mission_statement'] = 'New mission statement'
user_node.bind(uri=Graph().uri)
user_node.push()
The node is found, it does have a mission_statement property. The exception seems to happen on .push(). The Graph() uri is good, too.
Below are the errors.
I had been able to do this successfully about a week ago. I have not updated any packages recently.
The really weird part is that if I have a breakpoint and run this in debug mode I do not get any errors and the node is updated successfully.
Traceback (most recent call last):
File "C:/Users/Marnee Dearman/PycharmProjects/AgoraDev/py2neo_2.0_tests/create_rel_int_loc.py", line 27, in <module>
user_node.push()
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\core.py", line 1519, in push
batch.push()
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\batch\push.py", line 73, in push
self.graph.batch.run(self)
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\batch\core.py", line 99, in run
response = self.post(batch)
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\batch\core.py", line 88, in post
data.append(dict(job, id=i))
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\batch\core.py", line 232, in __iter__
yield "to", self.target.uri_string
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\batch\core.py", line 180, in uri_string
uri_string = self.entity.ref
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\core.py", line 1421, in ref
return "node/%s" % self._id
File "C:\Users\Marnee Dearman\PycharmProjects\VirtualEnvs\AgoraDev\lib\site-packages\py2neo\core.py", line 1412, in _id
self.__id = int(self.uri.path.segments[-1])
ValueError: invalid literal for int() with base 10: ''
Using Nigel's advice below, I got this to work. It was a usage error on my part:
user_node = Graph().find_one('USER',
property_key='email',
property_value='marnee#email.com')
user_node.properties['mission_statement'] = 'New mission statement'
user_node.push()
There are a couple of problems with your code so I will try to clarify the correct usage of these methods.
The bind method is used to connect a local entity (in this case, Node) to a corresponding remote equivalent. You should generally never need to use this method explicitly as entities are typically bound automatically on creation or retrieval. In your case, the find_one method does exactly this and constructs a client-side node that is bound to a corresponding server-side node; an explicit bind is not required.
The second issue is with your usage of bind. The URI taken by this method is that of a specific remote resource. You have passed the URI of the Graph itself (probably http://localhost:7474/db/data/) instead of that of the Node (such as http://localhost:7474/db/data/node/2345). The actual error that you see is caused by py2neo attempting to strip the ID from the URI and failing.
The simple solution is to remove the bind call.

error: unpack requires a string argument of length 8

I was running my script and I stumbled upon on this error
WARNING *** file size (24627) not 512 + multiple of sector size (512)
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero
Traceback (most recent call last):
File "C:\Email Attachments\whatever.py", line 20, in <module>
main()
File "C:\Email Attachments\whatever.py", line 17, in main
csv_from_excel()
File "C:\Email Attachments\whatever.py", line 7, in csv_from_excel
sh = wb.sheet_by_name('B2B_REP_YLD_100_D_SQ.rpt')
File "C:\Python27\lib\site-packages\xlrd\book.py", line 442, in sheet_by_name
return self.sheet_by_index(sheetx)
File "C:\Python27\lib\site-packages\xlrd\book.py", line 432, in sheet_by_index
return self._sheet_list[sheetx] or self.get_sheet(sheetx)
File "C:\Python27\lib\site-packages\xlrd\book.py", line 696, in get_sheet
sh.read(self)
File "C:\Python27\lib\site-packages\xlrd\sheet.py", line 1055, in read
dim_tuple = local_unpack('<ixxH', data[4:12])
error: unpack requires a string argument of length 8
I was trying to process this excel file.
https://drive.google.com/file/d/0B12NevhOGQGRMkRVdExuYjFveDQ/edit?usp=sharing
One solution that I found is that I have to open manually the spreadsheet, save it, then close it before I run my script of converting .xls to .csv. I find this solution a bit cumbersome and clunky.
This kind of spreadsheet is saved daily in my drive via an Outlook Macro. Unprocessed data is increasing that's why I turned into scripting to ease the job.
Who made the Outlook macro that's dumping this file? xlrd uses byte level unpacking to read in the Excel file, and is failing to read a field in this excel file. There are ways to follow where its failing, but none to automatically recover from this type of error.
The erroneous data seems to be at data[4:12] of a specific frame (we'll see later), which should be a bytestring that's parsed as such:
one integer (i)
2 pad bytes (xx)
unsigned short 2 byte integer (H).
You can set xlrd to DEBUG mode, which will show you which bytes its parsing, and exactly where in the file there is an error:
import xlrd
xlrd.DEBUG = 2
workbook = xlrd.open_workbook(u'/home/sparker/Downloads/20131117_040934_B2B_REP_YLD_100_D_LT.xls')
Here's the results, slightly trimmed down for the sake of SO:
parse_globals: record code is 0x0293
parse_globals: record code is 0x0293
parse_globals: record code is 0x0085
CODEPAGE: codepage 1200 -> encoding 'utf_16_le'
BOUNDSHEET: bv=80 data '\xfd\x04\x00\x00\x00\x00\x18\x00B2B_REP_YLD_100_D_SQ.rpt'
BOUNDSHEET: inx=0 vis=0 sheet_name=u'B2B_REP_YLD_100_D_SQ.rpt' abs_posn=1277 sheet_type=0x00
parse_globals: record code is 0x000a
GET_SHEETS: [u'B2B_REP_YLD_100_D_SQ.rpt'] [1277]
GET_SHEETS: sheetno = 0 [u'B2B_REP_YLD_100_D_SQ.rpt'] [1277]
reqd: 0x0010
getbof(): data='\x00\x06\x10\x00\xbb\r\xcc\x07\x00\x00\x00\x00\x06\x00\x00\x00'
getbof(): op=0x0809 version2=0x0600 streamtype=0x0010
getbof(): BOF found at offset 1277; savpos=1277
BOF: op=0x0809 vers=0x0600 stream=0x0010 buildid=3515 buildyr=1996 -> BIFF80
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "build/bdist.macosx-10.4-x86_64/egg/xlrd/__init__.py", line 457, in open_workbook
File "build/bdist.macosx-10.4-x86_64/egg/xlrd/__init__.py", line 1007, in get_sheets
File "build/bdist.macosx-10.4-x86_64/egg/xlrd/__init__.py", line 998, in get_sheet
File "build/bdist.macosx-10.4-x86_64/egg/xlrd/sheet.py", line 864, in read
struct.error: unpack requires a string argument of length 8
Specifically, you can see that it parses the name of the workbook named u'B2B_REP_YLD_100_D_SQ.rpt.
Lets check the source code. The traceback throws an error here where we can see from the parent loop that we're trying to parse the XL_DIMENSION and XL_DIMENSION2 values. These directly correspond to the shape of the Excel Sheet.
And that's where there's a problem in your workbook. It's not being made correctly. So, back to my original question, who made the excel macro? It needs to be fixed. But that's for another SO question, some other time.

Specifying select features to be categorical using OneHotEncoder in sklearn 0.14

I am using the sklearn 0.14 module in Python to create a decision tree. I was hoping to use the OneHotEncoder to convert some features into categorical features. According to the documentation, I should be able to provide an array of indices to indicate which features should be converted. However, trying the following code:
xs = [[64, 15230], [3, 67673], [16, 43678]]
encoder = preprocessing.OneHotEncoder(n_values='auto', categorical_features=[1], dtype=numpy.integer)
encoder.fit(xs)
I receive the following error:
Traceback (most recent call last): File
"C:\Users\sara\Documents\Shipping
Project\PythonSandbox\CarrierDecisionTree.py", line 35, in <module>
encoder.fit(xs) File "C:\Python27\lib\site-packages\sklearn\preprocessing\data.py", line
892, in fit
self.fit_transform(X) File "C:\Python27\lib\site-packages\sklearn\preprocessing\data.py", line
944, in fit_transform
self.categorical_features, copy=True) File "C:\Python27\lib\site-packages\sklearn\preprocessing\data.py", line
795, in _transform_selected
return sparse.hstack((X_sel, X_not_sel)) File "C:\Python27\lib\site-packages\scipy\sparse\construct.py", line 417,
in hstack
return bmat([blocks], format=format, dtype=dtype) File "C:\Python27\lib\site-packages\scipy\sparse\construct.py", line 532,
in bmat
dtype = upcast( *tuple([A.dtype for A in blocks[block_mask]]) ) File "C:\Python27\lib\site-packages\scipy\sparse\sputils.py", line 53,
in upcast
raise TypeError('no supported conversion for types: %r' % (args,)) TypeError: no supported conversion for types: (dtype('int32'),
dtype('S6'))
If instead, I provide the array [0, 1] to categorical_features, it works correctly and converts both features properly. The same correct behavior occurs with using 'all' to categorical_features. However, I only want the second feature converted and not the first. I understand I could do this manually by converting one feature at a time, but I was hoping to use all the beauty of OneHotEncoder as I will be using many more features later on.
Posting as an answer, for the record:
TypeError: no supported conversion for types: (dtype('int32'), dtype('S6'))
means something in the true xs (not the one shown in the code snippet) is a string: dtype('S6') is NumPy's length-six string type.