I could use your support on this. Here is my issue:
I've got a 2D buffer of floats (in a data object) in a C++ code, that I write in a binary file using:
ptrToFile.write(reinterpret_cast<char *>(&data->array[0][0]), nbOfEltsInArray * sizeof(float));
The data contains 8192 floats, and I (correctly ?) get a 32 kbytes (8192 * 4 bytes) file out of this line of code.
Now I want to read that binary file using MATLAB. The code is:
hdr_binaryfile = fopen(str_binaryfile_path,'r');
res2_raw = fread(hdr_binaryfile, 'float');
res2 = reshape(res2_raw, int_sizel, int_sizec);
But it's not happening as I expect it to happen. If I print the array of data in the C++ code using std::cout, I get:
pCarte_bin->m_size = 8192
pCarte_bin->m_sizel = 64
pCarte_bin->m_sizec = 128
pCarte_bin->m_p[0][0] = 1014.97
pCarte_bin->m_p[0][1] = 566946
pCarte_bin->m_p[0][2] = 423177
pCarte_bin->m_p[0][3] = 497375
pCarte_bin->m_p[0][4] = 624860
pCarte_bin->m_p[0][5] = 478834
pCarte_bin->m_p[1][0] = 2652.25
pCarte_bin->m_p[2][0] = 642077
pCarte_bin->m_p[3][0] = 5.33649e+006
pCarte_bin->m_p[4][0] = 3.80922e+006
pCarte_bin->m_p[5][0] = 568725
And on the MATLAB side, after I read the file using the little block of code above:
size(res2) = 64 128
res2(1,1) = 1014.9659
res2(1,2) = 323288.4063
res2(1,3) = 2652.2515
res2(1,4) = 457593.375
res2(1,5) = 642076.6875
res2(1,6) = 581674.625
res2(2,1) = 566946.1875
res2(3,1) = 423177.1563
res2(4,1) = 497374.6563
res2(5,1) = 624860.0625
res2(6,1) = 478833.7188
The size (lines, columns) is OK, as well as the very first item ([0][0] in C++ == [1][1] in MATLAB). But:
I'm reading the C++ line elements along the columns: [0][1] in C++ == [1][2] in MATLAB (remember that indexing starts at 1 in MATLAB), etc.
I'm reading one correct element out of two along the other dimension: [1][0] in C++ == [1][3] in MATLAB, [2][0] == [1][5], etc.
Any idea about this ?
Thanks!
bye
Leaving aside the fact there seems to be some precision difference (likely the display settings in MATLAB) the issue here is likely the difference between row major and column major ordering of data. Without more details it will be hard to be certain. In particular MATLAB is column major meaning that contiguous memory on disk is interpreted as detailing sequential elements in a column rather than a row.
The likely solution is to reverse the two sizes in your reshape, and access the elements with indices reversed. That is, swap the int_size1 and int_size2, and then read elements expecting
pCarte_bin->m_p[0][0] = res2(1,1)
pCarte_bin->m_p[0][1] = res2(2,1)
pCarte_bin->m_p[0][2] = res2(3,1)
pCarte_bin->m_p[0][3] = res2(4,1)
pCarte_bin->m_p[1][0] = res2(1,2)
etc.
You could also transpose the array in MATLAB after read, but for a large array that could be costly in itself
I have data in excel and wanted to write a string with a sum for each group of the table.
So I wanted to loop through in a range and write a string "Subtotal" on the first column and apply formula '=sum{}:{}' from start row to before I'm writing a formula.
I know the start and end range.
How can I achieve that by using a loop where the first blank found write string and formula.
input:
See below code I'm trying but it does not work.
row_start = number_rows_placement + number_rows_adsize + 20
row_end = number_rows_placement + number_rows_adsize + number_rows_daily + unqiue_final_day_wise * 5 + 15
for i in range ( row_start , row_end ):
if i == " ":
worksheet.write(i,1,"Subtotal", format)
i += 5
worksheet.write_formula(i,2,'=sum(:)', format)
but it doesn't seem to be working. I don't know where I'm wrong. also while trying to get sum range would vary after each header to before the formula marked.
OutPut:
The formula isn't valid in Excel. It should be =SUM(), uppercase.
Also, you can generate the range for the formula with something like this:
from xlsxwriter.utility import xl_range
row_start = 60
row_end = 64
col = 1
cell_range = xl_range(row_start, col, row_end, col) # B61:B65
See the XlsxWriter Cell Utility Functions.
I have a function which is supposed to unpack an H5PY dataset, temp.hdf5, which only contains one example so it can be evaluated:
def getprob():
test_set = H5PYDataset('temp.hdf5', which_sets=('test',))
handle = test_set.open()
test_data = test_set.get_data(handle, slice(0,1))
xx = test_data[0]
YY = test_data[1]
l, prob, rho_orig, rho_larger, rho_largest = f(xx)
return prob[9][0]
Where test_data[0] is 28x28 array of integers and test_data[1] is an integer between 0 and 9.
The problem is that, from within the function, test_data[0] is always a 28x28 array of zeros, even though it is not within 'temp.hdf5'. test_data[1] always loads properly, though.
When these lines of code are run outside of the function, everything works just fine.
What is going on here?
please help
I'm a beginner to python programming and my problem is this:
I have to make a program which first reads a text file like this one->
A a 1 2 (line one)
A b 3 5 (line two)
A c 9 1
B d 2 4
B e 9 2
C r 3 4
...
and find out: for each First Value (A, B, C, ...), which second value (a, b, c, ...) has max (third value)*(fourth value) (1*2, 3*5, ...) value.
that is, in this example the result should be b, e, r.
And I need to do it 1) without using dictionary class and saving each data
or 2) devise a class and object and do the same thing.
(actually I have to make this program twice by using either methods)
What I'am really confused about is... I made this program first by using dictionary, but I have no idea how to do it with any of those two certain methods mentioned above.
I did this by making dictionary[dictionary[value]] format and (saving each line's data), and found out which one has max value for first value.
How can I do this not on this particular way?
Especially is it even possible to do this on method 1)? (without using dictionary class and saving each data)
thank you for reading my question
I'm really just beginning to learn about this programming and if any of you could give me some advice it would be really appreciated
here is what I've done so far:
The below code works by storing the maximum values and doing comparisons with the values currently being read from the file. This code is not complete as it does not intentionally handle instances where two of the products are the same and it also does not handle an edge case that you should be able to find using your example inputs. I've left those for you to complete.
max_vals = []
with open('FILE.TXT', 'r') as f:
max_first_val = None
max_second_val = None
max_prod = 0
for line in f:
vals = line.strip('\n').split(' ')
curr_prod = int(vals[2]) * int(vals[3])
if vals[0] != max_first_val and max_first_val is not None:
max_vals.append(max_second_val)
max_first_val = vals[0]
max_prod = 0
if curr_prod > max_prod:
max_first_val = vals[0]
max_second_val = vals[1]
max_prod = curr_prod
I am trying to train the system on some data, Sound_Fc is a 16X1 float array.
for i in range(0,26983):
Block_coo = X[0,i]
Fc = Block_coo[4]
Sound_Fc = Fc[:,0]
Vib_Fc = Fc[:,1]
y = np.matrix([[1.0],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16]])
testX, trainY, testY) = train_test_split(
Sound_Fc, y, test_size = 0.33, random_state=42)
dbn = NeuralNet(
layers=[
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
input_shape = (None, trainX.shape[0]),
hidden_num_units=8,
output_num_units=4,
output_nonlinearity=softmax,
update=nesterov_momentum,
update_learning_rate=0.3,
update_momentum=0.9,
regression=False,
max_epochs=5,
verbose=1,
)
dbn.fit(trainX,trainY)
But I'm getting this error
Warning (from warnings module):
File "C:\Users\Essam Seddik\AppData\Roaming\Python\Python27\site-packages\sklearn\cross_validation.py", line 399
% (min_labels, self.n_folds)), Warning)
Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=5.
Traceback (most recent call last):
File "C:\Essam Seddik\Deep Learning Python Tutorial\DNV_DeepLearn.py", line 77, in <module>
dbn.fit(trainX,trainY)
File "C:\Python27\lib\site-packages\nolearn-0.6adev-py2.7.egg\nolearn\lasagne\base.py", line 293, in fit
self.train_loop(X, y)
File "C:\Python27\lib\site-packages\nolearn-0.6adev-py2.7.egg\nolearn\lasagne\base.py", line 300, in train_loop
X, y, self.eval_size)
File "C:\Python27\lib\site-packages\nolearn-0.6adev-py2.7.egg\nolearn\lasagne\base.py", line 401, in train_test_split
kf = StratifiedKFold(y, round(1. / eval_size))
File "C:\Users\Essam Seddik\AppData\Roaming\Python\Python27\site-packages\sklearn\cross_validation.py", line 416, in __init__
label_test_folds = test_folds[y == label]
IndexError: too many indices for array
I tried xrange instead of range, and y=list() instead of the defined y. I tried also small numbers in the for loop range like 5, 10 and 100 instead of 26983.
I tried np.array and np.ndarray and np.atleast_2d. Nothing works !
At every iteration of that loop you are overwriting Sound_Fc. So, at the end of the loop, the value of Sound_Fc is X[0,26982][4][:,0]. You are also overwriting y with the same value over and over again at each iteration of the loop, it is basically a vector with values from 1 to 16. Basically, your total data is 16 points, and the y value of each is a unique value (something between 1 and 16). Then you split this into training and test data, so you make 5 of these 16 points your test set, and 11 of them your training set. With only a single example for each observed y output, your network model is complaining that it cannot extract enough information to predict these y values in the future.
If I understand correctly, instead of overwriting Sound_Fc and y at each iteration, you want to append them to growing x and y vectors. You can do this by using vstack which vertically stacks numpy arrays. Replace the following with that loop:
Sound_Fc = np.vstack( [X[0,i][4][:,0] for i in range(26983)] )
y = np.vstack([np.matrix(range(1,17)).T for i in range(26983)])
Before, the Sound_Fc had the shape (16,1) when you used it as your feature vector. Now it will have the shape (431728, 1). That number is 26983*16, since you're stacking 26983 vectors that have 16 elements each. Your y will have the shape (431728, 1).
[X[0,i][4][:,0] for i in range(26983)] creates a list of 26983 elements, each element is a (16,1) shape numpy array. np.vstack stacks them vertically to get a single, tall, (431728, 1) array. This is your feature vector.
np.matrix(range(1,17)) creates a matrix with the elements 1 to 16. This is of shape (1, 16). By taking its transpose with .T, we make it vertical and now its shape is (16,1). Again, we make a list from 26983 of these, and vstack them to get a (431728, 1) shape vector that basically goes from 1 to 16 and then goes to 1 again and to 16 again, basically repeats that 1-16 pattern over and over again. This is your output vector. Now, for each output (let's say 8, for instance), you have 26983 data points to learn from (well it will be 17809 once you split .66 of all this to be your training set) . Now your model will not complain about not having enough examples for a specific y output.
There might of course be other errors related to other stuff (I can't see your data -- I don't know what's in that big X).