CV2: preprossing images for machine learning

CV2: preprossing images for machine learning - python-2.7

I planning to create svm with opencv2 machine learning libraries to process some images. I have done some digging on this site and I have found I need to convert the images into vectors and create a matrix out of these vectors. However I have found no information how to do that. Please help. Please also not that I am using python

probably all opencv ml algos want the following inputs:
a NxM trainData (float)Mat, that is composed like:
one row(N) per feature, where the feature size is M
a Nx1 array of labels(class ids) where each item is the label for the corresponding feature at index i
so, if you have a lot of 1d, flattened features, and corresponding labels you would:
# pseudocode
svm = cv2.SVM() # getting the params right is a science of its own..
traindata, trainlabels = [],[]
for i in (my trainig data ):
traindata.extend(feature) # again, 1 flattened array of numbers
trainlabels.append(label) # 1 class id for the feature above
# now train it:
svm.train(np.array(traindata), np.array(trainlabels))
# after that, we can go and predict labels from new test input,
# it will return the predicted label(same one you fed to the training before...)
p = svm.predict(test_feature)
also look here, please !

Related

Approach to get the weight values from the pre-trained weights from Darknet?

I'm currently trying to implement YOLOv3 object detection model in C(only detection, not training).
I have tested my convolution method with arbitrary values and it seems to be working as I expected.
Before stacking up multiple method calls to do forward propagation, I thought it would be safe to test with the actual pretrained weight file data.
When I look up Darknet's pre-trained weight file, it was a huge chunk of binary files. I tried to convert it to hex and decimals, but it still doesn't look simple to pinpoint what part of values to use.
So, my question is, what should I do to extract the decimal numbers of the weights or the filter values so that I can use them in the same order of the forward propagation happening in YOLOv3?
*I'm currently trying to build my c version of YOLOv3 using the structure image shown in https://www.itread01.com/content/1541167345.html
*My c code will be run on an FPGA board called MicroZed, along with other HDL code.
*I tried to plug some printf functions into some places of Darknet code to see what kinds of data are moving around when YOLOv3 runs, however, when I ran it on in Linux terminal, it didn't show anything new and kept outputting the same results.
Any help or advice will be really appreciated. Thank you!

I am not too sure if there is a direct way to read darknet weights, but you can convert it into .h5 format and obtain the weight values from it
You can convert the darknet yolov3 weights into .h5 format (used by keras) by using the appropriate command from this repository.
You can choose the command based on your Yolo version from the list shown in the ReadMe of the linked repo. For the standard yolov3, the command for converting is
python tools/model_converter/convert.py cfg/yolov3.cfg weights/yolov3.weights weights/yolov3.h5
Once you have the .h5weights, you can use the below code snippet for obtaining the
values from the weights. credit/source
import h5py
path = "<path to weights>.h5"
weights = {}
keys = []
with h5py.File(path, 'r') as f: # open file
f.visit(keys.append) # append all keys to list
for key in keys:
if ':' in key: # contains data if ':' in key
param_name = f[key].name
weights[f[key].name] = f[key].value
print(param_name,weights[f[key].name])

How to prepare the multilevel multivalued training dataset in python

I am a beginner in machine learning. My academic project involves detecting human posture from acceleration and gyro data. I am stuck at the beginning itself. My accelerometer data has x,y,z values and gyro also has x,y,z values stored in file acc.csv and gyro.csv. I want to classify the 'standing', 'sitting', 'walking' and 'lying' position. The idea is to train the machine using some ML algorithm (supervised) and then throw a new acc + gyro data set to identify what this new dataset predict (what the subject is doing at present). I am facing the following problems--
Constructing a training dataset -- I think my activities will be dependent variable, and acc & gyro axis readings will be independent. So if I like to combine it in single matrix with each element of the matrix again has it's own set of acc and gyro value [Something like main and sub matrix], how can I do that? or is there any alternative idea to do the same?
How can I take the data of multiple activities with multiple readings in a single training matrix,
I mean 10 walking data each with it's own acc(xyz) and gyro (xyz) + 10 standing data each with it's own acc(xyz) and gyro (xyz) + 10 sitting data each with it's own acc(xyz) and gyro (xyz) and so on.
Each data file has different number of records and time stamp, how to bring them into a common platform.
I know I am asking very basic things but these are the confusion part nobody has clearly explained to me. I am feeling like standing in front of a big closed door, inside very interesting things are happening where I cannot participate at this moment with my limited knowledge. My mathematical background is high school level only. Please help.
I have gone through some projects on activity recognition in Github. But they are way too complicated for a beginner like me.
import pandas as pd
import os
import warnings
from sklearn.utils import shuffle
warnings.filterwarnings('ignore')
os.listdir('../input/testtraindata/')
base_train_dir = '../input/testtraindata/Train_Set/'
#Train Data
train_data = pd.DataFrame(columns = ['activity','ax','ay','az','gx','gy','gz'])
train_folders = os.listdir(base_train_dir)
for tf in train_folders:
files = os.listdir(base_train_dir+tf)
for f in files:
df = pd.read_csv(base_train_dir+tf+'/'+f)
train_data = pd.concat([train_data,df],axis = 0)
train_data = shuffle(train_data)
train_data.reset_index(drop = True,inplace = True)
train_data.head()
The Data Set
Problem in Train_set
Surprisingly if I remove the last 'gz' from
train_data = pd.DataFrame(columns =['activity','ax','ay','az','gx','gy','gz'])
Everything is working fine.

You have the data labeled? --> position of x,y,z... = positure?
I have no clue about the values (as I have not seen the dataset, and have no clue about positions, acc or gyro), but Im guessing you should have a dataset within a matrise with x, y, z as categories and a target category ;"positions".
If you need all 6 (3 from one csv and 3 from the other) to define the positions you can make 6 categories + positions.
Something like : x_1, y_1 z_1 , x_2, y_2, and z_2 + position label ("position" category).
You can also make each position an own category with 0/1 as true/false.
"sitting" , "walking" etc... and have 0 and 1 as the values in the columns.
Is the timestamp of any importance towards the position? If it is not a feature of importance I would just drop it. If it is important in some way, you might want to bin them.
Here is a beginners guide from Medium in which you can see a bit how to preprocess your data. It also shows one hot encoding :)
https://medium.com/hugo-ferreiras-blog/dealing-with-categorical-features-in-machine-learning-1bb70f07262d
Also try googling Preprocessing your data, then you will probably find the right recipe

Can we give the test data, without labelling them?

I came across this snippet in the Tensorflow documentation, MNIST For ML Beginners.
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
Now, I want to feed my own test images, without labelling them and would like the model to predict the labels, how do I achieve this?

Yes you can, but it would not be deep learning instead it would be clustering. ( Ex: K means Clustering )
Basic idea is like the following:
Create two placeholders for input and centroids
Decide a distance metric
Create graph
feed only dataset to run the graph

Building Speech Dataset for LSTM binary classification

I'm trying to do binary LSTM classification using theano.
I have gone through the example code however I want to build my own.
I have a small set of "Hello" & "Goodbye" recordings that I am using. I preprocess these by extracting the MFCC features for them and saving these features in a text file. I have 20 speech files(10 each) and I am generating a text file for each word, so 20 text files that contains the MFCC features. Each file is a 13x56 matrix.
My problem now is: How do I use this text file to train the LSTM?
I am relatively new to this. I have gone through some literature on it as well but not found really good understanding of the concept.
Any simpler way using LSTM's would also be welcome.

There are many existing implementation for example Tensorflow Implementation, Kaldi-focused implementation with all the scripts, it is better to check them first.
Theano is too low-level, you might try with keras instead, as described in tutorial. You can run tutorial "as is" to understand how things goes.
Then, you need to prepare a dataset. You need to turn your data into sequences of data frames and for every data frame in sequence you need to assign an output label.
Keras supports two types of RNNs - layers returning sequences and layers returning simple values. You can experiment with both, in code you just use return_sequences=True or return_sequences=False
To train with sequences you can assign dummy label for all frames except the last one where you can assign the label of the word you want to recognize. You need to place input and output labels to arrays. So it will be:
X = [[word1frame1, word1frame2, ..., word1framen],[word2frame1, word2frame2,...word2framen]]
Y = [[0,0,...,1], [0,0,....,2]]
In X every element is a vector of 13 floats. In Y every element is just a number - 0 for intermediate frames and word ID for final frame.
To train with just labels you need to place input and output labels to arrays and output array is simpler. So the data will be:
X = [[word1frame1, word1frame2, ..., word1framen],[word2frame1, word2frame2,...word2framen]]
Y = [[0,0,1], [0,1,0]]
Note that output is vectorized (np_utils.to_categorical) to turn it to vectors instead of just numbers.
Then you create network architecture. You can have 13 floats for input, a vector for output. In the middle you might have one fully connected layer followed by one lstm layer. Do not use too big layers, start with small ones.
Then you feed this dataset into model.fit and it trains you the model. You can estimate model quality on heldout set after training.
You will have a problem with convergence since you have just 20 examples. You need way more examples, preferably thousands to train LSTM, you will only be able to use very small models.

quadratic featurizer: preprocessing with fit_transform

the following example is written in Python and is taken from the book Mastering Machine Learning.
Overview of the task:
training data is stored in column vectors X_train (features) and y_train (response variables)
data for testing purposes is respectively stored in X_test and y_test
now fit a model to the training data using polynomial regression (in this case quadratic)
The author's approach (imports and data initialization excluded):
quad_featurizer = PolynomialFeatures(degree=2)
X_train_quad = quad_featurizer.fit_transform(X_train)
X_test_quad = quad_featurizer.transform(X_test)
regressor_quad = LinearRegression()
regressor_quad.fit(X_train_quad, y_train)
The author didn't comment the code or tells anything more about the methods used. Since the scikit-learn API couldn't give me a satisfying answer either, I'd like to ask you.
Why would I use fit_transform and not just transform for preprocessing the training data? I mean the actual fitting is done with the regressor_quad object, so fit_transform is redundant, isn't it?

Those featurizers of scikit must be adjusted to your specific dataset and only afterwards can transform it to new feature vectors. fit() performs that adjustment. Therefore you need to first call fit() and then transform(), or both at the same time via fit_transform().
In your example PolynomialFeatures is used to project your training data into a new higher-dimensional space. So a vector (3, 6) would become (1, 3, 6, 3*3, 3*6, 6*6). In fit() PolynomialFeatures learns the size of your training vectors and in transform() it creates new training vectors from the old ones. So X_train_quad is a new matrix with a shape that is different from X_train. Afterwards the same is done with X_test but then PolynomialFeatures knows already the sizes of your vectors so it doesn't have to be fit() again. LinearRegression is then trained on your new training data (X_train_quad) via its fit() method, which is completely separated from PolynomialFeatures and therefore its fit() doesn't really have anything to do with fit() of PolynomialFeatures.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

CV2: preprossing images for machine learning - python-2.7

Related

Approach to get the weight values from the pre-trained weights from Darknet?

How to prepare the multilevel multivalued training dataset in python

Can we give the test data, without labelling them?

Building Speech Dataset for LSTM binary classification

quadratic featurizer: preprocessing with fit_transform

Categories

Resources