Train skin pixels using Opencv CvNormalBayesClassifier - c++

I'm very new to OpenCV. I am trying to use the CvNormalBayesClassifier to train my program to learn skin pixel colours.
Currently I have got around 20 human pictures (face/other body parts) under different light conditions and backgrounds. I have also got 20 corresponding responses in which the skin parts are marked red and everything else marked green.
I have problem understanding how to use the function
bool CvNormalBayesClassifier::train(const CvMat* _train_data, const CvMat* _response, const Cv*Mat _var_idx = 0, const CvMat* _sample_idx=0,, bool update=false);
How should I use the current two picture libraries I have got to prepare the values that can be passed in as _train_data and _responses?
Many thanks.

You need to put in train_data the pixel values from your training image, and in responses an index corresponding to the class of this pixel (e.g. 1 for class skin, 0 for class non-skin). var_idx and sample_idx can be left as is, they are used to mask out some of the descriptors or samples in your training set. Set update to true/false depending on wether you get all the descriptors (all the pixels of all your training images) at once in case you can let it to false, or you process your training images incrementally (which might be better for memory issues), in which case you need to update your model.
Let me clarify you with the code (not checked, and using the C++ interface to OpenCV which I strongly recommand instead of the old C)
int main(int argc, char **argv)
{
CvNormalBaseClassifier classifier;
for (int i = 0; i < argc; ++i) {
cv::Mat image = // read in your training image, say cv::imread(argv[i]);
// read your mask image
cv::Mat mask = ...
cv::Mat response = mask == CV_RGB(255,0,0); // little trick: you said red pixels in your mask correspond to skin, so pixels in responses are set to 1 if corresponding pixel in mask is red, 0 otherwise.
cv::Mat responseInt;
response.convertTo(responsesInt, CV_32S); // train expects a matrix of integers
image = image.reshape(0, image.rows*image.cols); // little trick number 2 convert your width x height, N channel image into a witdth*height row matrix by N columns, as each pixel should be considere as a training sample.
responsesInt = responsesInt.reshape(0, image.rows*image.cols); // the same, image and responses have the same number of rows (w*h).
classifier.train(image, responsesInt, 0, 0, true);
}

I did a google search on this class but didn't find much information, and actually even the official opencv document does not provide direct explanation on the parameters. But I did notice one thing in opencv document
The method trains the Normal Bayes classifier. It follows the
conventions of the generic CvStatModel::train() approach with the
following limitations:
which direct me to CvStatModel class and from there I found something useful. And probably you can also take a look on the book from page 471 which gives you more details of this class. The book is free from google Books.

Related

Set colour limit axis in OpenCV 4 (c++) akin to Matlab's CAXIS

Matlab offers the ability to set colour limits for the current axis using CAXIS. OpenCV has applyColorMap which can be used to highlight differences in pixel intensity in a greyscale image which I believe maps pixel from 0 - 255.
I am new to Matlab/Image-processing and have been asked to port a simple program from MatLab which uses the CAXIS function to change the "brightness" of a colour map. I have no experience in Matlab but it appears that they use this function to "lower" the intensity requirements needed for pixels to be mapped to a more intense colour on the map
i.e. Colour map using "JET"
When brightness = 1, red = 255
When brightness = 10, red >= 25
The matlab program allows 16bit images to be read in and displayed which obviouly gives higher pixel values whereas everything i've read and done indicates OpenCV only supports 8 bit images (for colour maps)
Therefore my question is, is it possible to provide similar functionality in OpenCV? How do you set the axis limit for a colourmap/how do you scale the colour map lookup table so that "less" intense pixels are scaled to the more intense regions?
A similar question was asked with a reply stating the array needs to be "normalised" but unfortunately I don't quite know how to achieve this and can't reply to the answer as i don't have enough rep!
I have gone ahead and used cv::normalize to set the max value in the array to be maxPixelValue/brightness but that doesn't work at all.
I have also experimented and tried converting my 16bit image into a CV_8UC1 with a scale factor to no avail. Any help would be greatly appreciated!
In my opinion you can use cv::normalize to "crop" values in the source picture to the corresponding ones in color map you are interested in. Say you want your image to be mapped to the blue-ish region of Jet colormap then you should do something like:
int minVal = 0, maxVal = 80;
cv::normalize(src,dst, minVal, maxVal, cv::NORM_MINMAX);
If you plan to apply some kind of custom map it's fairly easy for 1-or3-channel 8-bit image, you only need to create LUT with 255 values (with proper number of channels) and apply it using cv::LUT, more about it in this blog, also see the dosc about LUT
If the image you are working is of different depth, 16-bit or even floating point data I guess all you need to do is write a function like:
template<class T>
T customColorMapper(T input_pixel)
{
T output_pixel = 0;
// do something with output_pixel basing on intput_pixel
return output_pixel;
}
and apply it to each source image pixel like:
cv::Mat dst_image = src_image.clone(); //copy data
dst_image.forEach<TYPE>([](TYPE& input_pixel, const int* pos_row_col) -> void {
input_pixel = customColorMapper<TYPE>(input_pixel);
});
of course TYPE need to be a valid type. Maybe specialized version of this function taking cv::Scalar or cv::Vec3-something would be nice if you need to work with multiple channels.
Hope this helps!
I managed to replicate the MATLAB behaviour but had to resort to manually iterating over each pixel and setting the value to the maximum value for the image depth or scaling the value where needed.
my code looked something like this
cv::minMaxLoc(dst, &min, &max);
double axisThreshold = floor(max / contrastLevel);
for (int i = 0; i < dst.rows; i++)
{
for (int j = 0; j < dst.cols; j++)
{
short pixel = dst.at<short>(i, j);
if (pixel >= axisThreshold)
{
pixel = USHRT_MAX;
}
else
{
pixel *= (USHRT_MAX / axisThreshold);
}
dst.at<short>(i, j) = cv::saturate_cast<short>(pixel);
}
}
In my example I had a slider which adjusted the contrast/brightness (we called it contrast, the original implementation called it brightness).
When the contrast/brightness was changed, the program would retrieve the maximum pixel value and then compute the axis limit by doing
calculatedThreshold = Max pixel value / contrast
Each pixel more than the threshold gets set to MAX, each pixel lower than the threshold gets multiplied by a scale factor calculated by
scale = MAX Pixel Value / calculatedThreshold.
TBH i can't say I fully understand the maths behind it. I just used trial and error until it worked; any help in that department would be appreciated HOWEVER it seems to do what i want to!
My understanding of the initial matlab implementation and the terminology "brightness" is in fact their attempt to scale the colourmap so that the "brighter" the image, the less intense each pixel had to be to map to a particular colour in the colourmap.
Since applycolourmap only works on 8 bit images, when the brightness increases and the colourmap axis values decrease, we need to ensure the values of the pixels scale accordingly so that they now match up with the "higher" intensity values in the map.
I have seen numerous OPENCV tutorials which use this approach to changing the contrast/brightness but they often promote the use of optimised convertTo (especially if you're trying to use the GPU). However as far as I can see, convertTo applies the aplha/beta values uniformly and not on a pixel by pixel basis therefore I can't use that approach.
I will update this question If i found more suitable OPENCV functions to achieve what I want.

Thresholding an image in OpenCV using another image matrix of pixel standard deviations as threshold values

I have two videos, one of a background and one of that same background with a person sitting in the frame. I generated two images from the video of just the background: the mean image of the background video (by accumulating the frames and dividing by the number of frames) and an image of standard deviations from the mean per pixel, taken over the frames. In other words, I have two images representing the Gaussian distribution of the background video. Now, I want to threshold an image, not using one fixed threshold value for all pixels, but using the standard deviations from the image (a different threshold per pixel). However, as far as I understand, OpenCV's threshold() function only allows for one fixed threshold. Are there functions I'm missing, or is there a workaround?
A cv::Mat provides methodology to accomplish this.
The setTo() methods takes an optional InputArray as mask.
Assuming the following:
std is your standard deviations cv::Mat, in is the cv::Mat you want to threshold and thresh is the factor for your standard deviations.
Using these values the custom thresholding could be done like this:
// Computes threshold values based on input image + std dev
cv::Mat mask = in +/- (std * thresh);
// Set in.at<>(x,y) to 0 if value is lower than mask.at<>(x,y)
in.setTo(0, in < mask);
The in < mask expression creates a new MatExpr object, a matrix which is 0 at every pixel where the predicate is false, 255 otherwise.
This is a handy way to implement custom thresholding.

How to align 2 images based on their content with OpenCV

I am totally new to OpenCV and I have started to dive into it. But I'd need a little bit of help.
So I want to combine these 2 images:
I would like the 2 images to match along their edges (ignoring the very right part of the image for now)
Can anyone please point me into the right direction? I have tried using the findTransformECC function. Here's my implementation:
cv::Mat im1 = [imageArray[1] CVMat3];
cv::Mat im2 = [imageArray[0] CVMat3];
// Convert images to gray scale;
cv::Mat im1_gray, im2_gray;
cvtColor(im1, im1_gray, CV_BGR2GRAY);
cvtColor(im2, im2_gray, CV_BGR2GRAY);
// Define the motion model
const int warp_mode = cv::MOTION_AFFINE;
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
cv::Mat warp_matrix;
// Initialize the matrix to identity
if ( warp_mode == cv::MOTION_HOMOGRAPHY )
warp_matrix = cv::Mat::eye(3, 3, CV_32F);
else
warp_matrix = cv::Mat::eye(2, 3, CV_32F);
// Specify the number of iterations.
int number_of_iterations = 50;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-10;
// Define termination criteria
cv::TermCriteria criteria (cv::TermCriteria::COUNT+cv::TermCriteria::EPS, number_of_iterations, termination_eps);
// Run the ECC algorithm. The results are stored in warp_matrix.
findTransformECC(
im1_gray,
im2_gray,
warp_matrix,
warp_mode,
criteria
);
// Storage for warped image.
cv::Mat im2_aligned;
if (warp_mode != cv::MOTION_HOMOGRAPHY)
// Use warpAffine for Translation, Euclidean and Affine
warpAffine(im2, im2_aligned, warp_matrix, im1.size(), cv::INTER_LINEAR + cv::WARP_INVERSE_MAP);
else
// Use warpPerspective for Homography
warpPerspective (im2, im2_aligned, warp_matrix, im1.size(),cv::INTER_LINEAR + cv::WARP_INVERSE_MAP);
UIImage* result = [UIImage imageWithCVMat:im2_aligned];
return result;
I have tried playing around with the termination_eps and number_of_iterations and increased/decreased those values, but they didn't really make a big difference.
So here's the result:
What can I do to improve my result?
EDIT: I have marked the problematic edges with red circles. The goal is to warp the bottom image and make it match with the lines from the image above:
I did a little bit of research and I'm afraid the findTransformECC function won't give me the result I'd like to have :-(
Something important to add:
I actually have an array of those image "stripes", 8 in this case, they all look similar to the images shown here and they all need to be processed to match the line. I have tried experimenting with the stitch function of OpenCV, but the results were horrible.
EDIT:
Here are the 3 source images:
The result should be something like this:
I transformed every image along the lines that should match. Lines that are too far away from each other can be ignored (the shadow and the piece of road on the right portion of the image)
By your images, it seems that they overlap. Since you said the stitch function didn't get you the desired results, implement your own stitching. I'm trying to do something close to that too. Here is a tutorial on how to implement it in c++: https://ramsrigoutham.com/2012/11/22/panorama-image-stitching-in-opencv/
You can use Hough algorithm with high threshold on two images and then compare the vertical lines on both of them - most of them should be shifted a bit, but keep the angle.
This is what I've got from running this algorithm on one of the pictures:
Filtering out horizontal lines should be easy(as they are represented as Vec4i), and then you can align the remaining lines together.
Here is the example of using it in OpenCV's documentation.
UPDATE: another thought. Aligning the lines together can be done with the concept similar to how cross-correlation function works. Doesn't matter if picture 1 has 10 lines, and picture 2 has 100 lines, position of shift with most lines aligned(which is, mostly, the maximum for CCF) should be pretty close to the answer, though this might require some tweaking - for example giving weight to every line based on its length, angle, etc. Computer vision never has a direct way, huh :)
UPDATE 2: I actually wonder if taking bottom pixels line of top image as an array 1 and top pixels line of bottom image as array 2 and running general CCF over them, then using its maximum as shift could work too... But I think it would be a known method if it worked good.

OpenCV Face Recognition strange result

I have been using OpenCV's SVM and RF for a multi-class face recognition problem with 11 classes and only 5 images per class. I used two kinds of features - initially a toy intensity image feature (just each image resized to 32x32 grayscale) and then the second feature was simply another toy feature using Tan Triggs preprocessing(link). Here is the feature code:
void Feature::makeFeature(cv::Mat &image, cv::Mat &result)
{
cv::resize( image, image, cv::Size(32, 32), 0, 0, cv::INTER_CUBIC );
cv::equalizeHist(image, image);
// Images must be aligned - Only pitch executed, yaw and roll assumed negligible
algmt->getAlignedImage( image, image ); // image alignment
// tan triggs
{
tan_triggs_preprocessing(image, result);
result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
// if plain intensity
{
// image.copyTo(result);
// result.convertTo(result, CV_32F, 1.0f/255.0f);
// result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
}
Where the tan_triggs_preprocessing function is the same as the Tan Triggs preprocessing function given in the link. I added one step - i normalized the result between 0 and 1.
The results on test for both were not very good, as expected, but then I made a silly mistake and discovered something strange: When I accidentally gave the training directory as input for both training and test, I get 100% results on the plain intensity feature, but the Tan Triggs feature gives the following as result:
SVM Training Complete
Total number of correct: 51 and accuracy: 92.7273
RF Training Complete
Total number of correct: 53 and accuracy: 96.3636
I do know however much you overfit the result should be perfect when the training set is input to test. Everything else is standard, both SVM and RF are standard as in the OpenCV examples. Besides I get 100% for plain intensity feature so of course I am mucking something up here when using Tan Triggs. Anyone has any idea what mistake I am making?
I have used other complex features like LTPs and LQPs without issue, but this preprocessing method is something I want to use. I use the Jain-Learned Miller congealing algorithm for alignment as I assume frontals for face recognition, no pose correction.

using OpenCV and SVM with images

I am having difficulty with reading an image, extracting features for training, and testing on new images in OpenCV using SVMs. can someone please point me to a great link? I have looked at the OpenCV Introduction to Support Vector Machines. But it doesn't help with reading in images, and I am not sure how to incorporate it.
My goals are to classify pixels in an image. These pixel would belong to a curves. I understand forming the training matrix (for instance,
image A
1,1 1,2 1,3 1,4 1,5
2,1 2,2 2,3 2,4 2,5
3,1 3,2 3,3 3,4 3,5
I would form my training matrix as a [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }
However, I am a little confuse about the labels. From my understanding, I have to specify which row (image) in the training matrix corresponds, which corresponds to a curve or non-curve. But, how can I label a training matrix row (image) if there are some pixels belonging to the curve and some not belonging to a curve. For example, my training matrix is [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }, pixels {1,1} and {1,4} belong to the curve but the rest does not.
I've had to deal with this recently, and here's what I ended up doing to get SVM to work for images.
To train your SVM on a set of images, first you have to construct the training matrix for the SVM. This matrix is specified as follows: each row of the matrix corresponds to one image, and each element in that row corresponds to one feature of the class -- in this case, the color of the pixel at a certain point. Since your images are 2D, you will need to convert them to a 1D matrix. The length of each row will be the area of the images (note that the images must be the same size).
Let's say you wanted to train the SVM on 5 different images, and each image was 4x3 pixels. First you would have to initialize the training matrix. The number of rows in the matrix would be 5, and the number of columns would be the area of the image, 4*3 = 12.
int num_files = 5;
int img_area = 4*3;
Mat training_mat(num_files,img_area,CV_32FC1);
Ideally, num_files and img_area wouldn't be hardcoded, but obtained from looping through a directory and counting the number of images and taking the actual area of an image.
The next step is to "fill in" the rows of training_mat with the data from each image. Below is an example of how this mapping would work for one row.
I've numbered each element of the image matrix with where it should go in the corresponding row in the training matrix. For example, if that were the third image, this would be the third row in the training matrix.
You would have to loop through each image and set the value in the output matrix accordingly. Here's an example for multiple images:
As for how you would do this in code, you could use reshape(), but I've had issues with that due to matrices not being continuous. In my experience I've done something like this:
Mat img_mat = imread(imgname,0); // I used 0 for greyscale
int ii = 0; // Current column in training_mat
for (int i = 0; i<img_mat.rows; i++) {
for (int j = 0; j < img_mat.cols; j++) {
training_mat.at<float>(file_num,ii++) = img_mat.at<uchar>(i,j);
}
}
Do this for every training image (remembering to increment file_num). After this, you should have your training matrix set up properly to pass into the SVM functions. The rest of the steps should be very similar to examples online.
Note that while doing this, you also have to set up labels for each training image. So for example if you were classifying eyes and non-eyes based on images, you would need to specify which row in the training matrix corresponds to an eye and a non-eye. This is specified as a 1D matrix, where each element in the 1D matrix corresponds to each row in the 2D matrix. Pick values for each class (e.g., -1 for non-eye and 1 for eye) and set them in the labels matrix.
Mat labels(num_files,1,CV_32FC1);
So if the 3rd element in this labels matrix were -1, it means the 3rd row in the training matrix is in the "non-eye" class. You can set these values in the loop where you evaluate each image. One thing you could do is to sort the training data into separate directories for each class, and loop through the images in each directory, and set the labels based on the directory.
The next thing to do is set up your SVM parameters. These values will vary based on your project, but basically you would declare a CvSVMParams object and set the values:
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::POLY;
params.gamma = 3;
// ...etc
There are several examples online on how to set these parameters, like in the link you posted in the question.
Next, you create a CvSVM object and train it based on your data!
CvSVM svm;
svm.train(training_mat, labels, Mat(), Mat(), params);
Depending on how much data you have, this could take a long time. After it's done training, however, you can save the trained SVM so you don't have to retrain it every time.
svm.save("svm_filename"); // saving
svm.load("svm_filename"); // loading
To test your images using the trained SVM, simply read an image, convert it to a 1D matrix, and pass that in to svm.predict():
svm.predict(img_mat_1d);
It will return a value based on what you set as your labels (e.g., -1 or 1, based on my eye/non-eye example above). Alternatively, if you want to test more than one image at a time, you can create a matrix that has the same format as the training matrix defined earlier and pass that in as the argument. The return value will be different, though.
Good luck!