Problem;
Since I didn't find an implementation of LBP in the OpenCV lib, I added the faces module to my OpenCV build. Which does have a LBP implementation.
I'm trying to do texture analysis using local binary patterns (lbp) to determine what is road and what are lines and what isn't road. So I divided my problem up in 3 'labels'. (Road / Lines / The rest )
After that I took 3 random sample images and made subimages with these labels. In other words I selected the road, lines, all the rest and used this as input for the training.
Training:
for (int i = 0; i < names.size(); i++) {
Mat trainingData = imread(names[i]), output;
if (!trainingData.empty()) {
cvtColor(trainingData, output, CV_RGB2GRAY);
int pos = names[i].find_last_of('_');
cv::String l = names[i].substr(pos + 1, 1);
const char *sub = l.c_str();
int label = atoi(sub);
grayscaleImages.push_back(output);
labels.push_back(label);
}
}
Ptr<LBPHFaceRecognizer> lbp = createLBPHFaceRecognizer(2, 4, 10, 10);
lbp->train(grayscaleImages, labels);
Using LBP to predict:
for (int i = 0; i < dataset.images.size(); i++) {
Mat image = (dataset.images[i]).clone();
Mat original = image.clone();
for (int i = 0 ; i < image.size().width; i += squareSize) {
for (int j = image.size().height / 2 ; j < image.size().height; j += squareSize) {
Mat subImage = image(Range(j, j + squareSize), Range(i, i + squareSize));
subImage = applySobel(subImage);
int predict = lbp->predict(subImage);
}
}
imshow("Image", original);
waitKey(0);
}
My question is why this isn't working? I'm kinda puzzeled about the prediction aswell. It doesn't seem to be anywhere near close to what I want it to do?
Do I have the implement LBP myself? Does the LBP implementation of the faces module do something special?
Do you have any suggestion on other methods for texture analysis specifically for the case of road / no road?
So here is a general overview of the files I'm using.
Input:
Selecting our training data:
Results of selecting: (every file has a _number to specify the label)
Final result after training on a random image of the same set. Purple represents road, green is line (or road marking).
Related
I am play with OpenCV and SVM to make a classifier to predict facial expression. I have no problem to classify test dadaset, but when I try to predict a new image, I get this:
OpenCV Error: Assertion failed (samples.cols == var_count && samples.type() == CV_32F) in cv::ml::SVMImpl::predict
Error is pretty clear and I have a different number of columns, but of the same type.
I do not know how to achieve that, because I have a matrix of dimensions 1xnumber_of_features, but numbers_of_features is not the same of the trained and tested samples. How can I extract the same number of features from another image? Am I missing something?
To train classifier I did:
Detect face and save ROI;
Sift to extract features;
kmeans to cluster them;
bag of words to get the same numbers of features for each image;
pca to reduce;
train on train dadaset;
predict on test dadaset;
On the new image I did the same thing.
I tried to resize the new image to the same size, but nothing, same error ( and different number of columns, aka features). Vectors are of the same type (CF_32F).
[EDIT 1] Let's try to be more specific.
After succesfuly trained my classifier, I save SVM model in this way
svmClassifier->save(baseDatabasePath);
Then I load it when I need to do real time prediction in this way
cv::Ptr<cv::ml::SVM> svmClassifier;
svmClassifier = cv::ml::StatModel::load<ml::SVM>(path);
Then loop,
while (true)
{
getOneImage();
cv::Mat feature = extractFeaturesFromSingleImage();
float labelPredicted = svmClassifier->predict(feature);
cout << "Label predicted is: " << labelPredicted << endl;
}
But predict returns the error. feature dimension is 1x66, for example. As you can see below, I need like 140 features
<?xml version="1.0"?>
<opencv_storage>
<opencv_ml_svm>
<format>3</format>
<svmType>C_SVC</svmType>
<kernel>
<type>RBF</type>
<gamma>5.0625000000000009e-01</gamma></kernel>
<C>1.2500000000000000e+01</C>
<term_criteria><epsilon>1.1920928955078125e-07</epsilon>
<iterations>1000</iterations></term_criteria>
<var_count>140</var_count>
<class_count>7</class_count>
<class_labels type_id="opencv-matrix">
<rows>7</rows>
<cols>1</cols>
<dt>i</dt>
<data>
0 1 2 3 4 5 6</data></class_labels>
<sv_total>172</sv_total>
I do not know how achieve 140 features, when SIFT, FAST or SURF just give me around 60 features. What am I missing?
EDIT 2: I am going to try to be more formal: how can I put my real time sample on the same dimension of train and test dataset?
EDIT 3:
Extract features with sift and push on a vector of mat.
std::vector<cv::Mat> featuresVector;
for (int i = 0; i < numberImages; ++i)
{
cv::Mat face = cv::imread(facePath, CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat featuresExtracted = runExtractFeature(face, featuresExtractionAlgorithm);
featuresVector.push_back(featuresExtracted);
}
Get total features extracted from all images.
int numberFeatures = 0;
for (int i = 0; i < featuresVector.size(); ++i)
{
numberFeatures += featuresVector[i].rows;
}
Prepare a mat to cluster features (I tried to follow this example)
cv::Mat featuresData = cv::Mat::zeros(numberFeatures, featuresVector[0].cols, CV_32FC1);
int currentIndex = 0;
for (int i = 0; i < featuresVector.size(); ++i)
{
featuresVector[i].copyTo(featuresData.rowRange(currentIndex, currentIndex + featuresVector[i].rows));
currentIndex += featuresVector[i].rows;
}
Perform clustering (I do not know how this parameter suite my case, my I think can be ok for now)
cv::Mat labels;
cv::Mat centers;
int binSize = 1000;
kmeans(featuresData, binSize, labels, cv::TermCriteria(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 100, 1.0), 3, KMEANS_PP_CENTERS, centers);
Prepare a mat to perform bow.
cv::Mat featuresDataHist = cv::Mat::zeros(numberImages, binSize, CV_32FC1);
for (int i = 0; i < numberImages; ++i)
{
cv::Mat feature = cv::Mat::zeros(1, binSize, CV_32FC1);
int numberImageFeatures = featuresVector[i].rows;
for (int j = 0; j < numberImageFeatures; ++j)
{
int bin = labels.at<int>(currentIndex + j);
feature.at<float>(0, bin) += 1;
}
cv::normalize(feature, feature);
feature.copyTo(featuresDataHist.row(i));
currentIndex += featuresVector[i].rows;
}
PCA to try to reduce dimension.
cv::PCA pca(featuresDataHist, cv::Mat(), CV_PCA_DATA_AS_ROW, 50/*0.90*/);
cv::Mat feature;
for (int i = 0; i < numberImages; ++i)
{
feature = pca.project(featuresDataHist.row(i));
}
I am trying to copy certain rows from src image to new image called gaps. The gaps image will contain only few rows. However the program crashes at the line with copyTo. The Mat src image is correct, it contains my image because I can view it by imshow().
Mat gaps;
int gap = 6;
for (int r = 0; r < src.rows; r++)
{
if ( r % gap == 0 )
src.row(r).copyTo(gaps.row(r));
}
imshow("gaps", gaps);
waitKey(0);
I am using OpenCV, Visual Studio 2010 C++ on Windows XP.
I tried to add this:
gaps.create(CV_8UC3, 2056,2056); to specify depth and dimensions but it still crashes.
Try this:
// if you want your background to be black --> Scalar(0,0,0)
Mat gaps = Mat(src.size(), src.type(), Scalar(0,0,0));
This is what you'll get, I don't know if that is what you expect/want.
Code
// set your input image
Mat src = imread("{path to input image}");
// your code with the change I proposed
Mat gaps = Mat(src.size(), src.type(), Scalar(0,0,0));
int gap = 6;
for (int r = 0; r < src.rows; r++) {
if ( r % gap == 0 )
src.row(r).copyTo(gaps.row(r));
}
imshow("gaps", gaps);
// create the result image
Mat result = Mat(Size(src.cols * 2, src.rows), src.type(), Scalar(0,0,0));
src.copyTo(result(Rect(0,0,src.cols,src.rows)));
gaps.copyTo(result(Rect(src.cols,0,src.cols,src.rows)));
imshow("result", result);
waitKey();
I am looking for a way to take an image and get masks of all objects in it by color. My goal is to be able to separate similarly colored objects into layers so I can further examine each layer. The plan is to use each mask against the original image to create a histogram of the colors in each object and determine the similarity with other objects in the image. If something is similar enough it will be combined with other objects to form a layer.
The problem is that I can not find a function in opencv to find all objects in an image based on color contiguity. I am sure such an algorithm exists, but it seems to be evading me. Does anyone know of an algorithm or function like this?
The best method that I have found is K-means Clustering. This separates the image into different layers based on color. It uses a k-neighbors algorithm to do so. With this I am able to effectively split the image into several layers that are of similar color.
#define numClusters 7
cv::Mat src = cv::imread("img0.png");
cv::Mat kMeansSrc(src.rows * src.cols, 3, CV_32F);
//resize the image to src.rows*src.cols x 3
//cv::kmeans expects an image that is in rows with 3 channel columns
//this rearranges the image into (rows * columns, numChannels)
for( int y = 0; y < src.rows; y++ )
{
for( int x = 0; x < src.cols; x++ )
{
for( int z = 0; z < 3; z++)
kMeansSrc.at<float>(y + x*src.rows, z) = src.at<Vec3b>(y,x)[z];
}
}
cv::Mat labels;
cv::Mat centers;
int attempts = 2;
//perform kmeans on kMeansSrc where numClusters is defined previously as 7
//end either when desired accuracy is met or the maximum number of iterations is reached
cv::kmeans(kMeansSrc, numClusters, labels, cv::TermCriteria( CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 8, 1), attempts, KMEANS_PP_CENTERS, centers );
//create an array of numClusters colors
int colors[numClusters];
for(int i = 0; i < numClusters; i++) {
colors[i] = 255/(i+1);
}
std::vector<cv::Mat> layers;
for(int i = 0; i < numClusters; i++)
{
layers.push_back(cv::Mat::zeros(src.rows,src.cols,CV_32F));
}
//use the labels to draw the layers
//using the array of colors, draw the pixels onto each label image
for( int y = 0; y < src.rows; y++ )
{
for( int x = 0; x < src.cols; x++ )
{
int cluster_idx = labels.at<int>(y + x*src.rows,0);
layers[cluster_idx].at<float>(y, x) = (float)(colors[cluster_idx]);;
}
}
std::vector<cv::Mat> srcLayers;
//each layer to mask a portion of the original image
//this leaves us with sections of similar color from the original image
for(int i = 0; i < numClusters; i++)
{
layers[i].convertTo(layers[i], CV_8UC1);
srcLayers.push_back(cv::Mat());
src.copyTo(srcLayers[i], layers[i]);
}
I suggest you convert the image to the HSV-space (Hue-Saturation-Value). Then make a histogram based on the Hue value to find thresholds online, or define them before (depends if this is a general problem or a given one).
Crate one-channel images for each layer you want to form. (set them as black)
Then then use the HSV-image and mark a layer based on the threshold values. You might want to add some constant thresholds for value and saturation too (to avoid dark and light areas)
Does this make sense to you?
I think that you should proceed in the following proceess:
Smooth you image if it has too much details.
find edges
Find all contours
Try to find the color of each contour..lets say you want to keep all contours which are red. So, keep only those contours which are red.
Once you find the contours which you want to keep, then create a mask image based upon the contours you want to keep.
Using mask image, extract the required objects from the original image.
I have been using OpenCV for a quite time. I decided to check its power for Machine Learning lately. So I ended up with implementing a neural network for face recognition. To summarize my strategy for face recognition :
Read images from a csv of some face database.
Roll images to a Mat array row wise.
Apply PCA for dimensionality reduction.
Use projections of PCA to train the network.
Predict the test data using the trained network.
So everything was OK until the prediction stage. I was using the max responsed output unit to classify the face. So normally OpenCV's sigmoid implementation should give values in range of -1 to 1 which is stated at the docs. 1 is the max closure to class. After I got nearly 0 accuracy I checked the output responses for each class for each test data. I was suprised with the values : 14.53, -1.7 , #IND . If sigmoid was applied, how could i get these values ? Where am i doing wrong ?
To help you understand the matter and for the ones wondering how to apply PCA and use it with NN I m sharing my code :
Reading csv:
void read_csv(const string& filename, vector& images, vector& labels, char separator = ';')
{
std::ifstream file(filename.c_str(), ifstream::in);
if (!file)
{
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(1, error_message);
}
string line, path, classlabel;
while (getline(file, line))
{
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty())
{
Mat im = imread(path, 0);
images.push_back(im);
labels.push_back(atoi(classlabel.c_str()));
}
}
}
Rolling images row by row :
Mat rollVectortoMat(const vector<Mat> &data)
{
Mat dst(static_cast<int>(data.size()), data[0].rows*data[0].cols, CV_32FC1);
for(unsigned int i = 0; i < data.size(); i++)
{
Mat image_row = data[i].clone().reshape(1,1);
Mat row_i = dst.row(i);
image_row.convertTo(row_i,CV_32FC1, 1/255.);
}
return dst;
}
Converting vector of labels to Mat of labels
Mat getLabels(const vector<int> &data,int classes = 20)
{
Mat labels(data.size(),classes,CV_32FC1);
for(int i = 0; i <data.size() ; i++)
{
int cls = data[i] - 1;
labels.at<float>(i,cls) = 1.0;
}
return labels;
}
MAIN
int main()
{
PCA pca;
vector<Mat> images_train;
vector<Mat> images_test;
vector<int> labels_train;
vector<int> labels_test;
read_csv("train1k.txt",images_train,labels_train);
read_csv("test1k.txt",images_test,labels_test);
Mat rawTrainData = rollVectortoMat(images_train);
Mat rawTestData = rollVectortoMat(images_test);
Mat trainLabels = getLabels(labels_train);
Mat testLabels = getLabels(labels_test);
int pca_size = 500;
Mat trainData(rawTrainData.rows, pca_size,rawTrainData.type());
Mat testData(rawTestData.rows,pca_size,rawTestData.type());
pca(rawTrainData,Mat(),CV_PCA_DATA_AS_ROW,pca_size);
for(int i = 0; i < rawTrainData.rows ; i++)
pca.project(rawTrainData.row(i),trainData.row(i));
for(int i = 0; i < rawTestData.rows ; i++)
pca.project(rawTestData.row(i),testData.row(i));
Mat layers = Mat(3,1,CV_32SC1);
int sz = trainData.cols ;
layers.row(0) = Scalar(sz);
layers.row(1) = Scalar(1000);
layers.row(2) = Scalar(20);
CvANN_MLP mlp;
CvANN_MLP_TrainParams params;
CvTermCriteria criteria;
criteria.max_iter = 1000;
criteria.epsilon = 0.00001f;
criteria.type = CV_TERMCRIT_ITER | CV_TERMCRIT_EPS;
params.train_method = CvANN_MLP_TrainParams::BACKPROP;
params.bp_dw_scale = 0.1f;
params.bp_moment_scale = 0.1f;
params.term_crit = criteria;
mlp.create(layers,CvANN_MLP::SIGMOID_SYM);
int i = mlp.train(trainData,trainLabels,Mat(),Mat(),params);
int t = 0, f = 0;
for(int i = 0; i < testData.rows ; i++)
{
Mat response(1,20,CV_32FC1);
Mat sample = testData.row(i);
mlp.predict(sample,response);
float max = -1000000000000.0f;
int cls = -1;
for(int j = 0 ; j < 20 ; j++)
{
float value = response.at<float>(0,j);
if(value > max)
{
max = value;
cls = j + 1;
}
}
if(cls == labels_test[i])
t++;
else
f++;
}
return 0;
}
NOTE: I used AT&T 's first 20 class for my dataset.
Thanks to Canberk Baci's comment I managed to overcome sigmoid output discrepancy. Problem seems to be at default parameters of mlp 's create function which takes alpha and beta 0 as default. When they both are given as 1, sigmoid function works as it was stated in the docs and neural network can predict something but with errors of course.
And for the results of Neural Network :
By modifying some parameters like momentum etc, and without any illumunation correction algorithm, I got %72 accuracy on the dataset of (randomly sampled 936 train, 262 test images ) first 20 classes of CroppedYaleB from opencv tutorials. For the other factors to increase accuracy; when I applied PCA, I directly gave the reduced dimension size as 500. This may also reduce accuracy because retained variance may be below %95 or worse. So when I have free time I will apply these to increase accuracy :
Tan Triggs illumination correction
Train PCA with 0.95 as pca size to retain %95 variance.
Modify neural network parameters (I wish we had a less parametric NN in OpenCV library)
I shared these so that someone may wonder how to increase the classification accuracy of NN. I hope it helps.
By the way you can track the issue about this here:
http://code.opencv.org/issues/3583
I am trying to use opencv EM algorithm to do color extraction.I am using the following code based on example in opencv documentation:
cv::Mat capturedFrame ( height, width, CV_8UC3 );
int i, j;
int nsamples = 1000;
cv::Mat samples ( nsamples, 2, CV_32FC1 );
cv::Mat labels;
cv::Mat img = cv::Mat::zeros ( height, height, CV_8UC3 );
img = capturedFrame;
cv::Mat sample ( 1, 2, CV_32FC1 );
CvEM em_model;
CvEMParams params;
samples = samples.reshape ( 2, 0 );
for ( i = 0; i < N; i++ )
{
//from the training samples
cv::Mat samples_part = samples.rowRange ( i*nsamples/N, (i+1)*nsamples/N);
cv::Scalar mean (((i%N)+1)*img.rows/(N1+1),((i/N1)+1)*img.rows/(N1+1));
cv::Scalar sigma (30,30);
cv::randn(samples_part,mean,sigma);
}
samples = samples.reshape ( 1, 0 );
//initialize model parameters
params.covs = NULL;
params.means = NULL;
params.weights = NULL;
params.probs = NULL;
params.nclusters = N;
params.cov_mat_type = CvEM::COV_MAT_SPHERICAL;
params.start_step = CvEM::START_AUTO_STEP;
params.term_crit.max_iter = 300;
params.term_crit.epsilon = 0.1;
params.term_crit.type = CV_TERMCRIT_ITER|CV_TERMCRIT_EPS;
//cluster the data
em_model.train ( samples, Mat(), params, &labels );
cv::Mat probs;
probs = em_model.getProbs();
cv::Mat weights;
weights = em_model.getWeights();
cv::Mat modelIndex = cv::Mat::zeros ( img.rows, img.cols, CV_8UC3 );
for ( i = 0; i < img.rows; i ++ )
{
for ( j = 0; j < img.cols; j ++ )
{
sample.at<float>(0) = (float)j;
sample.at<float>(1) = (float)i;
int response = cvRound ( em_model.predict ( sample ) );
modelIndex.data [ modelIndex.cols*i + j] = response;
}
}
My question here is:
Firstly, I want to extract each model, here totally five, then store those corresponding pixel values in five different matrix. In this case, I could have five different colors seperately. Here I only obtained their indexes, is there any way to achieve their corresponding colors here? To make it easy, I can start from finding the dominant color based on these five GMMs.
Secondly, here my sample datapoints are "100", and it takes about nearly 3 seconds for them. But I want to do all these things in no more than 30 milliseconds. I know OpenCV background extraction, which is using GMM, performs really fast, below 20ms, that means, there must be a way for me to do all these within 30 ms for all 600x800=480000 pixels. I found predict function is the most time consuming one.
First Question:
In order to do color extraction you first need to train the EM with your input pixels. After that you simply loop over all the input pixels again and use predict() to classify each of them. I've attached a small example that utilizes EM for foreground/background separation based on colors. It shows you how to extract the dominant color (mean) of each gaussian and how to access the original pixel color.
#include <opencv2/opencv.hpp>
int main(int argc, char** argv) {
cv::Mat source = cv::imread("test.jpg");
//ouput images
cv::Mat meanImg(source.rows, source.cols, CV_32FC3);
cv::Mat fgImg(source.rows, source.cols, CV_8UC3);
cv::Mat bgImg(source.rows, source.cols, CV_8UC3);
//convert the input image to float
cv::Mat floatSource;
source.convertTo(floatSource, CV_32F);
//now convert the float image to column vector
cv::Mat samples(source.rows * source.cols, 3, CV_32FC1);
int idx = 0;
for (int y = 0; y < source.rows; y++) {
cv::Vec3f* row = floatSource.ptr<cv::Vec3f > (y);
for (int x = 0; x < source.cols; x++) {
samples.at<cv::Vec3f > (idx++, 0) = row[x];
}
}
//we need just 2 clusters
cv::EMParams params(2);
cv::ExpectationMaximization em(samples, cv::Mat(), params);
//the two dominating colors
cv::Mat means = em.getMeans();
//the weights of the two dominant colors
cv::Mat weights = em.getWeights();
//we define the foreground as the dominant color with the largest weight
const int fgId = weights.at<float>(0) > weights.at<float>(1) ? 0 : 1;
//now classify each of the source pixels
idx = 0;
for (int y = 0; y < source.rows; y++) {
for (int x = 0; x < source.cols; x++) {
//classify
const int result = cvRound(em.predict(samples.row(idx++), NULL));
//get the according mean (dominant color)
const double* ps = means.ptr<double>(result, 0);
//set the according mean value to the mean image
float* pd = meanImg.ptr<float>(y, x);
//float images need to be in [0..1] range
pd[0] = ps[0] / 255.0;
pd[1] = ps[1] / 255.0;
pd[2] = ps[2] / 255.0;
//set either foreground or background
if (result == fgId) {
fgImg.at<cv::Point3_<uchar> >(y, x, 0) = source.at<cv::Point3_<uchar> >(y, x, 0);
} else {
bgImg.at<cv::Point3_<uchar> >(y, x, 0) = source.at<cv::Point3_<uchar> >(y, x, 0);
}
}
}
cv::imshow("Means", meanImg);
cv::imshow("Foreground", fgImg);
cv::imshow("Background", bgImg);
cv::waitKey(0);
return 0;
}
I've tested the code with the following image and it performs quite good.
Second Question:
I've noticed that the maximum number of clusters has a huge impact on the performance. So it's better to set this to a very conservative value instead of leaving it empty or setting it to the number of samples like in your example. Furthermore the documentation mentions an iterative procedure to repeatedly optimize the model with less-constrained parameters. Maybe this gives you some speed-up. To read more please have a look at the docs inside the sample code that is provided for train() here.