I'm trying to create a simple OCR application with SVM, openCV, C++ and Visual Studio 2008 (mfc app).
My training samples are binary images of machine-printed digits (0-9). I want to use DAGSVM for this multi-class problem. So I need to create 45 SVMs, each of which is the SVM of 2 class (SVM(0,1), SVM(0,2)... SVM(8,9)).
Here's how things are going:
SVM's parameters:
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);
Data of training images of class i are stored in matrix trainData[i] (each row is the pixels of a 28x28 image, which means the matrix has 784 cols).
When training each SVM, I create 2 matrix called curTrainData & curTrainLabel.
for (int i = 0; i < 9; i++)
for (int j = i+1; j < 10; j++)
{
curTrainData.create(trainData[i].rows + trainData[j].rows, 784, CV_32FC1);
curTrainLabel.create(curTrainData.rows, 1, CV_32FC1);
// merge 2 matrix: trainData[i] & trainData[j]
for (int k = 0; k < trainData[i].rows; k++)
{
curTrainLabel.at<float>(k, 0) = 1.0; // class of digit i
for (int l = 0; l < 784; l++)
curTrainData.at<float>(k,l) = trainData[i].at<float>(k,l);
}
for (int k = 0; k < trainData[j].rows; k++)
{
curTrainLabel.at<float>(k + trainData[i].rows, 0) = -1.0; // class of digit j
for (int l = 0; l < 784; l++)
curTrainData.at<float>(k + trainData[i].rows,l) = trainData[j].at<float>(k,l);
}
svms[i][j].train(curTrainData, curTrainLabel, Mat(), Mat(), params);
}
I got error at the call svms[i][j].train.... The full error is:
Unhandled exception at 0x75b5d36f in svm.exe: Microsoft C++ exception: cv::Exception at memory location 0x0022af8c..
To tell the truth I don't fully understand SVM implemented in openCV and I can't find any example of them working with objects in images.
I'm really grateful if someone can tell me what is (are) wrong :(
Update 09/03:
I had mistaken. The error comes from:
str.Format(_T("Results\trained_%d_%d.xml"), i, j);
svms[i][j].save(CT2A(str));
str is a CString variable.
It remains even if I change to:
svms[i][j].save("Results\trained.xml");
I've created the folder Results and others files are written well into it (files for methods fopen(), imwrite()...). I don't know why I can't add the folder when it comes to this save method of svm.
If you use backslash "\", you have to put "\\" instead (or you can use a frontslash "/").
Related
I want to create a 1D plot from an image. Then I want to determine the maxima and their distances to each other in c++.
I am looking for some tips on how I could approach this.
I load the image as cv::Mat. In opencv I have searched, but only found the histogram function, which is wrong. I want to get a cross section of the image - from left to right.
does anyone have an idea ?
Well I have the following picture:
From this I want to create a 1D plot like in the following picture (I created the plot in ImageJ).
Here you can see the maxima (I could refine it with "smooth").
I want to determine the positions of these maxima and then the distances between them.
I have to get to the 1D plot somehow. I suppose I can get to the maxima with a derivation?
++++++++++ UPDATE ++++++++++
Now i wrote this to get an 1D Plot:
cv::Mat img= cv::imread(imgFile.toStdString(), cv::IMREAD_ANYDEPTH | cv::IMREAD_COLOR);
cv::cvtColor(img, img, cv::COLOR_BGR2GRAY);
uint8_t* data = img.data;
int width = img.cols;
int height = img.rows;
int stride = img.step;
std::vector<double> vPlot(width, 0);
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
uint8_t val = data[ i * stride + j];
vPlot[j]=vPlot[j] + val;
}
}
std::ofstream file;
file.open("path\\plot.csv");
for(int i = 0; i < vPlot.size(); i++){
file << vPlot[i];
file << ";";
}
file.close();
When i plot this in excel i got this:
Thats looks not so smooth as in ImageJ. Did i something wrong?
I need it like in the Plot of ImageJ - more smooth.
ok I got it:
for (int i = 0; i < vPlot.size(); i++) {
vPlot[i] = vPlot[i] / height;
}
Ok but i don't know how to get the maxima an distances.
When i have the local maxima (i don't know how), i can calculate the distance between them with the index of the vetcor elements.
Has anybody an idea to get the local Maxima out of the vector, that I plot above ?
Now o wrote this to find the maxima:
// find maxima
std::vector<int> idxMax;
int flag = 0;
for(int i = 1; i < avg.size(); i++){
double diff = avg[i] - avg[i-1];
if(diff < 0){
if(flag>0){
idxMax.push_back(i);
flag = -1;
}
}
if(diff >= 0){
if(flag<=0){
flag = 1;
}
}
}
But more maxima are found than wanted. The length of the vector varies and also the number of peaks. These can be close together or far away. They are also not always the same height, as can be seen in the picture
I actually have a problem on openCV3.0.
I used 12 gabor filters(12 differents orientation) on 1 image and stocked them.
Now I want to add all those images and then divide by 12 each value to obtain the mean of the 12 filters.
Because those image are RGB, I have to work on each channel separatly.
The problem is : when I add all the values, I obtain values > 12 while all the values are between 0 and 1.
The part of the code bugged :
for (i = 0; i < gaborV.size(); ++i) { //gaborV contain the 12 gabor filters
std::vector<cv::Mat> vec_split; //I split because of the 3 channels
cv::split(gaborV[i], vec_split);
for (int k = 0; k < imgCol.rows; ++k) {
for (int j = 0; j < imgCol.cols; ++j) {
if (k == 1 && j == 1)
std::cout << mat_X.at<float>(k, j) << " " << vec_split[0].at<float>(k, j) << std::endl;
mat_X.at<float>(k, j) += vec_split[0].at<float>(k, j);
mat_Y.at<float>(k, j) += vec_split[1].at<float>(k, j);
mat_Z.at<float>(k, j) += vec_split[2].at<float>(k, j);
}
}
}
and mat_X, mat_Y and mat_Z are created as follow :
mat_X = mat_Y = mat_Z = cv::Mat(cvSize(imgColNormalize.cols, imgColNormalize.rows), CV_32FC1, cvScalar(0.));
As I said, all values in vec_split are between 0 and 1, but when I'm out of the loop, mat_X, mat_Y and mat_Z contain values > 12..
The output of the cout I used :
0 0.507358
1.54751 0.496143
3.00963 0.528832
4.53887 0.465426
... and at the end I have 15.9459
And i don't understand since 0 + 0.507358 != 1.54751; 1.54751 + 0.496143 != 3.00963 ...
Do someone understand the problem?
Thanks for all!
I think the problem is here:
mat_X = mat_Y = mat_Z = cv::Mat(cvSize(imgColNormalize.cols,
imgColNormalize.rows), CV_32FC1, cvScalar(0.));
The way you initialise these arrays results in all three cv::Mat objects referencing the same data. Only one Mat is created and so your code increments the values in this array three times.
For info, OpenCV uses a reference counting mechanism with cv::Mat and the assignment operator simply creates a new reference to existing data. If you wanted to create a genuine deep-copy of a cv::Mat, you would need to use cv::Mat::clone().
So, instead, initialise like so:
mat_X = cv::Mat(cvSize(imgColNormalize.cols, imgColNormalize.rows), CV_32FC1, cvScalar(0.));
mat_Y = cv::Mat(cvSize(imgColNormalize.cols, imgColNormalize.rows), CV_32FC1, cvScalar(0.));
mat_Z = cv::Mat(cvSize(imgColNormalize.cols, imgColNormalize.rows), CV_32FC1, cvScalar(0.));
An excerpt from the documentation copied below for posterity:
On Windows 10, running Visual Studio 2015. Opencv 3.0
Using Opencv to first correlate two images and determine translation between them using matchTemplate. I want to get subpixel estimate so I am going to input an 11X11 window of values from the correlation output and fit a quadratic surface to those points.
void Sector1::ResampSector(cv::Mat In, cv::Mat R, cv::Mat Out, cv::Point Loc)
{
// first get fractional offset
int lsq = 5;
// Ax^2 + B xy + Cy^2 + Dx +Ey + F = R
cv::setBreakOnError(true);
cv::Mat A( 121, 6, CV_32F);
cv::Mat B( 121, 1, CV_32F);
cv::Mat C (6, 1, CV_32F);
int L = 0;
for (int i = Loc.y-lsq; i <= Loc.y+lsq; i++) {
for (int j = Loc.x-lsq; j <= Loc.x+lsq; j++) {
A.at<float>(L, 0) = float(i*i);
A.at<float>(L, 1) = (float)i*j;
A.at<float>(L, 2) = (float)j*j;
A.at<float>(L, 3) = (float)i;
A.at<float>(L, 4) = (float)j;
A.at<float>(L, 5) = 1.f;
B.at<float>(L) = R.at<float>(i, j); // since is 3 band stuff ?
L++;
} // for j
} // for i
bool rc = cv::solve(A, B, C);
the call to cv::solve returns false and there are two cv::Exceptions at same address which is outside of any of the image matrices or other variables. I have looked at the contents of A, B and C using memory window and they all appear correct. A,B,C structures all appear correct. I have tried to step into solve but i do not have the library with symbolic tables.
Any clue where i have gone wrong? suggestions for further tracking the problem?
Lapack complains that the default method will not work. correction is to add the flag=DECOMP_QR as the 4th, optional, arguement to the call to solve()
What is the fastest way of assigning a vector to a matrix row in a loop? I want to fill a data matrix along its rows with vectors. These vectors are computed in a loop. This loop last until all the entries of data matrix is filled those vectors.
Currently I am using cv::Mat::at<>() method for accessing the elements of the matrix and fill them with the vector, however, it seems this process is quite slow. I have tried another way by using cv::Mat::X.row(index) = data_vector, it works fast but fill my matrix X with some garbage values which I can not understand, why.
I read that there exists another way of using pointers (fastest way), however, I can not able to understand. Can somebody explain how to use them or other different methods?
Here is a part of my code:
#define OFFSET 2
cv::Mat im = cv::imread("001.png", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat X = cv::Mat((im.rows - 2*OFFSET)*(im.cols - 2*OFFSET), 25, CV_64FC1); // Holds the training data. Data contains image patches
cv::Mat patch = cv::Mat(5, 5, im.type()); // Holds a cropped image patch
typedef cv::Vec<float, 25> Vec25f;
int ind = 0;
for (int row = 0; row < (im.rows - 2*OFFSET); row++){
for (int col = 0; col < (im.cols - 2*OFFSET); col++){
cv::Mat temp_patch = im(cv::Rect(col, row, 5, 5)); // crop an image patch (5x5) at each pixel
patch = temp_patch.clone(); // Needs to do this because temp_patch is not continuous in memory
patch.convertTo(patch, CV_64FC1);
Vec25f data_vector = patch.reshape(0, 1); // make it row vector (1X25).
for (int i = 0; i < 25; i++)
{
X.at<float>(ind, i) = data_vector[i]; // Currently I am using this way (quite slow).
}
//X_train.row(ind) = patch.reshape(0, 1); // Tried this but it assigns some garbage values to the data matrix!
ind += 1;
}
}
To do it the regular opencv way you could do :-
ImageMat.row(RowIndex) = RowMat.clone();
or
RowMat.copyTo(ImageMat.row(RowIndex));
Haven't tested for correctness or speed.
Just a couple of edits in your code
double * xBuffer = X.ptr<double>(0);
for (int row = 0; row < (im.rows - 2*OFFSET); row++){
for (int col = 0; col < (im.cols - 2*OFFSET); col++){
cv::Mat temp_patch = im(cv::Rect(col, row, 5, 5)); // crop an image patch (5x5) at each pixel
patch = temp_patch.clone(); // Needs to do this because temp_patch is not continuous in memory
patch.convertTo(patch, CV_64FC1);
memcpy(xBuffer, patch.data, 25*sizeof(double));
xBuffer += 25;
}
}
Also, you dont seem to do any computation in patch just extract grey level values, so you can create X with the same type as im, and convert it to double at the end. In this way, you could memcpy each row of your patch, the address in memory beeing `unsigned char* buffer = im.ptr(row) + col
According to the docs:
if you need to process a whole row of matrix, the most efficient way is to get the pointer to the row first, and then just use plain C operator []:
// compute sum of positive matrix elements
// (assuming that M is double-precision matrix)
double sum=0;
for(int i = 0; i < M.rows; i++)
{
const double* Mi = M.ptr<double>(i);
for(int j = 0; j < M.cols; j++)
sum += std::max(Mi[j], 0.);
}
I am trying to use the vl_slic_segment function of the VLFeat library using an input image stored in an OpenCV Mat. My code is compiling and running, but the output superpixel values do not make sense. Here is my code so far :
Mat bgrUChar = imread("/pathtowherever/image.jpg");
Mat bgrFloat;
bgrUChar.convertTo(bgrFloat, CV_32FC3, 1.0/255);
cv::Mat labFloat;
cvtColor(bgrFloat, labFloat, CV_BGR2Lab);
Mat labels(labFloat.size(), CV_32SC1);
vl_slic_segment(labels.ptr<vl_uint32>(),labFloat.ptr<const float>(),labFloat.cols,labFloat.rows,labFloat.channels(),30,0.1,25);
I have tried not converting it to the Lab colorspace and setting different regionSize/regularization, but the output is always very glitchy. I am able to retrieve the label values correctly, the thing is the every labels is usually scattered on a little non-contiguous area.
I think the problem is the format of my input data is wrong but I can't figure out how to send it properly to the vl_slic_segment function.
Thank you in advance!
EDIT
Thank you David, as you helped me understand, vl_slic_segment wants data ordered as [LLLLLAAAAABBBBB] whereas OpenCV is ordering its data [LABLABLABLABLAB] for the LAB color space.
In the course of my bachelor thesis I have to use VLFeat's SLIC implementation as well. You can find a short example applying VLFeat's SLIC on Lenna.png on GitHub: https://github.com/davidstutz/vlfeat-slic-example.
Maybe, a look at main.cpp will help you figuring out how to convert the images obtained by OpenCV to the right format:
// OpenCV can be used to read images.
#include <opencv2/opencv.hpp>
// The VLFeat header files need to be declared external.
extern "C" {
#include "vl/generic.h"
#include "vl/slic.h"
}
int main() {
// Read the Lenna image. The matrix 'mat' will have 3 8 bit channels
// corresponding to BGR color space.
cv::Mat mat = cv::imread("Lenna.png", CV_LOAD_IMAGE_COLOR);
// Convert image to one-dimensional array.
float* image = new float[mat.rows*mat.cols*mat.channels()];
for (int i = 0; i < mat.rows; ++i) {
for (int j = 0; j < mat.cols; ++j) {
// Assuming three channels ...
image[j + mat.cols*i + mat.cols*mat.rows*0] = mat.at<cv::Vec3b>(i, j)[0];
image[j + mat.cols*i + mat.cols*mat.rows*1] = mat.at<cv::Vec3b>(i, j)[1];
image[j + mat.cols*i + mat.cols*mat.rows*2] = mat.at<cv::Vec3b>(i, j)[2];
}
}
// The algorithm will store the final segmentation in a one-dimensional array.
vl_uint32* segmentation = new vl_uint32[mat.rows*mat.cols];
vl_size height = mat.rows;
vl_size width = mat.cols;
vl_size channels = mat.channels();
// The region size defines the number of superpixels obtained.
// Regularization describes a trade-off between the color term and the
// spatial term.
vl_size region = 30;
float regularization = 1000.;
vl_size minRegion = 10;
vl_slic_segment(segmentation, image, width, height, channels, region, regularization, minRegion);
// Convert segmentation.
int** labels = new int*[mat.rows];
for (int i = 0; i < mat.rows; ++i) {
labels[i] = new int[mat.cols];
for (int j = 0; j < mat.cols; ++j) {
labels[i][j] = (int) segmentation[j + mat.cols*i];
}
}
// Compute a contour image: this actually colors every border pixel
// red such that we get relatively thick contours.
int label = 0;
int labelTop = -1;
int labelBottom = -1;
int labelLeft = -1;
int labelRight = -1;
for (int i = 0; i < mat.rows; i++) {
for (int j = 0; j < mat.cols; j++) {
label = labels[i][j];
labelTop = label;
if (i > 0) {
labelTop = labels[i - 1][j];
}
labelBottom = label;
if (i < mat.rows - 1) {
labelBottom = labels[i + 1][j];
}
labelLeft = label;
if (j > 0) {
labelLeft = labels[i][j - 1];
}
labelRight = label;
if (j < mat.cols - 1) {
labelRight = labels[i][j + 1];
}
if (label != labelTop || label != labelBottom || label!= labelLeft || label != labelRight) {
mat.at<cv::Vec3b>(i, j)[0] = 0;
mat.at<cv::Vec3b>(i, j)[1] = 0;
mat.at<cv::Vec3b>(i, j)[2] = 255;
}
}
}
// Save the contour image.
cv::imwrite("Lenna_contours.png", mat);
return 0;
}
In addition, have a look at README.md within the GitHub repository. The following figures show some example outputs of setting the regularization to 1 (100,1000) and setting the region size to 30 (20,40).
Figure 1: Superpixel segmentation with region size set to 30 and regularization set to 1.
Figure 2: Superpixel segmentation with region size set to 30 and regularization set to 100.
Figure 3: Superpixel segmentation with region size set to 30 and regularization set to 1000.
Figure 4: Superpixel segmentation with region size set to 20 and regularization set to 1000.
Figure 5: Superpixel segmentation with region size set to 20 and regularization set to 1000.