OpenCV 1.1 K-Means Clustering in High Dimensional Spaces - c++

I am trying to write a bag of features system image recognition system. One step in the algorithm is to take a larger number of small image patches (say 7x7 or 11x11 pixels) and try to cluster them into groups that look similar. I get my patches from an image, turn them into gray-scale floating point image patches, and then try to get cvKMeans2 to cluster them for me. I think I am having problems formatting the input data such that KMeans2 returns coherent results. I have used KMeans for 2D and 3D clustering before but 49D clustering seems to be a different beast.
I keep getting garbage values for the returned clusters vector, so obviously this is a garbage in / garbage out type problem. Additionally the algorithm runs way faster than I think it should for such a huge data set.
In the code below the straight memcpy is only my latest attempt at getting the input data in the correct format, I spent a while using the built in OpenCV functions, but this is difficult when your base type is CV_32FC(49).
Can OpenCV 1.1's KMeans algorithm support this sort of high dimensional analysis?
Does someone know the correct method of copying from images to the K-Means input matrix?
Can someone point me to a free, Non-GPL KMeans algorithm I can use instead?
This isn't the best code as I am just trying to get things to work right now:
std::vector<int> DoKMeans(std::vector<IplImage *>& chunks){
// the size of one image patch, CELL_SIZE = 7
int chunk_size = CELL_SIZE*CELL_SIZE*sizeof(float);
// create the input data, CV_32FC(49) is 7x7 float object (I think)
CvMat* data = cvCreateMat(chunks.size(),1,CV_32FC(49) );
// Create a temporary vector to hold our data
// we'll copy into the matrix for KMeans
int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;
float * rawdata = new float[rdsize];
// Go through each image chunk and copy the
// pixel values into the raw data array.
vector<IplImage*>::iterator iter;
int k = 0;
for( iter = chunks.begin(); iter != chunks.end(); ++iter )
{
for( int i =0; i < CELL_SIZE; i++)
{
for( int j=0; j < CELL_SIZE; j++)
{
CvScalar val;
val = cvGet2D(*iter,i,j);
rawdata[k] = (float)val.val[0];
k++;
}
}
}
// Copy the data into the CvMat for KMeans
// I have tried various methods, but this is just the latest.
memcpy( data->data.ptr,rawdata,rdsize*sizeof(float));
// Create the output array
CvMat* results = cvCreateMat(chunks.size(),1,CV_32SC1);
// Do KMeans
int r = cvKMeans2(data, 128,results, cvTermCriteria(CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 1000, 0.1));
// Copy the grouping information to our output vector
vector<int> retVal;
for( int y = 0; y < chunks.size(); y++ )
{
CvScalar cvs = cvGet1D(results, y);
int g = (int)cvs.val[0];
retVal.push_back(g);
}
return retVal;}
Thanks in advance!

Though I'm not familiar with "bag of features", have you considered using feature points like corner detectors and SIFT?

You might like to check out http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/ for another open source clustering package.
Using memcpy like this seems suspect, because when you do:
int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;
If CELL_SIZE and chunks.size() are very large you are creating something large in rdsize. If this is bigger than the largest storable integer you may have a problem.
Are you wanting to change "chunks" in this function?
I'm guessing that you don't as this is a K-means problem.
So try passing by reference to const here. (And generally speaking this is what you will want to be doing)
so instead of:
std::vector<int> DoKMeans(std::vector<IplImage *>& chunks)
it would be:
std::vector<int> DoKMeans(const std::vector<IplImage *>& chunks)
Also in this case it is better to use static_cast than the old c style casts. (for example static_cast(variable) as opposed to (float)variable ).
Also you may want to delete "rawdata":
float * rawdata = new float[rdsize];
can be deleted with:
delete[] rawdata;
otherwise you may be leaking memory here.

Related

How to copy the structure of a vector of vectors to another one in C++ using OpenCV

I have got a vector containing contours of an image. The code looks like this:
cv::Mat img = cv::imread ("whatever");
cv::Mat edges;
double lowThresh = 100, highThresh = 2*lowThresh;
Canny(img, edges, lowThresh, highThresh);
std::vector<std::vector<cv::Point>> contourVec;
std::vector<cv::Vec4i> hierarchy;
int mode = CV_RETR_LIST;
int method = CV_CHAIN_APPROX_TC89_KCOS;
findContours(edges, contourVec, hierarchy, mode, method);
What I now would like to do is to transform these points. I therefore created another vector, which shall have the same structure as the other one, just with Point3delements instead of Point. At the moment, I do it like this:
std::vector<std::vector<cv::Point3d>> contour3DVec(contourVec.size());
for (int i = 0; i < contourVec.size(); i++)
contour3DVec[i].resize(contourVec[i].size());
But I'm not sure whether that is really the best way to do it, as I don't know how resize()is actually working (e.g. in the field of memory location).
Does anybody have an idea whether there is a faster and/or "smarter" way to solve this? Thanks in advance.
Given that you surely want to do something with the contours afterwards, the resizing probably won't be a performance hotspot in your Program.
Vectors in c++ are usually created with a bit of slack to grow. They may take up to double their current size in memory.
If you resize a vector it will first check if the resizing will fit in the reserved memory.
If that's the case the resizing is free.
Otherwise new memory (up to double the new size) will be reserved and the current vector content moved there.
As your vectors are empty in the beginning, they will have to reserve new memory anyway, so (given a sane compiler and standard library) it would be hard to beat your Implementation speed wise.
if you want 3d points, you'll have to create them manually, one by one:
std::vector<std::vector<cv::Point3d>> contour3DVec(contourVec.size());
for (size_t i = 0; i < contourVec.size(); i++)
{
for (size_t j = 0; j < contourVec[i].size(); j++)
{
Point p = contourVec[i][j];
contour3DVec[i].push_back( Point3d(p.x, p.y, 1) );
}
}

what is the fastest way to run a method on all pixels in opencv (c++)

I have several tasks to do on each pixel in opencv. I am using a construct like this:
for(int row = 0; row < inputImage.rows; ++row)
{
uchar* p = inputImage.ptr(row);
for(int col = 0; col < inputImage.cols*3; col+=3)
{
int blue=*(p+col); //points to each pixel B,G,R value in turn assuming a CV_8UC3 colour image
int green=*(p+col+1);
int red=*(p+col+2);
// process pixel }
}
This is working, but I am wondering if there is any faster way to do this? This solution doesn't use any SIMD or any paralle processing of OpenCV.
What is the best way to run a method over all pixels of an image in opencv?
If the Mat is continuous, i.e. the matrix elements are stored continuously without gaps at the end of each row, which can be referred using Mat::isContinuous(), you can treat them as a long row. Thus you can do something like this:
const uchar *ptr = inputImage.ptr<uchar>(0);
for (size_t i=0; i<inputImage.rows*inputImage.cols; ++i){
int blue = ptr[3*i];
int green = ptr[3*i+1];
int red = ptr[3*i+2];
// process pixel
}
As said in the documentation, this approach, while being very simple, can boost the performance of a simple element-operation by 10-20 percents, especially if the image is rather small and the operation is quite simple.
PS: For faster need, you will need to take full use of GPU to process each pixel in parallel.

How to copy elements of 2D matrix to 1D array vertically using c++

I have a 2D matrix and I want to copy its values to a 1D array vertically in an efficient way as the following way.
Matrice(3x3)
[1 2 3;
4 5 6;
7 8 9]
myarray:
{1,4,7,2,5,8,3,6,9}
Brute force takes 0.25 sec for 1000x750x3 image. I dont want to use vector because I give myarray to another function(I didnt write this function) as input. So, is there a c++ or opencv function that I can use? Note that, I'm using opencv library.
Copying matrix to array is also fine, I can first take the transpose of the Mat, then I will copy it to array.
cv::Mat transposed = myMat.t();
uchar* X = transposed.reshape(1,1).ptr<uchar>(0);
or
int* X = transposed.reshape(1,1).ptr<int>(0);
depending on your matrix type. It might copy data though.
You can optimize to make it more cache friendly, i.e. you can copy blockwise, keeping track of the positions in myArray, where the data should go to. The point is, that you brute force approach will most likely make each access to the matrix being off-cache, which has a tremendous performance impact. Hence it is better to copy vertical/horizontal taking the cache line size into account.
See the idea bbelow (I didn't test it, so it has most likely bugs, but it should make the idea clear).
size_t cachelinesize = 128/sizeof(pixel); // assumed cachelinesize of 128 bytes
struct pixel
{
char r;
char g;
char b;
};
array<array<pixel, 1000>, 750> matrice;
vector<pixel> vec(1000*750);
for (size_t row = 0; row<matrice.size; ++row)
{
for (size_t col = 0; col<matrice[0].size; col+=cachelinesize)
{
for (size_t i = 0; i<cachelinesize; ++i)
{
vec[row*(col+i)]=matrice[row][col+i]; // check here, if right copy order. I didn't test it.
}
}
}
If you are using the matrix before the vertical assignment/querying, then you can cache the necessary columns when you hit each one of the elements of columns.
//Multiplies and caches
doCalcButCacheVerticalsByTheWay(myMatrix,calcType,myMatrix2,cachedColumns);
instead of
doCalc(myMatrix,calcType,myMatrix2); //Multiplies
then use it like this:
...
tmpVariable=cachedColumns[i];
...
For example, upper function multiplies the matrix with another one, then when the necessary columns are reached, caching into a temporary array occurs so you can access elements of it later in a contiguous order.
I think Mat::reshape is what you want. It does not copying data.

Converting a row of cv::Mat to std::vector

I have a fairly simple question: how to take one row of cv::Mat and get all the data in std::vector? The cv::Mat contains doubles (it can be any simple datatype for the purpose of the question).
Going through OpenCV documentation is just very confusing, unless I bookmark the page I can not find a documentation page twice by Googling, there's just to much of it and not easy to navigate.
I have found the cv::Mat::at(..) to access the Matrix element, but I remember from C OpenCV that there were at least 3 different ways to access elements, all of them used for different purposes... Can't remember what was used for which :/
So, while copying the Matrix element-by-element will surely work, I am looking for a way that is more efficient and, if possible, a bit more elegant than a for loop for each row.
It should be as simple as:
m.row(row_idx).copyTo(v);
Where m is cv::Mat having CV_64F depth and v is std::vector<double>
Data in OpenCV matrices is laid out in row-major order, so that each row is guaranteed to be contiguous. That means that you can interpret the data in a row as a plain C array. The following example comes directly from the documentation:
// compute sum of positive matrix elements
// (assuming that M is double-precision matrix)
double sum=0;
for(int i = 0; i < M.rows; i++)
{
const double* Mi = M.ptr<double>(i);
for(int j = 0; j < M.cols; j++)
sum += std::max(Mi[j], 0.);
}
Therefore the most efficient way is to pass the plain pointer to std::vector:
// Pointer to the i-th row
const double* p = mat.ptr<double>(i);
// Copy data to a vector. Note that (p + mat.cols) points to the
// end of the row.
std::vector<double> vec(p, p + mat.cols);
This is certainly faster than using the iterators returned by begin() and end(), since those involve extra computation to support gaps between rows.
From the documentation at here, you can get a specific row through cv::Mat::row, which will return a new cv::Mat, over which you can iterator with cv::Mat::begin and cv::Mat::end. As such, the following should work:
cv::Mat m/*= initialize */;
// ... do whatever...
cv::Mat first_row(m.row(0));
std::vector<double> v(first_row.begin<double>(), first_row.end<double>());
Note that I don't know any OpenCV, but googling "OpenCV mat" led directly to the basic types documentation and according to that, this should work fine.
The matrix iterators are random-access iterators, so they can be passed to any STL algorithm, including std::sort() .
This is also from the documentiation, so you could actually do this without a copy:
cv::Mat m/*= initialize */;
// ... do whatever...
// first row begin end
std::vector<double> v(m.begin<double>(), m.begin<double>() + m.size().width);
To access more than the first row, I'd recommend the first snippet, since it will be a lot cleaner that way and there doesn't seem to be any heavy copying since the data types seem to be reference-counted.
You can also use cv::Rect
m(cv::Rect(0, 0, 1, m.cols))
will give you first row.
matrix(cv::Rect(x0, y0, len_x, len_y);
means that you will get sub_matrix from matrix whose upper left corner is (x0,y0) and size is (len_x, len_y). (row,col)
I think this works,
an example :
Mat Input(480, 720, CV_64F, Scalar(100));
cropping the 1st row of the matrix:
Rect roi(Point(0, 0), Size(720, 1));
then:
std::vector<std::vector<double> > vector_of_rows;
vector_of_rows.push_back(Input(roi));

Convert RGB IplImage to 3 arrays

I need some C++/pointer help. When I create an RGB IplImage and I want to access i,j I use the following C++ class taken from: http://www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/opencv-intro.html
template<class T> class Image
{
private:
IplImage* imgp;
public:
Image(IplImage* img=0) {imgp=img;}
~Image(){imgp=0;}
void operator=(IplImage* img) {imgp=img;}
inline T* operator[](const int rowIndx) {
return ((T *)(imgp->imageData + rowIndx*imgp->widthStep));}
};
typedef struct{
unsigned char b,g,r;
} RgbPixel;
typedef struct{
float b,g,r;
} RgbPixelFloat;
typedef Image<RgbPixel> RgbImage;
typedef Image<RgbPixelFloat> RgbImageFloat;
typedef Image<unsigned char> BwImage;
typedef Image<float> BwImageFloat;
I've been working with CUDA so sometimes I have to put all the data into an array, I like to keep every channel in its own array, seems easier to handle the data that way. So I would usually do something like this:
IplImage *image = cvLoadImage("whatever.tif");
RgbImageFloat img(image);
for(int i = 0; i < exrIn->height; i++)
{
for(int j = 0; j < exrIn->width; j++)
{
hostr[j*data->height+i] = img[i][j].r;
hostg[j*data->height+i] = img[i][j].g;
hostb[j*data->height+i] = img[i][j].b;
}
}
I would then copy my data to the device, do some stuff with it, get it back to the host and then loop, yet again, through the array assigning the data back to the IplImage and saving my results.
It seems like I'm looping to much there has to be a faster way to do this with pointers but I'm lost, there has to be a more efficient way to do it. Is there a way I can simply use a pointer for every channel? I tried doing something like this but it didn't work:
float *hostr = &img[0][0].r
float *hostg = &img[0][0].b
float *hostb = &img[0][0].g
Any suggestions? Thanks!
EDIT:
Thanks everyone for answering. Maybe I wasn't very clear on my question. I am familiar on how to access channels and their data. What I am interested is in increasing the performance and efficiency of completely copying data off the IplImage to a standard array, more along the lines of what csl said so far. The problem I see is that the way data in an IplImage is arranged is "rgbrgbrgbrgb".
Firstly, if you're comfortable with C++, you should consider using OpenCV 2.0 which does away with different data types for images and matrices (IplImage* and CvMat*) and uses one structure (Mat) to handle both. Apart from automatic memory management and a truckload of useful routines to handle channels, etc. and some MATLAB-esque ones as well, it's really fun to use.
For your specific problem, you access the channels of an IplImage* with Mat, like this:
IplImage *image = cvLoadImage("lena.bmp");
Mat Lena(image);
vector<Mat> Channels;
split(Lena,Channels);
namedWindow("LR",CV_WINDOW_AUTOSIZE);
imshow("LR",Channels[0]);
waitKey();
Now you have the copies of each channel in the vector Channels.
If you don't want to use OpenCV2.0 and extract channels, note the following. OpenCV orders multi-channel images in the following manner:
x(1,1,1) x(1,1,2) x(1,1,3) x(1,2,1) x(1,2,2) x(1,2,3) ...
where x(i,j,k) = an element in row i of column j in channel k
Also, OpenCV pads it's images .. so don't forget to jump rows with widthStep which accounts for these padding gaps. And along the lines of what csl said, increase your row pointer in the outer loop (using widthStep) and increment this pointer to access elements in a row.
NOTE:
Since you're using 2.0 now, you can bypass IplImage* with Mat Lena = imread("Lena.bmp");.
There is room for a lot of improvement here. So much, that you should read up on how people access bitmaps.
First of all, increase memory locality as much as possible. This will increase cache hits, and performance. I.e., don't use three separate arrays for each color channel. Store each together, since you probably will be working mostly on pixels.
Secondly, don't do that y*width calculation for every pixel. When done in an inner loop, it consumes a lot of cycles.
Lastly, if you just want a complete copy of the image, then you could simply do a memcpy(), which is very fast. I couldn't deduce if you converted from floats to integers, but if not, use memcpy() for non-overlapping regions.
If you wonder how you can do this with pointers (kind of pseudo-code, and also not tested):
float *dst = &hostg[0][0];
RgbPixelFloat *src = &img[0][0];
RgbPixelFloat *end = &img[HEIGHT][WIDTH] + 1;
// copy green channel of whole image
while ( src != end ) {
*dst = src->g;
++dst;
++src;
}