Possibly a MatIterator_ error in a Region of Interest - c++

I'm working with some object detection using OpenCV 2.4.4 with C++. Unfortunately those objects that I must detect doesn't have unique shape, but they do have some really specific color range in the HSV color space (right now i'm detecting some red objects).
Also, the object should be present only in some part of the image, so I'm not scanning all the image, moreover, under this ROI I have sub-ROIs (little rectangles, and I need to know in which of these rectangles the object is in a frame).
So, first I tried to detect it something was found in the large ROI and if it was I set it to some color alien to my ambient. I did it with the following code:
cvtColor(frame(Range(0,espv),Range::all()),framhsv,CV_BGR2HSV);
MatIterator_<Vec3b> it, ith, end, endh;
ith = framhsv.begin<Vec3b>();
endh = framhsv.end<Vec3b>();
it = (frame(Range(0,espv),Range::all())).begin<Vec3b>();
for (;ith != endh; ++it,++ith){
if( (((*ith)[0] <=7 && (*ith)[0]>=0)||((*ith)[0]>170)) && (*ith)[1]>=160){
(*it)[0] = 63;
(*it)[1] = 0;
(*it)[2] = 255;}}
Everything worked right and the object was "detected", so I passed to the next steep, which is detecting if the object was in each sub-ROIs. To do so I created the following function the returns the number o "detected" pixels in each sub-ROI, so I can consider which is the correct sub-ROI.
Here is the function:
Vector<int> CountRed(cv::Mat &frame, int hspa, int vspa, int nlines ,int nblocks){
Vector<int> count(nblocks*nlines,0);
for(int i = 1; i<= nlines; i++){
for(int j = 1; j<= nblocks; j++){
Mat framhsv;
cvtColor(frame(Range((i-1)*vspa,i*vspa),Range((j-1)*hspa,j*hspa)),framhsv,CV_BGR2HSV); //Converte somente a área de interesse para HSV
MatIterator_<Vec3b> it, ith, endh;
ith = framhsv.begin<Vec3b>();
endh = framhsv.end<Vec3b>();
it = (frame(Range((i-1)*vspa,i*vspa),Range((j-1)*hspa,j*hspa))).begin<Vec3b>();
for (;ith != endh; ++it,++ith){
if( (((*ith)[0] <=7 && (*ith)[0]>=0)||((*ith)[0]>170)) && (*ith)[1]>=160){
(*it)[0] = 63;
(*it)[1] = 0;
(*it)[2] = 255;
count[(j)+nblocks*(i-1)-1]++;}}}}
return count;
}
The code compiles without warnings or errors and the video shows up, but if I insert the object that I must detect in any part of the ROI the code stop working, this also happens if I remove the:
count[(j)+nblocks*(i-1)-1]++;
line. I get the following error:
Access violation writing location 0xFFFFF970.
I really think the problem must be in accessing the iterator when not using Range::All()
To try to clarify what I'm doing here is an image of a frame:
http://i.imgur.com/Irvr8bk.jpg
the purple region is the frame, the red region is the ROI, and each of the black ones are the sub-ROIs.
I've also tried using the frame(Rect(0,0,frame.cols,2*vspa) to define the ROI and it also worked, but I got the same error when trying to work with the sub-ROIs using cv::Rect. So, I really think this must be a MatIterator_ error when not accessing the full row structure.
So, what should I do to work with those sub-ROIs ?

Related

OpenCV 3.1 Stitch images in order they were taken

I am building an Android app to create panoramas. The user captures a set of images and those images
are sent to my native stitch function that was based on https://github.com/opencv/opencv/blob/master/samples/cpp/stitching_detailed.cpp.
Since the images are in order, I would like to match each image only to the next image in the vector.
I found an Intel article that was doing just that with following code:
vector<MatchesInfo> pairwise_matches;
BestOf2NearestMatcher matcher(try_gpu, match_conf);
Mat matchMask(features.size(),features.size(),CV_8U,Scalar(0));
for (int i = 0; i < num_images -1; ++i)
{
matchMask.at<char>(i,i+1) =1;
}
matcher(features, pairwise_matches,matchMask);
matcher.collectGarbage();
Problem is, this wont compile. Im guessing its because im using OpenCV 3.1.
Then I found somewhere that this code would do the same:
int range_width = 2;
BestOf2NearestRangeMatcher matcher(range_width, try_cuda, match_conf);
matcher(features, pairwise_matches);
matcher.collectGarbage();
And for most of my samples this works fine. However sometimes, especially when im stitching
a large set of images (around 15), some objects appear on top of eachother and in places they shouldnt.
I've also noticed that the "beginning" (left side) of the end result is not the first image in the vector either
which is strange.
I am using "orb" as features_type and "ray" as ba_cost_func. Seems like I cant use SURF on OpenCV 3.1.
The rest of my initial parameters look like this:
bool try_cuda = false;
double compose_megapix = -1; //keeps resolution for final panorama
float match_conf = 0.3f; //0.3 default for orb
string ba_refine_mask = "xxxxx";
bool do_wave_correct = true;
WaveCorrectKind wave_correct = detail::WAVE_CORRECT_HORIZ;
int blend_type = Blender::MULTI_BAND;
float blend_strength = 5;
double work_megapix = 0.6;
double seam_megapix = 0.08;
float conf_thresh = 0.5f;
int expos_comp_type = ExposureCompensator::GAIN_BLOCKS;
string seam_find_type = "dp_colorgrad";
string warp_type = "spherical";
So could anyone enlighten me as to why this is not working and how I should match my features? Any help or direction would be much appreciated!
TL;DR : I want to stitch images in the order they were taken, but above codes are not working for me, how can I do that?
So I found out that the issue here is not with the order the images are stitched but rather the rotation that is estimated for the camera parameters in the Homography Based Estimator and the Bundle Ray Adjuster.
Those rotation angles are estimated considering a self rotating camera and my use case envolves an user rotating the camera (which means that will be some translation too.
Because of that (i guess) horizontal angles (around Y axis) are highly overestimated which means that the algorithm considers the set of images cover >= 360 degrees which results in some overlapped areas that shouldnt be overlapped.
Still havent found a solution for that problem though.
matcher() takes UMat as mask instead of Mat object, so try the following code:
vector<MatchesInfo> pairwise_matches;
BestOf2NearestMatcher matcher(try_gpu, match_conf);
Mat matchMask(features.size(),features.size(),CV_8U,Scalar(0));
for (int i = 0; i < num_images -1; ++i)
{
matchMask.at<char>(i,i+1) =1;
}
UMat umask = matchMask.getUMat(ACCESS_READ);
matcher(features, pairwise_matches, umask);
matcher.collectGarbage();

Template Matching with Mask

I want to perform Template matching with mask. In general Template matching can be made faster by converting the image from Spacial domain into Frequency domain. But is there any any method i can apply if i want to perform the same with mask? I'm using opencv c++. Is there any matching function already there in opencv for this task?
My current Approach:
Bitwise Xor Image A & Image B with Mask.
Count the Non-Zero Pixels.
Fill the Resultant matrix with this count.
Search for maxi-ma.
Few parameters I'm guessing now are:
Skip the Tile position if the matches are less than 25%.
Skip the tile position if the matches are less than 25%.
Skip the Tile position if the previous Tile has matches are less than 50%.
My question: is there any algorithm to do this matching already? Is there any mathematical operation which can speed up this process?
With binary images, you can use directly HU-Moments and Mahalanobis distance to find if image A is similar to image B. If the distance tends to 0, then the images are the same.
Of course you can use also Features detectors so see what matches, but for pictures like these, HU Moments or Features detectors will give approximately same results, but HU Moments are more efficient.
Using findContours, you can extract the black regions inside the white star and fill them, in order to have image A = image B.
Other approach: using findContours on your mask and apply the result to Image A (extracting the Region of Interest), you can extract what's inside the star and count how many black pixels you have (the mismatching ones).
I have same requirement and I have tried the almost same way. As in the image, I want to match the castle. The castle has a different shield image and variable length clan name and also grass background(This image comes from game Clash of Clans). The normal opencv matchTemplate does not work. So I write my own.
I follow the ways of matchTemplate to create a result image, but with different algorithm.
The core idea is to count the matched pixel under the mask. The code is following, it is simple.
This works fine, but the time cost is high. As you can see, it costs 457ms.
Now I am working on the optimization.
The source and template images are both CV_8U3C, mask image is CV_8U. Match one channel is OK. It is more faster, but it still costs high.
Mat tmp(matTempl.cols, matTempl.rows, matTempl.type());
int matchCount = 0;
float maxVal = 0;
double areaInvert = 1.0 / countNonZero(matMask);
for (int j = 0; j < resultRows; j++)
{
float* data = imgResult.ptr<float>(j);
for (int i = 0; i < resultCols; i++)
{
Mat matROI(matSource, Rect(i, j, matTempl.cols, matTempl.rows));
tmp.setTo(Scalar(0));
bitwise_xor(matROI, matTempl, tmp);
bitwise_and(tmp, matMask, tmp);
data[i] = 1.0f - float(countNonZero(tmp) * areaInvert);
if (data[i] > matchingDegree)
{
SRect rc;
rc.left = i;
rc.top = j;
rc.right = i + imgTemplate.cols;
rc.bottom = j + imgTemplate.rows;
rcOuts.push_back(rc);
if ( data[i] > maxVal)
{
maxVal = data[i];
maxIndex = rcOuts.size() - 1;
}
if (++matchCount == maxMatchs)
{
Log_Warn("Too many matches, stopped at: " << matchCount);
return true;
}
}
}
}
It says I have not enough reputations to post image....
http://i.stack.imgur.com/mJrqU.png
New added:
I success optimize the algorithm by using key points. Calculate all the points is cost, but it is faster to calculate only server key points. See the picture, the costs decrease greatly, now it is about 7ms.
I still can not post image, please visit: http://i.stack.imgur.com/ePcD9.png
Please give me reputations, so I can post images. :)
There is a technical formulation for template matching with mask in OpenCV Documentation, which works well. It can be used by calling cv::matchTemplate and its source code is also available under the Intel License.

Kinect for Windows v2 depth to color image misalignment

currently I am developing a tool for the Kinect for Windows v2 (similar to the one in XBOX ONE). I tried to follow some examples, and have a working example that shows the camera image, the depth image, and an image that maps the depth to the rgb using opencv. But I see that it duplicates my hand when doing the mapping, and I think it is due to something wrong in the coordinate mapper part.
here is an example of it:
And here is the code snippet that creates the image (rgbd image in the example)
void KinectViewer::create_rgbd(cv::Mat& depth_im, cv::Mat& rgb_im, cv::Mat& rgbd_im){
HRESULT hr = m_pCoordinateMapper->MapDepthFrameToColorSpace(cDepthWidth * cDepthHeight, (UINT16*)depth_im.data, cDepthWidth * cDepthHeight, m_pColorCoordinates);
rgbd_im = cv::Mat::zeros(depth_im.rows, depth_im.cols, CV_8UC3);
double minVal, maxVal;
cv::minMaxLoc(depth_im, &minVal, &maxVal);
for (int i=0; i < cDepthHeight; i++){
for (int j=0; j < cDepthWidth; j++){
if (depth_im.at<UINT16>(i, j) > 0 && depth_im.at<UINT16>(i, j) < maxVal * (max_z / 100) && depth_im.at<UINT16>(i, j) > maxVal * min_z /100){
double a = i * cDepthWidth + j;
ColorSpacePoint colorPoint = m_pColorCoordinates[i*cDepthWidth+j];
int colorX = (int)(floor(colorPoint.X + 0.5));
int colorY = (int)(floor(colorPoint.Y + 0.5));
if ((colorX >= 0) && (colorX < cColorWidth) && (colorY >= 0) && (colorY < cColorHeight))
{
rgbd_im.at<cv::Vec3b>(i, j) = rgb_im.at<cv::Vec3b>(colorY, colorX);
}
}
}
}
}
Does anyone have a clue of how to solve this? How to prevent this duplication?
Thanks in advance
UPDATE:
If I do a simple depth image thresholding I obtain the following image:
This is what more or less I expected to happen, and not having a duplicate hand in the background. Is there a way to prevent this duplicate hand in the background?
I suggest you use the BodyIndexFrame to identify whether a specific value belongs to a player or not. This way, you can reject any RGB pixel that does not belong to a player and keep the rest of them. I do not think that CoordinateMapper is lying.
A few notes:
Include the BodyIndexFrame source to your frame reader
Use MapColorFrameToDepthSpace instead of MapDepthFrameToColorSpace; this way, you'll get the HD image for the foreground
Find the corresponding DepthSpacePoint and depthX, depthY, instead of ColorSpacePoint and colorX, colorY
Here is my approach when a frame arrives (it's in C#):
depthFrame.CopyFrameDataToArray(_depthData);
colorFrame.CopyConvertedFrameDataToArray(_colorData, ColorImageFormat.Bgra);
bodyIndexFrame.CopyFrameDataToArray(_bodyData);
_coordinateMapper.MapColorFrameToDepthSpace(_depthData, _depthPoints);
Array.Clear(_displayPixels, 0, _displayPixels.Length);
for (int colorIndex = 0; colorIndex < _depthPoints.Length; ++colorIndex)
{
DepthSpacePoint depthPoint = _depthPoints[colorIndex];
if (!float.IsNegativeInfinity(depthPoint.X) && !float.IsNegativeInfinity(depthPoint.Y))
{
int depthX = (int)(depthPoint.X + 0.5f);
int depthY = (int)(depthPoint.Y + 0.5f);
if ((depthX >= 0) && (depthX < _depthWidth) && (depthY >= 0) && (depthY < _depthHeight))
{
int depthIndex = (depthY * _depthWidth) + depthX;
byte player = _bodyData[depthIndex];
// Identify whether the point belongs to a player
if (player != 0xff)
{
int sourceIndex = colorIndex * BYTES_PER_PIXEL;
_displayPixels[sourceIndex] = _colorData[sourceIndex++]; // B
_displayPixels[sourceIndex] = _colorData[sourceIndex++]; // G
_displayPixels[sourceIndex] = _colorData[sourceIndex++]; // R
_displayPixels[sourceIndex] = 0xff; // A
}
}
}
}
Here is the initialization of the arrays:
BYTES_PER_PIXEL = (PixelFormats.Bgr32.BitsPerPixel + 7) / 8;
_colorWidth = colorFrame.FrameDescription.Width;
_colorHeight = colorFrame.FrameDescription.Height;
_depthWidth = depthFrame.FrameDescription.Width;
_depthHeight = depthFrame.FrameDescription.Height;
_bodyIndexWidth = bodyIndexFrame.FrameDescription.Width;
_bodyIndexHeight = bodyIndexFrame.FrameDescription.Height;
_depthData = new ushort[_depthWidth * _depthHeight];
_bodyData = new byte[_depthWidth * _depthHeight];
_colorData = new byte[_colorWidth * _colorHeight * BYTES_PER_PIXEL];
_displayPixels = new byte[_colorWidth * _colorHeight * BYTES_PER_PIXEL];
_depthPoints = new DepthSpacePoint[_colorWidth * _colorHeight];
Notice that the _depthPoints array has a 1920x1080 size.
Once again, the most important thing is to use the BodyIndexFrame source.
Finally I get some time to write the long awaited answer.
Lets start with some theory to understand what is really happening and then a possible answer.
We should start by knowing the way to pass from a 3D point cloud which has the depth camera as the coordinate system origin to an image in the image plane of the RGB camera. To do that it is enough to use the camera pinhole model:
In here, u and v are the coordinates in the image plane of the RGB camera. the first matrix in the right side of the equation is the camera matrix, AKA intrinsics of the RGB Camera. The following matrix is the rotation and translation of the extrinsics, or better said, the transformation needed to go from the Depth camera coordinate system to the RGB camera coordinate system. The last part is the 3D point.
Basically, something like this, is what the Kinect SDK does. So, what could go wrong that makes the hand gets duplicated? well, actually more than one point projects to the same pixel....
To put it in other words and in the context of the problem in the question.
The depth image, is a representation of an ordered point cloud, and I am querying the u v values of each of its pixels that in reality can be easily converted to 3D points. The SDK gives you the projection, but it can point to the same pixel (usually, the more distance in the z axis between two neighbor points may give this problem quite easily.
Now, the big question, how can you avoid this.... well, I am not sure using the Kinect SDK, since you do not know the Z value of the points AFTER the extrinsics are applied, so it is not possible to use a technique like the Z buffering.... However, you may assume the Z value will be quite similar and use those from the original pointcloud (at your own risk).
If you were doing it manually, and not with the SDK, you can apply the Extrinsics to the points, and the use the project them into the image plane, marking in another matrix which point is mapped to which pixel and if there is one existing point already mapped, check the z values and compared them and always leave the closest point to the camera. Then, you will have a valid mapping without any problems. This way is kind of a naive way, probably you can get better ones, since the problem is now clear :)
I hope it is clear enough.
P.S.:
I do not have Kinect 2 at the moment so I can'T try to see if there is an update relative to this issue or if it still happening the same thing. I used the first released version (not pre release) of the SDK... So, a lot of changes may had happened... If someone knows if this was solve just leave a comment :)

Why can't I access the Object in my function?

I have a function that detects motion between two frames and stores a cropped image of only the moving Object in the variable cv::Mat result_cropped. Now I want to add a function that checks the result_cropped for black pixels. I wrote the code for that easely but I'm completly stuck on trying to implement it in my class.
For some reason my blackDetection(Mat & cropped) can't access the cropped image which results in the program crashing.
Heres my simplified code:
void ActualRec::run(){
while (isActive){
//...code to check for motion
//if there was motion a cropped image will be stored in result_cropped
number_of_changes = detectMotion(motion, result, result_cropped, region, max_deviation, color);
if(number_of_changes>=there_is_motion) {
if(number_of_sequence>0){
// there was motion detected, store cropped image - this works
saveImg(pathnameThresh, result_cropped);
if (blackDetection(result_cropped)==true){
//the cropped image has black pixels
}
else {
//the cropped image has no black pixels
}
number_of_sequence++;
}
else
{
// no motion was detected
}
}
}
bool ActualRec::blackDetection(Mat & result_cropped){
//...check for black pixels, program crashes since result_cropped is empty
//if i add imshow("test",result_cropped) I keep getting an empty window
if (blackPixelCounter>0){
return true;
}
else return false;
}
Again, the problem is that I can't manage to access result_cropped in blackDetection(Mat & result_cropped).
\\edit: my complete code for this class http://pastebin.com/3i0WdLG0 . Please someone help me. This problem doesn't make any sense for me..
You don't have a cv::waitKey() in blackDetection(), so you will crash before you get to the cvWaitKey() in run(). You are jumping to conclusions that result_cropped is "empty".
You have not allocated croppedBlack anywhere, so you will crash on croppedBlack.at<Vec3b>(y,x)[c] =.
Add this at the start of blackDetection() (e.g.):
croppedBlack.create(result_cropped.size(), result_cropped.type());
To make it faster see How to scan images ... with OpenCV : The efficient way
bool ActualRec::blackDetection(Mat& result_cropped)
{
croppedBlack.create(result_cropped.size(), result_cropped.type());
int blackCounter = 0;
for(int y = 0; y < result_cropped.rows; ++y)
{
Vec3b* croppedBlack_row = croppedBlack.ptr<Vec3b>(y);
Vec3b* result_cropped_row = result_cropped.ptr<Vec3b>(y);
for(int x = 0; x < result_cropped.cols; ++x)
{
for(int c = 0; c < 3; ++c)
{
croppedBlack_row[x][c] =
saturate_cast<uchar>(alpha * result_cropped_row[x][c] + beta);
}
}
}
}

OpenCV cvSet2d.....what does this do

I was browsing over some code in the OpenCV page when it came to accessing Pixel Data
IplImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_32F,3);
CvScalar s;
s=cvGet2D(img,i,j); // get the (i,j) pixel value
printf("B=%f, G=%f, R=%f\n",s.val[0],s.val[1],s.val[2]);
s.val[0]=111;
s.val[1]=111;
s.val[2]=111;
cvSet2D(img,i,j,s); // set the (i,j) pixel value
I had done something similar, but I used the Template Class provided to access pixel data......anyways Im not sure I understand the part s.val[0]=111....etc?
if s.val[0] contains the B value, what exactly is s.val[0]=111 doing? is it setting it to black?........I dont understand exactly what it's supposed to be?
Im used to CVscalars and such but I dont understand this format? Specifically what 111 means?
thanks
The cvSet2D(img, i, j, s) functions to not access the (i,j)th pixel. It accesses the (j,i)th pixel. That is because images are stored as a matrix - you need to specify the row first (the Y coordinate) and then the column (the X coordinate).
Instead of using the cvGet/Set functions, did you try using pointers to access data within an image?
If you want direct access to the pixels, after loading an image you could do something like:
// This example converts a colored image to its grayscale version.
// Let's say that rgb_img is your previously loaded image.
IplImage* gray_frame = 0;
gray_frame = cvCreateImage(cvSize(rgb_img->width, rgb_img->height), rgb_img->depth, rgb_img->nChannels);
if (!gray_frame)
{
fprintf(stderr, "!!! cvCreateImage failed!\n" );
return NULL;
}
for (int i = 0; i < rgb_img->width * rgb_img->height * rgb_img->nChannels; i += rgb_img->nChannels)
{
gray_frame->imageData[i] = (rgb_img->imageData[i] + rgb_img->imageData[i+1] + rgb_img->imageData[i+2])/3; //B
gray_frame->imageData[i+1] = (rgb_img->imageData[i] + rgb_img->imageData[i+1] + rgb_img->imageData[i+2])/3; //G
gray_frame->imageData[i+2] = (rgb_img->imageData[i] + rgb_img->imageData[i+1] + rgb_img->imageData[i+2])/3; //R
}