Making the usual blob tracker with OpenCV and cvBlobsLib, I've come across this problem and it seems no one else had it, which makes me sad. I get the RGB/BGR frame, choose the color to isolate, treshold it into b/w, find the blobs and add the bounding rectangle on each blob, but when I display the final image, the box is stretched on the x-axis: when the object is on the left the box is close to it (although around 2.5 times larger), and as it moves to the right the box moves faster (= more and more far from the object) until it reaches the right end of the window when the object isn't even halfway. This doesn't happen on the y-axis, where everything is fine. It's not a problem with rectangles, it happens when I use fillBlob aswell, the blob shape comes out stretched and misaligned. Also, it's not a problem related to image capturing, since I've tried with kinect (OpenNI), webcam and even using a single image (imread()), and I verified that every ImageGenerator, Mat, IplImage used were 640x480, 8bit depth, for which I used AUTOSIZE for the namedWindow (enlarging to fullscreen window doesn't help either). Showing the BGR frame and the tresholded image gives no problems, they both fit into the window, but the detected blobs seem to belong to a different resolution space when I merge them with the original image. Here's the code, not much has changed from the usual examples found online everywhere:
//[...]
namedWindow("Color Image", CV_WINDOW_AUTOSIZE);
namedWindow("Color Tracking", CV_WINDOW_AUTOSIZE);
//[...] I already got the two cv::Mat I need, imgBGR and imgTresh
CBlobResult blobs;
CBlob *currentBlob;
Point pt1, pt2;
Rect rect;
//had to do Mat to IplImage conversion, since cvBlobsLib doesn't like mats
IplImage iplTresh = imgTresh;
IplImage iplBGR = imgBGR;
blobs = CBlobResult(&iplTresh, NULL, 0);
blobs.Filter(blobs, B_EXCLUDE, CBlobGetArea(), B_LESS, 100);
int nBlobs = blobs.GetNumBlobs();
for (int i = 0; i < nBlobs; i++)
{
currentBlob = blobs.GetBlob(i);
rect = currentBlob->GetBoundingBox();
pt1.x = rect.x;
pt1.y = rect.y;
pt2.x = rect.x + rect.width;
pt2.y = rect.y + rect.height;
cvRectangle(&iplBGR, pt1, pt2, cvScalar(255, 255, 255, 0), 3, 8, 0);
}
//[...]
imshow("Color Image", imgBGR);
imshow("Color Tracking", imgTresh);
The "[...]" is code that shouldn't have nothing to do with this issue, but if you need further info on how I handled the images, let me know and I'll post it.
Based on the fact that the way I capture the image doesn't change anything, that BGR frame and B/W image are well shown, and that after getting blobs any way of displaying them gives the same (wrong) result, the problem must be something between CBlobResult() and matrix2ipl conversion, but I don't really know how to find it out.
Oh god, I spent ages to write the whole problem and the next day I found the answer almost casually. As I created the B/W matrix for tresholding, I didn't make it single-channel; I copied the BGR matrix type, thus having a treshold image with 3 channels which resulted in a widthStep 3 times the frame width. Resolved creating cv::Mat imgTresh with CV_8UC1 as type.
Related
Objective and problem
I'm trying to process a video file on the fly using OpenCV 3.4.1 by grabbing each frame, converting to grayscale, then doing Canny edge detection on it. In order to display the images (on the fly as well), I created a Mat class with 3 additional headers that is three times as wide as the original frame. The 3 extra headers represent the images I would like to display in the composite, and are positioned to the 1st, 2nd and 3rd horizontal segment of the composite.
After image processing however, the display of the composite image is not as expected: the first segment (where the original frame should be) is completely black, while the other segments (of processed images) are displayed fine. If, on the other hand, I display the ROIs one by one in separate windows, all the images look fine.
These are the things I tried to overcome this issue:
use .copyTo to actually copy the data into the appropriate image segments. The result was the same.
I put the Canny image to the compOrigPart ROI, and it did display in the first segment, so it is not a problem with the definition of the ROIs.
Define the composite as three channel image
In the loop convert it to grayscale
put processed images into it
convert back to BGR
put the original in.
This time around the whole composite was black, nothing showed.
As per gameon67's suggestion, I tried to create a namedWindow as well, but that doesn't help either.
Code:
int main() {
cv::VideoCapture vid("./Vid.avi");
if (!vid.isOpened()) return -1;
int frameWidth = vid.get(cv::CAP_PROP_FRAME_WIDTH);
int frameHeight = vid.get(cv::CAP_PROP_FRAME_HEIGHT);
int frameFormat = vid.get(cv::CAP_PROP_FORMAT);
cv::Scalar fontColor(250, 250, 250);
cv::Point textPos(20, 20);
cv::Mat frame;
cv::Mat compositeFrame(frameHeight, frameWidth*3, frameFormat);
cv::Mat compOrigPart(compositeFrame, cv::Range(0, frameHeight), cv::Range(0, frameWidth));
cv::Mat compBwPart(compositeFrame, cv::Range(0, frameHeight), cv::Range(frameWidth, frameWidth*2));
cv::Mat compEdgePart(compositeFrame, cv::Range(0, frameHeight), cv::Range(frameWidth*2, frameWidth*3));
while (vid.read(frame)) {
if (frame.empty()) break;
cv::cvtColor(frame, compBwPart, cv::COLOR_BGR2GRAY);
cv::Canny(compBwPart, compEdgePart, 100, 150);
compOrigPart = frame;
cv::putText(compOrigPart, "Original", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor);
cv::putText(compBwPart, "GrayScale", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor);
cv::putText(compEdgePart, "Canny edge detection", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor);
cv::imshow("Composite of Original, BW and Canny frames", compositeFrame);
cv::imshow("Original", compOrigPart);
cv::imshow("BW", compBwPart);
cv::imshow("Canny", compEdgePart);
cv::waitKey(33);
}
}
Questions
Why can't I display the entirety of the composite image in a single window, while displaying them separately is OK?
What is the difference between these displays? The data is obviously there, as evidenced by the separate windows.
Why only the original frame is misbehaving?
Your compBwPart and compEdgePart are grayscale images so the Mat type is CV8UC1 - single channel and therefore your compositeFrame is in grayscale too. If you want to combine these two images with a color image you have to convert it to BGR first and then fill the compOrigPart.
while (vid.read(frame)) {
if (frame.empty()) break;
cv::cvtColor(frame, compBwPart, cv::COLOR_BGR2GRAY);
cv::Canny(compBwPart, compEdgePart, 100, 150);
cv::cvtColor(compositeFrame, compositeFrame, cv::COLOR_GRAY2BGR);
frame.copyTo(compositeFrame(cv::Rect(0, 0, frameWidth, frameHeight)));
cv::putText(compOrigPart, "Original", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor); //the rest of your code
This is a combination of several issues.
The first problem is that you set the type of compositeFrame to the value returned by vid.get(cv::CAP_PROP_FORMAT). Unfortunately that property doesn't seem entirely reliable -- I've just had it return 0 (meaning CV_8UC1) after opening a color video, and then getting 3 channel (CV_8UC3) frames. Since you want to have the compositeFrame the same type as the input frame, this won't work.
To work around it, instead of using those properties, I'd lazy initialize compositeFrame and the 3 ROIs after receiving the first frame (based on it's dimensions and type).
The next set of problems lies in those two statements:
cv::cvtColor(frame, compBwPart, cv::COLOR_BGR2GRAY);
cv::Canny(compBwPart, compEdgePart, 100, 150);
In this case assumption is made that frame is BGR (since you're trying to convert), meaning compositeFrame and its ROIs are also BGR. Unfortunately, in both cases you're writing a grayscale image into the ROI. This will cause a reallocation, and the target Mat will cease to be a ROI.
To correct this, use temporary Mats for the grayscale data, and use cvtColor to turn it back to BGR to write into the ROIs.
Similar problem lies in the following statement:
compOrigPart = frame;
That's a shallow copy, meaning it will just make compOrigPart another reference to frame (and therefore it will cease to be a ROI of compositeFrame).
What you need is a deep copy, using copyTo (note that the data types still need to match, but that was fixed earlier).
Finally, even though you try to be flexible regarding the type of the input video (judging by the vid.get(cv::CAP_PROP_FORMAT)), the rest of the code really assumes that the input is 3 channel, and will break if it isn't.
At the least, there should be some assertion to cover this expectation.
Putting this all together:
#include <opencv2/opencv.hpp>
int main()
{
cv::VideoCapture vid("./Vid.avi");
if (!vid.isOpened()) return -1;
cv::Scalar fontColor(250, 250, 250);
cv::Point textPos(20, 20);
cv::Mat frame, frame_gray, edges_gray;
cv::Mat compositeFrame;
cv::Mat compOrigPart, compBwPart, compEdgePart; // ROIs
while (vid.read(frame)) {
if (frame.empty()) break;
if (compositeFrame.empty()) {
// The rest of code assumes video to be BGR (i.e. 3 channel)
CV_Assert(frame.type() == CV_8UC3);
// Lazy initialize once we have the first frame
compositeFrame = cv::Mat(frame.rows, frame.cols * 3, frame.type());
compOrigPart = compositeFrame(cv::Range::all(), cv::Range(0, frame.cols));
compBwPart = compositeFrame(cv::Range::all(), cv::Range(frame.cols, frame.cols * 2));
compEdgePart = compositeFrame(cv::Range::all(), cv::Range(frame.cols * 2, frame.cols * 3));
}
cv::cvtColor(frame, frame_gray, cv::COLOR_BGR2GRAY);
cv::Canny(frame_gray, edges_gray, 100, 150);
// Deep copy data to the ROI
frame.copyTo(compOrigPart);
// The ROI is BGR, so we need to convert back
cv::cvtColor(frame_gray, compBwPart, cv::COLOR_GRAY2BGR);
cv::cvtColor(edges_gray, compEdgePart, cv::COLOR_GRAY2BGR);
cv::putText(compOrigPart, "Original", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor);
cv::putText(compBwPart, "GrayScale", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor);
cv::putText(compEdgePart, "Canny edge detection", textPos, cv::FONT_HERSHEY_PLAIN, 1, fontColor);
cv::imshow("Composite of Original, BW and Canny frames", compositeFrame);
cv::imshow("Original", compOrigPart);
cv::imshow("BW", compBwPart);
cv::imshow("Canny", compEdgePart);
cv::waitKey(33);
}
}
Screenshot of the composite window (using some random test video off the web):
I am trying to draw an arrow with OpenCV 3.2:
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
using namespace cv;
int main()
{
Mat image(480, 640, CV_8UC3, Scalar(255, 255, 255)); //White background
Point from(320, 240); //Middle
Point to(639, 240); //Right border
arrowedLine(image, from, to, Vec3b(0, 0, 0), 1, LINE_AA, 0, 0.1);
imshow("Arrow", image);
waitKey(0);
return 0;
}
An arrow is drawn, but at the tip some pixels are missing:
To be more precise, two columns of pixels are not colored correctly (zoomed):
If I disable antialiasing, i.e., if I use
arrowedLine(image, from, to, Vec3b(0, 0, 0), 1, LINE_8, 0, 0.1);
instead (note the LINE_8 instead of LINE_AA), the pixels are there, albeit without antialiasing:
I am aware that antialiasing might rely on neighboring pixels, but it seems strange that pixels are not drawn at all at the borders instead of being drawn without antialiasing. Is there a workaround for this issue?
Increasing the X coordinate, e.g. to 640 or 641) makes the problem worse, i.e., more of the arrow head pixels disappear, while the tip still lacks nearly two complete pixel columns.
Extending and cropping the image would solve the neighboring pixels issue, but in my original use case, where the problem appeared, I cannot enlarge my image, i.e., its size must remain constant.
After a quick review, I've found that OpenCV draws AA lines using a Gaussian filter, which contracts the final image.
As I've suggested in comments, you can implement your own function for the AA mode (you can call the original one if AA is disabled) extending the points manually (see code below to have an idea).
Other option may be to increase the line width when using AA.
You may also simulate the AA effect of OpenCV but on the final image (may be slower but helpful if you have many arrows). I'm not an OpenCV expert so I'll write a general scheme:
// Filter radius, the higher the stronger
const int kRadius = 3;
// Image is extended to fit pixels that are not going to be blurred
Mat blurred(480 + kRadius * 2, 640 + kRadius * 2, CV_8UC3, Scalar(255, 255, 255));
// Points moved a according to filter radius (need testing, but the idea is that)
Point from(320, 240 + kRadius);
Point to(639 + kRadius * 2, 240 + kRadius);
// Extended non-AA arrow
arrowedLine(blurred, ..., LINE_8, ...);
// Simulate AA
GaussianBlur(blurred, blurred, Size(kRadius, kRadius), ...);
// Crop image (be careful, it doesn't copy data)
Mat image = blurred(Rect(kRadius, kRadius, 640, 480));
Another option may be to draw the arrow in an image twice as large and the scale it down with a good smoothing filter.
Obviously, last two options will work only if you don't have any previous data on the image. If so, then use a transparent image for temporal drawing and overlay it at the end.
I'm trying to get the people detector provided by the OpenCV library running. So far I get decent performance on my iPhone 6 but the detection is super bad and almost never correct and I'm not really sure why this is since you can find example videos using the same default HOG descriptor with way better detection.
Here is the code:
- (void)processImage:(Mat&)image {
cv::Mat cvImg, result;
cvtColor(image, cvImg, COLOR_BGR2HSV);
cv::vector<cv::Rect> found, found_filtered;
hog.detectMultiScale(cvImg, found, 0, cv::Size(4,4), cv::Size(8,8), 1.5, 0);
size_t i;
for (i=0; i < found.size(); i++) {
cv::Rect r = found[i];
rectangle(image, r.tl(), r.br(), Scalar(0,255,0), 2);
}
}
The video input comes from the iPhone camera itself and "processImage:" is called for every frame. For the HOGDescriptor I use the default people detector:
_hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());
I appreciate any help. :)
I'm new to openCV, so take this with a grain of salt:
The line cvtColor(image, cvImg, COLOR_BGR2HSV); converts the image from the BGR color space to the HSV color space. Essentially, it changes each pixel from being represented by how much blue, green, and red it has, to being represented by the components hue (color), saturation (how much color) and value (how bright). Clearly, the hogDescriptor acts on a BGR image, not an HSV image. You need to pass it a type CV_8UC3 image: An image with 3 channels per pixel (C3), ex. BGR, and an 8bit unsigned number for each channel (8U), This part is less important. What are you passing into the method processImage()? It should be one of those types. If not, you need to know the type and convert it to CV_8UC3 using the cvtColor() method
I'm working in opencv 2.4.0 and C++
I'm trying to do an exercise that says I should load an RGB image, convert it to gray scale and save the new image. The next step is to make the grayscale image into a binary image and store that image. This much I have working.
My problem is in counting the amount of black pixels in the binary image.
So far I've searched the web and looked in the book. The method that I've found that seems the most useful is.
int TotalNumberOfPixels = width * height;
int ZeroPixels = TotalNumberOfPixels - cvCountNonZero(cv_image);
But I don't know how to store these values and use them in cvCountNonZero(). When I pass the the image I want counted from to this function I get an error.
int main()
{
Mat rgbImage, grayImage, resizedImage, bwImage, result;
rgbImage = imread("C:/MeBGR.jpg");
cvtColor(rgbImage, grayImage, CV_RGB2GRAY);
resize(grayImage, resizedImage, Size(grayImage.cols/3,grayImage.rows/4),
0, 0, INTER_LINEAR);
imwrite("C:/Jakob/Gray_Image.jpg", resizedImage);
bwImage = imread("C:/Jakob/Gray_Image.jpg");
threshold(bwImage, bwImage, 120, 255, CV_THRESH_BINARY);
imwrite("C:/Jakob/Binary_Image.jpg", bwImage);
imshow("Original", rgbImage);
imshow("Resized", resizedImage);
imshow("Resized Binary", bwImage);
waitKey(0);
return 0;
}
So far this code is very basic but it does what it's supposed to for now. Some adjustments will be made later to clean it up :)
You can use countNonZero to count the number of pixels that are not black (>0) in an image. If you want to count the number of black (==0) pixels, you need to subtract the number of pixels that are not black from the number of pixels in the image (the image width * height).
This code should work:
int TotalNumberOfPixels = bwImage.rows * bwImage.cols;
int ZeroPixels = TotalNumberOfPixels - countNonZero(bwImage);
cout<<"The number of pixels that are zero is "<<ZeroPixels<<endl;
I'm trying to make a program to detect an object in any shape using a video camera/webcam based on Canny filter and contour finding function. Here is my program:
int main( int argc, char** argv )
{
CvCapture *cam;
CvMoments moments;
CvMemStorage* storage = cvCreateMemStorage(0);
CvSeq* contours = NULL;
CvSeq* contours2 = NULL;
CvPoint2D32f center;
int i;
cam=cvCaptureFromCAM(0);
if(cam==NULL){
fprintf(stderr,"Cannot find any camera. \n");
return -1;
}
while(1){
IplImage *img=cvQueryFrame(cam);
if(img==NULL){return -1;}
IplImage *src_gray= cvCreateImage( cvSize(img->width,img->height), 8, 1);
cvCvtColor( img, src_gray, CV_BGR2GRAY );
cvSmooth( src_gray, src_gray, CV_GAUSSIAN, 5, 11);
cvCanny(src_gray, src_gray, 70, 200, 3);
cvFindContours( src_gray, storage, &contours, sizeof(CvContour), CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE, cvPoint(0,0));
if(contours==NULL){ contours=contours2;}
contours2=contours;
cvMoments(contours, &moments, 1);
double m_00 = cvGetSpatialMoment( &moments, 0, 0 );
double m_10 = cvGetSpatialMoment( &moments, 1, 0 );
double m_01 = cvGetSpatialMoment( &moments, 0, 1 );
float gravityX = (m_10 / m_00)-150;
float gravityY = (m_01 / m_00)-150;
if(gravityY>=0&&gravityX>=0){
printf("center point=(%.f, %.f) \n",gravityX,gravityY); }
for (; contours != 0; contours = contours->h_next){
CvScalar color = CV_RGB(250,0,0);
cvDrawContours(img,contours,color,color,-1,-1, 8, cvPoint(0,0));
}
cvShowImage( "Input", img );
cvShowImage( "Contours", src_gray );
cvClearMemStorage(storage);
if(cvWaitKey(33)>=0) break;
}
cvDestroyWindow("Contours");
cvDestroyWindow("Source");
cvReleaseCapture(&cam);
}
This program will detect all contours captured by the camera and the average coordinate of the contours will be printed. My question is how to filter out only one object/contour so I can get more precise (x,y) position of the object? If possible, can anyone show me how to mark the center of the object by using (x,y) coordinates?
Thanks in advance. Cheers
p/s:Sorry I couldn't upload a screenshot yet but if anything helps, here's the link.
Edit: To make my question more clear:
For example, if I only want to filter out only the square from my screenshot above, what should I do?
The object I want to filter out has the biggest contour area and most importantly has a shape(any shape), not a straight or a curve line
I'm still experimenting with the smooth and canny values so if anybody have the problem to detect the contours using my program please alter the values.
I think it can be solved fairly easy. I would suggest some morphological operations before contour detection. Also, I would suggest filtering "out" smaller elements, and getting the biggest element as the only one still in the image.
I suggest:
for filtering out lines (straight or curved): you have to decide what do you yourself consider a border between a "line" and a "shape". Let's say you consider all the objects of a thickness 5 pixel or more to be objects, while the ones that are less than 5 pixels across to be lines. An morphological opening that uses a 5x5 square or a 3-pixel sized diamond shape as a structuring element would take care of this.
for filtering out small objects in general: if objects are of arbitrary shapes, purely morphological opening won't do: you have to do an algebraic opening. A special type of algebraic openings is an area opening: an operation that removes all the connected components in the image that have (pixel) area smaller than a given threshold. If you have an upper bound on the size of uninteresting objects, or a lower bound on the size of interesting ones, that value should be used as a threshold. You can probably get a similar effect with a larger morphological opening, but it will not be so flexible.
for filtering out all the objects except the largest: it sounds like removing connected components from the smallest one to the largest one should work. Try labeling the connected components. On a binary (black & white image), this image transformation works by creating a greyscale image, labeling the background as 0 (black), and each component with a different, increasing grey value. In the end, pixels of each object are marked by a different value. You can now simply look at the gray level histogram, and find the grey value with the most pixels. Set all the other grey levels to 0 (black), and the only object left in the image is the biggest one.
The suggestions are written from the simplest to the most complex ones. Still, I think OpenCV can be of help with any of these. Morphological erosion, dilation, opening and closing are implemented in OpenCV. I think you might need to construct an algebraic opening operator on your own (or play with combining OpenCV basic morphology), but I'm sure OpenCV can help you with both labeling the connected components and examining the histogram of the resulting greyscale image.
In the end, when only pixels from one object are left, you do the Canny contour detection.
This is a blob processing problem that can not be solved (easily) by OpenCV itself. Have a look at cvBlobsLib. This library is extends OpenCV with functions/classes for connected component labeling.
http://opencv.willowgarage.com/wiki/cvBlobsLib