OpenCV: How to use HOGDescriptor::detect method? - c++

I have succeeded in tracking moving objects in a video.
However I want to decide if an object is person or not. I have tried the HOGDescriptor in OpenCV. HOGDescriptor have two methods for detecting people: HOGDescriptor::detect, and HOGDescriptor::detectMultiScale. OpenCV "sources\samples\cpp\peopledetect.cpp" demonstrates how to use HOGDescriptor::detectMultiScale , which search around the image at different scale and is very slow.
In my case, I have tracked the objects in a rectangle. I think using HOGDescriptor::detect to detect the inside of the rectangle will be much more quickly. But the OpenCV document only have the gpu::HOGDescriptor::detect (I still can't guess how to use it) and missed HOGDescriptor::detect. I want to use HOGDescriptor::detect.
Could anyone provide me with some c++ code snippet demonstrating the usage of HOGDescriptor::detect?
thanks.

Since you already have a list of objects, you can call the HOGDescriptor::detect method for all objects and check the output foundLocations array. If it is not empty the object was classified as a person. The only thing is that HOG works with 64x128 windows by default, so you need to rescale your objects:
std::vector<cv::Rect> movingObjects = ...;
cv::HOGDescriptor hog;
hog.setSVMDetector(cv::HOGDescriptor::getDefaultPeopleDetector());
std::vector<cv::Point> foundLocations;
for (size_t i = 0; i < movingObjects.size(); ++i)
{
cv::Mat roi = image(movingObjects[i]);
cv::Mat window;
cv::resize(roi, window, cv::Size(64, 128));
hog.detect(window, foundLocations);
if (!foundLocations.empty())
{
// movingObjects[i] is a person
}
}

If you don't cmake OpenCV with CUDA enabled, calling gpu::HOGDescriptor::detect will be equal to call HOGDescriptor::detect. No GPU is called.
Also for code, you can use
GpuMat img;
vector<Point> found_locations;
gpu::HOGDescriptor::detect(img, found_locations);
if(!found_locations.empty())
{
// img contains/is a real person
}
Edit:
However I want to decide if an object is person or not.
I don't think that you need this step. HOGDescriptor::detect itself is used to detect people, so you don't need to verify them as they are supposed to be people according to your setup. On the other hand, you can setup its threshold to control its detected quality.

Related

dlib-19.1: Initialize dlib::matrix from image (e.g. dlib::cv_image) for DNN training

I am currently trying to train a DNN with images I have on file (OCR context... input images per class are aggregate images of several thousand fixed size tiny images).
I have some code to open and properly segment the aggregate images into small OpenCV cv::Mat's. My problem is, there does not seem to be a way to
train the DNN on dlib::cv_image directly (which can be wrapped around cv::Mat; I'm getting 500+ lines of compiler errors) or
easily convert/wrap cv::Mat to dlib::matrix without copying every element
I'm pretty sure I'm missing something here, any pointers would be greatly appreciated.
Note: The only variant I got to compile was calling dlib::dnn_trainer::train() with a vector of dlib::matrix (size fixed at compile time) and a vector with unsigned long labels (unsigned labels did not compile), although train() is templated on both types. Any pointers?
You don't have to fix the size of dlib::matrix at compile time. Just call set_size() on it. See also http://dlib.net/faq.html#HowdoIsetthesizeofamatrixatruntime.
Also, if you want to use something other than a dlib::matrix as input you can do that. You just have to define your own input layer. The interface you must implement is fully documented here: http://dlib.net/dlib/dnn/input_abstract.h.html#EXAMPLE_INPUT_LAYER. You could also look at the existing input layers for examples. But be sure to read the documentation as it will answer questions you are likely to have.
Dlib has an amazing function for this task: http://dlib.net/imaging.html#assign_image, but it makes copying of each element
here is sample code on how it can be used:
// mat should be greyscale image (8UC1)
void cv_to_dlib_float_matrix(const cv::Mat& mat, dlib::matrix<float>& res)
{
cv::Mat tmp(mat.cols, mat.rows, CV_32FC1);
cv::normalize(mat, tmp, 0.0, 1.0, cv::NORM_MINMAX, CV_32FC1);
dlib::assign_image(res, dlib::cv_image<float>(tmp));
}

OpenCV CascadeClassifier detectMultiScale resulting Rect outside input Mat bounds

I'm having a problem with detectMultiScale returning rectangles outside the bounds of the input Mat.
So what I'm doing is an optimization technique where the first frame of a video feed is passed to detectMultiScale in it's entirety.
If an object was detected, I create a temp Mat, which i clone the previous frames detected object's rect from the current full frame.
Then i pass this temp Mat to detectMultiScale, so only the area around the rectangle where the previous frame detected an object.
The problem i'm having is that the results from detectMultiScale when passing the temp Mat give rectangles that are outside the bounds of the input temp Mat.
Mostly I would just like to know exactly what is going on here. I have two ideas of what could be happening, but I can't figure out for sure.
Either the clone operation when cloning a rect from a full frame to the temp Mat is somewhere inside the Mat object setting the cloned area at the rows and columns of the full frame. So for example, i have a full frame of 100x100, i'm trying to clone a 10x10 rectangle from it at position 80x80. The resulting Mat will then be size 10x10, but maybe inside the Mat somewhere it is saying the Mat starts at 80x80?
CascadeClassifier is keeping state somewhere of the full frame i had passed to it previously?
I don't know what is happening here for sure, but was hoping someone could shed some light.
Here's a little code example of what i'm trying to do, with comments explaining the results i'm seeing:
std::vector<cv::Rect> DetectObjects(cv::Mat fullFrame, bool useFullFrame, cv::Rect detectionRect)
{
// fullFrame is 100x100
// detectionRect is 10x10 at position 80x80 eg. cv::Rect(80,80,10,10)
// useFullFrame is False
std::vector<cv::Rect> results;
if(useFullFrame)
{
object_cascade.detectMultiScale(fullFrame,
results,
m_ScaleFactor,
m_Neighbors,
0 | cv::CASCADE_SCALE_IMAGE | cv::CASCADE_DO_ROUGH_SEARCH | cv::CASCADE_DO_CANNY_PRUNING,
m_MinSize,
m_MaxSize);
}
else
{
// useFullFrame is false, so we run this block
cv::Mat tmpMat = fullFrame(detectionRect).clone();
// tmpMat is size 10,10
object_cascade.detectMultiScale(tmpMat,
results,
m_ScaleFactor,
m_Neighbors,
0 | cv::CASCADE_SCALE_IMAGE | cv::CASCADE_DO_ROUGH_SEARCH | cv::CASCADE_DO_CANNY_PRUNING,
m_MinSize,
m_MaxSize);
}
if(results.size() > 0)
{
// this is the weird part. When looking at the first element of
// results, (result[0]), it is at position 80,80, size 10,10
// so it is cv::Rect(80,80,10,10)
// even though the second detectMultiScale was ran with a Mat of 10x10
// do stuff
}
}
This is pretty darn close to what i have in code, except for the actual example values i mention above in the comments, i used values that were easy rather than full frame values like 1920x1080 and actual results, something like 367x711 for example.
So why am i getting results from detectMultiScale that are outside the bounds of the input Mat?
EDIT:
I had written this program originally for an embedded linux distribution, where this problem does not arise (i've always gotten expected results). This problem is happening on a windows release and build of opencv, so i'm currently going through the opencv code to see if there's anything that stands out related to this.
I believe this is a simple logic error. This:
if(fullFrame)
should be this:
if(useFullFrame)

Taking a screenshot of a particular area

Looking for a way for taking a screenshot of a particular area on the screen in C++. (So not the whole screen) Then it should save it as .png .jpg whatever to use it with another function afterwards.
Also, I am going to use it, somehow, with openCV. Thought i'd mention that, maybe it's a helpful detail.
OpenCV cannot take screenshots from your computer directly. You will need a different framework/method to do this. #Ben is correct, this link would be worth investigating.
Once you have read this image in, you will need to store it into a cv:Mat so that you are able to perform OpenCV operations on it.
In order to crop an image in OpenCV the following code snippet would help.
CVMat * imagesource;
// Transform it into the C++ cv::Mat format
cv::Mat image(imagesource);
// Setup a rectangle to define your region of interest
cv::Rect myROI(10, 10, 100, 100);
// Crop the full image to that image contained by the rectangle myROI
// Note that this doesn't copy the data
cv::Mat croppedImage = image(myROI);

OpenCV, how to use arrays of points for smoothing and sampling contours?

I have a problem to get my head around smoothing and sampling contours in OpenCV (C++ API).
Lets say I have got sequence of points retrieved from cv::findContours (for instance applied on this this image:
Ultimately, I want
To smooth a sequence of points using different kernels.
To resize the sequence using different types of interpolations.
After smoothing, I hope to have a result like :
I also considered drawing my contour in a cv::Mat, filtering the Mat (using blur or morphological operations) and re-finding the contours, but is slow and suboptimal. So, ideally, I could do the job using exclusively the point sequence.
I read a few posts on it and naively thought that I could simply convert a std::vector(of cv::Point) to a cv::Mat and then OpenCV functions like blur/resize would do the job for me... but they did not.
Here is what I tried:
int main( int argc, char** argv ){
cv::Mat conv,ori;
ori=cv::imread(argv[1]);
ori.copyTo(conv);
cv::cvtColor(ori,ori,CV_BGR2GRAY);
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i > hierarchy;
cv::findContours(ori, contours,hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_NONE);
for(int k=0;k<100;k += 2){
cv::Mat smoothCont;
smoothCont = cv::Mat(contours[0]);
std::cout<<smoothCont.rows<<"\t"<<smoothCont.cols<<std::endl;
/* Try smoothing: no modification of the array*/
// cv::GaussianBlur(smoothCont, smoothCont, cv::Size(k+1,1),k);
/* Try sampling: "Assertion failed (func != 0) in resize"*/
// cv::resize(smoothCont,smoothCont,cv::Size(0,0),1,1);
std::vector<std::vector<cv::Point> > v(1);
smoothCont.copyTo(v[0]);
cv::drawContours(conv,v,0,cv::Scalar(255,0,0),2,CV_AA);
std::cout<<k<<std::endl;
cv::imshow("conv", conv);
cv::waitKey();
}
return 1;
}
Could anyone explain how to do this ?
In addition, since I am likely to work with much smaller contours, I was wondering how this approach would deal with border effect (e.g. when smoothing, since contours are circular, the last elements of a sequence must be used to calculate the new value of the first elements...)
Thank you very much for your advices,
Edit:
I also tried cv::approxPolyDP() but, as you can see, it tends to preserve extremal points (which I want to remove):
Epsilon=0
Epsilon=6
Epsilon=12
Epsilon=24
Edit 2:
As suggested by Ben, it seems that cv::GaussianBlur() is not supported but cv::blur() is. It looks very much closer to my expectation. Here are my results using it:
k=13
k=53
k=103
To get around the border effect, I did:
cv::copyMakeBorder(smoothCont,smoothCont, (k-1)/2,(k-1)/2 ,0, 0, cv::BORDER_WRAP);
cv::blur(smoothCont, result, cv::Size(1,k),cv::Point(-1,-1));
result.rowRange(cv::Range((k-1)/2,1+result.rows-(k-1)/2)).copyTo(v[0]);
I am still looking for solutions to interpolate/sample my contour.
Your Gaussian blurring doesn't work because you're blurring in column direction, but there is only one column. Using GaussianBlur() leads to a "feature not implemented" error in OpenCV when trying to copy the vector back to a cv::Mat (that's probably why you have this strange resize() in your code), but everything works fine using cv::blur(), no need to resize(). Try Size(0,41) for example. Using cv::BORDER_WRAP for the border issue doesn't seem to work either, but here is another thread of someone who found a workaround for that.
Oh... one more thing: you said that your contours are likely to be much smaller. Smoothing your contour that way will shrink it. The extreme case is k = size_of_contour, which results in a single point. So don't choose your k too big.
Another possibility is to use the algorithm openFrameworks uses:
https://github.com/openframeworks/openFrameworks/blob/master/libs/openFrameworks/graphics/ofPolyline.cpp#L416-459
It traverses the contour and essentially applies a low-pass filter using the points around it. Should do exactly what you want with low overhead and (there's no reason to do a big filter on an image that's essentially just a contour).
How about approxPolyDP()?
It uses this algorithm to 'smooth' a contour (basically gettig rid of most of the contour's points and leave the ones that represent a good approximation of your contour)
From 2.1 OpenCV doc section Basic Structures:
template<typename T>
explicit Mat::Mat(const vector<T>& vec, bool copyData=false)
You probably want to set 2nd param to true in:
smoothCont = cv::Mat(contours[0]);
and try again (this way cv::GaussianBlur should be able to modify the data).
I know this was written a long time ago, but did you tried a big erode followed by a big dilate (opening), and then find the countours? It looks like a simple and fast solution, but I think it could work, at least to some degree.
Basically the sudden changes in contour corresponds to high frequency content. An easy way to smooth your contour would be to find the fourier coefficients assuming the coordinates form a complex plane x + iy and then by eliminating the high frequency coefficients.
My take ... many years later ...!
Maybe two easy ways to do it:
loop a few times with dilate,blur,erode. And find the contours on that updated shape. I found 6-7 times gives good results.
create a bounding box of the contour, and draw an ellipse inside the bounded rectangle.
Adding the visual results below:
This applies to me. The edges are smoother than before:
medianBlur(mat, mat, 7)
morphologyEx(mat, mat, MORPH_OPEN, getStructuringElement(MORPH_RECT, Size(12.0, 12.0)))
val contours = getContours(mat)
This is opencv4android code.

Extracting Depth images of Kinect using opencv

Does anyone know what is the simplest way to extract the gray-level depth images of Kinect using OpenCV and C++? any source code in this field?
if you use OpenNI SDK, you can simply point to the buffer:
//on setup:
xn::DepthGenerator depthGenerator;
xn::DepthMetaData depthMD;
cv::Mat depthWrapper;
//on update loop,
//after context.WaitAnyUpdateAll();
depthGenerator.GetMetaData(depthMD);
depthWrapper = cv::Mat(depthMD.YRes(), depthMD.XRes(), CV_16UC1, (void*) depthMD.Data());
note that depthWrapper is const so you need to clone it in order to manipulate it
The documentation has everything you need. Can't elaborate better than this.
You need to do two things (apart from reading about context, depth generator and initialization of Kinect):
Create Mat of the type CV_16U a.
context.WaitOneUpdateAll(depth_map); b. Mdepth_original =
Mat(h_depth, w_depth, CV_16U, (void*) depth_map.GetData()) c. copy
the Mat since it will be destroyed during next read:
Mdepth_original.copyTo(depth);
Map depth to gray or color. Color seems like a good idea (256^3 levels) but a human eye is more sensitive to the luminance change. Even with 256 levels you can map 10,000 Kinect levels reasonably well using [histogram equalization][1] technique. A simplest way though is to loose precision and just do I(x, y) = 255.0*z(x, y)/z_range
Here is how histogram equalization is implemented in openNI2:
https://github.com/OpenNI/OpenNI2/blob/master/Samples/Common/OniSampleUtilities.h