I found a tutorial about VLFeat HOG
http://www.vlfeat.org/overview/hog.html
I am a little confused by the 16*16*31 matrix. Can anyone tell me how can I extract features that can be used for classification task from the matrix that the function returns?
Thanks!
The entries in that matrix are features! Depending on what you're trying to achieve you might do some dimensionality reduction or augmentation or post processing, but none of that is strictly necessary. Check out the original HoG paper.
Related
Is there a function in OpenCV (C++ API) to perform Wiener filtering? In this case, which is the header file?
I am looking for a function like matlab's Wiener filter. If there is none, has anyone tried to implement it with OpenCV? My goal is to reduce the noise in disparity maps.
I found C++ source code of Weiner filter is there:
http://gigadom.wordpress.com/2012/05/11/deblurring-with-opencv-weiner-filter-reloaded/
and there
https://github.com/savsun/Filters
You can simply edit it and then call as function.
Bad news: there is none.
Good news: it's not difficult to implement one with the classical equations, using OpenCV's FFT functions. It does even provide an API to multiply spectrums.
Now, you may also try other algorithms such as median filtering or implement TV denoising that have been shown to work with depth maps.
I know that this is an old question but I've encountered the same need few days ago. I wrote my personal C++ implementation of the adaptive Wiener filter (similar to the wiener2 Matlab's function) based on OpenCV library and I've pushed it on github. Hope this helps!
You can try to implement by yourself the Wiener filter, for example the book
PETROU, Maria; PETROU, Costas. Image processing: the fundamentals. John Wiley & Sons, 2010.
has a full derivation of the formula for the Wiener filter and a lot of suggestions and practical explanation for implementing the algorithm (for example it explains how to estimate the power spectrum of the noise and the power spectrum of the original undegraded image/signal starting just from the degraded and noisy image/signal and explaining well some reasonable assumptions).
I'm trying to align two images taken from a handheld camera.
At first, I was trying to use the OpenCV warpPerspective method based on SIFT/SURF feature points. The problem is the feature-extract & matching process may be extremely slow when the image quality is high (3000x4000). I tried to scale-down the image before find feature-points, the result is not as good as before.(The Mat generated from findHomography shouldn't be affected by scaling down the image, right?) And sometimes, due to lack of good feature point matches, the result is quite strange.
After searching on this topic, it seems that solving the problem in Fourier domain will speed up the registration process. And I've found this question which leads me to the code here.
The only problem is the code is written in python with numpy (not even using OpenCV), which makes it quite hard to re-written to C++ code using OpenCV (In OpenCV, I can only find dft and there's no fftshift nor fft stuff, I'm not quite familiar with NumPy, and I'm not brave enough to simply ignore the missing methods). So I'm wondering why there is not such a Fourier-domain image registration implementation using C++?
Can you guys give me some suggestion on how to implement one, or give me a link to the already implemented C++ version? Or help me to turn the python code into C++ code?
Big thanks!
I'm fairly certain that the FFT method can only recover a similarity transform, that is, only a (2d) rotation, translation and scale. Your results might not be that great using a handheld camera.
This is not quite a direct answer to your question, but, as a suggestion for a speed improvement, have you tried using a faster feature detector and descriptor? In OpenCV SIFT/SURF are some of the slowest methods they have for feature extraction/matching. You could try testing some of their other methods first, they all work quite well and are faster than SIFT/SURF. Especially if you use their FLANN-based matcher.
I've had to do this in the past with similar sized imagery, and using the binary descriptors OpenCV has increases the speed significantly.
If you need only shift you can use OpenCV's phasecorrelate
I am using OpenCV 2.4 (C++) for line finding on grayscale images. This involves some basic image processing steps like blurring, threshold, Canny edge detector, gradient filter or Hough transformation. I have to apply the line finding algorithm on thousands of images.
Is there a way to speed up the calculation considering the large number of images?
Does one of the following provide help? Intel TBB, IPP or OpenCV GPU?
I heard that OpenCV GPU can speed up calculations but data transfer is slowly. So using GPU might not be the right choice here?
Thank You!
EDIT:
Is there any sense in using parallel_for from TBB to speed up image processing? If I use a for loop like this:
for(int i=0; i<image_location.size();++i)
{
Mat img=imread(image_location[i]);
blur(img...);
threshold(img...);
...
}
Can I improve performance by using parallel_for instead? Can anyone provide examples how to use parallel_for including some opencv operations?
The scope of your question is virtually unbounded.
First of all, have you measured the performance of your application to detect the actual bottleneck(s) ? My guess would be the Hough transform, but who knows what else your code is doing. Now, if the Hough transform is the slow piece, and supposing OpenCV has a fast implementation of it, then this is the reason I tell you the question is problematic. Changing for a somewhat better implementation doesn't help much when you decide to increase your already large number of images, the problem is in the approach itself.
Do you really need to use Hough ? Maybe you could achieve something similar/better using morphological operators ? Are the images from some common domain ? Can you include examples of them ? Etc, etc.
I'm trying to build a lightweight object recognition system using ORB for feature extraction and LDA for classification. But I'm running into an issue do to the varying size of extracted features.
These are my steps:
Extract keypoints using ORB.
Extract trainable features in the image by grouping the keypoints.
(example of whats being extracted: http://imgur.com/gaQWk)
Train the recognizer with the extracted features. (This is where problems arise)
Classify objects in an image from the wild.
If I attempt to create a generalized matrix using cv::gemm, I get an exception due to the varying sizes. My first thought was to just to normalize all the images by resizing them, but this causes a lot of accuracy issues when objects have similar small features.
Is there any solution to this? Is LDA an appropriate method for this? I know it's commonly used with facial recognition algorithms such as fisherfaces.
LDA requires fixed length features, as do most optimization and machine learning methods. You could resize the image patches to be a fixed size, but that is probably not going to be a good feature. Normally people use a scale invariant feature such as SIFT. You also might try a color histogram, or some variation of edge detection and spatial histogram binning such as a GIST vector.
It's hard to say if LDA is an appropriate method for this without knowing what you hope to accomplish. You might also look into using SVM, some form of boosting, or just plain nearest neighbor with a large training set.
I have a simple question, which I want to know, what kind of libraries are available and can give good results for implementing SIFT, HOG(Histogram Oriented Gradient) and SURF in c++ or opencv?
Hence: 1- Give me the link for the code if you can, which I will be so appreciated.
2- If you know one of them or any kind of information to lead me to what I want, I will be so appreciated as well.
Thanks
check these:
surf
- great article
http://people.csail.mit.edu/kapu/papers/mar_mir08.pdf
sift
- great source, I tried it on the iPhone
http://blogs.oregonstate.edu/hess/
- fast - fast corner detection library
http://svr-www.eng.cam.ac.uk/~er258/work/fast.html
Example of surf code in openCV
https://code.ros.org/trac/opencv/browser/trunk/opencv/samples/cpp/matching_to_many_images.cpp
Not sure if this is still relevant, but you also get two implementations of computing HOG descriptors in opencv i.e. both GPU and CPU versions of the HOG code.
for the CPU version you can check this blog post
however in the CPU version you would need to write your own logic for sliding windows.
and the GPU version is fairly straightforward you can read the documentation here
Might help you to know that SIFT and SURF implementations are already integrated into OpenCV.
http://opencv.willowgarage.com/documentation/cpp/features2d__feature_detection_and_descriptor_extraction.html
Be careful about OpenCV implementations, because latest versions of OpenCV have classified SIFT and SURF implementations as nonfree http://docs.opencv.org/modules/nonfree/doc/nonfree.html.
Now you can use them, but probably they are subject to licensing and cannot be used for commercial solutions.
This one uses descriptors based on HoG, Sobel and Lab channels for detection Class-Specific Hough Forests for Object Detection (opencv/c source code).
Rather then performing detection at every possible location this approach calculates a vote for each descriptor, then when putted together they produce a voting cloud where maximum will correspond to most probable location of the target. When combined with cvGoodFeaturesToTrack can produce very good results, even with a small training database.