OpenCV HOGDescriptor return value

OpenCV HOGDescriptor return value - c++

Why does the HOG descriptor returns a vector of float and not int? It's suppose to return a histogram..

To complement the previous answers that are right in my opinion, according to this HoG note I found clearer than the initial Dalal & Triggs paper, there are two normalization steps involved:
Block Normalization
Group the cells into overlapping blocks of 2 x 2 cells each, so that
each block has size 2C x 2C pixels. Two horizontally or vertically
consecutive blocks overlap by two cells, that is, the block stride is
C pixels. As a consequence, each internal cell is covered by four
blocks. Concatenate the four cell histograms in each block into a
single block feature b and normalize the block feature by its
Euclidean norm.
HOG Feature Normalization
The final normalization makes the HOG feature independent of overall
image contrast.
There should be also a bilinear interpolation voting between two consecutive bins to prevent quantization artifacts.
Also, it cannot be an int as you do not only count the number of gradient vectors that fall in a bin but add also the gradient magnitude.

I believe that #Micka is right: the histograms are probably normalized (maybe not to 1). On the Wikipedia page on HOG Descriptors, it is written that:
For improved accuracy, the local histograms can be contrast-normalized by calculating a measure of the intensity across a larger region of the image, called a block, and then using this value to normalize all cells within the block. This normalization results in better invariance to changes in illumination and shadowing.
Hence the need for a vector<float> instead of vector<int>.

Related

opencv clahe parameters explanation

I would like to know proper explanation of the clahe parameters
i.e clipLimit and tileGridSize.
and how does clipLimit value effects the contrast of the image and what factors(like image resolution, object sizes) to be considered to select tileGridSize.
Thanks in advance

this question is for a long time ago but i searched for the answer and saw this,then i found some links which may help,obviously most of below information are from different sites.
AHE is a computer image processing technique used to improve contrast in images. It differs from ordinary histogram equalization in the respect that the adaptive method computes several histograms, each corresponding to a distinct section of the image, and uses them to redistribute the lightness values of the image. It is therefore suitable for improving the local contrast and enhancing the definitions of edges in each region of an image.
and , AHE has a tendency to over-amplify noise in relatively homogeneous regions of an image ,A variant of adaptive histogram equalization called contrast limited adaptive histogram equalization (CE) prevents this by limiting the amplification.
for first one this image can be useful:
CLAHE limits the amplification by clipping the histogram at a predefined value (called clip limit)
tileGridSize refers to Size of grid for histogram equalization. Input image will be divided into equally sized rectangular tiles. tileGridSize defines the number of tiles in row and column.
it is opencv documentation about it's available functions:
https://docs.opencv.org/master/d6/db6/classcv_1_1CLAHE.html
and this link was good at all:
https://en.wikipedia.org/wiki/Adaptive_histogram_equalization#Contrast_Limited_AHE
http://www.cs.utah.edu/~sujin/courses/reports/cs6640/project2/clahe.html

clipLimit is the threshold value.
tileGridSize defines the number of tiles in row and column.
More Information

Finding CheckerBoard Points in opencv for any random ChessBoard( pattern size not known)

Well, OpenCv comes with its function findCheckerboardCorners() in C++ which goes like
bool findChessboardCorners(InputArray image, Size patternSize,
OutputArray corners,
int flags=CALIB_CB_ADAPTIVE_THRESH+CALIB_CB_NORMALIZE_IMAGE )
After using this function for a while, one thing that i understood was that the pattern size must comply with the image to a very good extent, else the algorithm refuses to detect any Chessboard altogether. I was wondering if there were any random image of a chessboard, this function would fail as it is impractical to enter the precise values of the patternSize. Is there a way, the patternSize for this function could be obtained from the image provided. Any help would be appreciated. Thanks.

Short answer: you cannot.
The OpenCV checkerboard detection code assumes that the pattern is uniform (all squares have the same size) and therefore, in order to uniquely locate its position in the image, the following two conditions must be true:
The pattern is entirely visible.
The pattern has a known numbers of rows and columns.
If either 1 or 2 is violated there is no way to know which corner is, say, the "top left" one.
For a more general case, and in particular if you anticipate that the pattern may be partially occluded, you must use a different algorithm and a non-uniform pattern, upon which corners can be uniquely identified.
There are various way to do that. My favorite pattern is Matsunaga and Kanatani's "2D barcode" one, which uses sequences of square lengths with unique crossratios. See the paper here. In order to match it, once you have sorted the corners into a grid, you can use a simple majority voting algorithm:
Precompute the crossratios of all the pattern's consecutive 4-tuples of corners, in both the horizontal and vertical directions.
Do the above for the detected corners in the grid.
For every possible horizontal shift
Over every row
Accumulate the number of crossratios that agree within a threshold
Select the horizontal shift with the highest number of agreements.
Repeat the above for every possible vertical shift, counting crossratios along the columns.
Repeat the above two steps reversing the order of the crossratios in the vertical and horizontal and vertical direction, separately and jointly, to account for reflections and rotations.
Placing the detected corners in a grid can be achieved in various ways. There is an often-rediscovered algorithm that uses topological proximity. The idea is to first associate each corner to all the squares within a small window of it, thus building a corner->squares table, and then traverse it as a graph to build a global table of the offsets of each corner from one another.

The doc for findChessboardCorners says that
patternSize – Number of inner corners per a chessboard row and
column
So patternSize is not the size of the chessboard inside the image but the number of inner corners. The number of inner corners does not depend from the size of the chessboard inside the image.
For example for the following image https://github.com/Itseez/opencv/blob/3.1.0/samples/data/chessboard.png
patternSize should be cv::Size(7,7).

How to find that image is more or less homogeneous w.r.t color (hue)?

UPDATE:
I have segmented the image into different regions. For each region, I need to know whether it is more or less homogeneous in terms of color.
What could be the possible strategies to do so?
previous:
I want to check the color variance (preferably hue variance) of an image to find out the images made up of homogeneous colors (i.e. the images which have only one or two color).
I understand that one strategy could be to create a hue-histogram for that and then I can found the count of each color but I have several images altogether and I cannot create a hue-histogram of 180 bins for each image because then it would be computationally expensive for whole code.
Is there any inbuilt openCV method OR other simpler method to find out whether the image consist of homogeneous color only OR several colors?
Something, which can calculate the variance of hue-image would also be fine. I could not find something like variance(image);
PS: I am writing the code in C++.

The variance can be computed without an histogram, as the average squared values minus the square of the averaged values. It takes a single pass over the image, with two accumulators. Choose a data type that will not overflow.

What is the difference between dense SIFT and HoG?

I am new to Computer Vision. I am studying Dense SIFT and HOG. For dense SIFT, the algorithm just considers every point as an interesting point and computes its gradient vector. HOG is another way to describe an image with a gradient vector.
I think Dense SIFT is a special case for HOG. In HoG, if we set the bin size to 8, for each window there are 4 blocks, for each block, there are 4 cells and the block stride is the same as the block size, we can still get a 128 dim vector for this window. And we can set any window stride to slide the window to detect the whole image. If the window stride of both these two algorithms is the same, they can get identical results.
I am not sure whether I am correct. Can anyone help me?

SIFT descriptor chooses a 16x16 and then divides it into 4x4 windows. Over each of these 4 windows it computes a Histogram of Oriented gradients. While computing this histogram, it also performs an interpolation between neighboring angles. Once you have all the 4x4 windows, it uses a gaussian of half the window size, centered at the center of the 16x16 block to weight the values in the whole 16x16 descriptor.
HoG on the other hand only computes a simple histogram of oriented gradients as the name says.
I feel that SIFT is more suited in describing the importance of a point, due to the gaussian weighting involved, while HoG does not have such a bias. Due to this reason, (ideally) HoG should be better suited at classification of images over dense SIFT, if all feature vectors are concatenated into one huge vector (this is my opinion, may not be true)

How can I remove small parallel line in image?

I have black and white image after binarization. After that I have image like below:
How can I remove the small lines parallel to the long curves using OpenCV?. I can remove them by removing all small objects, but I want to remove only the small parallel
lines.

This looks like a Canny artifact (or some kind of ringing artifact) to me. There are several ways to remove them.
An empiric but not too computing intensive method would be to locate all small features, and superimpose them with the same image shifted by [+/-]X, [+/-]Y. If the feature is completely coincident with the shifted image, i.e., all pixels in the white feature are also white in the shifted image, then you are probably looking at an artifact.
To evaluate "smallness" of feature, you can use a basic floodfill. This method is cheap because you can simulate shifting with pointers, without really allocating four shifted images. It is prone to false positives wherever you really have small parallel lines, and to false negatives if the artifacts are very large.
Another method would be to posterize twice the original image with different thresholds. While the "real" lines will stay together, the ringing artifacts will have a different strength. At that point you evaluate the image difference, and consider "artifact" all features that are farther than a given threshold from the image track. This is a bit more computation intensive, yields better results, but depends on what you have for an original image, i.e. what is your workflow.
It is possible that reevaluating the workflow (altering the edge detection phase) could avoid the creation of the artifacts altogether.

use cvBlobslib library to detect the white patches as blobs...the cvBlobslib library gives functions by which you can find out different features of the blobs like area , and ellipticity...so if you want only the smaller patches parallel to the long curve...then ..
Get the long curve on the basis of area covered by the blob or the preimeter i.e. contour length of the blob...
Get the ellipticity or the orientation of the major axis of the long curve after fitting an ellipse(cvBlobslib library will do that for you..!!)...
Filter all those blobs which are less than a threshold in terms of area or contour and have the same orientation as the long curve....
hope this might work..

If you know the orientation of your line in advance, you can do a morphological closing with a custom structuring element adapted to your needs.
See morphomat on wikipedia
See opencv documentation

Perhaps similar to what the others said, but in simpler words: since the small lines seem to have roughly half the thickness of the long ones, if you don't really care about preserving the long lines the way they are, you could apply several times a simple algorithm that "makes the lines thinner", until the small ones disappear. What you need to do is scan the image pixel by pixel and when you detect a white pixel above or below or to the left or to the right of a black pixel, you store its coordinates in a vector. After you traverse the entire image, you make all the pixels specified by the coordinates in the vector black. You could define some threshold empirically for the number of iterations of this algorithm.

Here are steps exploiting the fact that parallel lines are increasing edge density.
1) Apply adaptive Threshold on gray image to get many edges.
2) Erode 3x3 (or experiment but small) Morphological Operation.
3) Take Logical Not to get edge density.
4) Apply Dilate of like 3x3 or 5x5. It will dilate edges to merge and make a region.
5) Now Erode 7x7 (or experiment for higher then last dilate) Morphological Operation. It will remove most of the non-required region, long lines and small stray areas.
Output is is MASK for removal region. You can apply contour detection on original image and remove contour-object for matching position in mask high precision removal.
OR if you don't need high-precision result simply And with mask's NOT.

Why not doing something like:
Find the long curves (using findContours and filter by size).
Find the small curves
For each long curve, calculate the minimal distance between each point of every small curve and the long curve.
Calculate the mean and the standard deviation of these minimal distances.
Reject small curves for which either the mean minimal distance to the long curve is too large, or small curves for which the standard deviation of the minimal distances is large.
The result will probably be better (and faster) is you skeletonize the image first.
Good luck with it,

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js