I want to calculate the histogram for Variance local binary pattern of a gray scale image using OpenCV C++.
Can someone explain me how to exactly find histogram for variance LBP in OpenCV C++ and what exactly it means?
Also please provide some links that are useful in this case.
VAR is a rotation invariant measure of local variance (have a look at this paper for a more in-depth explanation) defined as:
where P is the number of pixels in the local neighbourhood and μ is the average intensity computed across the local neighbourhood.
LBP variance (LBPV) is a texture descriptor that uses VAR as an adaptive weight to adjust the contribution of the LBP code in histogram calculation (see this paper for details). The value of the kth bin of the LBPV histogram can be expressed as:
where N and M are the number of rows and columns of the image, respectively, and w is given by:
According to this answer the code for calculating LBP using OpenCV is not available for public use, but here you can find a workaround to make that function accesible.
Related
I want to know about non redundant local binary pattern for texture description. What is the difference between original LBP and non-redundant LBP in texture description?
Can someone clarify the above mentioned topic through a good example?
Non-redundant Local Binary Patterns (NRLBP) descriptor considers the LBP code and its complement as the same pattern, and hence the number of bins in the LBP histogram is reduced by half (see this paper for further details).
The following toy example might help you figure out how NRLBP works. Consider an image of just 3 rows and 4 columns with the intensity levels shown below:
There are only two LBP codes in this image, namely:
Thus, the LBP representation of the image is a feature vector of 256 components. The bins corresponding to patterns 101010102=170 and 010101012=85 take the value 0.5 and the remaining bins are zero (I'm assuming that the histogram is normalized).
The NRLBP representation of the image turns out to be a feature vector of 128 components. As both patterns are 1's complement of each other, they are actually the same pattern in this texture model and thus the only nonzero bin corresponds to the pattern code 85 and takes the value 1.
I would like to know proper explanation of the clahe parameters
i.e clipLimit and tileGridSize.
and how does clipLimit value effects the contrast of the image and what factors(like image resolution, object sizes) to be considered to select tileGridSize.
Thanks in advance
this question is for a long time ago but i searched for the answer and saw this,then i found some links which may help,obviously most of below information are from different sites.
AHE is a computer image processing technique used to improve contrast in images. It differs from ordinary histogram equalization in the respect that the adaptive method computes several histograms, each corresponding to a distinct section of the image, and uses them to redistribute the lightness values of the image. It is therefore suitable for improving the local contrast and enhancing the definitions of edges in each region of an image.
and , AHE has a tendency to over-amplify noise in relatively homogeneous regions of an image ,A variant of adaptive histogram equalization called contrast limited adaptive histogram equalization (CE) prevents this by limiting the amplification.
for first one this image can be useful:
CLAHE limits the amplification by clipping the histogram at a predefined value (called clip limit)
tileGridSize refers to Size of grid for histogram equalization. Input image will be divided into equally sized rectangular tiles. tileGridSize defines the number of tiles in row and column.
it is opencv documentation about it's available functions:
https://docs.opencv.org/master/d6/db6/classcv_1_1CLAHE.html
and this link was good at all:
https://en.wikipedia.org/wiki/Adaptive_histogram_equalization#Contrast_Limited_AHE
http://www.cs.utah.edu/~sujin/courses/reports/cs6640/project2/clahe.html
clipLimit is the threshold value.
tileGridSize defines the number of tiles in row and column.
More Information
I have an OpenCV application and I have to implement a correspondence search using varying support weights between two image pair. This work is very similar to "Adaptive Support-Weight Approach for Correspondence Search" by Kuk-Jin Yoon and In So Kweon. The support weights are in a given support window.
I calculate dissimilarity between pixels using the support weights in the two images. Dissimilarity between pixel 'p' and 'Pd' is given by
where Pd and Qd are the pixels in the target image when pixels p and q in the reference image have a disparity value d; Np and Npd are the support weight.
After this, the disparity of each pixel is selected by the WTA (Winner-Takes-All) method as:
What I would like to know is how to proceed starting with the formula of the fig.1 (function computing dissimilarity and weights that I have written), i.e. which pixel to consider? Where to start? What pixel with? Any suggestion?
The final result of the work should be similar to:
What could be a good way to do it?
UPDATE1
Should I start creating a new image, and then consider the cost between the pixel (0,0) and all the other pixels, find the minimum value and set this value as the value in the new image at pixel (0,0) ? And so on with the other pixels?
I am trying to classify MRI images of brain tumors into benign and malignant using C++ and OpenCV. I am planning on using bag-of-words (BoW) method after clustering SIFT descriptors using kmeans. Meaning, I will represent each image as a histogram with the whole "codebook"/dictionary for the x-axis and their occurrence count in the image for the y-axis. These histograms will then be my input for my SVM (with RBF kernel) classifier.
However, the disadvantage of using BoW is that it ignores the spatial information of the descriptors in the image. Someone suggested to use SPM instead. I read about it and came across this link giving the following steps:
Compute K visual words from the training set and map all local features to its visual word.
For each image, initialize K multi-resolution coordinate histograms to zero. Each coordinate histogram consist of L levels and each level
i has 4^i cells that evenly partition the current image.
For each local feature (let's say its visual word ID is k) in this image, pick out the k-th coordinate histogram, and then accumulate one
count to each of the L corresponding cells in this histogram,
according to the coordinate of the local feature. The L cells are
cells where the local feature falls in in L different resolutions.
Concatenate the K multi-resolution coordinate histograms to form a final "long" histogram of the image. When concatenating, the k-th
histogram is weighted by the probability of the k-th visual word.
To compute the kernel value over two images, sum up all the cells of the intersection of their "long" histograms.
Now, I have the following questions:
What is a coordinate histogram? Doesn't a histogram just show the counts for each grouping in the x-axis? How will it provide information on the coordinates of a point?
How would I compute the probability of the k-th visual word?
What will be the use of the "kernel value" that I will get? How will I use it as input to SVM? If I understand it right, is the kernel value is used in the testing phase and not in the training phase? If yes, then how will I train my SVM?
Or do you think I don't need to burden myself with the spatial info and just stick with normal BoW for my situation(benign and malignant tumors)?
Someone please help this poor little undergraduate. You'll have my forever gratefulness if you do. If you have any clarifications, please don't hesitate to ask.
Here is the link to the actual paper, http://www.csd.uwo.ca/~olga/Courses/Fall2014/CS9840/Papers/lazebnikcvpr06b.pdf
MATLAB code is provided here http://web.engr.illinois.edu/~slazebni/research/SpatialPyramid.zip
Co-ordinate histogram (mentioned in your post) is just a sub-region in the image in which you compute the histogram. These slides explain it visually, http://web.engr.illinois.edu/~slazebni/slides/ima_poster.pdf.
You have multiple histograms here, one for each different region in the image. The probability (or the number of items would depend on the sift points in that sub-region).
I think you need to define your pyramid kernel as mentioned in the slides.
A Convolutional Neural Network may be better suited for your task if you have enough training samples. You can probably have a look at Torch or Caffe.
I am trying to find a way to parametrize the precision of my homography calculation. I would like to obtain a value that describes the precision of the homography calculation for a measurement taken at a certain position.
I currently have succesfully calculated the homography (with cv::findHomography) and I can use it to map a point on my camera image onto a 2D map (using cv::perspectiveTransform). Now I want to track these objects on my 2D map and to do this I want to take in account that objects that are in the back of my camera image have a less precise position on my 2D map than the objects that are all the way in the front.
I have looked at the following example on this website that mentions plane fitting but I don't really understand how to fill the matrices correctly using this method. The visualisation of the result does seem to fit my needs. Is there any way to do this with standard OpenCV functions?
EDIT:
Thanks Francesco for your recommendations. But, I think I am looking for something different than your answer. I am not looking to test the precision of the homography itself, but the relation between the density of measurements in one real camera view and the actual size on a map I create. I want to know that when I am 1 pixel off on my detection in the camera image, how many meters this will be on my map at this point.
I can of course calculate by taking some pixels around my measurement on my camera image and then use the homography to see how many meters on my map this represent every time I do a homography, but I don't want to calculate this every time. What I would like is to have a formula that tells me the relation between pixels in my image and pixels on my map so I can take this in account for my tracking on the map.
What you are looking for is called "predictive error bars" or "prediction uncertainty". You should definitely consult a good introductory book on estimation theory for details (e.g. this one). But briefly, the predictive uncertainty is the probability that...
A certain pixel p in image 1 will is the mapping H(p') of a pixel p' in image 2 under the homography H...
Given the uncertainty in H which is due to the errors in the matched pairs (q0, q0'), (q1, q1'), ..., that have been used to estimate H, ...
But assuming the model is correct, that is, that the true map between images 1 and 2 is, in fact, a homography (although the estimated parameters of the homography itself may be affected by errors).
In order to estimate this probability distribution you'll need a model for the errors in the measurements, and a model for how they propagate through the (homography) model.