Is it ok to have negative coefficients in approximation band of the wavelet transform of an image? - wavelet

I am trying to decompose an image using various wavelets ,Daubechies,Coif,symlet,ortho of all orders. Except db1(Haar), others produce some negative coefficients in approximation band. My understanding is approximation band contains the average values of the original image and hence should contain only positive values. Does it also depend on filter coefficients used for decomposition? I implemented decomposition using dwt2 command as well as using circular convolution with filter coefficients. Both produce same results for higher order wavelet filters.
I want to extract features from wavelet coefficients,negative coefficients may result in wrong feature values hence want to clarify.
Thanks.

Yes, the approximation band also depends on filter coefficients used for decomposition. More precisely, this situation is completely valid for low-pass decomposition filters having negative coefficients. If you require only positive coefficients in the approximation band, use one of these wavelets in MATLAB: rbio1.x, rbio2.x, rbio3.x.

Related

Why do edge detection filters sum to 0 whereas blur filters sum to 1?

I am now learning about filters in computer vision. I can see that the elements of the kernel for edge detection sum to 0, whereas for blurring sum to 1.
I am wondering, does it have to do with the fact that the one is a high-pass and the other is a low-pass filter? Is there some kind of rule or explanation?
Thanks in advance!
Blur filters must preserve the mean image intensity. This is why their kernels sum to 1. If you look at their frequency response, you’ll see that the zero-frequency component (DC component) is 1. This component is the sum over the kernel. And it being 1 means that the DC component of the image is not modified when applying the convolution. Yes, this is a property of any low-pass filter. Modifying the zero frequency means you don’t let low frequencies pass unaltered.
What you call edge detection filters are really estimators of the derivative. They add to zero because of the definition of the derivative: the slope at any one point does not depend on how high up that point is. Adding or subtracting a constant from the function (or image) will not change the derivative, the derivative of I and I+1 are the same. Therefore the derivative filter cannot preserve the mean image intensity: you’d get a different result for dI/dx and for d(I+1)/dx, which would not make sense.
The Laplace filter (not an edge detector) is a generalized second order derivative, the same reasoning as above applies.

Principal component analysis on proportional data

Is it valid to run a PCA on data that is comprised of proportions? For example, I have data on the proportion of various food items in the diet of different species. Can I run a PCA on this type of data or should I transform the data or do something else beforehand?
I had a similar question. You should search for "compositional data analysis". There are transformation to apply to proportions in order to analyze them with multivariate tecniques such as PCA. You can find also "robust" PCA algorithms to run your analysis in R. Let us know if you find an appropriate solution to your specific problem.
I don't think so.
PCA will give you "impossible" answers. You might get principal components with values that proportions can't have, like negative values or values greater than 1. How would you interpret this component?
In technical terms, the support of your data is a subset of the support of PCA. Say you have $k$ classes. Then:
the support for PCA vectors is $\R^k$
the support for your proportion vectors is the $k$- dimensional simplex. By simplex I mean the set of $p$ vectors of length $k$ such that:
$0 \le p_i \le 1$ where $i = 1, ..., k$
$\sum_{i=1}^k{p_i} = 1$
One way around this is if there's a one to one mapping between the $k$-simplex to all of $\R^k$. If so, you could map from your proportions to $\R^k$, do PCA there, then map the PCA vectors to the simplex.
But I'm not sure the simplex is a self-contained linear space. If you add two elements of the simplex, you don't get an element of the simplex :/
A better approach, I think, is clustering, eg with Gaussian mixtures, or spectral clustering. This is related to PCA. But a nice property of clustering is you can express any element of your data as a "convex combination" of the clusters. If you analyze your proportion data and find clusters, they (unlike PCA vectors) will be within the simplex space, and any mixture of them will be, too.
I also recommend looking into nonnegative matrix factorization. This is like PCA but, as the name suggests, avoids negative components and also negative eigenvectors. It's very useful for inferring structure in strictly positive data, like proportions. But nmf does not give you a basis for simplex space.

How to calculate efficiently and accurately the Fourier transform of a radial function in Fortran

As my question states, I want to calculate the Fourier transform F(q) of a radial function f(r) (defined on [0,infinity[ and which decays like an exponential exp(-Ar +b) at large r) as accurately as possible in Fortran. The function values come from a data file (which I can easily interpolate through cubic interpolation for example and extrapolate since the behaviour at large r is known).
I'm using the "physics" definition of the Fourier transform in 3D, which gives (because f is radial) :
I first tried to calculate this integral for some chosen values of q by using Gauss-Legendre quadrature, by generating some 60 or 100 abscissas and weights via the NAG routine D01BCF (D01BCF link). In the case of Gauss Legendre quadrature, the problem is to choose the interval [0,B] on which to integrate. While the function f loses 4 to 5 orders of magnitude from r=10 to r=20 (example), the choice of B as a strong influence on the result of the calculation... When I compared the result I get to a "nearly exact" calculation (made with matlab but with a veeeery long computation time), I saw that in fact this was only valid for small values of q (of the order of 5, when I have to deal with values as large as 150). A Gauss-Laguerre quadrature does not give any better result, probably because of the oscillatory part of the integrand.
I then tried to compute this Fourier transform for some given values of q with the routine D01ASF (D01ASF link). It is a "one-dimensional quadrature, adaptive, semi-infinite interval, weight function cos(ωx) or sin(ωx) ", which is exactly what I need. The results are quite convincing for q up to 80 or 100 if I input absolute error tolerances of 10E-5. Problems are : I would need to go at larger q, and the Fourier transform F(q) oscillates with a magnitude of ~ 10E-6 at such q's. Lowering the tolerance to 10E-5 already takes some time and even makes the whole thing to output some error message from the subroutine so I don't know if 10E-6 would be feasible.
I'm thus currently wondering if trying to calculate this Fourier transform with FFT wouldn't be a good idea ? The problems I face are that I don't know how to calculate radial wave functions with FFT (and also that I don't even know how to use FFT properly either since the definition of the transform is not even the same (exponent sign and argument) and that I never used it before).
Would you have ideas ? :)
EDIT 2 : I tried by FFT (using the routine C06FAF from NAG library). It works quite well up to some large values of q. The problem I face is that there is always some constant normalising factor to account for. I don't get why. This normalising factor evolves with the number N of points used in the mesh. It has the for of a power law : Normalising Factor F = N^(-0.5) x exp(9.9) approximately (see figure where the black line is the "exact" Fourier Transform and the green, magenta, blue, red and yellow lines are the FFT calculated for different values of N)
EDIT3 : I found the factor to be A*N^(-0.5) where A is the length of the integration mesh

Measure entropy of a binary matrix

I`ve a 100*100 binary matrix the probability of each pixel given by this relation :
i want to know how to calculate the entropy of this image .
According to the given conditions, the probability for each entry in the matrix can be calculated accordingly.
For example, because s(1,1)=0, s(1,2)=0, s(2,1)=0, then you can calculate P(s(2,2)=0), and P(s(2,2)=1) using the theorem.
After you calculate all the probability for all entries, then you can start to calculate the entropy of this image by calculating the expected value.
Entropy is given by the formula: -Sum(PlogP)
Where log is the base 2 logarithm and P is the probability of the information.

Deciding about dimensionality reduction with PCA

I have 2D data (I have a zero mean normalized data). I know the covariance matrix, eigenvalues and eigenvectors of it. I want to decide whether to reduce the dimension to 1 or not (I use principal component analysis, PCA). How can I decide? Is there any methodology for it?
I am looking sth. like if you look at this ratio and if this ratio is high than it is logical to go on with dimensionality reduction.
PS 1: Does PoV (Proportion of variation) stands for it?
PS 2: Here is an answer: https://stats.stackexchange.com/questions/22569/pca-and-proportion-of-variance-explained does it a criteria to test it?
PoV (Proportion of variation) represents how much information of data will remain relatively to using all of them. It may be used for that purpose. If POV is high than less information will be lose.
You want to sort your eigenvalues by magnitude then pick the highest 1 or 2 values. Eigenvalues with a very small relative value can be considered for exclusion. You can then translate data values and using only the top 1 or 2 eigenvectors you'll get dimensions for plotting results. This will give a visual representation of the PCA split. Also check out scikit-learn for more on PCA. Precisions, recalls, F1-scores will tell you how well it works
from http://sebastianraschka.com/Articles/2014_pca_step_by_step.html...
Step 1: 3D Example
"For our simple example, where we are reducing a 3-dimensional feature space to a 2-dimensional feature subspace, we are combining the two eigenvectors with the highest eigenvalues to construct our d×kd×k-dimensional eigenvector matrix WW.
matrix_w = np.hstack((eig_pairs[0][1].reshape(3,1),
eig_pairs[1][1].reshape(3,1)))
print('Matrix W:\n', matrix_w)
>>>Matrix W:
[[-0.49210223 -0.64670286]
[-0.47927902 -0.35756937]
[-0.72672348 0.67373552]]"
Step 2: 3D Example
"
In the last step, we use the 2×32×3-dimensional matrix WW that we just computed to transform our samples onto the new subspace via the equation
y=W^T×x
transformed = matrix_w.T.dot(all_samples)
assert transformed.shape == (2,40), "The matrix is not 2x40 dimensional."