I would like to apply PCA on heatmaps of 18 dimensions.
dim(heatmaps)=(224,224,18)
Since PCA takes only data of dim <= 2. I reshape my heatmaps as follow :
heatmaps=heatmaps.reshape(-1,18)
heatmaps.shape
(50176, 18)
Now, l would to apply PCA and take the first components that preserve 95% of variance.
from sklearn.decomposition import PCA
pca = PCA(n_components=18)
reduced_heatmaps=pca.transform(heatmaps)
However the dimension of reduced_heatmaps remains the same as the original heatmaps (50176, 18).
My question is as follow :
How to reduce the dimensionality of my heatmaps while preserving 95% of variance ?
Strange thing
pca.explained_variance_ratio_.cumsum()
array([ 0.05744624, 0.11482341, 0.17167621, 0.22837643, 0.284996 ,
0.34127299, 0.39716828, 0.45296374, 0.50849681, 0.56382308,
0.61910508, 0.67425335, 0.72897448, 0.78361028, 0.83813329,
0.89247688, 0.94636864, 1. ])
It means, I need to keep 17 components to reduce the dimensionality of my data such that l have 18 dimensions.
What is wrong ?
EDIT : following the suggestions of Eric Yang
heatmaps=heatmaps.reshape(18,-1)
heatmaps.shape
(18,50176)
Then applying PCA as follow :
pca = PCA(n_components=11)
reduced_heatmaps=pca.fit_transform(heatmaps)
pca.explained_variance_ratio_.cumsum()
results the following :
array([ 0.21121199, 0.33070526, 0.44827572, 0.55748779, 0.64454442,
0.72588593, 0.7933346 , 0.85083687, 0.89990991, 0.9306283 ,
0.9596194 ], dtype=float32)
11 components is needed to explain 95% variance of my data.
reduced_heatmaps.shape
(18, 11)
Hence we go from (18,50176) to (18, 11)
Thank you for your help
The ability to reduce your variance is a function of your data. If you have an N dimensional gaussian with each dimension N(0,1), each dimension will explain 1/N of your variance, and so your ability to reduce dimensions via PCA would be minimal. So the results of PCA does not seem to be incorrect.
Now based on a superficial understanding of your problem, you have 18 images that are 224x224 correct? If that is correct, then your dimensionality is 224x224 not 18. So you'd want to ask what is the minimum number of pixels in my image that explain the difference between my 18 images. (However, I could be wrong if that is not the assumption, and what you have is 18 channels for 1 image)
There is one other possibility in which you have a series of similar images (and so your dimensionality is going to be 18), and you're looking for the Eigen image. If the images are too different, you will have a minimal reduction in the dimensionality.
Related
I have some raw images to debayer then apply colour corrections/transforms to. I use OpenCV and C++, and for the image sensor used the linear matrix coefficients are:
1.32 -0.46 0.14
-0.36 1.25 0.11
0.08 -1.96 1.88
I am not sure how to apply these to the image. It's not clear to me what I am supposed to do with them and why.
Can anyone explain what these colour reproduction or colour matrix values are, and how to use them to process an image?
Thank you!
Your question is not clear because it seems you also don't know what to do.
"what I am supposed to do with them"
First thing coming to my mind, you can convolve image with that matrix by using filter2D. According to documentation filter2D:
Convolves an image with the kernel.
The function applies an arbitrary linear filter to an image. In-place
operation is supported. When the aperture is partially outside the
image, the function interpolates outlier pixel values according to the
specified border mode.
Here is the example code snippet hpw tp use it:
Mat output;
Mat kernelMatrix = (Mat_<double>(3, 3) << 1.32, -0.46, 0.14,
-0.36, 1.25, 0.11,
0.08, -1.96, 1.88);
filter2D(rawImage, output, -1, kernelMatrix);
Before debayering you have an array B (-ayer) of MxN filtered "graylevel" values. They are physically filtered in the sense that the the number of photons measured by each one of them is affected by the color filter on top of each sensor site.
After debayering you have an array C (-olor) of MxNx3 BGR values, obtained by (essentially) reindexing the B array. However, each of the 3 values at a (row, col) image location represents 3 physical measurements. This is not the final image because we still need to "convert" the physical measurements to numbers that are representative of color channels as perceived by a human (or, more generally, by the intended user, which could also be some kind of image processing software). That is, the physical values need to be mapped to a color space.
The 3x3 "color correction" matrix you have represents one possible mapping - a simple linear one. You need to apply it in turn to each BGR triple at all (row, col) pixel locations. For example (in python/numpy/cv2):
import numpy as np
def colorCorrect(img, M):
"""Applies a color correction M to a BGR image img"""
rows, cols, depth = img.shape
assert depth == 3
assert M.shape == (3, 3)
img_corr = np.zeros((rows, cols, 3), dtype=img.dtype)
for r in range(rows):
for c in range(cols):
img_corr[r, c, :] = M.dot(img[r, c, :])
return img_corr
I have a data set with multiple discrete labels, say 4,5,6. On this I run the ExtraTreesClassifier (I will also run Multinomial logit afterword on the same data, this is just a short example) as below. :
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.metrics import roc_curve, auc
clf = ExtraTreesClassifier(n_estimators=200,random_state=0,criterion='gini',bootstrap=True,oob_score=1,compute_importances=True)
# Also tried entropy for the information gain
clf.fit(x_train, y_train)
#y_test is test data and y_predict is trained using ExtraTreesClassifier
y_predicted=clf.predict(x_test)
fpr, tpr, thresholds = roc_curve(y_test, y_predicted,pos_label=4) # recall my labels are 4,5 and 6
roc_auc = auc(fpr, tpr)
print("Area under the ROC curve : %f" % roc_auc)
The question is - is there something like a average ROC curve - basically I could add up all the tpr & fpr seperately for EACH label value and then take means (will that make sense by the way?) - and then just call
# Would this be statistically correct, and would mean something worth interpreting?
roc_auc_avearge = auc(fpr_average, tpr_average)
print("Area under the ROC curve : %f" % roc_auc)
I am assuming, I will get something similar to this afterword - but how do I interpret thresholds in this case ?
How to plot a ROC curve for a knn model
Hence, please also mention if I can/should get individual thresholds in this case and why would one approach be better(statistically) over the other?
What I tried so far (besides averaging):
On changing pos_label = 4 , then 5 & 6 and plotting the roc curves, I see very poor performance, even lesser than the y=x (perfectly random and tpr=fpr case) How should I approach this problem ?
ROC curve averaging has been proposed by Hand & Till in 2001. They basically compute the ROC curves for all comparison pairs (4 vs. 5, 4 vs. 6 and 5 vs. 6) and average the result.
When you compute the ROC curve with pos_label=4, you implicitly say that the other labels are the negatives (5 and 6). Note that this is slightly different from what was proposed by Hand & Till.
A few notes:
You should make sure that your classifier was trained in a way that makes sense with your ROC analysis. If you say pos_label=5 in the roc_curve, and your classifier was train to recognize 5 as intermediate between 4 and 6, you will for sure get nothing useful here
If you get AUC < 0.5, it means you are looking at it in the wrong way (and you should reverse your predictions)
In general ROC analysis is useful for a binary classification. Whether it makes sense for multiclass problems is case-dependant, and it might not be the case for you.
I'm dealing with a small project with OpenCV & C++, maybe the following questions are naive, but I'll be very grateful if anyone could offer help.
Being new here, I don't have enough reputation to post images, so I'll try to make it clear.
I'm trying to denoise an image (MxN = 200x200 Mat) in frequency domain,
and say I have a UxV=3x3 Gaussian kernel {{0, -1, 0},{-1, 4, -1},{0, -1, 0}}, and the expected steps are:
zero-pad the kernel up to (M+U-1) x (N+V-1)
take the 2-D fft of the kernel
zero-pad the image up to (M+U-1) x (N+V-1)
take the 2-D FFT of the image
multiply FFT of kernel by FFT of
image take inverse 2-D FFT of result
Both The result of step 2 (fft of the kernel) and the final result (filted image) seems right, but then I found this answer:
you do need to make K as big as I by padding it with zeros. Also, after padding, but before you take the FFT of the kernel, you need to translate it with wraparound, such that the center of the kernel (the peak of the Gaussian) is at (0,0). Otherwise, your filtered image will be translated.
That's what I didn't do. But how could the result seems acceptable?
So I'm now wondering what's the difference between whether moving the kernel to make its center is at (0, 0) before fft?
Here comes my 2nd question. If we got an shifted fft of an image (with the 0 frequency in the middle), can I just do this to obtain a 'low-pass' effet:
For the pixels whose distance from the center is bigger than a threshold, set their value 0.
(I think it's straight forward, but haven't found some similar methods widely used.)
Thank you VERY much for offering any help :-)
I am using PCA on binary attributes to reduce the dimensions (attributes) of my problem. The initial dimensions were 592 and after PCA the dimensions are 497. I used PCA before, on numeric attributes in an other problem and it managed to reduce the dimensions in a greater extent (the half of the initial dimensions). I believe that binary attributes decrease the power of PCA, but i do not know why. Could you please explain me why PCA does not work as good as in numeric data.
Thank you.
The principal components of 0/1 data can fall off slowly or rapidly,
and the PCs of continuous data too —
it depends on the data. Can you describe your data ?
The following picture is intended to compare the PCs of continuous image data
vs. the PCs of the same data quantized to 0/1: in this case, inconclusive.
Look at PCA as a way of getting an approximation to a big matrix,
first with one term: approximate A ~ c U VT, c [Ui Vj].
Consider this a bit, with A say 10k x 500: U 10k long, V 500 long.
The top row is c U1 V, the second row is c U2 V ...
all the rows are proportional to V.
Similarly the leftmost column is c U V1 ...
all the columns are proportional to U.
But if all rows are similar (proportional to each other),
they can't get near an A matix with rows or columns 0100010101 ...
With more terms, A ~ c1 U1 V1T + c2 U2 V2T + ...,
we can get nearer to A: the smaller the higher ci, the faster..
(Of course, all 500 terms recreate A exactly, to within roundoff error.)
The top row is "lena", a well-known 512 x 512 matrix,
with 1-term and 10-term SVD approximations.
The bottom row is lena discretized to 0/1, again with 1 term and 10 terms.
I thought that the 0/1 lena would be much worse -- comments, anyone ?
(U VT is also written U ⊗ V, called a "dyad" or "outer product".)
(The wikipedia articles
Singular value decomposition
and Low-rank approximation
are a bit math-heavy.
An AMS column by
David Austin,
We Recommend a Singular Value Decomposition
gives some intuition on SVD / PCA -- highly recommended.)
Some details about my problem:
I'm trying to realize corner detector in openCV (another algorithm, that are built-in: Canny, Harris, etc).
I've got a matrix filled with the response values. The biggest response value is - the biggest probability of corner detected is.
I have a problem, that in neighborhood of a point there are few corners detected (but there is only one). I need to reduce number of false-detected corners.
Exact problem:
I need to walk through the matrix with a kernel, calculate maximum value of every kernel, leave max value, but others values in kernel make equal zero.
Are there build-in openCV functions to do this?
This is how I would do it:
Create a kernel, it defines a pixels neighbourhood.
Create a new image by dilating your image using this kernel. This dilated image contains the maximum neighbourhood value for every point.
Do an equality comparison between these two arrays. Wherever they are equal is a valid neighbourhood maximum, and is set to 255 in the comparison array.
Multiply the comparison array, and the original array together (scaling appropriately).
This is your final array, containing only neighbourhood maxima.
This is illustrated by these zoomed in images:
9 pixel by 9 pixel original image:
After processing with a 5 by 5 pixel kernel, only the local neighbourhood maxima remain (ie. maxima seperated by more than 2 pixels from a pixel with a greater value):
There is one caveat. If two nearby maxima have the same value then they will both be present in the final image.
Here is some Python code that does it, it should be very easy to convert to c++:
import cv
im = cv.LoadImage('fish2.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
maxed = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
comp = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
#Create a 5*5 kernel anchored at 2,2
kernel = cv.CreateStructuringElementEx(5, 5, 2, 2, cv.CV_SHAPE_RECT)
cv.Dilate(im, maxed, element=kernel, iterations=1)
cv.Cmp(im, maxed, comp, cv.CV_CMP_EQ)
cv.Mul(im, comp, im, 1/255.0)
cv.ShowImage("local max only", im)
cv.WaitKey(0)
I didn't realise until now, but this is what #sansuiso suggested in his/her answer.
This is possibly better illustrated with this image, before:
after processing with a 5 by 5 kernel:
solid regions are due to the shared local maxima values.
I would suggest an original 2-step procedure (there may exist more efficient approaches), that uses opencv built-in functions :
Step 1 : morphological dilation with a square kernel (corresponding to your neighborhood). This step gives you another image, after replacing each pixel value by the maximum value inside the kernel.
Step 2 : test if the cornerness value of each pixel of the original response image is equal to the max value given by the dilation step. If not, then obviously there exists a better corner in the neighborhood.
If you are looking for some built-in functionality, FilterEngine will help you make a custom filter (kernel).
http://docs.opencv.org/modules/imgproc/doc/filtering.html#filterengine
Also, I would recommend some kind of noise reduction, usually blur, before all processing. That is unless you really want the image raw.