How to do this specific tensor transformation in Eigen? - c++

I am looking for an idiomatic and efficient solution for this problem:
Let's say I have 3D Tensor where I want to represent an image with 100*100 pixels on 3 color channels,
Eigen::Tensor<int, 3> input(3,100,100);
The output I would like to get could be stored in
Eigen::Tensor<int, 4> output(3,3,100,100);
I would like to project the 3D input into the 4D output in a way that each color channel in the original tensor would have its own individual 3D tensor in the output, where each channel would contain the same values, that is
tensor(0,0,42,42) = tensor(0,1,42,42) = tensor(0,2,42,42)
tensor(0,0,12,12) = tensor(0,1,12,12) = tensor(0,2,12,12)
Illustrated on a picture:
Originally I wanted to solve this method:
Chip the individual color channels.
Broadcast the individual color channels into the size I need,
Reshape the broadcasted result into the desirable format(this is just a 3D Tensor at this point)
Concatenate the individual 3D Tensors into a big 4d one.
I have two problems with this approach.
Firstly, I just can not get the reshaping right, it always gives back a reshaped tensor with the dimensionality I want, but the coefficients get shuffled. I started to experiment with the layout of the Tensors, but it did not seem to help.
Secondly, this seems to be very tedious, I just feel like there should be a more convenient way to achieve this but I could not find any cue about that in the documentation.

Related

Pass variable sized input to Linear layer in Pytorch

I have a Linear() layer in Pytorch after a few Conv() layers. All the images in my dataset are black and white. However most of the images in my test set are of a different dimension than the images in my training set. Apart from resizing the images themselves, is there any way to define the Linear() layer in such a way that it takes a variable input dimension? For example something similar to view(-1)
Well, it doesn't make sense to have a Linear() layer with a variable input size. Because in fact it's a learnable matrix of shape [n_in, n_out]. And matrix multiplication is not defined for inputs if theirs feature dimension != n_in
What you can do is to apply pooling from functional API. You'll need to specify kernel_size and stride such that resulting output will have feature dimension size = n_in.

Clip Unstructured grid and keep arrays data

I'm trying to clip a vtkUnstructuredGrid using vtkClipDataSet. The problem is that after I clip, the resulting vtkUnstructuredGrid doesn't have the point/cells data (the arrays).
This is my code:
vtkSmartPointer<vtkUnstructuredGrid> model = reader->GetOutput();
// this shows that model has one point data array called "Displacements" (vectorial of 3 components)
model->Print(std::cout);
// Plane to cut it
vtkSmartPointer<vtkPlane> plane = vtkSmartPointer<vtkPlane>::New();
plane->SetOrigin(0.0,0.0,0.0); plane->SetNormal(1,0,0);
// Clip data
vtkSmartPointer<vtkClipDataSet> clipDataSet = vtkSmartPointer<vtkClipDataSet>::New();
clipDataSet->SetClipFunction(plane);
clipDataSet->SetInputConnection(model->GetProducerPort());
clipDataSet->InsideOutOn();
clipDataSet->GenerateClippedOutputOn();
//PROBLEM HERE. The print shows that there aren't any arrays on the output data
clipDataSet->GetOutput()->Print(std::cout);
I need the output grid to have the arrays, because I would like to display the values on the resulting grid.
For example, if the data are are scalars, I would like to display isovalues on the cutted mesh. If the data is vectorial, I would like to deform the mesh (warp) in the direction of the data vectors.
Here I have an example on ParaView of what I would like to do. The solid is the original mesh and the wireframe mesh is the deformed one.
I'm using VTK 5.10 under C++ (Windows 8.1 64 bits, if that helps).
Thank you!
PS: I tried asking this on the VTKusers list, but I got no answer.
Ok I found the error after the comment of user lib. I was missing the call to update after I set the inputconnection.
Thank you all.
// Clip data
vtkSmartPointer<vtkClipDataSet> clipDataSet = vtkSmartPointer<vtkClipDataSet>::New();
clipDataSet->SetClipFunction(plane);
clipDataSet->SetInputConnection(model->GetProducerPort());
clipDataSet->InsideOutOn();
clipDataSet->GenerateClippedOutputOn();
clipDataSet->Update(); // THIS is the solution

Armadillo porting imagesc to save image bitmap from matrix

I have this matlab code to display image object after do super spectrogram (stft, couple plca...)
t = z2 *stft_options.hop/stft_options.sr;
f = stft_options.sr*[0:size(spec_t,1)-1]/stft_options.N/1000;
max_val = max(max(db(abs(spec_t))));
imagesc(t, f, db(abs(spec_t)),[max_val-60 max_val]);
And get this result:
I was porting to C++ successfully by using Armadillo lib and get the mat results:
mat f,t,spec_t;
The problem is that I don't have any idea for converting bitmap like imagesc in matlab.
I searched and found this answer, but seems it doesn't work in my case because:
I use a double matrix instead of integer matrix, which can't be mark as bitmap color
The imagesc method take 4 parameters, which has the bounds with vectors x and y
The imagesc method also support scale ( I actually don't know how it work)
Does anyone have any suggestion?
Update: Here is the result of save method in Armadillo. It doesn't look like spectrogram image above. Do I miss something?
spec_t.save("spec_t.png", pgm_binary);
Update 2: save spectrogram with db and abs
mat spec_t_mag = db(abs(spec_t)); // where db method: m = 10 * log10(m);
mag_spec_t.save("mag_spec_t.png", pgm_binary);
And the result:
Armadillo is a linear algebra package, AFAIK it does not provide graphics routines. If you use something like opencv for those then it is really simple.
See this link about opencv's imshow(), and this link on how to use it in a program.
Note that opencv (like most other libraries) uses row-major indexing (x,y) and Armadillo uses column-major (row,column) indexing, as explained here.
For scaling, it's safest to convert to unsigned char yourself. In Armadillo that would be something like:
arma::Mat<unsigned char> mat2=255*(mat-mat.min())/(mat.max()-mat.min());
The t and f variables are for setting the axes, they are not part of the bitmap.
For just writing an image you can use Armadillo. Here is a description on how to write portable grey map (PGM) and portable pixel map (PPM) images. PGM export is only possible for 2D matrices, PPM export only for 3D matrices, where the 3rd dimension (size 3) are the channels for red, green and blue.
The reason your matlab figure looks prettier is because it has a colour map: a mapping of every value 0..255 to a vector [R, G, B] specifying the relative intensity of red, green and blue. A photo has an RGB value at every point:
colormap(gray);
x=imread('onion.png');
imagesc(x);
size(x)
That's the 3rd dimension of the image.
Your matrix is a 2d image, so the most natural way to show it is as grey levels (as happened for your spectrum).
x=mean(x,3);
imagesc(x);
This means that the R, G and B intensities jointly increase with the values in mat. You can put a colour map of different R,G,B combinations in a variable and use that instead, i.e. y=colormap('hot');colormap(y);. The variable y shows the R,G,B combinations for the (rescaled) image values.
It's also possible to make your own colour map (in matlab you can specify 64 R, G, and B combinations with values between 0 and 1):
z[63:-1:0; 1:2:63 63:-2:0; 0:63]'/63
colormap(z);
Now for increasing image values, red intensities decrease (starting from the maximum level), green intensities quickly increase then decrease, and blue values increase from minuimum to maximum.
Because PPM appears (I don't know the format) not to support colour maps, you need to specify the R,G,B values in a 3D array. For a colour order similar to z you would neet to make a Cube<unsigned char> c(ysize, xsize, 3) and then for every pixel y, x in mat2, do:
c(y,x,0) = 255-mat2(y,x);
c(y,x,1) = 255-abs(255-2*mat2(y,x));
x(y,x,2) = mat2(y,x)
or something very similar.
You may use SigPack, a signal processing library on top of Armadillo. It has spectrogram support and you may save the plot to a lot of different formats (png, ps, eps, tex, pdf, svg, emf, gif). SigPack uses Gnuplot for the plotting.

How to implemet 1D convolution in opencv?

Is there any way to implement convolution of 1D signal in OpenCV?
As I can see there is only filter2D, but I'm looking for something like Matlab's convn.
For 1-D convolutions, you might want to look at np.convolve.
See here: https://docs.scipy.org/doc/numpy/reference/generated/numpy.convolve.html
Python OpenCV programs that need a 1-D convolution can use it readily.
You can always view a 1D vector as a 2D mat, and thus simply calling the opencv build-it functions resolves the problem.
Below is a snippet that I use to smooth an image histogram.
# Inputs:
# gray = a gray scale image
# smoothing_nbins = int, the width of 1D filter
# Outputs:
# hist_sm = the smoothed image histogram
hist = cv2.calcHist([gray],[0],None,[256],[0,256])
hist_sm = cv2.blur(hist, (1, smoothing_nbins))
As you can see, the only trick you need here is to set one filter dimension to 1.
As far as I know, if you use convolve2D with just the 1D matrix, it still works. But depending on your specific work, it may not.

ITK - Calculate texture features for segmented 3D brain MRI

I'm trying to calculate texture features for a segmented 3D brain MRI using ITK library with C++. So I followed this example. The example takes a 3D image, and extracts 3 different features for all 13 possible spatial directions. In my program, I just want for a given 3D image to get :
Energy
Correlation
Inertia
Haralick Correlation
Inverse Difference Moment
Cluster Prominence
Cluster Shade
Here is what I have so far :
//definitions of used types
typedef itk::Image<float, 3> InternalImageType;
typedef itk::Image<unsigned char, 3> VisualizingImageType;
typedef itk::Neighborhood<float, 3> NeighborhoodType;
typedef itk::Statistics::ScalarImageToCooccurrenceMatrixFilter<InternalImageType>
Image2CoOccuranceType;
typedef Image2CoOccuranceType::HistogramType HistogramType;
typedef itk::Statistics::HistogramToTextureFeaturesFilter<HistogramType> Hist2FeaturesType;
typedef InternalImageType::OffsetType OffsetType;
typedef itk::AddImageFilter <InternalImageType> AddImageFilterType;
typedef itk::MultiplyImageFilter<InternalImageType> MultiplyImageFilterType;
void calcTextureFeatureImage (OffsetType offset, InternalImageType::Pointer inputImage)
{
// principal variables
//Gray Level Co-occurance Matrix Generator
Image2CoOccuranceType::Pointer glcmGenerator=Image2CoOccuranceType::New();
glcmGenerator->SetOffset(offset);
glcmGenerator->SetNumberOfBinsPerAxis(16); //reasonable number of bins
glcmGenerator->SetPixelValueMinMax(0, 255); //for input UCHAR pixel type
Hist2FeaturesType::Pointer featureCalc=Hist2FeaturesType::New();
//Region Of Interest
typedef itk::RegionOfInterestImageFilter<InternalImageType,InternalImageType> roiType;
roiType::Pointer roi=roiType::New();
roi->SetInput(inputImage);
InternalImageType::RegionType window;
InternalImageType::RegionType::SizeType size;
size.Fill(50);
window.SetSize(size);
window.SetIndex(0,0);
window.SetIndex(1,0);
window.SetIndex(2,0);
roi->SetRegionOfInterest(window);
roi->Update();
glcmGenerator->SetInput(roi->GetOutput());
glcmGenerator->Update();
featureCalc->SetInput(glcmGenerator->GetOutput());
featureCalc->Update();
std::cout<<"\n Entropy : ";
std::cout<<featureCalc->GetEntropy()<<"\n Energy";
std::cout<<featureCalc->GetEnergy()<<"\n Correlation";
std::cout<<featureCalc->GetCorrelation()<<"\n Inertia";
std::cout<<featureCalc->GetInertia()<<"\n HaralickCorrelation";
std::cout<<featureCalc->GetHaralickCorrelation()<<"\n InverseDifferenceMoment";
std::cout<<featureCalc->GetInverseDifferenceMoment()<<"\nClusterProminence";
std::cout<<featureCalc->GetClusterProminence()<<"\nClusterShade";
std::cout<<featureCalc->GetClusterShade();
}
The program works. However I have this problem : it gives the same results for different 3D images, even when I change the window size.
Does any one used ITK to do this ? If there is any other method to achieve that, could anyone point me to a solution please ?
Any help will be much apreciated.
I think that your images have only one gray scale level. For example, if you segment your images using itk-snap tool, when you save the result of the segmentation, itk-snap save it with one gray scale level. So, if you try to calculate texture features for images segmented with itk-snap you'll always have the same results even if you change the images or the window size because you have only one gray scale level in the co-occurrence matrix. Try to run your program using unsegmented images, you'll certainly have different results.
EDIT :
To calculate texture features for segmented images, try another segmentation method which saves the original gray scale levels of the unsegmented image.
Something strange in your code is size.Fill(50), while in the example they show it should hold the image dimension:
size.Fill(3); //window size=3x3x3