How can I turn a three channel Mat into a summed up one channel Mat? - c++

I want to add up all channels of a Mat image to a Mat image with only one sum-channel. I've tried it this way:
// sum up the channels of the image:
// 1 .store initial nr of rows/columns
int initialRows = frameVid1.rows;
int initialCols = frameVid1.cols;
// 2. check if matrix is continous
if (!frameVid1.isContinuous())
{
frameVid1 = frameVid1.clone();
}
// 3. reshape matrix to 3 color vectors
frameVid1 = frameVid1.reshape(3, initialRows*initialCols);
// 4. convert matrix to store bigger values than 255
frameVid1.convertTo(frameVid1, CV_32F);
// 5. sum up the three color vectors
reduce(frameVid1, frameVid1, 1, CV_REDUCE_SUM);
// 6. reshape to initial size
frameVid1 = frameVid1.reshape(1, initialRows);
// 7. convert back to CV_8UC1
frameVid1.convertTo(frameVid1, CV_8U);
But somehow reduce does not touch the color channels as a Matrix Dimension. Is there another function that can sum them up?
Also why does using CV_16U in step 4.) not work? (I had to put a CV_32F in there)
Thanks in advance!

You can sum the RGB channels with a single line
cv::transform(frameVid1, frameVidSum, cv::Matx13f(1,1,1))
You may need one more line, as before applying the transform you shall convert the image to some appropriate type to avoid saturation (I assumed CV_32FC3). -Output array is of the same size and depth as source.
Some explanation:
cv::transform may operate on per-pixel channel values.
Having the third argument cv::Matx13f(a, b, c) for each pixel [u,v] it does the following:
frameVidSum[u,v] = frameVid1[u,v].B * a + frameVid1[u,v].G * b + frameVid1[u,v].R * c
By using third argument cv::Matx13f(1,0,1) you will sum only blue and red channels.
cv::transform is so clever, you can even use cv::Matx14f and then the fourth value will be added (offset) to each pixel in the frameVidSum.

Every 3rd element (in RGB) is one similar colour. Probably it will work if you grab every group of 3 elements (R, G and B) sum them up and store it in another 1-channel matrix. Before storing you should use saturate cast to avoid unexpected results. So, I think the better way is to use saturate cast instead of adapting your matrix.

Have a look at cv::split() and cv::add() functions.
You can use the split function to split the image into separate channels and then the add function to add the images. But be careful when using add because adding may lead to saturation of values. You may have to first convert types and then add. Have a look here: http://answers.opencv.org/question/13769/adding-matrices-without-saturation/

Related

How i can take the average of 100 image using opencv?

i have 100 image, each one is 598 * 598 pixels, and i want to remove the pictorial and noise by taking the average of pixels, but if i want to use Adding for "pixel by pixel"then dividing i will write a loop until 596*598 repetitions for one image, and 598*598*100 for hundred of image.
is there a method to help me in this operation?
You need to loop over each image, and accumulate the results. Since this is likely to cause overflow, you can convert each image to a CV_64FC3 image, and accumualate on a CV_64FC3 image. You can use also CV_32FC3 or CV_32SC3 for this, i.e. using float or integer instead of double.
Once you have accumulated all values, you can use convertTo to both:
make the image a CV_8UC3
divide each value by the number of image, to get the actual mean.
This is a sample code that creates 100 random images, and computes and shows the
mean:
#include <opencv2\opencv.hpp>
using namespace cv;
Mat3b getMean(const vector<Mat3b>& images)
{
if (images.empty()) return Mat3b();
// Create a 0 initialized image to use as accumulator
Mat m(images[0].rows, images[0].cols, CV_64FC3);
m.setTo(Scalar(0,0,0,0));
// Use a temp image to hold the conversion of each input image to CV_64FC3
// This will be allocated just the first time, since all your images have
// the same size.
Mat temp;
for (int i = 0; i < images.size(); ++i)
{
// Convert the input images to CV_64FC3 ...
images[i].convertTo(temp, CV_64FC3);
// ... so you can accumulate
m += temp;
}
// Convert back to CV_8UC3 type, applying the division to get the actual mean
m.convertTo(m, CV_8U, 1. / images.size());
return m;
}
int main()
{
// Create a vector of 100 random images
vector<Mat3b> images;
for (int i = 0; i < 100; ++i)
{
Mat3b img(598, 598);
randu(img, Scalar(0), Scalar(256));
images.push_back(img);
}
// Compute the mean
Mat3b meanImage = getMean(images);
// Show result
imshow("Mean image", meanImage);
waitKey();
return 0;
}
Suppose that the images will not need to undergo transformations (gamma, color space, or alignment). The numpy package lets you do this quickly and succinctly.
# List of images, all must be the same size and data type.
images=[img0, img1, ...]
avg_img = np.mean(images, axis=0)
This will auto-promote the elements to float. If you want the as BGR888, then:
avg_img = avg_img.astype(np.uint8)
Could also do uint16 for 16 bits per channel. If you are dealing with 8 bits per channel, you almost certainly won't need 100 images.
Firstly- convert images to floats. You have N=100 images. Imagine that a single image is an array of average pixel values of 1 image. You need to calculate an array of average pixel values of N images.
Let A- array of average pixel values of X images, B - array of average pixel values of Y images. Then C = (A * X + B * Y) / (X + Y) - array of average pixel values of X + Y images. To get better accuracy in floating point operations X and Y should be approximately equal
You may merge all you images like subarrays in merge sort. In you case merge operation is C = (A * X + B * Y) / (X + Y) where A and B are arrays of average pixel values of X and Y images

Coordinates of a pixel between two images

I am looking for a solution to easily compute the pixel coordinate from two images.
Question: If you take the following code, how could I compute the pixel coordinate that changed from the "QVector difference" ? Is it possible to have an (x,y) coordinate and find on the currentImage which pixel it represents ?
char *previousImage;
char *currentImage;
QVector difference<LONG>;
for(int i = 0 ; i < CurrentImageSize; i++)
{
//Check if pixels are the same (we can also do it with RGB values, this is just for the example)
if(previousImagePixel != currentImagePixel)
{
difference.push_back(currentImage - previousImage);
}
currentImage++;
}
EDIT:
More information about this topic:
The image is in RGB format
The width, the height and the bpp of both images are known
I have a pointer to the bytes representing the image
The main objective here is to clearly know what is the new value of a pixel that changed between the two images and to know which pixel is it (its coordinates)
There is not enough information to answer, but I will try to give you some idea.
You have declared char *previousImage;, which implies to me that you have a pointer to the bytes representing an image. You need more than that to interpret the image.
You need to know the pixel format. You mention RGB, So -- for the time being, let's assume that the image uses 3 bytes for each pixel and the order is RGB
You need to know the width of the image.
Given the above 2, you can calculate the "Row Stride", which is the number of bytes that a row takes up. This is usually the "bytes per pixel" * "image width", but it is typically padded out to be divisible by 4. So 3 bpp and a width of 15, would be 45 bytes + 3 bytes of padding to make the row stride 48.
Given that, if you have an index into the image data, you first integer-divide it against the row stride to get the row (Y coordinate).
The X coordinate is the (index mod the row stride) integer-divided by the bytes per pixel.
From what I understand, you want compute the displacement or motion that occured between two images. E.g. for each pixel I(x, y, t=previous) in previousImage, you want to know where it did go in currentImage, and what is his new coordinate I(x, y, t=current).
If that is the case, then it's called motion estimation and measuring the optical flow. There are many algorithms for that, who rely on more or less complex hypotheses, depending on the objects you observe in the image sequence.
The simpliest hypothesis is that if you follow a moving pixel I(x, y, t) in the scene you observe, its luminance will remain constant over time. In other words, dI(x,y,t) / dt = 0.
Since I(x, y, t) is function of three parameters (space and time) with two unknowns, and there is only one equation, this is an ill defined problem that has no easy solution. Many of the algorithms add an additional hypothesis, so that the problem can be solved with a unique solution.
You can use existing libraries which will do that for you, one of them which is pretty popular is openCV.

Compare intensity pixel value Vec3b in OpenCV

I have a 3 channel Mat image, type is CV_8UC3.
I want to compare, in a loop, the intensity value of a pixel with its neighbours and then set 0 or 1 if the neighbour is greater or not.
I can get the intensity calling Img.at<Vec3b>(x,y).
But my question is: how can I compare two Vec3b?
Should I compare pixels value for every channel (BGR or Vec3b[0], Vec3b[1] and Vec3b[2]), and then merge the three channels results into a single Mat object?
Me again :)
If you want to compare (greater or less) two RGB values you need to project the 3-dimensional RGB space onto a plane or axis.
Of course, there are many possibilities to do this, but an easy way would be to use the HSV color space. The hue (H), however, is not appropriate as a linear order function because it is circular (i.e. the value 1.0 is identical with 0.0, so you cannot decide if 0.5 > 0.0 or 0.5 < 0.0). However, the saturation (S) or the value (V) are appropriate projection functions for your purpose:
If you want to have colored pixels "larger" than monochrome pixels, you will prefer S.
If you want to have lighter pixels larger than darker pixels, you will probably prefer V.
Also any combination of S and V would be a valid projection function, e.g. S+V.
As far as I understand, you want a measure to calculate distance/similarity between two Vec3b pixels. This can be reflected to the general problem of finding distance between two vectors in an n-mathematical space.
One of the famous measures (and I think this is what you're asking for), is the Euclidean distance.
If you are using Opencv then you can simply use:
cv::Vec3b a(1, 1, 1);
cv::Vec3b b(5, 5, 5);
double dist = cv::norm(a, b, CV_L2);
You can refer to this for reading about cv::norm and its options.
Edit: If you are doing this to measure color similarity, it's recommended to use the LAB color space as it's proved that Euclidean distance in LAB space is a good approximation for human perception of colors.
Edit 2: I see what you mean, for this you can get the magnitude of each vector and then compare them, something like this:
double a_magnitude = cv::norm(a, CV_L2);
double b_magnitude = cv::norm(b, CV_L2);
if(a_magnitude > b_magnitude)
// do something
else
// do something else.

Image Gaussian convolution in Fourier domain: works, while should't

The problem is I can't fully understand the principles of convolution in frequency domain.
I have an image of size 256x256, which I want to convolve with 3x3 gaussian matrix. It's coefficients are (1/16, 1/8, 1/4):
PlainImage<float> FourierRunner::getGaussMask(int sz)
{
PlainImage<float> G(3,3);
*G.at(0, 0) = 1.0/16; *G.at(0, 1) = 1.0/8; *G.at(0, 2) = 1.0/16;
*G.at(1, 0) = 1.0/8; *G.at(1, 1) = 1.0/4; *G.at(1, 2) = 1.0/8;
*G.at(2, 0) = 1.0/16; *G.at(2, 1) = 1.0/8; *G.at(2, 2) = 1.0/16;
return G;
}
To get FFT of both image and filter kernel, I zero-pad them. sz_common stands for the extended size. Image and kernel are moved to the center of h and g ComplexImages respectively, so they are zero-padded at right, left, bottom and top.
I've read that size should be sz_common >= sz+gsz-1 because of circular convolution property: filter can change undesired image values on boundaries.
But it don't works: adequate results are only when sz_common = sz, when sz_common = sz+gsz-1 or sz_common = 2*sz, after IFFT I get 2-3 times smaller convolved image! Why?
Also I'm confused that filter matrix values should be multiplied by 256, like pixel values: other questions on SO contain Matlab code without such normalization. As in previous case, without such multiplying it works bad: I get black image. Why?
// fft_in is shifted fourier image with center in [sz/2;sz/2]
void FourierRunner::convolveImage(ComplexImage& fft_in)
{
int sz = 256; // equal to fft_in.width()
// Get original complex image (backward fft_in)
ComplexImage original_complex = fft_in;
fft2d_backward(fft_in, original_complex);
int gsz = 3;
PlainImage<float> filter = getGaussMask(gsz);
ComplexImage filter_complex = ComplexImage::fromFloat(filter);
int sz_common = pow2ceil(sz); // should be sz+gsz-1 ???
ComplexImage h = ComplexImage::zeros(sz_common,sz_common);
ComplexImage g = ComplexImage::zeros(sz_common,sz_common);
copyImageToCenter(h, original_complex);
copyImageToCenter(g, filter_complex);
LOOP_2D(sz_common, sz_common) g.setPoint(x, y, g.at(x, y)*256);
fft2d_forward(g, g);
fft2d_forward(h, h);
fft2d_fft_shift(g);
// CONVOLVE
LOOP_2D(sz_common,sz_common) h.setPoint(x, y, h.at(x, y)*g.at(x, y));
copyImageToCenter(fft_in, h);
fft2d_backward(fft_in, fft_in);
fft2d_fft_shift(fft_in);
// TEST DIFFERENCE BTW DOMAINS
PlainImage<float> frequency_res(sz,sz);
writeComplexToPlainImage(fft_in, frequency_res);
fft2d_forward(fft_in, fft_in);
}
I tried to zero-padd image at right and bottom, such that smaller image is copied to the start of bigger, but it also doesn't work.
I wrote convolution in spatial domain to compare results, frequency blur results are almost the same as in spatial domain (avg. error btw pixels is 5), only when sz_common = sz.
So, could you explain phenomena of zero-padding and normalization for this case? Thanks in advance.
Convolution in the Spatial Domain is equivalent of Multiplication in the Fourier Domain.
This is the truth for Continuous functions which are defined everywhere.
Yet in practice, we have discrete signals and convolution kernels.
Which require more gentle caring.
If you have an image of the size M x N and a Kernel of the size of MM x NN if you apply DFT (FFT is an efficient way to calculate the DFT) on them you'll get functions of the size of M x N and MM x NN respectively.
Moreover, the theorem above, about the multiplication equivalence requires to multiply the same frequencies one with each other.
Since practically the Kernel is much smaller than the image, usually it is zero padded to the size of the image.
Now, by applying the DFT you'll get to matrices of the same M x N size and will be able to multiply them.
Yet, this will be equivalent of the Circular Convolution between the Image and Kernel.
To apply the linear convolution you should make them both in the size of (M + MM - 1) x (N + NN - 1).
Usually this would be by applying "Replicate" boundary condition on the image and zero pad the Kernel.
Enjoy...
P.S.
Could you support a new Community Proposal for SE at - http://area51.stackexchange.com/proposals/86832/.
We need more people to follow, up vote questions with less than 10 up votes and more question to be asked.
Thank You.

matrix multiplication resulting in values greater than 255

If I am performing matrix multiplication on two 8UC1 images, or per element multiplication, what happens if one of the resulting pixel values is greater than 255? For example, if in image A a certain pixel has value 100, and in image B that same pixel has value 150 (for the per element multiplication case), then clearly 100*150 > 255 - so does that pixel simply get truncated to 255 value? And if so is there some transformation I can make to preserve that information without having it truncated?
opencv will saturate the result for a uchar img.
to avoid that, use e.g. the dtype flag in multiply and specify a type larger than your input
Mat a, b; //input, CV_8U
Mat c; // output, yet unspecified
multiply( a,b, c, 1, CV_32S ); // c will be of int type, untruncated results