Make a mosaic image (bitmap format) - c++

I want to make a mosaic photo with different window-size (that has been determined by user). This is just like a first draft of the code but I have problems to get the pixels and calculating averages. Then put the avarage value in each pixel and continue to the end. Even I get error to converting them of diffrent types: (Also the other part manufacturers a gray-scale image)
p.s: sorry that I am in the very first steps of learning image processing.
''' void CImageProcessingDoc::OnProcessMosaic()
{
if (m_pImage) {
DlgMosaicOption dlg;
if (dlg.DoModal() == IDOK) {
DWORD dwWindowSize = dlg.m_dwWindowSize;
DWORD width = m_pImage->GetWidth();
DWORD height = m_pImage->GetHeight();
RGBQUAD color;
RGBQUAD newcolor;
float X_step = width / dwWindowSize;
float Y_step = height / dwWindowSize;
int avg, pixel;
for (DWORD y = 0; y < dwWindowSize; y++) {
for (DWORD x = 0; x < dwWindowSize; x++) {
color = m_pImage->GetPixelColor(x, y);
(RGBQUAD) pixel = m_pImage->GetPixelColor(x, y);
avR += (int)(color.red(pixel);
avG += (int)(color.green(pixel);
avB += (int)(color.blue(pixel);
newcolor.rgbBlue = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
newcolor.rgbGreen = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
newcolor.rgbRed = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
m_pImage->SetPixelColor(x, y, newcolor);
}
}
}
}
} '''
Could anyone please help me to understand the problem?

I think you are mixing up spatial, spectral and temporal average here.
Spatial average
This is the operation of computing average of pixels over an area.
You have to compute eR = 1/N * (P0.R + P1.R + P2.R + P3.R + ...), eG = 1/N * (P0.G + P1.G + ...), eB = 1/N * (P0.B + P1.B + ...)
You'll get a pixel with as many color as there was in the input picture, but with limited spatial frequency, a picture computed like this will appear blurred, with no details
Spectral average
This is the operation of computing average of the components (spectrum) of each pixels.
You have to compute e = 1/3 * (P0.R + P0.G + P0.B)
You'll get a monochromic picture with the exact same spatial frequency as the initial picture.
Temporal average
While you haven't talked about it, this is for reference. The idea is to compute the average of each pixel, and each component for N pictures in a temporal sequence
This gives a kind of motion blurred picture.
Answer
If I understand your question correctly, you want spectral average to convert a RGB to the average grey value taken that grey = (R+G+B)/3.
Thus, you pixel loop should look like this:
for (DWORD y = 0; y < dwWindowSize; y++) {
for (DWORD x = 0; x < dwWindowSize; x++) {
color = m_pImage->GetPixelColor(x, y);
float avg = (color.rgbRed + color.rgbGreen + color.rgbBlue) / 3.f;
m_pImage->SetPixelColor(x, y, RGBQUAD(avg, avg, avg, 1.0f));
}
}
Please notice that converting non linear RGB (usually called sRGB) to luminance using the average is a poor formula for RGB to grayscale conversion. You should read about RGB to Lab* conversion (you are interested in L part only) or at least RGB to YUV (you are interested to Y part only).
If your question is about resizing the input picture, then you are not using the appropriate algorithm, what you want is called resampling.

Related

Asus Zenfone AR correlate depth and colour image

Hello, I am using the depth and colour images from Google Tango, so that I can load the image into Meshlab. There is a related question, where the goal is to find the colour of each point in the Tango Point Cloud. However, I would like to go the other way. For each pixel of the colour image, how do I find the corresponding depth?
I have upsampled the depth image and saved the result in the TangoDepthBuffer. I have used the OpenGL readPixels() method, to get the colour image and store the RGB values in an array called pixels[]. I then correlate the x, y, z values with the RGB values using the following code:
index_rgb = 0;
index_pixels = 0;
for(int i = 0; i < color_camera_width; i++)
{
for(int j = 0; j < color_camera_height; j++)
{
red [index_rgb] = pixels[color_camera_width * color_camera_height * 3 - 3 - index_pixels];
green [index_rgb] = pixels[color_camera_width * color_camera_height * 3 - 2 - index_pixels];
blue [index_rgb] = pixels[color_camera_width * color_camera_height * 3 - 1 - index_pixels];
z[index_rgb] = render_point_cloud_buffer->depths[j * color_camera_width + i];
x[index_rgb] = (double) (i - color_camera_width/2);
y[index_rgb] = (double) (j - color_camera_height/2);
x[index_rgb] = (x[index_rgb] / color_camera_width) * depth_camera_horizontal_fov;
y[index_rgb] = (y[index_rgb] / color_camera_height) * depth_camera_vertical_fov;
x[index_rgb] = z[index_rgb] * tan(x[index_rgb]);
y[index_rgb] = z[index_rgb] * tan(y[index_rgb]);
index_rgb++;
index_pixels += 3;
}
}
I would expect the result to align the depth and colour images. However, when I load the result into Meshlab, the depth pixels are shifted down and to the left of the corresponding colour pixels. The manner in which this shift occurs varies based on the depth. However, I cannot find a depth where there is no shift.
How do you find the transformation required to fix this? Will it work for any depth? Alternatively, how do you find the depth at each specific colour pixel?

Implementing FFT low-pass filter in C with FFTW

I am trying to create a very simple C++ program that given an argument in range [0-100] applies a low-pass filter to a grayscale image that should "compress" it proprotionally to the value of the given argument.
I am using the FFTW library.
I have some doubts about how I define the frequency threshold, cut. Is there any more effective way to define such value?
//fftw_complex *fft
//double[] magnitude
// . . .
int percent = 100;
if (percent < 0 || percent > 100) {
cerr << "Compression rate must be a value between 0 and 100." << endl;
return -1;
}
double cut =(double)(w*h) * ((double)percent / (double)100);
for (i = 0; i < (w * h); i++) {
magnitude[i] = sqrt(pow(fft[i][0], 2.0) + pow(fft[i][1], 2.0));
if (magnitude[i] < cut) {
fft[i][0] = 0.0;
fft[i][1] = 0.0;
}
}
Update1:
I've changed my code to this, but again I'm not sure this is a proper way to filter frequencies. The image is surely compressed, but non-square images are messed up and setting compression to 100% isn't the real maximum compression available (I can go up to ~140%).
Here you can find an image of what I see now.
int cX = w/2;
int cY = h/2;
cout<<"TEST "<<((double)percent/(double)100)*h<<endl;
for(i = 0; i<(w*h);i++){
int row = i/s;
int col = i%s;
int distance = sqrt((col-cX)*(col-cX)+(row-cY)*(row-cY));
if(distance<((double)percent/(double)100)*min(cX,cY)){
fft[i][0] = 0.0;
fft[i][1] = 0.0;
}
}
This is not a low-pass filter at all. A low-pass filter passes low frequencies, i.e. it removes fine details (blurring). You obviously need a 2D FFT for that.
This code just removes random bits, essentially.
[edit]
The new code looks a lot more like a low-pass filter. The 141% setting is expected: the diagonal of a square is sqrt(2)=1.41 times its side. Converting an index into a row/column pair should use the image width, not some random unexplained s.
I don't know where your zero frequency is located. That should be easy to spot (largest value) but it might be in (0,0) instead of (w/2,h/2)

C++AMP Computing gradient using texture on a 16 bit image

I am working with depth images retrieved from kinect which are 16 bits. I found some difficulties on making my own filters due to the index or the size of the images.
I am working with Textures because allows to work with any bit size of images.
So, I am trying to compute an easy gradient to understand what is wrong or why it doesn't work as I expected.
You can see that there is something wrong when I use y dir.
For x:
For y:
That's my code:
typedef concurrency::graphics::texture<unsigned int, 2> TextureData;
typedef concurrency::graphics::texture_view<unsigned int, 2> Texture
cv::Mat image = cv::imread("Depth247.tiff", CV_LOAD_IMAGE_ANYDEPTH);
//just a copy from another image
cv::Mat image2(image.clone() );
concurrency::extent<2> imageSize(640, 480);
int bits = 16;
const unsigned int nBytes = imageSize.size() * 2; // 614400
{
uchar* data = image.data;
// Result data
TextureData texDataD(imageSize, bits);
Texture texR(texDataD);
parallel_for_each(
imageSize,
[=](concurrency::index<2> idx) restrict(amp)
{
int x = idx[0];
int y = idx[1];
// 65535 is the maxium value that can take a pixel with 16 bits (2^16 - 1)
int valX = (x / (float)imageSize[0]) * 65535;
int valY = (y / (float)imageSize[1]) * 65535;
texR.set(idx, valX);
});
//concurrency::graphics::copy(texR, image2.data, imageSize.size() *(bits / 8u));
concurrency::graphics::copy_async(texR, image2.data, imageSize.size() *(bits) );
cv::imshow("result", image2);
cv::waitKey(50);
}
Any help will be very appreciated.
Your indexes are swapped in two places.
int x = idx[0];
int y = idx[1];
Remember that C++AMP uses row-major indices for arrays. Thus idx[0] refers to row, y axis. This is why the picture you have for "For x" looks like what I would expect for texR.set(idx, valY).
Similarly the extent of image is also using swapped values.
int valX = (x / (float)imageSize[0]) * 65535;
int valY = (y / (float)imageSize[1]) * 65535;
Here imageSize[0] refers to the number of columns (the y value) not the number of rows.
I'm not familiar with OpenCV but I'm assuming that it also uses a row major format for cv::Mat. It might invert the y axis with 0, 0 top-left not bottom-left. The Kinect data may do similar things but again, it's row major.
There may be other places in your code that have the same issue but I think if you double check how you are using index and extent you should be able to fix this.

Irregular sampling of an image using OpenGL

I'm looking for some pointers on how to sample an image using OpenGL at a list of specified locations. Any links to tutorial or examples similar to the problem below?
At the moment we have a code that calculates the 'output intensity' at a list of specified locations x1,y1, x2,y2, ..., xn,yn applying a Lanczos2 filter to an input image. The amount of locations at the moment is 20 (which is actually the list of phosphene locations in a visual prosthesis) but it will eventually increase up to 256 and GPU processing will certainly accelerate things. The list of locations can't be hardcoded.
So far I have seen how to implement a median filter and alike, but in my case there is no need to compute the convolution with the filter kernel at every image pixel, just at the locations specified.
Handle those values (intesity) in a second texture that has a sample or not bit.
If you use OpenGL, you'll be able to define the ROI (region of interest), the portion of an image to which you want to apply edits or processing, as you describe.
If you go that route, this is how you calculate the median in a pixel neighborhood radius of your choosing using OpenGL ES 2.0/3.0:
kernel vec4 medianUnsharpKernel(sampler u) {
vec4 pixel = unpremultiply(sample(u, samplerCoord(u)));
vec2 xy = destCoord();
int radius = 3;
int bounds = (radius - 1) / 2;
vec4 sum = vec4(0.0);
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
sum += unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
}
}
vec4 mean = vec4(sum / vec4(pow(float(radius), 2.0)));
float mean_avg = float(mean);
float comp_avg = 0.0;
vec4 comp = vec4(0.0);
vec4 median = mean;
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
comp = unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
comp_avg = float(comp);
median = (comp_avg < mean_avg) ? max(median, comp) : median;
}
}
return premultiply(vec4(vec3(abs(pixel.rgb - median.rgb)), 1.0));
}
A brief description of the steps
1. Calculate the mean of the values of the pixels surrounding the source pixel in a 3x3 neighborhood;
2. Find the maximum pixel value of all pixels in the same neighborhood that are less than the mean.
3. [OPTIONAL] Subtract the median pixel value from the source pixel value for edge detection.
If you're using the median value for edge detection, there are a couple of ways to modify the above code for better results, namely, hybrid median filtering and truncated media filtering (a substitute and a better 'mode' filtering). If you're interested, please ask.

How to map optical flow field (float) to pixel data (char) for image warping?

I've been playing with the optical flow functions in OpenCV and am stuck. I've successfully generated X and Y optical flow fields/maps using the Farneback method, but I don't know how to apply this to the input image coordinates to warp the images. The resulting X and Y fields are of 32bit float type (0-1.0), but how does this translate to the coordinates of the input and output images? For example, 1.0 of what? The width of the image? The difference between the two?
Plus, I'm not sure what my loop would look like to apply the transform/warp. I've done plenty of loops to change color, but the pixels always remain in the same location. Moving pixels around is new territory for me!
Update: I got this to work, but the resulting image is messy:
//make a float copy of 8 bit grayscale source image
IplImage *src_img = cvCreateImage(img_sz, IPL_DEPTH_32F, 1);
cvConvertScale(input_img,src_img,1/255.0); //convert 8 bit to float
//create destination image
IplImage *dst_img = cvCreateImage(img_sz, IPL_DEPTH_32F, 1);
for(y = 0; y < flow->height; y++){
//grab flow maps for X and Y
float* vx = (float*)(velx->imageData + velx->widthStep*y);
float* vy = (float*)(vely->imageData + vely->widthStep*y);
//coords for source and dest image
const float *srcpx = (const float*)(src_img->imageData+(src_img->widthStep*y));
float *dstpx = (float*)(dst_img->imageData+(dst_img->widthStep*y));
for(x=0; x < flow->width; x++)
{
int newx = x+(vx[x]);
int newy = (int)(vy[x])*flow->width;
dstpx[newx+newy] = srcpx[x];
}
}
I could not get this to work. The output was just garbled noise:
cvRemap(src_img,dst_img,velx,vely,CV_INTER_CUBIC,cvScalarAll(0));
The flow vectors are velocity values. If the pixel in image 1 at position (x, y) has the flow vector (vx, vy) it is estimated to be at position (x+vx, y+vy) (so the values aren't really in the [0, 1] range - they can be bigger, and be negative too). Easiest way to do the warping is to create floating point images with those values (x+vx for the x direction, similar for y), and then use cv::remap.
Using OpenCV
https://github.com/opencv/opencv/blob/master/samples/python/opt_flow.py
def warp_flow(img, flow):
h, w = flow.shape[:2]
flow = -flow
flow[:,:,0] += np.arange(w)
flow[:,:,1] += np.arange(h)[:,np.newaxis]
res = cv2.remap(img, flow, None, cv2.INTER_LINEAR)
return res