Asus Zenfone AR correlate depth and colour image - c++

Hello, I am using the depth and colour images from Google Tango, so that I can load the image into Meshlab. There is a related question, where the goal is to find the colour of each point in the Tango Point Cloud. However, I would like to go the other way. For each pixel of the colour image, how do I find the corresponding depth?
I have upsampled the depth image and saved the result in the TangoDepthBuffer. I have used the OpenGL readPixels() method, to get the colour image and store the RGB values in an array called pixels[]. I then correlate the x, y, z values with the RGB values using the following code:
index_rgb = 0;
index_pixels = 0;
for(int i = 0; i < color_camera_width; i++)
{
for(int j = 0; j < color_camera_height; j++)
{
red [index_rgb] = pixels[color_camera_width * color_camera_height * 3 - 3 - index_pixels];
green [index_rgb] = pixels[color_camera_width * color_camera_height * 3 - 2 - index_pixels];
blue [index_rgb] = pixels[color_camera_width * color_camera_height * 3 - 1 - index_pixels];
z[index_rgb] = render_point_cloud_buffer->depths[j * color_camera_width + i];
x[index_rgb] = (double) (i - color_camera_width/2);
y[index_rgb] = (double) (j - color_camera_height/2);
x[index_rgb] = (x[index_rgb] / color_camera_width) * depth_camera_horizontal_fov;
y[index_rgb] = (y[index_rgb] / color_camera_height) * depth_camera_vertical_fov;
x[index_rgb] = z[index_rgb] * tan(x[index_rgb]);
y[index_rgb] = z[index_rgb] * tan(y[index_rgb]);
index_rgb++;
index_pixels += 3;
}
}
I would expect the result to align the depth and colour images. However, when I load the result into Meshlab, the depth pixels are shifted down and to the left of the corresponding colour pixels. The manner in which this shift occurs varies based on the depth. However, I cannot find a depth where there is no shift.
How do you find the transformation required to fix this? Will it work for any depth? Alternatively, how do you find the depth at each specific colour pixel?

Related

Why my bitmap image have another color overlay after converting 32-bit to 8-bit

Im working on resizing bitmap image and converting bitmap image to 8-bit (grayscale). But I have the problem that when I convert 32-bit image to 8-bit image, the result has another color overlay while it works perfectly on 24-bit. I guess the cause is in the alpha color. but I dont know where the problem exactly is.
This is my code to generate 8-bit palette color and write it after DIB part:
char* palette = new char[1024];
for (int i = 0; i < 256; i++) {
palette[i * 4] = palette[i * 4 + 1] = palette[i * 4 + 2] = (char)i;
palette[i * 4 + 3] = 255;
}
fout.write(palette, 1024);
delete[] palette;
As I said, my code works perfectly on 24-bit. In 32-bit the color is still kept after resizing, but when converting to 8-bit, it will look like this:
expected image (when converted from 24-bit) //
unexpected image (when converted from 32-bit)
This is how I get the colors and save it to srcPixel[]:
int i = 0;
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
int index = getIndex(width, x, y);
srcPixel[index].A = srcBMP.pImageData[i];
i += alpha;
srcPixel[index].B = srcBMP.pImageData[i++];
srcPixel[index].G = srcBMP.pImageData[i++];
srcPixel[index].R = srcBMP.pImageData[i++];
}
i += padding;
}
And this is the code I converted it by getting average of 4 colors A, B, G and R from that srcPixel[]:
int i = 0;
for (int y = 0; y < dstHeight; y++) {
for (int x = 0; x < dstWidth; x++) {
int index = getIndex(dstWidth, x, y);
dstBMP.pImageData[i++] = (srcPixel[index].A + srcPixel[index].B + srcPixel[index].G + srcPixel[index].R) / 4;
}
i += dstPadding;
}
If I remove and skip all alpha bytes in my code, when converting my image is still like that and I will have another problem is when resizing, my image will have another color overlay like the problem when converting to 8-bit: resizing without alpha channel.
If I skip the alpha channel while getting average (change into dstBMP.pImageData[i++] = (srcPixel[index].B + srcPixel[index].G + srcPixel[index].R) / 3, there is almost nothing different, the overlay still exists.
If I remove palette[i * 4 + 3] = 255; or doing anything with it, the result is still not affected.
Thank you very much.
You add alpha channel to the color and that's why it becomes brighter. From here I found that opaque is 255 and transparent 0 - therefore you add another channel which is set to 'white' to your result.
Remove alpha channel from your equation and see if I'm right.

Generate High Quality textures Realtime C++

I am having a procedural terrain generation application. Now i want to generate textures for the terrain based on height.
Say i have got 5 textures for different height levels now for every pixel i calculate the the position of it on the mesh then get its height and then decide which texture to sample from.
Note texture is always a square.
In code it will be something like:
for (int i = 0; i < resolution; i++) {
for (int j = 0; j < resolution; j++) {
tex[i * resolution* 3 + j * 3 + 0] = SampleTextureR(i, j);
tex[i * resolution* 3 + j * 3 + 1] = SampleTextureG(i, j);
tex[i * resolution* 3 + j * 3 + 2] = SampleTextureB(i, j);
}
}
Now SampleTextureR(i, j) is just like:
for(TextureData* t : txtures){
if(t.heightl > GetMeshElevation(i, j) && t.heightg < GetMeshElevation(i, j))
return t.sampleR(i, j);
}
return 0;
GetMeshElevation returns height of mesh at a point. t.sampleR() returns unsigned char value of texture's red pixels at (i, j).
heightl is minimum height of the texture
heightg is maximum height of the texture
Now the problem is this this is very slow method. How can i make this fast enough to be done in realtime so that the changes to heightl or heghtg is immediately reflected. the heightl and heightg are for each texture.
These textures can be upto 4K 4096X4096
Use a varying variable between your vertex and fragment shader. A single float value should suffice, since you're only interested in the height coordinate.
Other than that, introduce 5 uniform varaiables for your textures in the fragment shader and do the calculations on the GPU.
In more detail:
For each fragment you get in the fragment shader the interpolated height value of the current mesh. Depending on the height value you simply select the sample from the desired texture and put that color out.

Make a mosaic image (bitmap format)

I want to make a mosaic photo with different window-size (that has been determined by user). This is just like a first draft of the code but I have problems to get the pixels and calculating averages. Then put the avarage value in each pixel and continue to the end. Even I get error to converting them of diffrent types: (Also the other part manufacturers a gray-scale image)
p.s: sorry that I am in the very first steps of learning image processing.
''' void CImageProcessingDoc::OnProcessMosaic()
{
if (m_pImage) {
DlgMosaicOption dlg;
if (dlg.DoModal() == IDOK) {
DWORD dwWindowSize = dlg.m_dwWindowSize;
DWORD width = m_pImage->GetWidth();
DWORD height = m_pImage->GetHeight();
RGBQUAD color;
RGBQUAD newcolor;
float X_step = width / dwWindowSize;
float Y_step = height / dwWindowSize;
int avg, pixel;
for (DWORD y = 0; y < dwWindowSize; y++) {
for (DWORD x = 0; x < dwWindowSize; x++) {
color = m_pImage->GetPixelColor(x, y);
(RGBQUAD) pixel = m_pImage->GetPixelColor(x, y);
avR += (int)(color.red(pixel);
avG += (int)(color.green(pixel);
avB += (int)(color.blue(pixel);
newcolor.rgbBlue = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
newcolor.rgbGreen = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
newcolor.rgbRed = (BYTE)RGB2GRAY(color.rgbRed, color.rgbGreen, color.rgbBlue);
m_pImage->SetPixelColor(x, y, newcolor);
}
}
}
}
} '''
Could anyone please help me to understand the problem?
I think you are mixing up spatial, spectral and temporal average here.
Spatial average
This is the operation of computing average of pixels over an area.
You have to compute eR = 1/N * (P0.R + P1.R + P2.R + P3.R + ...), eG = 1/N * (P0.G + P1.G + ...), eB = 1/N * (P0.B + P1.B + ...)
You'll get a pixel with as many color as there was in the input picture, but with limited spatial frequency, a picture computed like this will appear blurred, with no details
Spectral average
This is the operation of computing average of the components (spectrum) of each pixels.
You have to compute e = 1/3 * (P0.R + P0.G + P0.B)
You'll get a monochromic picture with the exact same spatial frequency as the initial picture.
Temporal average
While you haven't talked about it, this is for reference. The idea is to compute the average of each pixel, and each component for N pictures in a temporal sequence
This gives a kind of motion blurred picture.
Answer
If I understand your question correctly, you want spectral average to convert a RGB to the average grey value taken that grey = (R+G+B)/3.
Thus, you pixel loop should look like this:
for (DWORD y = 0; y < dwWindowSize; y++) {
for (DWORD x = 0; x < dwWindowSize; x++) {
color = m_pImage->GetPixelColor(x, y);
float avg = (color.rgbRed + color.rgbGreen + color.rgbBlue) / 3.f;
m_pImage->SetPixelColor(x, y, RGBQUAD(avg, avg, avg, 1.0f));
}
}
Please notice that converting non linear RGB (usually called sRGB) to luminance using the average is a poor formula for RGB to grayscale conversion. You should read about RGB to Lab* conversion (you are interested in L part only) or at least RGB to YUV (you are interested to Y part only).
If your question is about resizing the input picture, then you are not using the appropriate algorithm, what you want is called resampling.

Object detection with every pixel information OpenCV

Input and Output Images of my code are here.
I want output as complete object detection with every pixel. Here I get with some shadows as well as other background pixels and missing some object points.
So can anybody have idea how can I get complete object detection (foreground detection) with this input images (object image and background image)?
Below is the code I have tried.
cv::Mat ImgObject, ImgBck;
ImgObject = imread("Object.jpg");
ImgBck = imread("Background.jpg");
imwrite("ImgObject.jpg", ImgObject);
imwrite("ImgBck.jpg", ImgBck);
cv::Mat diffImage;
ImgBck = ImgBck + Scalar(-20, -20 - 20);/* decrease brightness of background
because of brightness changes after putting object */
cv::absdiff(ImgObject, ImgBck, diffImage);
float threshold = (float)50;
float dist = 0.0f;
for (int j = 0; j < diffImage.rows; ++j)
{
for (int i = 0; i<diffImage.cols; ++i)
{
cv::Vec3b pix = diffImage.at<cv::Vec3b>(j, i);
dist = (pix[0] * pix[0] + pix[1] * pix[1] + pix[2] * pix[2]);
dist = sqrt(dist);
cv::Point3_<uchar>* pFinal = ImgObject.ptr<Point3_<uchar> >(j, i);
if (dist <= threshold)
{
pFinal->x = 255; // fill blue as background
pFinal->y = 0;
pFinal->z = 0;
}
}
}
imwrite("Obj.jpg", ImgObject);
ImgObject.release();
ImgBck.release();
Do not use direct light on the object(To reduce Shadow and Reflection).
Firstly, I need to say that this is not an object detection task, but a saliency detection or segmentation task.
Second, as #Kartik Maheshwari said, you are facing a lightning issue which is not a solved problem in Computer Vision.
As an alternative answer, take a look at this.

Irregular sampling of an image using OpenGL

I'm looking for some pointers on how to sample an image using OpenGL at a list of specified locations. Any links to tutorial or examples similar to the problem below?
At the moment we have a code that calculates the 'output intensity' at a list of specified locations x1,y1, x2,y2, ..., xn,yn applying a Lanczos2 filter to an input image. The amount of locations at the moment is 20 (which is actually the list of phosphene locations in a visual prosthesis) but it will eventually increase up to 256 and GPU processing will certainly accelerate things. The list of locations can't be hardcoded.
So far I have seen how to implement a median filter and alike, but in my case there is no need to compute the convolution with the filter kernel at every image pixel, just at the locations specified.
Handle those values (intesity) in a second texture that has a sample or not bit.
If you use OpenGL, you'll be able to define the ROI (region of interest), the portion of an image to which you want to apply edits or processing, as you describe.
If you go that route, this is how you calculate the median in a pixel neighborhood radius of your choosing using OpenGL ES 2.0/3.0:
kernel vec4 medianUnsharpKernel(sampler u) {
vec4 pixel = unpremultiply(sample(u, samplerCoord(u)));
vec2 xy = destCoord();
int radius = 3;
int bounds = (radius - 1) / 2;
vec4 sum = vec4(0.0);
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
sum += unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
}
}
vec4 mean = vec4(sum / vec4(pow(float(radius), 2.0)));
float mean_avg = float(mean);
float comp_avg = 0.0;
vec4 comp = vec4(0.0);
vec4 median = mean;
for (int i = (0 - bounds); i <= bounds; i++)
{
for (int j = (0 - bounds); j <= bounds; j++ )
{
comp = unpremultiply(sample(u, samplerTransform(u, vec2(xy + vec2(i, j)))));
comp_avg = float(comp);
median = (comp_avg < mean_avg) ? max(median, comp) : median;
}
}
return premultiply(vec4(vec3(abs(pixel.rgb - median.rgb)), 1.0));
}
A brief description of the steps
1. Calculate the mean of the values of the pixels surrounding the source pixel in a 3x3 neighborhood;
2. Find the maximum pixel value of all pixels in the same neighborhood that are less than the mean.
3. [OPTIONAL] Subtract the median pixel value from the source pixel value for edge detection.
If you're using the median value for edge detection, there are a couple of ways to modify the above code for better results, namely, hybrid median filtering and truncated media filtering (a substitute and a better 'mode' filtering). If you're interested, please ask.