Depth Values Don't Make Sense R200 Camera - c++

I am running the tutorial found here: https://software.intel.com/en-us/articles/using-librealsense-and-opencv-to-stream-rgb-and-depth-data
It gets the depth values from the r200 using the following lines:
cv::Mat depth16( _depth_intrin.height, _depth_intrin.width, CV_16U,(uchar *)_rs_camera.get_frame_data( rs::stream::depth ) );
cv::Mat depth8u = depth16;
depth8u.convertTo( depth8u, CV_8UC1, 255.0/1000 );
imshow( WINDOW_DEPTH, depth8u );
And the output image steam is:
https://imgur.com/EmdhFNk
You can see the color image as well. I've also put a tape measure across the bottom that goes as far as 3.5m (the range for the r200 is supposed to be up to 3.5m)
Why on earth is the color binary? I've tried adding different color images but it seems to not be depth values at all. Also it makes no sense that the floor is consistently black even though it spans from 1m to 5m away. Why are all objects white? The table and couch are obviously different distances away.
How can I improve this? I know you can get good depth values from the r200 as I get them in the examples. See (http://docs.ros.org/kinetic/api/librealsense/html/cpp-capture_8cpp_source.html) but these use glfw as opposed to OpenCV. I'm wondering why the depth values are so odd once theyve been converted.
Ideally i would like to generate depth values and filter any outside the range of 1m to 2m away. Thanks!

Edit: As #MSalters pointed out, the first half of my answer was erroneous and due to my misreading of the OP's code. The second half contains the right answer.
If your depth range is 1-3.5m, measured in millimetres (1000mm-3500mm); dividing the result by 1000 will give you data in the range 1.0-3.5. However, your source data is a 16-bit unsigned type, which can't represent decimal or floating point values, only integers, so your values get truncated to one of {0,1,2,3}. You might get away with this in convertTo, as it may marshal the types internally, but it's a potential source of error.
There is a second problem though... CV_8U is an 8-bit unsigned char, which can also only represent integer values, this time in the range from 0-255. Since your data can be in the range 0...3500, by multiplying by 0.255 as you do in your example, anything over 1000mm depth results in a value over 255 and so gets truncated there.
Instead of converting the raw depth image as you are above, you could use the cv::normalize function, with the NORM_MINMAX normalisation-type to normalise your data down to the 0...255 range. You can set the destination image format to CV_8U too.
This is probably only suitable for visualisation though, as it'll be affected by the source data input range. Instead, if you know your max value is 3500, and your min is 0, divide the source image by 3500 and multiply by 255. That said, where possible, it's probably best to keep it in the 16-bit format for the sake of depth resolution.

Related

cimg pixel value - numerical

is there a way to get the int value of a pixel returned with cimg? I'm in the process of building a basic ASCII art program that converts JPG's to character arrays, and I have the entire utility built out but I cann not find a way to get the unsigned char's converted into the range of ints I need (0-255, although the specifics don't matter so long as its a predictable interval).
Does anyone have any idea how to get a numerical pixel value from a JPG? (library suggestions or anything else are completely welcome)
Here is the pixel output:
\�_b��}�HaX�gNzԴ�����p��-�u�����lqu��Lߐ_"T������{�y�sricX[[TXgZ]`a~�t91960d�BpvJ0kY#uR!BpMWb\W?j"#���dCy2+4?ڽ�TT<Tght%P%y;mhͬ�����8#1�H��)����:4lu���CY|��u&<_��ī��������������ȿF�����LP:����N���-�Q�+�2;E3(�SdRO6��NI16j{#�0((
: pixel data
It's already been converted to black and white, so even accessing the numerical value of one color channel off the cimg would be fine. I just can't seem to get any kind of intelligible/manipulable output from the image, even though the image itself is exactly what i'm looking for.
cast it as an int using (int)img(x,y) and ignore the extra channels

Converting 12 bit color values to 8 bit color values C++

I'm attempting to convert 12-bit RGGB color values into 8-bit RGGB color values, but with my current method it gives strange results.
Logically, I thought that simply dividing the 12-bit RGGB into 8-bit RGGB would work and be pretty simple:
// raw_color_array contains R,G1,G2,B in a bayer pattern with each element
// ranging from 0 to 4096
for(int i = 0; i < array_size; i++)
{
raw_color_array[i] /= 16; // 4096 becomes 256 and so on
}
However, in practice this actually does not work. Given, for example, a small image with water and a piece of ice in it you can see what actually happens in the conversion (right most image).
Why does this happen? and how can I get the same (or close to) image on the left, but as 8-bit values instead? Thanks!
EDIT: going off of #MSalters answer, I get a better quality image but the colors are still drasticaly skewed. What resources can I look into for converting 12-bit data to 8-bit data without a steep loss in quality?
It appears that your raw 12 bits data isn't on a linear scale. That is quite common for images. For a non-linear scale, you can't use a linear transformation like dividing by 16.
A non-linear transform like sqrt(x*16) would also give you an 8 bits value. So would std::pow(x, 12.0/8.0)
A known problem with low-gradient images is that you get banding. If your images has an area where the original value varies from say 100 to 200, the 12-to-8 bit reduction will shrink that to less than 100 different values. You get rounding , and with naive (local) rounding you get bands. Linear or non-linear, there will then be some inputs x that all map to y, and some that map to y+1. This can be mitigated by doing the transformation in floating point, and then adding a random value between -1.0 and +1.0 before rounding. This effectively breaks up the band structure.
After you clarified that this 12bit data is only for one color, here is my simple answer:
Since you want to convert its value to its 8 bit equivalent, it obviously means you lost some of the data (4bits). This is the reason why you are not getting the same output.
After clarification:
If you want to retain the actual colour values!
Apply de-mosaicking in the 12 Bit image and then scale the resultant data to 8 - Bit. So that the colour loss due to de-mosaicking will be less compared to the previous approach.
You say that your 12-bits represent 2^12 bits of one colour. That is incorrect. There are reds, greens and blues in your image. Look at the histogram. I made this with ImageMagick at the command line:
convert cells.jpg histogram:png:h.png
If you want 8-bits per pixel, rather than trying to blindly/statically apportion 3 bits to Green, 2 bits to Red and 3 bits to Blue, you would probably be better off going with an 8-bit palette so you can have 250+ colours of all variations rather than restricting yourself to just 8 blue shades, 4 reds an 8 green. So, like this:
convert cells.jpg -colors 254 PNG8:result.png
Here is the result of that beside the original:
The process above is called "quantisation" and if you want to implement it in C/C++, there is a writeup here.

OpenCV convertTo()

I came across this code:
image.convertTo(temp_image,CV_16SC3);
I saw the description of the convertTo() function from here, but what confuses me is image. How can we read the above code? What would be the relation between image and temp_image?
Thanks.
The other answers here are correct, but lack some details. Let me try.
image.convertTo(temp_image,CV_16SC3);
You have a source image image, and a destination image temp_image. You didn't specify the type of image, but probably is CV_8UC3 or CV_32FC3, i.e. a 3 channel image (since convertTo doesn't change the number of channels), where each channel has depth 8 bit (unsigned char, CV_8UC3) or 32 bit (float, CV_32FC3).
This line of code will change the depth of each channel, so that temp_image has each channel of depth 16 bit (short). Specifically it's a signed short, since the type specifier has the S: CV_16SC3.
Note that if you are narrowing down the depth, as in the case from float to signed short, then saturate_cast will make sure that all the values in temp_image will be in the correct range, i.e. in [–32768, 32767] for signed short.
Why you need to change the depth of an image?
Some OpenCV functions require input images with a specific depth.
You need a matrix to contain a different range of values. E.g. if you need to sum (or subtract) some images CV_8UC3 (tipically BGR images), you'd better store the result in a CV_16SC3 or you'll probably get wrong results due to saturations, since the range for CV_8U images is in [0,255]
You read with imread, or want to store with imwrite images with 16bit depth. This are usually used (AFAIK) in medical or graphics application to allow a wider range of colors. However, most monitors do not support 16bit image visualization.
There may be other cases, let me know if I miss the one important to you.
An image is a matrix of pixel information (i.e. a 1080p image will be a 1,920 × 1,080 matrix where each entry contains rbg values for that pixel). All you are doing is reformatting that matrix (each pixel entry, iteratively) into a new type (CV_16SC3) so it can be read by different programs.
The temp_image is a new matrix of pixel information based off of image formatted into CV_16SC3.
The first one is a source, the second one - destination. So, it takes image, converts it into type CV_16SC3 and stores in temp_image.

C++: How to interpret a byte array representation of an image?

I'm trying to work with this camera SDK, and let's say the camera has this function called CameraGetImageData(BYTE* data), which I assume takes in a byte array, modifies it with the image data, and then returns a status code based on success/failure. The SDK provides no documentation whatsoever (not even code comments) so I'm just guestimating here. Here's a code snippet on what I think works
BYTE* data = new BYTE[10000000]; // an array of an arbitrary large size, I'm not
// sure what the exact size needs to be so I
// made it large
CameraGetImageData(data);
// Do stuff here to process/output image data
I've run the code w/ breakpoints in Visual Studio and can confirm that the CameraGetImageData function does indeed modify the array. Now my question is, is there a standard way for cameras to output data? How should I start using this data and what does each byte represent? The camera captures in 8-bit color.
Take pictures of pure red, pure green and pure blue. See what comes out.
Also, I'd make the array 100 million, not 10 million if you've got the memory, at least initially. A 10 megapixel camera using 24 bits per pixel is going to use 30 million bytes, bigger than your array. If it does something crazy like store 16 bits per colour it could take up to 60 million or 80 million bytes.
You could fill this big array with data before passing it. For example fill it with '01234567' repeated. Then it's really obvious what bytes have been written and what bytes haven't, so you can work out the real size of what's returned.
I don't think there is a standard but you can try to identify which values are what by putting some solid color images in front of the camera. So all pixels would be approximately the same color. Having an idea of what color should be stored in each pixel you may understand how the color is represented in your array. I would go with black, white, reg, green, blue images.
But also consider finding a better SDK which has the documentation, because making just a big array is really bad design
You should check the documentation on your camera SDK, since there's no "standard" or "common" way for data output. It can be raw data, it can be RGB data, it can even be already compressed. If the camera vendor doesn't provide any information, you could try to find some libraries that handle most common formats, and try to pass the data you have to see what happens.
Without even knowing the type of the camera, this question is nearly impossible to answer.
If it is a scientific camera, chances are good that it adhers to the IEEE 1394 (aka IIDC or DCAM) standard. I have personally worked with such a camera made by Hamamatsu using this library to interface with the camera.
In my case the camera output was just raw data. The camera itself was monochrome and each pixel had a depth-resolution of 12 bit. Therefore, each pixel intensity was stored as 16-bit unsigned value in the result array. The size of the array was simply width * height * 2 bytes, where width and height are the image dimensions in pixels the factor 2 is for 16-bit per pixel. The width and height were known a-priori from the chosen camera mode.
If you have the dimensions of the result image, try to dump your byte array into a file and load the result either in Python or Matlab and just try to visualize the content. Another possibility is to load this raw file with an image editor such as ImageJ and hope to get anything out from it.
Good luck!
I hope this question's solution will helps you: https://stackoverflow.com/a/3340944/291372
Actually you've got an array of pixels (assume 1 byte per pixel if you camera captires in 8-bit). What you need - is just determine width and height. after that you can try to restore bitmap image from you byte array.

Using ImageMagick++ to modify image contrast/brightness

I'm trying to apply contrast and brightness to a bitmap in memory and I'm completely lost. Currently I'm trying to use Magick++ to do it, but if one of the other APIs would work better I'm all ears. I managed to find Magick::Image::sigmoidalContrast() for applying the contrast, but I can't figure out how to get it to work. I'm creating an image, passing it the buffer pointer, then calling that function, but it doesn't seem like it's changing anything so my first though was that it's making a copy and modifying that. Even so, I have no idea how to get the data out of the Magick::Image object.
Here's what I got so far.
Magick::Image image(fBitmapData->mGetTextureWidth(), fBitmapData->mGetTextureHeight(), "RGBA", MagickCore::CharPixel, pixels);
image.sigmoidalContrast(1, 20.0);
The documentation is useless and after searching I could only find hints that the first parameter is actually a boolean, even though it takes a size_t, that specifies whether to add or subtract the contrast, and the second value is something I have no idea what to pass so I'm just using 20.0 to test.
So does anyone know if this will work for contrast, and if not, then how do you apply contrast? And likewise I still have no idea how to apply brightness either and can't find any functions that look like they would work.
Figured it out; The function for contrast I was using was correct, and for brightness I ended up using image.modulate(brightness, 100.0, 100.0);. To get the data out of the image object you can grab the pixels of the entire image by doing
const MagickCore::PixelPacket * magickPixels = image.getConstPixels(0, 0, image.columns(), image.rows());
And then copy the magickPixels data back into the original pixels that were passed into the image constructor. An important thing to note is that the member MagickCore::PixelPacket::opacity is not what you would think it would be. If the pixel is completely transparent you'd think the value would be 0, right? Well for some reason ImageMagick is doing it opposite. So for full transparency the value would be 255. This means you need to do 255 - opacity to get the correct value.
Also be careful of the MAGICKCORE_QUANTUM_DEPTH that ImageMagick was compiled with, as this will change the values drastically. For my code MAGICKCORE_QUANTUM_DEPTH just happened to be defined as 16 so all of the values were a range of 0 to 65535, which I just fixed by doing realValue = magickValue >> 8 when copying the data back over since the texture data is unsigned char values.
Just for clarification on how to use these functions, since the documentation is horrible and completely wrong, the first parameter to signmoidalContrast() is actually a boolean, even though the type is a size_t, that specifies whether to increase the contrast (true) or reduce it (false), and the second is a range from 0.00001 to 20.0. I say 0.00001 because 0.0 is an invalid value so it just needs to be some decimal that is close to but not exactly 0.0.
For modulate() the documentation says that each value should be specified as 1.0 for no change, which is completely wrong. The values are actually a percentage so for no change you would specify 100.0.
I hope that helps someone because it took me all damn day to figure this stuff out.
According to the Imagemagick website - for the command line but may be the same?
-sigmoidal-contrast contrastxmid-point
increase the contrast without saturating highlights or shadows.
Increase the contrast of the image using a sigmoidal transfer function without saturating highlights or shadows. Contrast indicates how much to increase the contrast. For example, near 0 is none, 3 is typical and 20 is a lot. Note that exactly zero is invalid, but 0.0001 is negligibly different from no change in contrast. mid-point indicates where midtones fall in the resultant image (0 is white; 50% is middle-gray; 100% is black). By default the image contrast is increased, use +sigmoidal-contrast to decrease the contrast.
To achieve the equivalent of a sigmoidal brightness change, use -sigmoidal-contrast brightnessx0% to increase brightness and class="arg">+sigmoidal-contrast brightnessx0% to decrease brightness.
On the command line there is a new brightness contrast setting that may be in later versions of magic++?
-brightness-contrast brightness{xcontrast}{%}}
Adjust the brightness and/or contrast of the image.
Brightness and Contrast values apply changes to the input image. They are not absolute settings. A brightness or contrast value of zero means no change. The range of values is -100 to +100 on each. Positive values increase the brightness or contrast and negative values decrease the brightness or contrast. To control only contrast, set the brightness=0. To control only brightness, set contrast=0 or just leave it off.
You may also use -channel to control which channels to apply the brightness and/or contrast change. The default is to apply the same transformation to all channels.
Brightness and Contrast arguments are converted to offset and slope of a linear transform and applied using -function polynomial "slope,offset".
The slope varies from 0 at contrast=-100 to almost vertical at contrast=+100. For brightness=0 and contrast=-100, the result are totally midgray. For brightness=0 and contrast=+100, the result will approach but not quite reach a threshold at midgray; that is the linear transformation is a very steep vertical line at mid gray.
Negative slopes, i.e. negating the image, are not possible with this function. All achievable slopes are zero or positive.
The offset varies from -0.5 at brightness=-100 to 0 at brightness=0 to +0.5 at brightness=+100. Thus, when contrast=0 and brightness=100, the result is totally white. Similarly, when contrast=0 and brightness=-100, the result is totally black.
As the range of values for the arguments are -100 to +100, adding the '%' symbol is no different than leaving it off.
If magick++ is like Imagick it may be lagging a long way behind the Imagemagick options