I am trying to develop a dicom image viewer. I have successfully decoded the image buffer. I store all the image pixel values in an unsigned char buffer in C++.
Now, when I display the image it is working fine for images with pixel representation (0028,0103) =0. Could someone show me how to apply this signed conversion into these decoded buffer. I don't know how to convert this signed bits into unsigned bit (I think the usual conversion using typecast doesn't work well). Please post the replay for 16 bit image, that is what I really need now.
I am trying to create a viewer from scratch, which simply puts the image in screen. I have successfully completed the decoding and displaying of the dicom image. But when I try to open an image which has a pixel representation (tag 0028,0103) =1, the image is not showing correctly. The conversion from 16 bit to 8 bit is done along with applying window level and width (value found inside the dicom image), the conversion is simply linear.
Make sure to correctly read the pixel data into a signed short array by taking the TransferSyntax (endianess) into account. Then apply the windowing equation from the DICOM Standard. In setting ymin=0, ymax = 255 rescaling to 8 bit is achieved.
In general, there is more to take into account about handling DICOM pixel data:
Photometric Interpretation
Bits Stored, High Bit
Modality LUT (Rescale Slope/Intercept or a lookup table stored in the DICOM header)
I am assuming that Photometric Interpretation is MONOCRHOME2, High Bit = Bits Stored - 1, Modality LUT is Identity transformation (Slope = 1, Intercept = 0).
Other SO posts dealing with this topic:
Converting Pixel Data to 8 bit
Pixel Data Interpretation
Related
I am trying to convert some bitmap files into custom images (exr, pfm, whatever), and after that, back to bitmap:
CImg<float> image(_T("D:\\Temp\\test.bmp"));
image.normalize(0.0, 1.0);
image.save_exr(_T("D:\\Temp\\test.exr"));
and goes fine (same for .pfm file), I mean the exr file is ok, same for pfm file.
But when this exr, or pfm file I trying to convert back to bitmap:
CImg<float> image;
image.load_exr(_T("D:\\Temp\\test.exr")); // image.load_pfm(_T("D:\\Tempx\\test.pfm"));
image.save_bmp(_T("D:\\Temp\\test2.bmp"));
the result, test2.bmp is black. Complete. Why ? What I am doing wrong ?
Some image formats support saving as float, but most formats save as unsigned 8 bit integer (or uint8), meaning normal image values are from 0 to 255. If you try to save an array that is made up of floats from 0 to 1 into a format that does not support floats, your values will most likely be converted to integers. When you display your image with most image-viewing software, it'll appear entirely black since 0 is black and 1 is almost black.
Most likely when you save your image to bitmap it is trying to convert the values to uint8 but not scaling properly. You can fix this by multiplying normalized values between 0 and 1 by 255. img = int(img*255) or using numpy img = (img*255).astype(np.uint8).
It is also possible that somehow your save function is able to preserve floating point values in the bitmap format. However your image viewing software might not know how to view/display a float image. Perhaps use some imshow function (matplotlib.pyplot can easily display floating point grayscale arrays) between each line of code to check if the arrays are consistent with what you expect them to be.
I have a small doubt regarding the MaskRCNN images for training purpose. Is MRCNN is taking only 8 bit images for training? if its taking any 16 bit or 32 bit images, How it will help us by training?
Usually the visualization happens for 8 bit images. I had a dilemma if its processing 16bit how it will help in classification and mapping.
As long as you keep the data type the same and the image intensity range to be "consistent" for all input images, then it should be fine. For example if we prefer 8 bit image, you should rescale 16 and 32 bit images to 8 bit, i.e. input images should be of type uint8 - values between [0,255]. This type of "preprocessing" is required when training and doing inference with most machine learning models.
In one of the examples by matterport/Mask_RCNN, the input images are of type uint8.
Alternatively, why not just cast the images to be of type float and range [0,1], thereby preserving the pixel resolution for 16 and 32bit images? Hope this helps.
I came across this code:
image.convertTo(temp_image,CV_16SC3);
I saw the description of the convertTo() function from here, but what confuses me is image. How can we read the above code? What would be the relation between image and temp_image?
Thanks.
The other answers here are correct, but lack some details. Let me try.
image.convertTo(temp_image,CV_16SC3);
You have a source image image, and a destination image temp_image. You didn't specify the type of image, but probably is CV_8UC3 or CV_32FC3, i.e. a 3 channel image (since convertTo doesn't change the number of channels), where each channel has depth 8 bit (unsigned char, CV_8UC3) or 32 bit (float, CV_32FC3).
This line of code will change the depth of each channel, so that temp_image has each channel of depth 16 bit (short). Specifically it's a signed short, since the type specifier has the S: CV_16SC3.
Note that if you are narrowing down the depth, as in the case from float to signed short, then saturate_cast will make sure that all the values in temp_image will be in the correct range, i.e. in [–32768, 32767] for signed short.
Why you need to change the depth of an image?
Some OpenCV functions require input images with a specific depth.
You need a matrix to contain a different range of values. E.g. if you need to sum (or subtract) some images CV_8UC3 (tipically BGR images), you'd better store the result in a CV_16SC3 or you'll probably get wrong results due to saturations, since the range for CV_8U images is in [0,255]
You read with imread, or want to store with imwrite images with 16bit depth. This are usually used (AFAIK) in medical or graphics application to allow a wider range of colors. However, most monitors do not support 16bit image visualization.
There may be other cases, let me know if I miss the one important to you.
An image is a matrix of pixel information (i.e. a 1080p image will be a 1,920 × 1,080 matrix where each entry contains rbg values for that pixel). All you are doing is reformatting that matrix (each pixel entry, iteratively) into a new type (CV_16SC3) so it can be read by different programs.
The temp_image is a new matrix of pixel information based off of image formatted into CV_16SC3.
The first one is a source, the second one - destination. So, it takes image, converts it into type CV_16SC3 and stores in temp_image.
I was wondering how to determine the equivalent of RGB values for a grayscale image. The original image is grayscale and everything I have found online is converting an RGB image pixel values to the grayscale pixel values. I already can read in the image. Ideally, this would be for xCode.
I was wondering if there was a class which would do this for me. If so, and you could point me to it, that would be great. I will read on it.
Any help is greatly appreciated.
NOTE: I am a beginner in C++ and do not have time to learn everything formally; I have to learn all of my programming on the fly.
You need more information to transform from a simple Greyscale to RGB, when you do reverse operation, the color information is "lost", as the three channels are set to same value(depending on the algorithm each channel will have a different/same weight in the final color computation).
Digital cameras, usually store more information per pixel, 12 bits per channel in 35mm and 14 bits per channel in medium format (those bits number are the average, some products offer less or even more quality).
Thanks to those additional bits per channel, the camera can compute the "real" color, or what it thinks is the real color based on some parameters.
TL;DR: You can't without more data from your source, in this case the image.
You can convert a gray value to RGB by setting each component of the RGB value to the gray value:
ColorRGB myColorRGB = ColorRGBMake(myGrayValue, myGrayValue, myGrayValue);
I'm trying to work with this camera SDK, and let's say the camera has this function called CameraGetImageData(BYTE* data), which I assume takes in a byte array, modifies it with the image data, and then returns a status code based on success/failure. The SDK provides no documentation whatsoever (not even code comments) so I'm just guestimating here. Here's a code snippet on what I think works
BYTE* data = new BYTE[10000000]; // an array of an arbitrary large size, I'm not
// sure what the exact size needs to be so I
// made it large
CameraGetImageData(data);
// Do stuff here to process/output image data
I've run the code w/ breakpoints in Visual Studio and can confirm that the CameraGetImageData function does indeed modify the array. Now my question is, is there a standard way for cameras to output data? How should I start using this data and what does each byte represent? The camera captures in 8-bit color.
Take pictures of pure red, pure green and pure blue. See what comes out.
Also, I'd make the array 100 million, not 10 million if you've got the memory, at least initially. A 10 megapixel camera using 24 bits per pixel is going to use 30 million bytes, bigger than your array. If it does something crazy like store 16 bits per colour it could take up to 60 million or 80 million bytes.
You could fill this big array with data before passing it. For example fill it with '01234567' repeated. Then it's really obvious what bytes have been written and what bytes haven't, so you can work out the real size of what's returned.
I don't think there is a standard but you can try to identify which values are what by putting some solid color images in front of the camera. So all pixels would be approximately the same color. Having an idea of what color should be stored in each pixel you may understand how the color is represented in your array. I would go with black, white, reg, green, blue images.
But also consider finding a better SDK which has the documentation, because making just a big array is really bad design
You should check the documentation on your camera SDK, since there's no "standard" or "common" way for data output. It can be raw data, it can be RGB data, it can even be already compressed. If the camera vendor doesn't provide any information, you could try to find some libraries that handle most common formats, and try to pass the data you have to see what happens.
Without even knowing the type of the camera, this question is nearly impossible to answer.
If it is a scientific camera, chances are good that it adhers to the IEEE 1394 (aka IIDC or DCAM) standard. I have personally worked with such a camera made by Hamamatsu using this library to interface with the camera.
In my case the camera output was just raw data. The camera itself was monochrome and each pixel had a depth-resolution of 12 bit. Therefore, each pixel intensity was stored as 16-bit unsigned value in the result array. The size of the array was simply width * height * 2 bytes, where width and height are the image dimensions in pixels the factor 2 is for 16-bit per pixel. The width and height were known a-priori from the chosen camera mode.
If you have the dimensions of the result image, try to dump your byte array into a file and load the result either in Python or Matlab and just try to visualize the content. Another possibility is to load this raw file with an image editor such as ImageJ and hope to get anything out from it.
Good luck!
I hope this question's solution will helps you: https://stackoverflow.com/a/3340944/291372
Actually you've got an array of pixels (assume 1 byte per pixel if you camera captires in 8-bit). What you need - is just determine width and height. after that you can try to restore bitmap image from you byte array.