I'm developing application for editing raster graphic. In this application I have to create scanline function which will do same thing as scanline function in QImage class.
But I'm little confused with the way that scanline function works and with scanline generally.
For example, when I call bytesPerLine() for image which height is 177px I was expecting that value will be 531 (3 bytes for each pixel) but this function is returning 520?
Also, when I use
uchar data = image->scanLine(y)[x]
for R=249 G=249 B=249 value in variable data is 255.
I really don't understand this value.
Thanks in advance :)
For reliable behavior you should check the return value of QImage::format() to see what underlying format is used before accessing the raw image data.
Qt seems to prefer RGB32/ARGB32 format for true-colors, where each pixel takes 4 bytes, whether an alpha channel exists or not (for RGB32 format it's simply filled with 0xff). If you load a true-color image, it's probably in one of these two formats.
Besides, the byte order can be different across platforms, use QRgb to access 32-bit pixels whenever possible.
BTW, shouldn't a scanline be horizontal? I think you should use width() instead of height() to calculate the length of a scanline.
Related
I came across this code:
image.convertTo(temp_image,CV_16SC3);
I saw the description of the convertTo() function from here, but what confuses me is image. How can we read the above code? What would be the relation between image and temp_image?
Thanks.
The other answers here are correct, but lack some details. Let me try.
image.convertTo(temp_image,CV_16SC3);
You have a source image image, and a destination image temp_image. You didn't specify the type of image, but probably is CV_8UC3 or CV_32FC3, i.e. a 3 channel image (since convertTo doesn't change the number of channels), where each channel has depth 8 bit (unsigned char, CV_8UC3) or 32 bit (float, CV_32FC3).
This line of code will change the depth of each channel, so that temp_image has each channel of depth 16 bit (short). Specifically it's a signed short, since the type specifier has the S: CV_16SC3.
Note that if you are narrowing down the depth, as in the case from float to signed short, then saturate_cast will make sure that all the values in temp_image will be in the correct range, i.e. in [–32768, 32767] for signed short.
Why you need to change the depth of an image?
Some OpenCV functions require input images with a specific depth.
You need a matrix to contain a different range of values. E.g. if you need to sum (or subtract) some images CV_8UC3 (tipically BGR images), you'd better store the result in a CV_16SC3 or you'll probably get wrong results due to saturations, since the range for CV_8U images is in [0,255]
You read with imread, or want to store with imwrite images with 16bit depth. This are usually used (AFAIK) in medical or graphics application to allow a wider range of colors. However, most monitors do not support 16bit image visualization.
There may be other cases, let me know if I miss the one important to you.
An image is a matrix of pixel information (i.e. a 1080p image will be a 1,920 × 1,080 matrix where each entry contains rbg values for that pixel). All you are doing is reformatting that matrix (each pixel entry, iteratively) into a new type (CV_16SC3) so it can be read by different programs.
The temp_image is a new matrix of pixel information based off of image formatted into CV_16SC3.
The first one is a source, the second one - destination. So, it takes image, converts it into type CV_16SC3 and stores in temp_image.
A C++ application running in another process passes in a char[] array of three-byte pixels (red, green, blue) to a go program. I've reconstructed this in go as a byte[] slice using cgo, but I'm unsure how to convert to an image. I can pass the width or height as well, if that is needed (I would imagine it would be).
I'm aware of the image.RGBA type, but the documentation seems to imply that those aren't just single-byte-per-color, and that assumes that there is an alpha channel, which my very simplistic bitmap does not have. Would converting the 3 byte values I have into something that works with image.RGBA be a solution? If so, how should I do that?
Alternatively, I could do the conversion in C/C++ before sending the values into a format that go recognizes (jpeg, gif, png). Either way works for my uses, but I don't know how to approach either.
The image package is based on interfaces. Just define a new type with those methods.
Your type's ColorModel would return color.RGBAModel, Bounds - your rectangle's borders, and At - the color at (x, y) that you can compute if you know the image's dimensions.
I'm trying to work with this camera SDK, and let's say the camera has this function called CameraGetImageData(BYTE* data), which I assume takes in a byte array, modifies it with the image data, and then returns a status code based on success/failure. The SDK provides no documentation whatsoever (not even code comments) so I'm just guestimating here. Here's a code snippet on what I think works
BYTE* data = new BYTE[10000000]; // an array of an arbitrary large size, I'm not
// sure what the exact size needs to be so I
// made it large
CameraGetImageData(data);
// Do stuff here to process/output image data
I've run the code w/ breakpoints in Visual Studio and can confirm that the CameraGetImageData function does indeed modify the array. Now my question is, is there a standard way for cameras to output data? How should I start using this data and what does each byte represent? The camera captures in 8-bit color.
Take pictures of pure red, pure green and pure blue. See what comes out.
Also, I'd make the array 100 million, not 10 million if you've got the memory, at least initially. A 10 megapixel camera using 24 bits per pixel is going to use 30 million bytes, bigger than your array. If it does something crazy like store 16 bits per colour it could take up to 60 million or 80 million bytes.
You could fill this big array with data before passing it. For example fill it with '01234567' repeated. Then it's really obvious what bytes have been written and what bytes haven't, so you can work out the real size of what's returned.
I don't think there is a standard but you can try to identify which values are what by putting some solid color images in front of the camera. So all pixels would be approximately the same color. Having an idea of what color should be stored in each pixel you may understand how the color is represented in your array. I would go with black, white, reg, green, blue images.
But also consider finding a better SDK which has the documentation, because making just a big array is really bad design
You should check the documentation on your camera SDK, since there's no "standard" or "common" way for data output. It can be raw data, it can be RGB data, it can even be already compressed. If the camera vendor doesn't provide any information, you could try to find some libraries that handle most common formats, and try to pass the data you have to see what happens.
Without even knowing the type of the camera, this question is nearly impossible to answer.
If it is a scientific camera, chances are good that it adhers to the IEEE 1394 (aka IIDC or DCAM) standard. I have personally worked with such a camera made by Hamamatsu using this library to interface with the camera.
In my case the camera output was just raw data. The camera itself was monochrome and each pixel had a depth-resolution of 12 bit. Therefore, each pixel intensity was stored as 16-bit unsigned value in the result array. The size of the array was simply width * height * 2 bytes, where width and height are the image dimensions in pixels the factor 2 is for 16-bit per pixel. The width and height were known a-priori from the chosen camera mode.
If you have the dimensions of the result image, try to dump your byte array into a file and load the result either in Python or Matlab and just try to visualize the content. Another possibility is to load this raw file with an image editor such as ImageJ and hope to get anything out from it.
Good luck!
I hope this question's solution will helps you: https://stackoverflow.com/a/3340944/291372
Actually you've got an array of pixels (assume 1 byte per pixel if you camera captires in 8-bit). What you need - is just determine width and height. after that you can try to restore bitmap image from you byte array.
In an application I handle images where each pixel is either an unsigned or a float with each value is a pixel with the given level of grey. I have the source available so I can access the data of the images freely.
I need to display/save and load these pictures using the qt framework. Currently the only way of handling the conversion is to get and set each pixel which is proving to be a bit slow.
Are there any other way one could convert these images?
Instead of using QImage::setPixel you should access to the image buffer directly.
After you create the image with the desidered format, width and height, you can use QImage::bits() to access the memory buffer, or also QImage::scanLine() to retrieve a pointer to the beginning of each line in the image and set the pixels directly in memory: this is much faster than calling setPixel() for each pixel.
QImage has a constructor that takes a pointer to an existing buffer/image:
QImage ( uchar * data, int width, int height, Format format )
It does not take ownership of the buffer nor does it copy the contents, so you are responsible that the buffer is valid throughout the lifetime of the QImage.
Note: QImage requires 32-bit aligned image rows, so you might need to copy the image rowwise into a new buffer with appropriate padding. You have only unsigned or float pixels, so it doesn't apply for you (already 32bit values), but remember it, should you have different pixel types in the future.
So I have a x8r8g8b8 formatted IDirect3DSurface9 that contains the contents of the back buffer. When I call LockRect on it I get access to a struct containing pBits, a pointer to the pixels I assume, and and integer Pitch (which I am very unclear about its purpose).
How to read the individual pixels?
Visual Studio 2008 C++
The locked area is stored in a D3DLOCKED_RECT. I haven't ever used this but the documentation says it is the "Number of bytes in one row of the surface". Actually people would normally call this "stride" (some terms explained in the MSDN).
For example, if one pixel has 4 bytes (8 bits for each component of XRGB), and the texture width is 7, the image is usually stored as 8*4 bytes instead of 7*4 bytes because the memory can be accessed faster if the data is DWORD-aligned.
So, in order to read pixel [x, y] you would have to read
uint8_t *pixels = rect.pBits;
uint32_t *mypixel = (uint32_t*)&pixels[rect.Pitch*y + 4*x];
where 4 is the size of a pixel. *myPixel would be the content of the pixel in my example.
Yep, you would access the individual RGB components of the pixel like that.
The first byte of the pixel is not used, but it is more efficient to use 4 Bytes per pixel, so that each pixel is aligned on a 32Bit boundary (that's also, why there's the pitch).
In your example, the x is not used, but note that there are lso other pixel formats, for example ARGB, which stores the alpha value (transparency) in the first byte. Sometimes the colors are also reversed (BGR instead of RGB). If you're unsure what byte corresponds to what color, a good trick is to create a texture which is entirely red, green or blue and then check which of the 4 bytes has the value 255.