I have an 8bit Image (stored in an Array) containing black(0) and white(255) pixels. Say I want to change all Black pixels in the image to grey(say 120) pixels. What is the fastest way I can change Black to Grey.
I thought of two approaches-
Start checking every single pixel in the image. As soon as a black pixel is found change it to grey. Continue till end of image. (Slower but easier)
Start checking pixels. When a black pixel is found maintain a counter to track it. Continue incrementing the counter till the next white pixel. Then goto the counter and use a fast function like memset to change a group of black pixels to grey. (Not sure but I think this may be faster)
I have a huge 1GB image therefore approach 1 is pretty slow. Is there a better(faster) way to change/edit pixels?
Probably quicker to do it a word at a time (using word aligned accesses).
You can just bitwise OR with 0x78787878 (assuming 32 bits). This will not affect white pixels but will set black pixels to the required value.
I think the problem with the first approach is that you read and write the same 32/64/x bits (depending on memory architecture/bus width) muliple times. It should be faster if you read and write the bits corresponding to the bus width once.
In the following snippet, getPixelsSizeOfLong returns the bites according to the width of the bus (say 4 bytes) and there reduces transfering bits between cache and cpu.
// Forward declaration:
unsigned long getPixelsSizeOfLong(byteInImage unsigned int);
void setPixelsSizeOfLong(byteInImage unsigned int, newBitValue unsigned long);
unsinged long l;
for (a=0; a+=sizeof(l); a<nof_pixels) {
l = getPixelsSizeOfLong(a);
l |= 120;
setPixelsSizeOfLong(a, l);
}
Related
I'm trying to work with this camera SDK, and let's say the camera has this function called CameraGetImageData(BYTE* data), which I assume takes in a byte array, modifies it with the image data, and then returns a status code based on success/failure. The SDK provides no documentation whatsoever (not even code comments) so I'm just guestimating here. Here's a code snippet on what I think works
BYTE* data = new BYTE[10000000]; // an array of an arbitrary large size, I'm not
// sure what the exact size needs to be so I
// made it large
CameraGetImageData(data);
// Do stuff here to process/output image data
I've run the code w/ breakpoints in Visual Studio and can confirm that the CameraGetImageData function does indeed modify the array. Now my question is, is there a standard way for cameras to output data? How should I start using this data and what does each byte represent? The camera captures in 8-bit color.
Take pictures of pure red, pure green and pure blue. See what comes out.
Also, I'd make the array 100 million, not 10 million if you've got the memory, at least initially. A 10 megapixel camera using 24 bits per pixel is going to use 30 million bytes, bigger than your array. If it does something crazy like store 16 bits per colour it could take up to 60 million or 80 million bytes.
You could fill this big array with data before passing it. For example fill it with '01234567' repeated. Then it's really obvious what bytes have been written and what bytes haven't, so you can work out the real size of what's returned.
I don't think there is a standard but you can try to identify which values are what by putting some solid color images in front of the camera. So all pixels would be approximately the same color. Having an idea of what color should be stored in each pixel you may understand how the color is represented in your array. I would go with black, white, reg, green, blue images.
But also consider finding a better SDK which has the documentation, because making just a big array is really bad design
You should check the documentation on your camera SDK, since there's no "standard" or "common" way for data output. It can be raw data, it can be RGB data, it can even be already compressed. If the camera vendor doesn't provide any information, you could try to find some libraries that handle most common formats, and try to pass the data you have to see what happens.
Without even knowing the type of the camera, this question is nearly impossible to answer.
If it is a scientific camera, chances are good that it adhers to the IEEE 1394 (aka IIDC or DCAM) standard. I have personally worked with such a camera made by Hamamatsu using this library to interface with the camera.
In my case the camera output was just raw data. The camera itself was monochrome and each pixel had a depth-resolution of 12 bit. Therefore, each pixel intensity was stored as 16-bit unsigned value in the result array. The size of the array was simply width * height * 2 bytes, where width and height are the image dimensions in pixels the factor 2 is for 16-bit per pixel. The width and height were known a-priori from the chosen camera mode.
If you have the dimensions of the result image, try to dump your byte array into a file and load the result either in Python or Matlab and just try to visualize the content. Another possibility is to load this raw file with an image editor such as ImageJ and hope to get anything out from it.
Good luck!
I hope this question's solution will helps you: https://stackoverflow.com/a/3340944/291372
Actually you've got an array of pixels (assume 1 byte per pixel if you camera captires in 8-bit). What you need - is just determine width and height. after that you can try to restore bitmap image from you byte array.
I'm looking to filter a 1 bit per pixel image using a 3x3 filter: for each input pixel, the corresponding output pixel is set to 1 if the weighted sum of the pixels surrounding it (with weights determined by the filter) exceeds some threshold.
I was hoping that this would be more efficient than converting to 8 bpp and then filtering that, but I can't think of a good way to do it. A naive method is to keep track of nine pointers to bytes (three consecutive rows and also pointers to either side of the current byte in each row, for calculating the output for the first and last bits in these bytes) and for each input pixel compute
sum = filter[0] * (lastRowPtr & aMask > 0) + filter[1] * (lastRowPtr & bMask > 0) + ... + filter[8] * (nextRowPtr & hMask > 0),
with extra faff for bits at the edge of a byte. However, this is slow and seems really ugly. You're not gaining any parallelism from the fact that you've got eight pixels in each byte and instead are having to do tonnes of extra work masking things.
Are there any good sources for how to best do this sort of thing? A solution to this particular problem would be amazing, but I'd be happy being pointed to any examples of efficient image processing on 1bpp images in C/C++. I'd like to replace some more 8 bpp stuff with 1 bpp algorithms in future to avoid image conversions and copying, so any general resouces on this would be appreciated.
I found a number of years ago that unpacking the bits to bytes, doing the filter, then packing the bytes back to bits was faster than working with the bits directly. It seems counter-intuitive because it's 3 loops instead of 1, but the simplicity of each loop more than made up for it.
I can't guarantee that it's still the fastest; compilers and especially processors are prone to change. However simplifying each loop not only makes it easier to optimize, it makes it easier to read. That's got to be worth something.
A further advantage to unpacking to a separate buffer is that it gives you flexibility for what you do at the edges. By making the buffer 2 bytes larger than the input, you unpack starting at byte 1 then set byte 0 and n to whatever you like and the filtering loop doesn't have to worry about boundary conditions at all.
Look into separable filters. Among other things, they allow massive parallelism in the cases where they work.
For example, in your 3x3 sample-weight-and-filter case:
Sample 1x3 (horizontal) pixels into a buffer. This can be done in isolation for each pixel, so a 1024x1024 image can run 1024^2 simultaneous tasks, all of which perform 3 samples.
Sample 3x1 (vertical) pixels from the buffer. Again, this can be done on every pixel simultaneously.
Use the contents of the buffer to cull pixels from the original texture.
The advantage to this approach, mathematically, is that it cuts the number of sample operations from n^2 to 2n, although it requires a buffer of equal size to the source (if you're already performing a copy, that can be used as the buffer; you just can't modify the original source for step 2). In order to keep memory use at 2n, you can perform steps 2 and 3 together (this is a bit tricky and not entirely pleasant); if memory isn't an issue, you can spend 3n on two buffers (source, hblur, vblur).
Because each operation is working in complete isolation from an immutable source, you can perform the filter on every pixel simultaneously if you have enough cores. Or, in a more realistic scenario, you can take advantage of paging and caching to load and process a single column or row. This is convenient when working with odd strides, padding at the end of a row, etc. The second round of samples (vertical) may screw with your cache, but at the very worst, one round will be cache-friendly and you've cut processing from exponential to linear.
Now, I've yet to touch on the case of storing data in bits specifically. That does make things slightly more complicated, but not terribly much so. Assuming you can use a rolling window, something like:
d = s[x-1] + s[x] + s[x+1]
works. Interestingly, if you were to rotate the image 90 degrees during the output of step 1 (trivial, sample from (y,x) when reading), you can get away with loading at most two horizontally adjacent bytes for any sample, and only a single byte something like 75% of the time. This plays a little less friendly with cache during the read, but greatly simplifies the algorithm (enough that it may regain the loss).
Pseudo-code:
buffer source, dest, vbuf, hbuf;
for_each (y, x) // Loop over each row, then each column. Generally works better wrt paging
{
hbuf(x, y) = (source(y, x-1) + source(y, x) + source(y, x+1)) / 3 // swap x and y to spin 90 degrees
}
for_each (y, x)
{
vbuf(x, 1-y) = (hbuf(y, x-1) + hbuf(y, x) + hbuf(y, x+1)) / 3 // 1-y to reverse the 90 degree spin
}
for_each (y, x)
{
dest(x, y) = threshold(hbuf(x, y))
}
Accessing bits within the bytes (source(x, y) indicates access/sample) is relatively simple to do, but kind of a pain to write out here, so is left to the reader. The principle, particularly implemented in this fashion (with the 90 degree rotation), only requires 2 passes of n samples each, and always samples from immediately adjacent bits/bytes (never requiring you to calculate the position of the bit in the next row). All in all, it's massively faster and simpler than any alternative.
Rather than expanding the entire image to 1 bit/byte (or 8bpp, essentially, as you noted), you can simply expand the current window - read the first byte of the first row, shift and mask, then read out the three bits you need; do the same for the other two rows. Then, for the next window, you simply discard the left column and fetch one more bit from each row. The logic and code to do this right isn't as easy as simply expanding the entire image, but it'll take a lot less memory.
As a middle ground, you could just expand the three rows you're currently working on. Probably easier to code that way.
I am fairly new to image processing using Visual C++, I am in search of a way that reads black and white TIFF files and writes the image as an array hex values representing 0 or 1, then get the location information of either 0 (black) or 1 (white).
After a bit of research on Google and an article here https://www.ibm.com/developerworks/linux/library/l-libtiff/#resources I have the following trivial questions, please do point me to relevant articles, there are so many that I couldn’t wrap my heads around them, meanwhile I will keep on searching and reading.
Is it possible to extract pixel location information from TIFF using LIBTIFF?
Again being new to all image formats, I couldn't help to think that an image is made up of a 2D array of hex values , is the bitmap format more appropriate? Thinking in that way makes me wonder if I can just write the location of “white” or “black” mapped from its height onto a 2D array. I want the location of either “black” or “white”, is that possible? I don’t really want an array of 0s and 1s that represents the image, not sure if I even need them.
TIFF Example of a 4 by 2 BW image:
black white
white black
black black
white white
Output to something like ptLocArray[imageHeight][imageWidth] under a function named mappingNonPrint()
2 1
4 4
or under a function named mappingPrint()
1 2
3 3
With LIBTIFF, in what direction/ways does it read TIFF files?
Ideally I would like TIFF file to be read vertically from top to bottom, then shifts to the next column, then start again from top to bottom. But my gut tells me that's a fairy tale, being very new to image processing, I like to know how TIFF is read for example a single strip ones.
Additional info
I think I should add why I want the pixel location. I am trying to make a cylindrical roll printer, using a optical rotary encoder providing a location feedback The height of the document represents the circumference of the roll surface, the width of the document represent the number of revolutions.
The following is a grossly untested logic of my ENC_reg(), pretty much unrelated to the image processing part, but this may help readers to see what I am trying to do with the processed array of a tiff image. i is indexed from 0 to imageHeight, however, each element may be an arbitrary number ranging from 0 to imageHeight, could be 112, 354 etc that corresponds to whatever the location that contains black in the tiff image. j is indexed from 0 to imageWidth, and each element there is also starting 0 to imageWidth. For example, ptLocArray[1][2] means the first location that has a black pixel in column 2, the value stored there could be 231. the 231st pixel counting from the top of column 2.
I realized that the array[imageHeight][] should be instead array[maxnum_blackPixelPerColumn], but I don't know how to count the number of 1s, or 0s per column....because I don't know how to convert the 1s and 0s in the right order as I mentioned earlier..
void ENC_reg()
{
if(inputPort_ENC_reg == true) // true if received an encoder signal
{
ENC_regTemp ++; // increment ENC register temp value by one each time the port registers a signal
if(ENC_regTemp == ptLocArray[i][j])
{
Rollprint(); // print it
i++; // increment by one to the next height location that indicates a "black", meaning print
if (ENC_regTemp == imageHeight)
{
// Check if end of column reached, end of a revolution
j++; // jump to next column of the image (starting next revolution)
i = 0; // reset i index to 0, to the top of the next column
ENC_regTemp = 0; // reset encoder register temp to 0 for the new revolution
};
};
ENC_reg(); // recall itself;
};
}
As I read your example, you want two arrays for each column of the image, one to contain the numbers of rows with a black pixel in that column, the other with indices of rows with a white pixel. Right so far?
This is certainly possible, but would require huge amounts of memory. Even uncompressed, a single pixel of a grayscale image will take one byte of memory. You can even use bit packing and cram 8 pixels into a byte. On the other hand, unless you restrict yourself to images with no more than 255 rows, you'll have multiple bytes to represent each column index. And you'd need extra logic to mark the end of the column.
My point is: try working with the “array of 0s and 1s” as this should be both easier to accomplish, less demanding in terms of memory, and more efficient to run.
I'm trying to understand building a bmp based on raw data in c++ and I have a few questions.
My bmp can be black and white so I figured that the in the bit per pixel field I should go with 1. However in a lot of guides I see the padding field adds the number of bits to keep 32 bit alignment, meaning my bmp will be the same file size as a 24 bit per pixel bmp.
Is this understanding correct or in some way is the 1 bit per pixel smaller than 24, 32 etc?
Thanks
Monochrome bitmaps are aligned too, but they will not take as much space as 24/32-bpp ones.
A row of 5-pixel wide 24-bit bitmap will take 16 bytes: 5*3=15 for pixels, and 1 byte of padding.
A row of 5-pixel wide 32-bit bitmap will take 20 bytes: 5*4=20 for pixels, no need for padding.
A row of 5-pixel wide monochrome bitmap will take 4 bytes: 1 byte for pixels (it is not possible to use less than a byte, so whole byte is taken but 3 of its 8 bits are not used), and 3 bytes of padding.
So, monochrome bitmap will of course be smaller than 24-bit one.
The answer is already given above (that bitmap rows are aligned/padded to 32-bit boundary), however if you want more information, you might want to read DIBs and Their Uses, the "DIB Header" section - it explains in detail.
Every scanline is DWORD-aligned. The scanline is buffered to alignment; the buffering is not necessarily 0.
The scanlines are stored upside down, with the first scan (scan 0) in memory being the bottommost scan in the image. (See Figure 1.) This is another artifact of Presentation Manager compatibility. GDI automatically inverts the image during the Set and Get operations. Figure 1. (Embedded image showing memory and screen representations.)
So I have a x8r8g8b8 formatted IDirect3DSurface9 that contains the contents of the back buffer. When I call LockRect on it I get access to a struct containing pBits, a pointer to the pixels I assume, and and integer Pitch (which I am very unclear about its purpose).
How to read the individual pixels?
Visual Studio 2008 C++
The locked area is stored in a D3DLOCKED_RECT. I haven't ever used this but the documentation says it is the "Number of bytes in one row of the surface". Actually people would normally call this "stride" (some terms explained in the MSDN).
For example, if one pixel has 4 bytes (8 bits for each component of XRGB), and the texture width is 7, the image is usually stored as 8*4 bytes instead of 7*4 bytes because the memory can be accessed faster if the data is DWORD-aligned.
So, in order to read pixel [x, y] you would have to read
uint8_t *pixels = rect.pBits;
uint32_t *mypixel = (uint32_t*)&pixels[rect.Pitch*y + 4*x];
where 4 is the size of a pixel. *myPixel would be the content of the pixel in my example.
Yep, you would access the individual RGB components of the pixel like that.
The first byte of the pixel is not used, but it is more efficient to use 4 Bytes per pixel, so that each pixel is aligned on a 32Bit boundary (that's also, why there's the pitch).
In your example, the x is not used, but note that there are lso other pixel formats, for example ARGB, which stores the alpha value (transparency) in the first byte. Sometimes the colors are also reversed (BGR instead of RGB). If you're unsure what byte corresponds to what color, a good trick is to create a texture which is entirely red, green or blue and then check which of the 4 bytes has the value 255.