is there any way to determine width and height of rgb values array? - c++

I have RGB values array with raw size each time. I'm trying to determine which width/height it would be more suitable for it.
The idea is, I'm getting raw files and I want to display file data as BMP image (e.g Hex Workshop got that feature which called Data Visualizer)
Any suggestions?
Regards.

Find the divisors of the pixel array size.
For instance, if your array contains 243 pixels, divisors are 1, 3, 9, 27, 81 and 243. It means that your image is either 1x243, 3x81, 9x27, 27x9, 81x3 or 243x1.
You can only guess which is the good one by analyzing image content, vertical or horizontal features, recurring patterns, common aspect ratio, etc.

Related

Compression of an image

I have been calculating the uncompressed and compressed file sizes of an image. This for me has always resulted in the compressed image being smaller than the uncompressed image which I would expect. If an image contains a large number of different colours, then storing the palette takes up a significant amount of space, and more bits are also needed to store each code. However my question is, would it be possible the compression method could potentially result in a larger file than the uncompressed RGB image. What would the size (in pixels) of the smallest square RGB image, containing a total of k different colours, for which this compression method is still useful? So we want to find, for a given value of k, find the smallest integer number n for which an image of size n×n takes up less storage space after compression than the original RGB image.
Let's begin by making a small simplification -- the size of the encoded output depends on the number of pixels (the actual proportion of width vs. height doesn't really matter). Hence, let's generalize the problem to number of pixels N, from which we can always calculate n by taking a square root.
To further simplify the problem, we will also ignore the overhead of any image headers/metadata, such as width, height, size of the palette, etc. In practice, this would generally be some relatively small constant.
Problem Statement
Given that we have
N representing the number of pixels in an image
k representing the number of distinct colours in an image
24 bits per pixel RGB encoding
LRGB representing the length of a RGB image
LP representing the length of a palette image
our goal is to solve the following inequality
in terms of N.
Size of RGB Image
RGB image is just an array of N pixels, each pixel taking up a fixed number of bits given by the RGB encoding. Hence,
Size of Palette Image
Palette image consists of two parts: a palette, and the pixels.
A palette is an array of k colours, each colour taking up a fixed number of bits given by the RGB encoding. Therefore,
In this case, each pixel holds an index to a palette entry, rather than an actual RGB colour. The number of bits required to represent k values is
However, unless we can encode fractional bits (which I consider outside the scope of this question), we need to round this up. Therefore, the number of bits required to encode a palette index is
Since there are N such palette indices, the size of the pixel data is
and the total size of the palette image is
Solving the Inequality
And finally
In Python, we could express this in the following way:
import math
def limit_size(k):
return (k * 24.) / (24. - math.ceil(math.log(k, 2)))
def size_rgb(N):
return (N * 24.)
def size_pal(N, k):
return (N * math.ceil(math.log(k, 2))) + (k * 24.)
In general no, but your question is not precise.
If we compress normal files, they could be larger. E.g. if you compress a random generated sequence of bytes, there is not much to compress, and so you get the header of compression program, which tell which compression method is used, and some versioning. This will enlarge the file, and ev. some escaping. Good compression program will see that compression will not shrink the size, and so they should just not compress, and tell in the header that it is a flat file. Possibly this is done by region of program.
But your question is about images. Compression is done inside the file, and often not all file, but just the image bits. In this case program will see that there is no need to compress, and so they would keep the file uncompressed. But because the image headers are always present, this change only a flag, and so no increase of size.
But this could depends also on file format. You wrote about "palette", but this is not much used nowadays: compression is done finding similar pattern on file. But again: this depends on the image format. If you look in Wikipedia, for particular file format, you may see a table with headers parameters (e.g. bit depth or number of colours (palette), definitions of colours, and methods used to compress).
Then, for palette like image, the answer of Dan Mašek (https://stackoverflow.com/a/58683948/2758823) has some nice mathematical explanation, but one should not forget that compression is much heuristic and test of real examples: real images have patterns.

cannot read correct pgm pixel values

I have a really weird error,
so I'm trying to read a pgm image by loading its pixel values into an array, I was able to correctly read in its version, height, width, and maximum possible pixel value. However, when I start reading the pixel values, I always get 0. (I know it's not zero because I can read it using imread in matlab, but have to implement it in c++, plus I couldn't use the opencv library so..)
And besides, when I read the pgm file in like NotePad++, the first few lines are good representing the information about this image ,how ever, the actual pixel values are not readable. I'm wondering if I need some sort of parsing to read a pgm image? Its version is p5.
Thanks!
You must have an assignment to solve as there is no sane reason implementing a PGM reader otherwise.
There are two different PGM formats: ASCII and binary. You seem to expect an ASCII PGM but the one you have is binary.
Have a look at the specs: http://netpbm.sourceforge.net/doc/pgm.html
It says:
/1. A "magic number" for identifying the file type. A pgm image's
magic number is the two characters "P5".
[…]
/9. A raster of Height rows, in order from top to bottom. Each row
consists of Width gray values, in order from left to right. Each gray
value is a number from 0 through Maxval, with 0 being black and Maxval
being white. Each gray value is represented in pure binary by either
1 or 2 bytes. If the Maxval is less than 256, it is 1 byte.
Otherwise, it is 2 bytes. The most significant byte is first.
The format you are expecting is described further down below as the Plain PGM format. Its magic number is "P2".

multidimensional discrete wavelet transform

can anyone tell me the correct method to use the getOutputValue function in the following link? Also, how does the author get the 2nd and 3rd image from the code.
http://www.codeproject.com/Articles/385658/Multidimensional-Discrete-Wavelet-Transform
Thanks
Okay, usage:
I haven't tried it yet, but from what I get you simply call getOutputValue() to get one result. The parameter is a vector containing the "coordinates" (based on the number of dimensions in your input).
Images:
In this example, the author obviously used the image data as the discrete values, e.g. a black pixel would be 0 and a white pixel would be 255 with all other shades of grey being inbetween (default 8 bit grayscale image).
He then used the output signal/result to recreate a image (i.e. interpret the values as pixels once again).

C++: How to interpret a byte array representation of an image?

I'm trying to work with this camera SDK, and let's say the camera has this function called CameraGetImageData(BYTE* data), which I assume takes in a byte array, modifies it with the image data, and then returns a status code based on success/failure. The SDK provides no documentation whatsoever (not even code comments) so I'm just guestimating here. Here's a code snippet on what I think works
BYTE* data = new BYTE[10000000]; // an array of an arbitrary large size, I'm not
// sure what the exact size needs to be so I
// made it large
CameraGetImageData(data);
// Do stuff here to process/output image data
I've run the code w/ breakpoints in Visual Studio and can confirm that the CameraGetImageData function does indeed modify the array. Now my question is, is there a standard way for cameras to output data? How should I start using this data and what does each byte represent? The camera captures in 8-bit color.
Take pictures of pure red, pure green and pure blue. See what comes out.
Also, I'd make the array 100 million, not 10 million if you've got the memory, at least initially. A 10 megapixel camera using 24 bits per pixel is going to use 30 million bytes, bigger than your array. If it does something crazy like store 16 bits per colour it could take up to 60 million or 80 million bytes.
You could fill this big array with data before passing it. For example fill it with '01234567' repeated. Then it's really obvious what bytes have been written and what bytes haven't, so you can work out the real size of what's returned.
I don't think there is a standard but you can try to identify which values are what by putting some solid color images in front of the camera. So all pixels would be approximately the same color. Having an idea of what color should be stored in each pixel you may understand how the color is represented in your array. I would go with black, white, reg, green, blue images.
But also consider finding a better SDK which has the documentation, because making just a big array is really bad design
You should check the documentation on your camera SDK, since there's no "standard" or "common" way for data output. It can be raw data, it can be RGB data, it can even be already compressed. If the camera vendor doesn't provide any information, you could try to find some libraries that handle most common formats, and try to pass the data you have to see what happens.
Without even knowing the type of the camera, this question is nearly impossible to answer.
If it is a scientific camera, chances are good that it adhers to the IEEE 1394 (aka IIDC or DCAM) standard. I have personally worked with such a camera made by Hamamatsu using this library to interface with the camera.
In my case the camera output was just raw data. The camera itself was monochrome and each pixel had a depth-resolution of 12 bit. Therefore, each pixel intensity was stored as 16-bit unsigned value in the result array. The size of the array was simply width * height * 2 bytes, where width and height are the image dimensions in pixels the factor 2 is for 16-bit per pixel. The width and height were known a-priori from the chosen camera mode.
If you have the dimensions of the result image, try to dump your byte array into a file and load the result either in Python or Matlab and just try to visualize the content. Another possibility is to load this raw file with an image editor such as ImageJ and hope to get anything out from it.
Good luck!
I hope this question's solution will helps you: https://stackoverflow.com/a/3340944/291372
Actually you've got an array of pixels (assume 1 byte per pixel if you camera captires in 8-bit). What you need - is just determine width and height. after that you can try to restore bitmap image from you byte array.

C/C++ LIBTIFF: Need to read pixel location of white and black pixels from BW TIFF files

I am fairly new to image processing using Visual C++, I am in search of a way that reads black and white TIFF files and writes the image as an array hex values representing 0 or 1, then get the location information of either 0 (black) or 1 (white).
After a bit of research on Google and an article here https://www.ibm.com/developerworks/linux/library/l-libtiff/#resources I have the following trivial questions, please do point me to relevant articles, there are so many that I couldn’t wrap my heads around them, meanwhile I will keep on searching and reading.
Is it possible to extract pixel location information from TIFF using LIBTIFF?
Again being new to all image formats, I couldn't help to think that an image is made up of a 2D array of hex values , is the bitmap format more appropriate? Thinking in that way makes me wonder if I can just write the location of “white” or “black” mapped from its height onto a 2D array. I want the location of either “black” or “white”, is that possible? I don’t really want an array of 0s and 1s that represents the image, not sure if I even need them.
TIFF Example of a 4 by 2 BW image:
black white
white black
black black
white white
Output to something like ptLocArray[imageHeight][imageWidth] under a function named mappingNonPrint()
2 1
4 4
or under a function named mappingPrint()
1 2
3 3
With LIBTIFF, in what direction/ways does it read TIFF files?
Ideally I would like TIFF file to be read vertically from top to bottom, then shifts to the next column, then start again from top to bottom. But my gut tells me that's a fairy tale, being very new to image processing, I like to know how TIFF is read for example a single strip ones.
Additional info
I think I should add why I want the pixel location. I am trying to make a cylindrical roll printer, using a optical rotary encoder providing a location feedback The height of the document represents the circumference of the roll surface, the width of the document represent the number of revolutions.
The following is a grossly untested logic of my ENC_reg(), pretty much unrelated to the image processing part, but this may help readers to see what I am trying to do with the processed array of a tiff image. i is indexed from 0 to imageHeight, however, each element may be an arbitrary number ranging from 0 to imageHeight, could be 112, 354 etc that corresponds to whatever the location that contains black in the tiff image. j is indexed from 0 to imageWidth, and each element there is also starting 0 to imageWidth. For example, ptLocArray[1][2] means the first location that has a black pixel in column 2, the value stored there could be 231. the 231st pixel counting from the top of column 2.
I realized that the array[imageHeight][] should be instead array[maxnum_blackPixelPerColumn], but I don't know how to count the number of 1s, or 0s per column....because I don't know how to convert the 1s and 0s in the right order as I mentioned earlier..
void ENC_reg()
{
if(inputPort_ENC_reg == true) // true if received an encoder signal
{
ENC_regTemp ++; // increment ENC register temp value by one each time the port registers a signal
if(ENC_regTemp == ptLocArray[i][j])
{
Rollprint(); // print it
i++; // increment by one to the next height location that indicates a "black", meaning print
if (ENC_regTemp == imageHeight)
{
// Check if end of column reached, end of a revolution
j++; // jump to next column of the image (starting next revolution)
i = 0; // reset i index to 0, to the top of the next column
ENC_regTemp = 0; // reset encoder register temp to 0 for the new revolution
};
};
ENC_reg(); // recall itself;
};
}
As I read your example, you want two arrays for each column of the image, one to contain the numbers of rows with a black pixel in that column, the other with indices of rows with a white pixel. Right so far?
This is certainly possible, but would require huge amounts of memory. Even uncompressed, a single pixel of a grayscale image will take one byte of memory. You can even use bit packing and cram 8 pixels into a byte. On the other hand, unless you restrict yourself to images with no more than 255 rows, you'll have multiple bytes to represent each column index. And you'd need extra logic to mark the end of the column.
My point is: try working with the “array of 0s and 1s” as this should be both easier to accomplish, less demanding in terms of memory, and more efficient to run.