I have a really weird error,
so I'm trying to read a pgm image by loading its pixel values into an array, I was able to correctly read in its version, height, width, and maximum possible pixel value. However, when I start reading the pixel values, I always get 0. (I know it's not zero because I can read it using imread in matlab, but have to implement it in c++, plus I couldn't use the opencv library so..)
And besides, when I read the pgm file in like NotePad++, the first few lines are good representing the information about this image ,how ever, the actual pixel values are not readable. I'm wondering if I need some sort of parsing to read a pgm image? Its version is p5.
Thanks!
You must have an assignment to solve as there is no sane reason implementing a PGM reader otherwise.
There are two different PGM formats: ASCII and binary. You seem to expect an ASCII PGM but the one you have is binary.
Have a look at the specs: http://netpbm.sourceforge.net/doc/pgm.html
It says:
/1. A "magic number" for identifying the file type. A pgm image's
magic number is the two characters "P5".
[…]
/9. A raster of Height rows, in order from top to bottom. Each row
consists of Width gray values, in order from left to right. Each gray
value is a number from 0 through Maxval, with 0 being black and Maxval
being white. Each gray value is represented in pure binary by either
1 or 2 bytes. If the Maxval is less than 256, it is 1 byte.
Otherwise, it is 2 bytes. The most significant byte is first.
The format you are expecting is described further down below as the Plain PGM format. Its magic number is "P2".
Related
is there a way to get the int value of a pixel returned with cimg? I'm in the process of building a basic ASCII art program that converts JPG's to character arrays, and I have the entire utility built out but I cann not find a way to get the unsigned char's converted into the range of ints I need (0-255, although the specifics don't matter so long as its a predictable interval).
Does anyone have any idea how to get a numerical pixel value from a JPG? (library suggestions or anything else are completely welcome)
Here is the pixel output:
\�_b��}�HaX�gNzԴ�����p��-�u�����lqu��Lߐ_"T������{�y�sricX[[TXgZ]`a~�t91960d�BpvJ0kY#uR!BpMWb\W?j"#���dCy2+4?ڽ�TT<Tght%P%y;mhͬ�����8#1�H��)����:4lu���CY|��u&<_��ī��������������ȿF�����LP:����N���-�Q�+�2;E3(�SdRO6��NI16j{#�0((
: pixel data
It's already been converted to black and white, so even accessing the numerical value of one color channel off the cimg would be fine. I just can't seem to get any kind of intelligible/manipulable output from the image, even though the image itself is exactly what i'm looking for.
cast it as an int using (int)img(x,y) and ignore the extra channels
I am trying to convert some bitmap files into custom images (exr, pfm, whatever), and after that, back to bitmap:
CImg<float> image(_T("D:\\Temp\\test.bmp"));
image.normalize(0.0, 1.0);
image.save_exr(_T("D:\\Temp\\test.exr"));
and goes fine (same for .pfm file), I mean the exr file is ok, same for pfm file.
But when this exr, or pfm file I trying to convert back to bitmap:
CImg<float> image;
image.load_exr(_T("D:\\Temp\\test.exr")); // image.load_pfm(_T("D:\\Tempx\\test.pfm"));
image.save_bmp(_T("D:\\Temp\\test2.bmp"));
the result, test2.bmp is black. Complete. Why ? What I am doing wrong ?
Some image formats support saving as float, but most formats save as unsigned 8 bit integer (or uint8), meaning normal image values are from 0 to 255. If you try to save an array that is made up of floats from 0 to 1 into a format that does not support floats, your values will most likely be converted to integers. When you display your image with most image-viewing software, it'll appear entirely black since 0 is black and 1 is almost black.
Most likely when you save your image to bitmap it is trying to convert the values to uint8 but not scaling properly. You can fix this by multiplying normalized values between 0 and 1 by 255. img = int(img*255) or using numpy img = (img*255).astype(np.uint8).
It is also possible that somehow your save function is able to preserve floating point values in the bitmap format. However your image viewing software might not know how to view/display a float image. Perhaps use some imshow function (matplotlib.pyplot can easily display floating point grayscale arrays) between each line of code to check if the arrays are consistent with what you expect them to be.
I have been calculating the uncompressed and compressed file sizes of an image. This for me has always resulted in the compressed image being smaller than the uncompressed image which I would expect. If an image contains a large number of different colours, then storing the palette takes up a significant amount of space, and more bits are also needed to store each code. However my question is, would it be possible the compression method could potentially result in a larger file than the uncompressed RGB image. What would the size (in pixels) of the smallest square RGB image, containing a total of k different colours, for which this compression method is still useful? So we want to find, for a given value of k, find the smallest integer number n for which an image of size n×n takes up less storage space after compression than the original RGB image.
Let's begin by making a small simplification -- the size of the encoded output depends on the number of pixels (the actual proportion of width vs. height doesn't really matter). Hence, let's generalize the problem to number of pixels N, from which we can always calculate n by taking a square root.
To further simplify the problem, we will also ignore the overhead of any image headers/metadata, such as width, height, size of the palette, etc. In practice, this would generally be some relatively small constant.
Problem Statement
Given that we have
N representing the number of pixels in an image
k representing the number of distinct colours in an image
24 bits per pixel RGB encoding
LRGB representing the length of a RGB image
LP representing the length of a palette image
our goal is to solve the following inequality
in terms of N.
Size of RGB Image
RGB image is just an array of N pixels, each pixel taking up a fixed number of bits given by the RGB encoding. Hence,
Size of Palette Image
Palette image consists of two parts: a palette, and the pixels.
A palette is an array of k colours, each colour taking up a fixed number of bits given by the RGB encoding. Therefore,
In this case, each pixel holds an index to a palette entry, rather than an actual RGB colour. The number of bits required to represent k values is
However, unless we can encode fractional bits (which I consider outside the scope of this question), we need to round this up. Therefore, the number of bits required to encode a palette index is
Since there are N such palette indices, the size of the pixel data is
and the total size of the palette image is
Solving the Inequality
And finally
In Python, we could express this in the following way:
import math
def limit_size(k):
return (k * 24.) / (24. - math.ceil(math.log(k, 2)))
def size_rgb(N):
return (N * 24.)
def size_pal(N, k):
return (N * math.ceil(math.log(k, 2))) + (k * 24.)
In general no, but your question is not precise.
If we compress normal files, they could be larger. E.g. if you compress a random generated sequence of bytes, there is not much to compress, and so you get the header of compression program, which tell which compression method is used, and some versioning. This will enlarge the file, and ev. some escaping. Good compression program will see that compression will not shrink the size, and so they should just not compress, and tell in the header that it is a flat file. Possibly this is done by region of program.
But your question is about images. Compression is done inside the file, and often not all file, but just the image bits. In this case program will see that there is no need to compress, and so they would keep the file uncompressed. But because the image headers are always present, this change only a flag, and so no increase of size.
But this could depends also on file format. You wrote about "palette", but this is not much used nowadays: compression is done finding similar pattern on file. But again: this depends on the image format. If you look in Wikipedia, for particular file format, you may see a table with headers parameters (e.g. bit depth or number of colours (palette), definitions of colours, and methods used to compress).
Then, for palette like image, the answer of Dan Mašek (https://stackoverflow.com/a/58683948/2758823) has some nice mathematical explanation, but one should not forget that compression is much heuristic and test of real examples: real images have patterns.
I came across this code:
image.convertTo(temp_image,CV_16SC3);
I saw the description of the convertTo() function from here, but what confuses me is image. How can we read the above code? What would be the relation between image and temp_image?
Thanks.
The other answers here are correct, but lack some details. Let me try.
image.convertTo(temp_image,CV_16SC3);
You have a source image image, and a destination image temp_image. You didn't specify the type of image, but probably is CV_8UC3 or CV_32FC3, i.e. a 3 channel image (since convertTo doesn't change the number of channels), where each channel has depth 8 bit (unsigned char, CV_8UC3) or 32 bit (float, CV_32FC3).
This line of code will change the depth of each channel, so that temp_image has each channel of depth 16 bit (short). Specifically it's a signed short, since the type specifier has the S: CV_16SC3.
Note that if you are narrowing down the depth, as in the case from float to signed short, then saturate_cast will make sure that all the values in temp_image will be in the correct range, i.e. in [–32768, 32767] for signed short.
Why you need to change the depth of an image?
Some OpenCV functions require input images with a specific depth.
You need a matrix to contain a different range of values. E.g. if you need to sum (or subtract) some images CV_8UC3 (tipically BGR images), you'd better store the result in a CV_16SC3 or you'll probably get wrong results due to saturations, since the range for CV_8U images is in [0,255]
You read with imread, or want to store with imwrite images with 16bit depth. This are usually used (AFAIK) in medical or graphics application to allow a wider range of colors. However, most monitors do not support 16bit image visualization.
There may be other cases, let me know if I miss the one important to you.
An image is a matrix of pixel information (i.e. a 1080p image will be a 1,920 × 1,080 matrix where each entry contains rbg values for that pixel). All you are doing is reformatting that matrix (each pixel entry, iteratively) into a new type (CV_16SC3) so it can be read by different programs.
The temp_image is a new matrix of pixel information based off of image formatted into CV_16SC3.
The first one is a source, the second one - destination. So, it takes image, converts it into type CV_16SC3 and stores in temp_image.
i have a pixel array containing the values from 0 to 255 ...
i have passed it to my c++ function ...
this pixel array i want to save it to jpeg image file...
how to do it with correct encoding ??
i have converted the array to binary string
and saved it into the file in the below code but it just saves an empty image of 4 byte size ...
FILE *file = fopen("/media/internal/wallpapers/04.jpeg", "w+");
fwrite(binaryStr , 1 , sizeof(binaryStr) ,file );
fclose(file);
thnks
Use libjpeg. Don't try to reimplement jpeg encoding yourself, there are too many ways it can go wrong.
I think you need a JPEG library, like libjpeg.
Independent JPEG Group: http://www.ijg.org/
Info: http://en.wikipedia.org/wiki/Libjpeg
From your description it looks like you have YUV-data that you need to convert to jpeg. Correct? Imagemagick is a very powerful tool that can handle this.
From wikipedias entry on YUV:
Y' values are conventionally shifted and scaled to the range [16, 235] rather than using the full range of [0, 255]. This confusing practice derives from the MPEG standards and explains why 16 is added to Y' and why the Y' coefficients in the basic transform sum to 220 instead of 255. U and V values, which may be positive or negative, are summed with 128 to make them always positive
I.e. 0-255 is not a valid range for YUV-data
It seems that sizeof(binaryStr) is 4. So, you'll need to get a length of the binaryStr, not the sizeof(pointer). And it's more simply to use something already cooked like libjpeg.