Is there a way to calculate the Maximum size that could take any image compressed with PNG ?
I need to know, that (for example) a PNG of a resolution of 350x350 (px), can't be larger than "X" KB. (and for a constant quality compression, like 90)
the "X" value is the one I'm looking for. Or in math expression
350px * 350px * (90q) < X KB
I'm not quite familiar with the PNG compression algorithm, but there is probably a max value for a specific resolution ?
P.S. : the PNG has no alpha is this case.
From the PNG format here:
The maximum case happens when the data is incompressible
(for example, if the image resolution is 1x1,
or if the image is larger but contains random incompressible data).
That would make the maximum size:
8 // PNG signature bytes
+ 25 // IHDR chunk (Image Header)
+ 12 // IDAT chunk (assuming only one IDAT chunk)
+ height // in pixels
* (
1 // filter byte for each row
+ (
width // in pixels
* 3 // Red, blue, green color samples
* 2 // 16 bits per color sample
)
)
+ 6 // zlib compression overhead
+ 2 // deflate overhead
+ 12 // IEND chunk
Compression "quality" doesn't enter into this.
Most applications will probably separate the IDAT chunk into smaller chunks, typically 8 kbytes each, so in the case of a 350x350 image there would be 44 IDAT chunks, so add 43*12 for IDAT chunk overhead.
As a check, a 1x1 16-bit RGB image can be written as a 72-byte PNG, and a 1x1 8-bit grayscale image is 67 bytes.
If the image is interlaced, or has any ancillary chunks, or has an alpha channel, it will naturally be bigger.
Related
I am implementing clipboard, and I want to allocate memory for png image one time. Is there some way to predict maximum size of png file?
A PNG image includes several things:
Signature and basic metadata (image size and type)
Palette (only if the image is indexed)
Raw pixel data
Optional metadata (ancillary chunks)
End of image chunk
Size of item 1 is fixed: 8 + 12 + 11 = 31 bytes
Size of item 2 (if required) is at most 12 + 3 * 256 = 780 bytes
Size of item 5 is fixed: 12 bytes
Item 3, raw pixels data, is usually the most important one. The filtered-uncompressed data amounts to
FUD=(W*C*8/BPC+1)*H bytes
Where W=width in pixels, H=height in pixels, C=channels (3 if RGB, 1 if palette or grayscale, 4 if RGBA, 2 if GA), BPC=bits per channel (normally 8)
That is compressed with ZLIB. It's practically impossible to bound precisely the worst case compression rate. In practice, one might assume that in the worst case the compressed stream will have a few bytes more than the original.
Then the item 3 size would be approximately bound by (again assuming a fairly small IDAT chunk size of 8192 bytes) by
(FUD + 6)(1 + 12/8192) ~ FUD
Item 4 (ancillary chunk data) is practically impossible to bound.
I was trying to load BMP picture into memory and save the RGB array into a file(my own format 3d model with texture data).I made the programming to convert OBJ and its texture data into a m2d file. But when I loaded the file in actual in my m2d loader it showed me green continuous lines on the picture.
I opened the BMP file in hex editor, found two 00s as culprit(occurred many times).
Any hint how should I take these 00s out of my RGB array?
Any hint or tip will be appreciated.
Each horizontal row in a BMP must be a multiple of 4 bytes long.
If the pixel data does not take up a multiple of 4 bytes, then 0x00 bytes are added at the end of the row. For a 24-bpp image, the number of bytes per row is (imageWidth*3 + 3) & ~3. The number of padding bytes is ((imageWidth*3 + 3) & ~3) - (imageWidth*3).
I have a float matrix of 1024x1024 and I want to keep sign of this matrix inside a file. For this purpose, I want to keep the sign matrix as Matrix of boolean which I fail to do.
Assume, my matrix is:
2.312, 0.232, -2,132
5.754, -4,34, -3.23
-4.34, -1.23, 7.9453
My output should be
1,1,0
1,0,0
0,0,1
Since float is 4Byte and my matrix size is 10^20(1M) the size is 4MB and boolean is 1bit and matrix size is 1M, I expect the bool mat to be around 1Mb=128KB however, when I use threshold method in opencv my output file is 1MB which means the file is saved as uchar(8bit).
I tried to use imwrite but it didn't work.
EDIT: I realized that I didn't mention speed is also another important factor for my tests. I'm loading approximately 10 million of 1K*1K matrix from disk.
Thanks in advance
In OpenCV you can write
Mat input;
Mat A = (input >= 0);
Now the problem is that OpenCV has no bitmap data type. So the best you can get is Mat1u (unsigned char).
If you want to save space in your storage, you need to do it on your own. For example, you can use libpng to write out a PNG file of bit depth 1. Unfortunately, imwrite does not support setting that bit depth (it can write PNGs with bit depths 8 and 16).
If you want to write a compressed PNG with bitdepth 8, you can use imwrite:
std::vector<int> flags;
flags.push_back(CV_IMWRITE_PNG_COMPRESSION);
flags.push_back(9); // [0-9] 9 being max compression, default is 3
cv::imwrite("output.png", A, flags);
This will result in the best compression effort. Now you can use Imagemagick to compare the filesize against the same image stored with bit depth 1:
convert output.png -type Bilevel -define "png:bit-depth=1" -define "png:compression-level=9" output-1b.png
I tested with a random example image (see below).
8 bit, compressed PNG: 24,732 bytes
1 bit, compressed PNG: 20,529 bytes
8 bit, uncompressed PGM: 270,015 bytes
1 bit, uncompressed PBM: 34,211 bytes
As you can see, a compressed 8bit storage still beats uncompressed 1bit storage in this example.
I have a 2D array of float values:
float values[1024][1024];
that I want to store as an image.
The values are in the range: [-range,+range].
I want to use a colortable that goes from red(-range) to white(0) to black(+range).
So far I have been storing each pixel as a 32 bit RGBA using the BMP file format. The total memory for storing my array is then 1024*1024*4 bytes = 4MB.
This seems very vasteful knowing that my colortable is "1 dimensional" whereas the 32 RGBA is "4 dimensional".
To see what I mean; lets assume that my colortable went from black(-range) to blue(+range).
In this case the only component that varies is clearly the B, all the others are fixed.
So I am only getting 8bits of precision whereas I am "paying" for 32 :-(.
I am therefore looking for a "palette" based file format.
Ideally I would like each pixel to be a 16 bit index (unsigned short int) into a "palette" consisting of 2^16 RGBA values.
The total memory used for storing my array in this case would be: 1024*1024*2 bytes + 2^16*4bytes = 2.25 MB.
So I would get twice as good precision for almost half the "price"!
Which image formats support this?
At the moment I am using Qt's QImage to write the array to file as an image. QImage has an internal 8 bit indexed ("palette") format. I would like a 16 bit one. Also I did not understand from Qt's documentation which file formats support the 8 bit indexed internal format.
Store it as a 16 bit greyscale PNG and do the colour table manually yourself.
You don't say why your image can be decomposed in 2^16 colours but using your knowledge of this special image you could make an algorithm so that indices that are near each other have similar colours and are therefore easier to compress.
"I want to use a colortable that goes from red(-range) to white(0) to black(+range)."
Okay, so you've got FF,00,00 (red) to FF,FF,FF (white) to 00,00,00 (black). In 24 bit RGB, that looks to me like 256 values from red to white and then another 256 from white to black. So you don't need a palette size of 2^16 (16384); you need 2^9 (512).
If you're willing to compromise and use a palette size of 2^8 then the GIF format could work. That's still relatively fine resolution: 128 shades of red on the negative size, plus 128 shades of grey on the positive. Each of a GIF's 256 palette entries can be an RGB value.
PNG is another candidate for palette-based color. You have more flexibility with PNG, including RGBA if you need an alpha channel.
You mentioned RGBA in your question but the use of the alpha channel was not explained.
So independent of file format, if you can use a 256 entry palette then you will have a very well compressed image. Back to your mapping requirement (i.e. mapping floats [-range -> 0.0 -> +range] to [red -> white -> black], here is a 256 entry palette that covers the range red-white-black you wanted:
float entry# color rgb
------ ------- ----- --------
-range 0 00 red FF,00,00
1 01 FF,02,02
2 02 FF,04,04
... ...
... ...
127 7F FF,FD,FD
0.0 128 80 white FF,FF,FF
129 81 FD,FD,FD
... ....
... ...
253 FD 04,04,04
254 FE 02,02,02
+range 255 FF black 00,00,00
If you double the size of the color table to be 9 bits (512 values) then you can make the increments between RGB entries more fine: increments of 1 instead of 2. Such a 9-bit palette would give you full single-channel resolution in RGB on both the negative and positive sides of the range. It's not clear that allocating 16 bits of palette would really be able to store any more visual information given the mapping you want to do. I hope I understand your question and maybe this is helpful.
PNG format supports paletted format up to 8-bits, but should also support grayscale images up to 16-bits. However, 16-bit modes are less used, and software support may be lacking. You should test your tools first.
But you could also test with plain 24-bit RGB truecolor PNG images. They are compressed and should produce better result than BMP in any case.
I have a big binary file with lots of files stored inside it. I'm trying to copy the data of a PCX image from the file and write it to a new file which I can then open in an image editor.
After obtaining the specs for the header of a PCX file I think that I've located the image in the big binary file. My problem is that I cannot figure out how many bytes I'm supposed to read after the header. I read about decoding PCX files, but I don't want to decode anything. I want to read the encoded image data and write that to a seperate file so the image editor can open in.
Here is the header. I've included the values of the image as I guess they can be used to determine the "end-of-file" for the image data.
struct PcxHeader
{
BYTE Identifier; // PCX Id Number (Always 0x0A) // 10
BYTE Version; // Version Number // 5
BYTE Encoding; // Encoding Format // 1
BYTE BitsPerPixel; // Bits per Pixel // 8
WORD XStart; // Left of image // 0
WORD YStart; // Top of Image // 0
WORD XEnd; // Right of Image // 319
WORD YEnd; // Bottom of image // 199
WORD HorzRes; // Horizontal Resolution // 320
WORD VertRes; // Vertical Resolution // 200
BYTE Palette[48]; // 16-Color EGA Palette
BYTE Reserved1; // Reserved (Always 0)
BYTE NumBitPlanes; // Number of Bit Planes // 1
WORD BytesPerLine; // Bytes per Scan-line // 320
WORD PaletteType; // Palette Type // 0
WORD HorzScreenSize; // Horizontal Screen Size // 0
WORD VertScreenSize; // Vertical Screen Size // 0
BYTE Reserved2[54]; // Reserved (Always 0)
};
There are three components to the PCX file format:
128-byte header (though less are actually used, it is 128 bytes long)
variable-length image data
optional 256 color palette (though improper PCX files exist with palette sizes other than 256 colors).
From the Wikipedia artice:
Due to the PCX compression scheme the only way to find the actual length of the image data is to read and process it. This effort is made difficult because the format allows for the compressed data to run beyond the image dimensions, often padding it to the next 8 or 16 line boundary.
In general, then, it sound like you'll have to do a "deep process" of the image data to find the complete PCX file embedded within your larger binary file.
Without knowing much about the PCX file format, I can take a best guess at this:
bytesAfterHeader = header.BytesPerLine * header.VertRes;