I am implementing clipboard, and I want to allocate memory for png image one time. Is there some way to predict maximum size of png file?
A PNG image includes several things:
Signature and basic metadata (image size and type)
Palette (only if the image is indexed)
Raw pixel data
Optional metadata (ancillary chunks)
End of image chunk
Size of item 1 is fixed: 8 + 12 + 11 = 31 bytes
Size of item 2 (if required) is at most 12 + 3 * 256 = 780 bytes
Size of item 5 is fixed: 12 bytes
Item 3, raw pixels data, is usually the most important one. The filtered-uncompressed data amounts to
FUD=(W*C*8/BPC+1)*H bytes
Where W=width in pixels, H=height in pixels, C=channels (3 if RGB, 1 if palette or grayscale, 4 if RGBA, 2 if GA), BPC=bits per channel (normally 8)
That is compressed with ZLIB. It's practically impossible to bound precisely the worst case compression rate. In practice, one might assume that in the worst case the compressed stream will have a few bytes more than the original.
Then the item 3 size would be approximately bound by (again assuming a fairly small IDAT chunk size of 8192 bytes) by
(FUD + 6)(1 + 12/8192) ~ FUD
Item 4 (ancillary chunk data) is practically impossible to bound.
Related
Is there a way to calculate the Maximum size that could take any image compressed with PNG ?
I need to know, that (for example) a PNG of a resolution of 350x350 (px), can't be larger than "X" KB. (and for a constant quality compression, like 90)
the "X" value is the one I'm looking for. Or in math expression
350px * 350px * (90q) < X KB
I'm not quite familiar with the PNG compression algorithm, but there is probably a max value for a specific resolution ?
P.S. : the PNG has no alpha is this case.
From the PNG format here:
The maximum case happens when the data is incompressible
(for example, if the image resolution is 1x1,
or if the image is larger but contains random incompressible data).
That would make the maximum size:
8 // PNG signature bytes
+ 25 // IHDR chunk (Image Header)
+ 12 // IDAT chunk (assuming only one IDAT chunk)
+ height // in pixels
* (
1 // filter byte for each row
+ (
width // in pixels
* 3 // Red, blue, green color samples
* 2 // 16 bits per color sample
)
)
+ 6 // zlib compression overhead
+ 2 // deflate overhead
+ 12 // IEND chunk
Compression "quality" doesn't enter into this.
Most applications will probably separate the IDAT chunk into smaller chunks, typically 8 kbytes each, so in the case of a 350x350 image there would be 44 IDAT chunks, so add 43*12 for IDAT chunk overhead.
As a check, a 1x1 16-bit RGB image can be written as a 72-byte PNG, and a 1x1 8-bit grayscale image is 67 bytes.
If the image is interlaced, or has any ancillary chunks, or has an alpha channel, it will naturally be bigger.
I was trying to load BMP picture into memory and save the RGB array into a file(my own format 3d model with texture data).I made the programming to convert OBJ and its texture data into a m2d file. But when I loaded the file in actual in my m2d loader it showed me green continuous lines on the picture.
I opened the BMP file in hex editor, found two 00s as culprit(occurred many times).
Any hint how should I take these 00s out of my RGB array?
Any hint or tip will be appreciated.
Each horizontal row in a BMP must be a multiple of 4 bytes long.
If the pixel data does not take up a multiple of 4 bytes, then 0x00 bytes are added at the end of the row. For a 24-bpp image, the number of bytes per row is (imageWidth*3 + 3) & ~3. The number of padding bytes is ((imageWidth*3 + 3) & ~3) - (imageWidth*3).
What are the disadvantages of always using alginment of 1?
glPixelStorei(GL_UNPACK_ALIGNMENT, 1)
glPixelStorei(GL_PACK_ALIGNMENT, 1)
Will it impact performance on modern gpus?
How can data not be 1-byte aligned?
This strongly suggests a lack of understanding of what the row alignment in pixel transfer operations means.
Image data that you pass to OpenGL is expected to be grouped into rows. Each row contains width number of pixels, with each pixel being the size as defined by the format and type parameters. So a format of GL_RGB with a type of GL_UNSIGNED_BYTE will result in a pixel that is 24-bits in size. Pixels are otherwise expected to be packed, so a row of 16 of these pixels will take up 48 bytes.
Each row is expected to be aligned on a specific value, as defined by the GL_PACK/UNPACK_ALIGNMENT. This means that the value you add to the pointer to get to the next row is: align(pixel_size * width, GL_*_ALIGNMENT). If the pixel size is 3-bytes, the width is 2, and the alignment is 1, the row byte size is 6. If the alignment is 4, the row byte size is eight.
See the problem?
Image data, which may come from some image file format as loaded with some image loader, has a row alignment. Sometimes this is 1-byte aligned, and sometimes it isn't. DDS images have an alignment specified as part of the format. In many cases, images have 4-byte row alignments; pixel sizes less than 32-bits will therefore have padding at the end of rows with certain widths. If the alignment you give OpenGL doesn't match that, then you get a malformed texture.
You set the alignment to match the image format's alignment. If you know or otherwise can ensure that your row alignment is always 1 (and that's unlikely unless you've written your own image format or DDS writer), you need to set the row alignment to be exactly what your image format uses.
Will it impact performance on modern gpus?
No, because the pixel store settings are only relevent for the transfer of data from or to the GPU, namely the alignment of your data. Once on the GPU memory it's aligned in whatever way the GPU and driver desire.
There will be no impact on performance. Setting higher alignment (in openGL) doesn't improve anything, or speeds anything up.
All alignment does is to tell openGL where to expect the next row of pixels. You should always use an alignment of 1, if your image pixels are tightly packed, i.e. if there are no gaps between where a row of bytes ends and where a new row starts.
The default alignment is 4 (i.e. openGL expects the next row of pixels to be after a jump in memory which is divisible by 4), which may cause problems in cases where you load R, RG or RGB textures which are not 4-bytes floats, or the width is not divisible by 4. If your image pixels are tightly packed you have to change alignment to 1 in order for the unpacking to work.
You could (I personally haven’t encountered them) have an image of, say, 3x3 RGB ubyte, whose rows are 4th-aligned with 3 extra bytes used as padding in the end. Which rows might look like this:
R - G - B - R - G - B - R - G - B - X - X - X (16 bytes in total)
The reason for it is that aligned data improves the performance of the processor (not sure how much it's true/justified on todays processors). IF you have any control over how the original image is composed, then maybe aligning it one way or another will improve the handling of it. But this is done PRIOR to openGL. OpenGL has no way of changing anything about this, it only cares about where to find the pixels.
So, back to the 3x3 image row above - setting the alignment to 4 would be good (and necessary) to jump over the last padding. If you set it to 1 then, it will mess your result, so you need to keep/restore it to 4. (Note that you could also use ROW_LENGTH to jump over it, as this is the parameter used when dealing with subsets of the image, in which case you sometime have to jump much more then 3 or 7 bytes (which is the max the alignment parameter of 8 can give you). In our example if you supply a row length of 4 and an alignment of 1 will also work).
Same goes for packing. You can tell openGL to align the pixels row to 1, 2, 4 and 8. If you're saving a 3x3 RGB ubyte, you should set the alignment to 1. Technically, if you want the resulting rows to be tightly packed, you should always give 1. If you want (for whatever reason) to create some padding you can give another value. Giving (in our example) a PACK_ALIGNMENT of 4, would result in creating rows that look like the row above (with the 3 extra padding in the end). Note that in that case your containing object (openCV mat, bitmap, etc.) should be able to receive that extra padding.
I'm trying to understand building a bmp based on raw data in c++ and I have a few questions.
My bmp can be black and white so I figured that the in the bit per pixel field I should go with 1. However in a lot of guides I see the padding field adds the number of bits to keep 32 bit alignment, meaning my bmp will be the same file size as a 24 bit per pixel bmp.
Is this understanding correct or in some way is the 1 bit per pixel smaller than 24, 32 etc?
Thanks
Monochrome bitmaps are aligned too, but they will not take as much space as 24/32-bpp ones.
A row of 5-pixel wide 24-bit bitmap will take 16 bytes: 5*3=15 for pixels, and 1 byte of padding.
A row of 5-pixel wide 32-bit bitmap will take 20 bytes: 5*4=20 for pixels, no need for padding.
A row of 5-pixel wide monochrome bitmap will take 4 bytes: 1 byte for pixels (it is not possible to use less than a byte, so whole byte is taken but 3 of its 8 bits are not used), and 3 bytes of padding.
So, monochrome bitmap will of course be smaller than 24-bit one.
The answer is already given above (that bitmap rows are aligned/padded to 32-bit boundary), however if you want more information, you might want to read DIBs and Their Uses, the "DIB Header" section - it explains in detail.
Every scanline is DWORD-aligned. The scanline is buffered to alignment; the buffering is not necessarily 0.
The scanlines are stored upside down, with the first scan (scan 0) in memory being the bottommost scan in the image. (See Figure 1.) This is another artifact of Presentation Manager compatibility. GDI automatically inverts the image during the Set and Get operations. Figure 1. (Embedded image showing memory and screen representations.)
If i load image such as 98x*** which is 3 bytes per pixel, it will create 2 bytes padding there to make it fit in 4 bytes sequences.
Is it possible to use IMG_Load() without generating the padded bytes in the ->pixels raw data?
At the moment i use this to detect how many bytes it has been padded:
int pad = img->pitch - (img->w * img->format->BytesPerPixel);
And if > 0 Then i rebuild new image without the padded bytes... but this is inefficient, so im hoping if theres better fix?