work with binary (1 bit per pixel) images c++ - c++

The problem follows. It is necessary to work with very large binary images (100000x100000 pixels). Initially did it using Qt's QImage class, it supports Format_Mono format that stores an image as a 1-bit per pixel. And in general, everything was fine, until it turned out that QPainter has limited rasterizer and draw on images whose size is more short (32767x32767) can not be, it just cut off.
I was not able to combine images by more than 32767x32767. Then, I began to look closely to individual libraries. The OpenCV, as I understand it, does not support this format. Regarding ImageMagick, it supports the construction of the image as one-bit per pixel and save it in the same format. However, while working with the image is still stored as an 8bit per pixel and hence there arises a shortage of RAM. Then I decided to try CImg, but it don't suppor 1bbp format, as i understand:
the overall size of the used memory for one instance image (in bytes)
is then 'width x height x depth x dim x sizeof (T)
Where sizeof (T) of course can not be less than sizeof (char)...
It was interesting how QImage in principle works with its Format_Mono format, but honestly, I was tangled in the source code.
So, i have the next question. Is there a library that implemented the ability to create and work with binary images, and in this case they really are stored as a 1-bit per pixel in RAM?

libvips can process huge 1 bit images. It unpacks them to one pixel per byte for processing, but it only keeps the part of the image currently being processed in memory, so you should be OK.
For example, this tiny program makes a 100,000 x 100,000 pixel black image, then pastes in all of the images from the command-line at random positions:
#!/usr/bin/env python
import sys
import random
import gi
gi.require_version('Vips', '8.0')
from gi.repository import Vips
# this makes a 8-bit, mono image of 100,000 x 100,000 pixels, each pixel zero
im = Vips.Image.black(100000, 100000)
for filename in sys.argv[2:]:
tile = Vips.Image.new_from_file(filename, access = Vips.Access.SEQUENTIAL)
im = im.insert(tile,
random.randint(0, im.width - tile.width),
random.randint(0, im.height - tile.height))
im.write_to_file(sys.argv[1])
I can run the program like this:
$ vipsheader wtc.tif
wtc.tif: 9372x9372 uchar, 1 band, b-w, tiffload
$ mkdir test
$ for i in {1..1000}; do cp wtc.tif test/$i.tif; done
$ time ./insert.py x.tif[bigtiff,squash,compression=ccittfax4] test/*.tif
real 1m31.034s
user 3m24.320s
sys 0m7.440s
peak mem: 6.7gb
The [] on the output filename set image write options. Here I'm enabling fax compression and setting the squash option. This means 8-bit one band images should be squashed down to 1 bit for write.
The peak mem result is from watching RES in top. 6.7gb is rather large unfortunately, as it's having to keep input buffers for each of the 1,000 input images.
If you use tiled 1-bit tiff, you can remove the access = option and use operators that need random access, like rotate. If you try to rotate a strip tiff, vips will have to decompress the whole image to a temporary disk file, which you probably don't want.
vips has a reasonable range of standard image processing operators, so you may be able to do what you want just glueing them together. You can add new operators in C or C++ if you want.
That example is in Python for brevity, but you can use Ruby, PHP, C, C++, Go, JavaScript or the command-line if you prefer. It comes with all Linuxes and BSDs, it's on homebrew, MacPorts and fink, and there's a Windows binary.

Related

Writing image data to the disk as fast as possible

I am writing an application that generates a huge amount of images. Each frame is 1280x800 pixels large and has 1 byte per pixel for color information (greyscale). Each of the frames must be written to disk.
Currently I simply dump the raw pixel data to a binary file on the disk. The file can then be viewed with a special viewer I also created.
This is a very unsatisfactory solution, since the images can't be viewed/processed directly. They always have to run through my custom viewer/converter.
Is there an image format I could use to write my images to disk that:
Is fast to be written (no compression etc.)
Does not increase the final file's size much
Supports dumping my raw pixel buffer in there (no alignemnt changes etc.)
Can be read by common applications (Windows Explorer, Paint, Photoshop etc.)
I already tried to use .png, but the file generation takes much too long due to the compression.
Have a look at the binary Portable GrayMap (P5) format. It consists of an extremely simple header followed by raw image data (without any alignment requirements), and is widely supported by image viewers.
Both bmp and tiff can be used to save raw data. Bmp has the oddity of having image upside down, unless height is negative. And tiff has plenty of encoding options. It should be anyway feasible to reverse engineer the format to be used as a template, where the image data is copy pasted. So no need to use a library: just a header, image data and an optional footer concatenated.

How to compress sprite sheets?

I am making a game with a large number of sprite sheets in cocos2d-x. There are too many characters and effects, and each of them use a sequence of frames. The apk file is larger than 400mb. So I have to compress those images.
In fact, each frame in a sequence only has a little difference compares with others. So I wonder if there is a tool to compress a sequence of frames instead of just putting them into a sprite sheet? (Armature animation can help but the effects cannot be regarded as an armature.)
For example, there is an effect including 10 png files and the size of each file is 1mb. If I use TexturePacker to make them into a sprite sheet, I will have a big png file of 8mb and a plist file of 100kb. The total size is 8.1mb. But if I can compress them using the differences between frames, maybe I will get a png file of 1mb and 9 files of 100kb for reproducing the other 9 png files during loading. This method only requires 1.9mb size in disk. And if I can convert them to pvrtc format, the memory required in runtime can also be reduced.
By the way, I am now trying to convert .bmp to .pvr during game loading. Is there any lib for converting to pvr?
Thanks! :)
If you have lots of textures to convert to pvr, i suggest you get PowerVR tools from www.imgtec.com. It comes with GUI and CLI variants. PVRTexToolCLI did the job for me , i scripted a massive conversion job. Free to download, free to use, you must register on their site.
I just tested it, it converts many formats to pvr (bmp and png included).
Before you go there (the massive batch job), i suggest you experiment with some variants. PVR is (generally) fat on disk, fast to load, and equivalent to other formats in RAM ... RAM requirements is essentially dictated by the number of pixels, and the amount of bits you encode for each pixel. You can get some interesting disk size with pvr, depending on the output format and number of bits you use ... but it may be lossy, and you could get artefacts that are visible. So experiment with limited sample before deciding to go full bore.
The first place I would look at, even before any conversion, is your animations. Since you are using TP, it can detect duplicate frames and alias N frames to a single frame on the texture. For example, my design team provide me all 'walk/stance' animations with 5 pictures, but 8 frames! The plist contains frame aliases for the missing textures. In all my stances, frame 8 is the same as frame 2, so the texture only contains frame 2, but the plist artificially produces a frame8 that crops the image of frame 2.
The other place i would look at is to use 16 bits. This will favour bundle size, memory requirement at runtime, and load speed. Use RGBA565 for textures with no transparency, or RGBA5551 for animations , for examples. Once again, try a few to make certain you get acceptable rendering.
have fun :)

Can't find logic behind png file sizes

I'm saving a large number of small png files for use in a game on a phone, so space is at a premium.
I'm trying to figure out the logic behind the file sizes so I can save things most efficiently, but even after using pngcrush the sizes are totally inconsistent.
I saved a 1x1 image and it takes 3kb. I have another 23x21 image which takes only 2kb. I have two images which are almost the same size, but one takes 6kb and the other takes 13kb. I doubled the image height and copied one image into the empty space of the other and saved that. The combined image is only 11kb!
Why is a 1x1 image larger than a 23x21 image? Why can I combine a 13kb image and a 6kb image and get an 11kb image?
Here are the images I'm talking about (there's a 1x1 pixel in between the 1st and second images. It's difficult to see, so I'll just give the URL: http://g42.org/temp/png/1x1.png):
example http://g42.org/temp/png/hat.png
example http://g42.org/temp/png/1x1.png
example http://g42.org/temp/png/helmet1.png
example http://g42.org/temp/png/helmet2.png
example http://g42.org/temp/png/helmet1_2.png
It's not a compression thing, the problem with the 1x1 image is that it has metadata (added by Photoshop, it seems), a color profile (iCCP chunk). If you look inside the binary, its' the data between the strings "iCCP" and "IDAT", it could be removed and you get a 69 bytes file.
If you reopen and save the file most image viewers (xnview), or use pngcrush, you can strip that chunk. : See it here : http://i.stack.imgur.com/fmOdA.png
And regarding the helmet images: besides other informational chunks (imageReady ads some informational text, as you can see), the difference is due to different formats: the two-helmets is a paletted image (8bits per pixel), the single helmet is a RGB with alpha (32bits per pixel)
PNG compression is based on the same algorithm as zlib and is highly sensitive to the data that is being compressed so you won't see a consistent relationship between image size and file size. In the case of the combined image, it is still bigger than the smaller image and given the similarity of the two halves of the image, the compressor was probably able to reuse a lot of the Huffman tree. I don't know enough about the algorithm to say for certain how it ended up smaller than the other half.
As long as you are not seeing oddities like the 1x1 image, which you seem to have figured out in the comments, I don't think this will make a lot of sense without extensive study of image compression.
There is a great utility called pngcrush
http://pmt.sourceforge.net/pngcrush/
Compressing to PNG is a rather difficult task - there are lost of assumptions and strategies to try - do we create a palette, or are we better off without it?
PNGcrush essentially bruteforces 100+ different compression strategies, while at the same time trimming useless tags and sections.
PNG has several sub-formats: 24-bit with or without alpha, 8-bit (includes alpha), grayscale, etc. which use different amount of bytes per pixel and have different "compressibility".
Plus PNG supports several compression tricks (filters and gzip settings) which affect how well image data is compressed.
On top of that PNG can contain metadata, which sometimes can be pretty large, like some embedded color profiles.
ImageAlpha converts images to the most space-efficient PNG8+alpha variant.
ImageOptim removes junk metadata and finds best compression parameters.
With a combination of those two your images can be reduced by 30-50%.

Large JPEG/PNG Image Sequence Looping

I have been working on my project about remotely sensed image processing, and image sequence looping. Each resulting image (in JPEG or PNG format) has approximately 8000 * 4000 pixels. Our users usually want to loop an image sequence (more than 50 images) on the basis of region of interest at a time. Thus, I have to extract the required viewing area from the each image according to user's visualization client size. For example, if user's current client view is 640 * 480, I'll have to find a size of 640 * 480 data block from each original image based on the current x (columns) and y (rows) coordinates, and remap to the client view. When user pans to another viewing area by mouse dragging, our program must accordingly re-load regional data out of each original image as soon as possible.
I know neither JPEG library nor PNG library has some built-in data block read routines, such as
long ReadRectangle (long x0, long y0, long x1, long y1, char* RectData);
long ReadInaRectangle (long x0, long y0, short width, short height, char* RectData);
The built-in JPEG decompressor lacks this kind of functionality. I know that JPEG2000 format has provisions for decompressing a specific area of the image. I'm not entirely sure about JEPG.
Someone suggest that I use CreateFileMapping, MapViewOfFile, and CreateDIBSection to commit the number of bytes of a file mapping to map to the view. Unlike the simple flat binary image formats such *.raw, *.img, and *.bmp, JPEG's Blob will contain not only the image data but also the complicated JPG header. So it's not easy to map a block of data view out of the JPEG file.
Someone recommend that I use image tiling or image pyramid technology to generate sub-images, just like mnay popular, image visualization (Google Earth, and etc.), and GIS applications (WebGIS, and etc.) do.
How can I solve this problem?
Thanks for your help.
Golden Lee
If you're OK with region co-ordinates being multiples of 8, the JPEG library from ijg may be able to help you load partial JPEG images.
You'd want to:
Get all DCT coefficients for the entire image. Here's an example of how to do this. Yes, this will involve entropy decoding of the entire image, but this is the less expensive step of JPEG decoding (IDCT is the most expensive one, and we're avoiding it).
Throw away the blocks that you don't need (each block consists of 8x8) coefficients. You'll have to do this by hand, but since the layout is quite simple (the blocks are in scanline order) it shouldn't be that hard.
Apply block inverse DCT to each of the frames. You can probably get IJG to do that for you. If you can't, then you'll have to do your own IDCT and color transform back to [0, 255] because intensities are in [-127, 128] in the world of JPEG.
If all goes well, you'll get your decoded JPEG image. Because of chroma subsampling, the luma and chroma channels may be of different dimensions, and you will have to compensate for this yourself by scaling.
The first two steps are pretty much covered by the links. The fourth one is quite trivial (you can get the type of chroma subsampling using the IJG interface, and scaling -- essentially upsampling -- is easily achieved by using something like OpenCV or rolling your own code). The third one is something I haven't tried yet, but it sounds like it would be possible.
It's easy with the gd library. LibGD is an open source code library for the dynamic creation of images on the fly by programmers.

Low memory image resizing

I am looking for some advice on how to construct a very low memory image resizing program that will be run as a child process of my nodejs application in linux.
The solution I am looking for is a linux executable that will take a base64 string image (uploaded from a client) using stdin, resizing the photo to a specified size and then pumping the resulting image data back through stdout.
I've looked into image magick and it might be what I end up using, but I figured I would ask and see if anyone had a suggestion.
Suggestions of libraries or examples of pre compiled executables in C/C++ would be greatly appreciated. Also a helpful answer would include general strategies for low memory image resizing.
Thank you
Depending on the image formats you want to support, it's almost surely possible to perform incremental decoding and scaling by decoding only a few lines at a time and discarding the data once you write the output. However it may require writing your own code or adapting an existing decoder library to support this kind of operation.
It's also worth noting that downsizing giant jpegs can be performed efficiently by simply skipping the high-frequency coefficients and using a smaller IDCT. For example, to decode at half width and half height, discard all but the upper-left quadrant of the coefficients (horizontal and vertical frequency < 4) and use a 4x4 IDCT on them instead of the usual 8x8. Both the libjpeg decoder and the libavcodec decoder support this operation for power-of-2 scalings (1/2, 1/4, or 1/8). This type of approach might make incremental decoding/scaling unnecessary.
You can try it out with djpeg -scale 1/4 < src.jpg | cjpeg > dest.jpg. If you want a fixed output size, you'll probably first scale by whichever of 1/2, 1/4, or 1/8 puts you closest to the desired size without going to low, then performing interpolation to go the final step, e.g. djpeg -scale 1/4 < src.jpg | convert pnm:- -scale 640x480 dest.jpg.
When working on very large images, such as 0.25 GPix and larger, ImageMagick uses ~2 GB ram, even when using djpeg to decode the JPEG image first.
This command chain will resize JPEG images of about any size using only ~3 MB ram:
djpeg my-large.jpg | pnmscale -xysize 16000 16000 | cjpeg > scaled-large.jpg
GraphicsMagick is generally a better version of ImageMagick, I'd take a look at that. If you really need something fast, you probably want to drop to something like libjpeg - while you say you want something that's non-blocking IO, the operation you want to do is relatively CPU-bound (i.e decoding the image, then trying to resize it).
if anything this is just a sample following what he described:
import sys
from PIL import Image
import binascii
import cStringIO
x,y = sys.stdin.readline().strip().split(' ')
x,y = int(x), int(y)
img = Image.open(cStringIO.StringIO(binascii.b2a_base64(sys.stdin.read())).resize(x,y)
img.save(sys.stdout, format="png")
as that has to read the input, decode it, resize, and encode, and write it out there is no way to reduce the size of the memory used to less then the size of the input image
Nothing can beat Intel Integrated Performance Primitives in terms of performance. If you can afford it I strongly recommend to use it.
Otherwise just implement your own resizing routine. Lanczos gives quite good results albeit it won't be tremendously fast.
Edit: I strongly suggest you NOT to use Image Magick or Graphics Magick. They are both great libraries, but designed for completely different purpose - handling many file formats, depths, pixel formats, etc. They sacrifice performance and memory effectiveness for the things I've mentioned.
You might need this: https://github.com/zhangyuanwei/node-images
Cross-platform image decoder(png/jpeg/gif) and encoder(png/jpeg) for Nodejs