Low memory image resizing - c++

I am looking for some advice on how to construct a very low memory image resizing program that will be run as a child process of my nodejs application in linux.
The solution I am looking for is a linux executable that will take a base64 string image (uploaded from a client) using stdin, resizing the photo to a specified size and then pumping the resulting image data back through stdout.
I've looked into image magick and it might be what I end up using, but I figured I would ask and see if anyone had a suggestion.
Suggestions of libraries or examples of pre compiled executables in C/C++ would be greatly appreciated. Also a helpful answer would include general strategies for low memory image resizing.
Thank you

Depending on the image formats you want to support, it's almost surely possible to perform incremental decoding and scaling by decoding only a few lines at a time and discarding the data once you write the output. However it may require writing your own code or adapting an existing decoder library to support this kind of operation.
It's also worth noting that downsizing giant jpegs can be performed efficiently by simply skipping the high-frequency coefficients and using a smaller IDCT. For example, to decode at half width and half height, discard all but the upper-left quadrant of the coefficients (horizontal and vertical frequency < 4) and use a 4x4 IDCT on them instead of the usual 8x8. Both the libjpeg decoder and the libavcodec decoder support this operation for power-of-2 scalings (1/2, 1/4, or 1/8). This type of approach might make incremental decoding/scaling unnecessary.
You can try it out with djpeg -scale 1/4 < src.jpg | cjpeg > dest.jpg. If you want a fixed output size, you'll probably first scale by whichever of 1/2, 1/4, or 1/8 puts you closest to the desired size without going to low, then performing interpolation to go the final step, e.g. djpeg -scale 1/4 < src.jpg | convert pnm:- -scale 640x480 dest.jpg.

When working on very large images, such as 0.25 GPix and larger, ImageMagick uses ~2 GB ram, even when using djpeg to decode the JPEG image first.
This command chain will resize JPEG images of about any size using only ~3 MB ram:
djpeg my-large.jpg | pnmscale -xysize 16000 16000 | cjpeg > scaled-large.jpg

GraphicsMagick is generally a better version of ImageMagick, I'd take a look at that. If you really need something fast, you probably want to drop to something like libjpeg - while you say you want something that's non-blocking IO, the operation you want to do is relatively CPU-bound (i.e decoding the image, then trying to resize it).

if anything this is just a sample following what he described:
import sys
from PIL import Image
import binascii
import cStringIO
x,y = sys.stdin.readline().strip().split(' ')
x,y = int(x), int(y)
img = Image.open(cStringIO.StringIO(binascii.b2a_base64(sys.stdin.read())).resize(x,y)
img.save(sys.stdout, format="png")
as that has to read the input, decode it, resize, and encode, and write it out there is no way to reduce the size of the memory used to less then the size of the input image

Nothing can beat Intel Integrated Performance Primitives in terms of performance. If you can afford it I strongly recommend to use it.
Otherwise just implement your own resizing routine. Lanczos gives quite good results albeit it won't be tremendously fast.
Edit: I strongly suggest you NOT to use Image Magick or Graphics Magick. They are both great libraries, but designed for completely different purpose - handling many file formats, depths, pixel formats, etc. They sacrifice performance and memory effectiveness for the things I've mentioned.

You might need this: https://github.com/zhangyuanwei/node-images
Cross-platform image decoder(png/jpeg/gif) and encoder(png/jpeg) for Nodejs

Related

work with binary (1 bit per pixel) images c++

The problem follows. It is necessary to work with very large binary images (100000x100000 pixels). Initially did it using Qt's QImage class, it supports Format_Mono format that stores an image as a 1-bit per pixel. And in general, everything was fine, until it turned out that QPainter has limited rasterizer and draw on images whose size is more short (32767x32767) can not be, it just cut off.
I was not able to combine images by more than 32767x32767. Then, I began to look closely to individual libraries. The OpenCV, as I understand it, does not support this format. Regarding ImageMagick, it supports the construction of the image as one-bit per pixel and save it in the same format. However, while working with the image is still stored as an 8bit per pixel and hence there arises a shortage of RAM. Then I decided to try CImg, but it don't suppor 1bbp format, as i understand:
the overall size of the used memory for one instance image (in bytes)
is then 'width x height x depth x dim x sizeof (T)
Where sizeof (T) of course can not be less than sizeof (char)...
It was interesting how QImage in principle works with its Format_Mono format, but honestly, I was tangled in the source code.
So, i have the next question. Is there a library that implemented the ability to create and work with binary images, and in this case they really are stored as a 1-bit per pixel in RAM?
libvips can process huge 1 bit images. It unpacks them to one pixel per byte for processing, but it only keeps the part of the image currently being processed in memory, so you should be OK.
For example, this tiny program makes a 100,000 x 100,000 pixel black image, then pastes in all of the images from the command-line at random positions:
#!/usr/bin/env python
import sys
import random
import gi
gi.require_version('Vips', '8.0')
from gi.repository import Vips
# this makes a 8-bit, mono image of 100,000 x 100,000 pixels, each pixel zero
im = Vips.Image.black(100000, 100000)
for filename in sys.argv[2:]:
tile = Vips.Image.new_from_file(filename, access = Vips.Access.SEQUENTIAL)
im = im.insert(tile,
random.randint(0, im.width - tile.width),
random.randint(0, im.height - tile.height))
im.write_to_file(sys.argv[1])
I can run the program like this:
$ vipsheader wtc.tif
wtc.tif: 9372x9372 uchar, 1 band, b-w, tiffload
$ mkdir test
$ for i in {1..1000}; do cp wtc.tif test/$i.tif; done
$ time ./insert.py x.tif[bigtiff,squash,compression=ccittfax4] test/*.tif
real 1m31.034s
user 3m24.320s
sys 0m7.440s
peak mem: 6.7gb
The [] on the output filename set image write options. Here I'm enabling fax compression and setting the squash option. This means 8-bit one band images should be squashed down to 1 bit for write.
The peak mem result is from watching RES in top. 6.7gb is rather large unfortunately, as it's having to keep input buffers for each of the 1,000 input images.
If you use tiled 1-bit tiff, you can remove the access = option and use operators that need random access, like rotate. If you try to rotate a strip tiff, vips will have to decompress the whole image to a temporary disk file, which you probably don't want.
vips has a reasonable range of standard image processing operators, so you may be able to do what you want just glueing them together. You can add new operators in C or C++ if you want.
That example is in Python for brevity, but you can use Ruby, PHP, C, C++, Go, JavaScript or the command-line if you prefer. It comes with all Linuxes and BSDs, it's on homebrew, MacPorts and fink, and there's a Windows binary.

How can I further compress a JPEG to send over a TCP Stream in C/C++ on Windows?

I am making my own version of teamviewer, where you can show your screen to other people so that they can help you with issues and provide support. However, I have ran into a slight issue.
Firstly, when I take a screenshot the .BMP is about 1 MB in memory. After using EncodingParams and converting it to a JPEG with ulong quality = 20, it drops to 92 kb. However, I still feel like this might be a little too big to constantly send over a stream.
Is there any way I can reduce the size of an image or any kind of way I can make it less network intensive? Every single byte I remove would help speed it up for slower connections, and use less bandwidth.
I would appreciate if someone could give me some advice.
Thanks
Either use a lower quality than 20, or or reduce the size of the image. Both will of course degrade the quality. But there aren't any obvious compression techniques that you can use on a JPEG compressed image - it's about as small as it will get (even high quality JPEG's will compress rather poorly, because JPEG does a good job at compressing the actual data in the image, as well as "destroying" pixels when compression levels are quite high). You can prove this by taking a number of JPEG images and compress them with your favourite compression program (ZIP, GZIP, BZIP, etc).
You can also reduce the number of colours in the image - a 256 (8-bit) or 64K (16-bit) image will compress much better than a 16M (24/32-bit) image.
Other obvious answer is to only send the DIFFERENCE between one frame and another - this can give you pretty decent benefits as long as the picture isn't changing very much - if someone is playing a "shoot-em up" type game, where things explode and the scene changes at 50 fps, it's probably not going to help much, but for regular office type applications, the number of pixels that change each frame is probably minimal.

Cropping large jpeg

There is a task, to write a programm that will be crope a JPEG files. But the problem is that some jpeg files has large sizes - hundreds of MegaBytes. So the question: Is it possible to crop a jpeg file, but without loading all file to the RAM, using something like fseek(), and decoding only the parts that needed.
Is that possible? If yes, maybe there is some libraries do the same.
Upd. All this will be used for the deep zoom technology. So when deep zoom will asking for a file, this program will give it, but this should be in real time
There are two ways to accomplish this.
The first is lossless cropping, where you don't decode the file all the way but work with the 8x8 DCT blocks. You'll need to use a library that has this capability, and it places some restrictions on the cropping ability. You can't crop to a boundary that isn't on the DCT square, which limits you to multiples of 8 or 16 depending on the subsampling in the file.
The second way is to use a library that allows you to read and write one line at a time. I know that the IJG library can do this, and probably others as well. This is the easy way, but the downside is that the image goes through a decompression/recompression pass and will lose quality and/or be larger.

Appropriate image file format for losslessly compressing series of screenshots

I am building an application which takes a great many number of screenshots during the process of "recording" operations performed by the user on the windows desktop.
For obvious reasons I'd like to store this data in as efficient a manner as possible.
At first I thought about using the PNG format to get this done. But I stumbled upon this: http://www.olegkikin.com/png_optimizers/
The best algorithms only managed a 3 to 5 percent improvement on an image of GUI icons. This is highly discouraging and reveals that I'm going to need to do better because just using PNG will not allow me to use previous frames to help the compression ratio. The filesize will continue to grow linearly with time.
I thought about solving this with a bit of a hack: Just save the frames in groups of some number, side by side. For example I could just store the content of 10 1280x1024 captures in a single 1280x10240 image, then the compression should be able to take advantage of repetitions across adjacent images.
But the problem with this is that the algorithms used to compress PNG are not designed for this. I am arbitrarily placing images at 1024 pixel intervals from each other, and only 10 of them can be grouped together at a time. From what I have gathered after a few minutes scanning the PNG spec, the compression operates on individual scanlines (which are filtered) and then chunked together, so there is actually no way that info from 1024 pixels above could be referenced from down below.
So I've found the MNG format which extends PNG to allow animations. This is much more appropriate for what I am doing.
One thing that I am worried about is how much support there is for "extending" an image/animation with new frames. The nature of the data generation in my application is that new frames get added to a list periodically. But I do have a simple semi-solution to this problem, which is to cache a chunk of recently generated data and incrementally produce an "animation", say, every 10 frames. This will allow me to tie up only 10 frames' worth of uncompressed image data in RAM, not as good as offloading it to the filesystem immediately, but it's not terrible. After the entire process is complete (or even using free cycles in a free thread, during execution) I can easily go back and stitch the groups of 10 together, if it's even worth the effort to do it.
Here is my actual question that everything has been leading up to. Is MNG the best format for my requirements? Those reqs are: 1. C/C++ implementation available with a permissive license, 2. 24/32 bit color, 4+ megapixel (some folks run 30 inch monitors) resolution, 3. lossless or near-lossless (retains text clarity) compression with provisions to reference previous frames to aid that compression.
For example, here is another option that I have thought about: video codecs. I'd like to have lossless quality, but I have seen examples of h.264/x264 reproducing remarkably sharp stills, and its performance is such that I can capture at a much faster interval. I suspect that I will just need to implement both of these and do my own benchmarking to adequately satisfy my curiosity.
If you have access to a PNG compression implementation, you could easily optimize the compression without having to use the MNG format by just preprocessing the "next" image as a difference with the previous one. This is naive but effective if the screenshots don't change much, and compression of "almost empty" PNGs will decrease a lot the storage space required.

efficient TIFF tile extraction C++

I am working with 1gb large tiff images of around 20000 x 20000 pixels. I need to extract several tiles (of about 300x300 pixels) out of the images, in random positions.
I tried the following solutions:
Libtiff (the only low level library I could find) offers TIFFReadline() but that means reading in around 19700 unnecesary pixels.
I implemented my own tiff reader which extracts a tile out of the image without reading in unnecesary pixels. I expected it to be faster, but doing a seekg for every line of the tile makes it very slow. I also tried reading to a buffer all the lines of the file that include my tile, and then extracting the tile from the buffer, but results are more or less the same.
I'd like to receive suggestions that would improve my tile extraction tool!
Everything is welcome, maybe you can propose a more efficient library I could use, some tips about C/C++ I/O, some higher level strategy for my needs, etc.
Regards,
Juan
[Major edit 14 Jan 10]
I was a bit confused by your mention of tiles, when the tiff is not tiled.
I do use tiled/pyramidical TIFF images. I've created those with VIPS
vips im_vips2tiff source_image output_image.tif:none,tile:256x256,pyramid
I think you can do this with :
vips im_vips2tiff source_image output_image.tif:none,tile:256x256,flat
You may want to experiment with tile size. Then you can read using TIFFReadEncodedTile.
Multi-resolution storage using pyramidical tiffs are much faster if you need to zoom in/out. You may also want to use this to have a coarse image nearly immediately followed by a detailed picture.
After switching to (appropriately sized) tiled storage (which will bring you MASSIVE performance improvements for random access!), your bottleneck will be disk io. File read is much faster if read in sequence. Here mmapping may be the solution.
Some useful links:
VIPS
IIPImage
LibTiff.NET stackoverflow
VIPS is a image handling library which can do much more than just read/write. It has its own, very efficient internal format. It has a good documentation on the algorithms. For one, it decouples processing from filesystem, thereby allowing tiles to be cached.
IIPImage is a multi-zoom webserver/browser library. I found the documentation a very good source of information on multi-resolution imaging (like google maps)
The other solution on this page, using mmap, is efficient only for 'small' files. I've hit the 32-bit boundaries often. Generally, allocating a 1 GByte chunk of memory will fail on a 32-bit os (with 4 GBytes RAM installed) due to the fact that even virtual memory gets fragemented after one or two application runs. Still, there is sufficient memory to cache parts or whole of the image. More memory = more performance.
Just mmap your file.
http://www.kernel.org/doc/man-pages/online/pages/man2/mmap.2.html
Thanks everyone for the replies.
Actually a change in the way tiles were required, allowed me to extract the tiles from the files in hard disk, in a sequential way, instead of a random way. This allowed me to load a part of the file into ram, and extract the tiles from there.
The efficiency gain was huge. Otherwise, if you need random access to a file, mmap is a good deal.
Regards,
Juan
I did something similar to this to handle an arbitrarily large TARGA(TGA) format file.
The thing that made it simple for that kind of file is that the image is not compressed. You can calculate the position of any arbitrary pixel within the image and find it with a simple seek. You might consider targa format if you have the option to specify the image encoding.
If not there are many varieties of TIFF formats. You probably want to use a library if they've already gone through the pain of supporting all the different formats.
Did you get a specific error message? Depending on how you used that command line, you could have been stepping on your own file.
If that wasn't the issue, try using imagemagick instead of vips if it's an option.