Does LZ4 compression eliminate the need for manual bit packing? - compression

I'm considering using LZ4 compression for a high bandwidth browser game I'm developing. I'm currently pre-compressing 1000s of float32 values down to 16bits. I'm wondering if I pre-compressed down to 12bits but saved those values as uint16s, whether LZ4 would remove the empty bits and save me the work of manually bit packing those values.

Related

How to efficiently compress thousands of slightly differing JPEG files?

I captured a timelapse sequence from a camera, and saved it as a collection of separate .jpg files.
The files are numbering in the tens of thousands now, and most of them only differ by a slight amount - is there a compression method that would utilize this fact?
Since video codecs are more or less tailored towards compressing "sequences of slightly differing images", they seem like a good choice. However, as the files are already compressed, I would prefer not to lose any more information by further encoding them into a lossy format. So, I experimented with video formats that offer lossless compression like h264 or FFV1, but the resulting filesize was several times larger than a simple gzip of the jpg files - I assume it's because in the encoding step, the jpgs are converted to a bitmap, and then losslessly compressed, resulting in a better filesize than if I had a folder full of uncompressed bitmaps, but falling short of gzipping the original jpgs.
Right now, I'm simply storing them gzipped, but I wonder - is there is a better method, one that might exploit the fact that the files are perceptually very similar? Or, since the files are already compressed jpgs, the best way to go about this is to consider them no different from binary files - and use general purpose compression methods like gzip, bzip, etc.?
(Also, apologies for asking on StackOverflow - there might be a better StackExchange site, but I couldn't find any.)
You'll need to define "slightly differing". It seems that you are demanding lossless compression of the JPEGs, even though each of the JPEGs was compressed with loss. Anyway, depending on how slight the differences are, it might be effective to send the first JPEG, and then PNGs (which are lossless) of the difference between successive images, pixel-by-pixel. If at some point the next PNG is bigger than the JPEG being differenced, then just send the JPEG. That way your stream at least won't get bigger.
If the sequence of images have effectively panning or zooming as part of the difference, then this won't work so well, since pixels in the same locations in the images are differenced. For panning and zooming, you would want a video format. Accepting a little loss goes a long way.

What is the best way to compress my data in lmdb

I have a large dataset which makes my lmdb huge. For 16.000 samples my database is already 20 GB. But in total I have 800.000 images which would end up in a huge amount of data. Is there any way to compress an lmdb? Or is it better to use HDF5 files? I would like to know if anyone knows probably the best solution for this problem.
If you look inside ReadImageToDatum function in io.cpp it can keep image in both compressed(jpg/png) format or raw format. To use compressed format you can compress the loaded image using cv::imencode. Now you just set the datum to the compressed data and set the encoded flag. Then you can store the datum in lmdb.
There are various techniques to reduce input size, but much of that depends on your application. For instance, the ILSVRC-2012 data set images can be resized to about 256x256 pixels without nasty effects on the training time or model accuracy. This reduces the data set from 240Gb to 40Gb. Can your data set suffer loss of fidelity from simple "physical" compression? How small do you have to have the data set?
I'm afraid that I haven't worked with HDF5 files enough to have an informed opinion.

Image steganography that could survive jpeg compression

I am trying to implement a steganographic algorithm where hidden message could survive jpeg compression.
The typical scenario is the following:
Hide data in image
Compress image using jpeg
The hidden data is not destroyed by jpeg compressiona nd could be restored
I was trying to use different described algorithms but with no success.
For example I was trying to use simple repetition code but the jpeg compression destroyed hidden data. Also I was trying to implementt algorithms described by the following articles:
http://nas.takming.edu.tw/chkao/lncs2001.pdf
http://www.securiteinfo.com/ebooks/palm/irvine-stega-jpg.pdf
Do you know about any algorithm that actually can survive jpeg compression?
You can hide the data in the frequency domain, JPEG saves information using DCT (Discrete Cosine Transform) for every 8x8 pixel block, the information that is invariant under compression is the highest frequency values, and they are arranged in a matrix, the lossy compression is done when the lowest coefficients of the matrix are rounded to 0 after the quantization of the block, these zeroes are arranged in the low-right part of the matrix and that is why the compression works and the information is lost.
Quite a few applications seem to implement Steganography on JPEG, so it's feasible:
http://www.jjtc.com/Steganography/toolmatrix.htm
Here's an article regarding a relevant algorithm (PM1) to get you started:
http://link.springer.com/article/10.1007%2Fs00500-008-0327-7#page-1
Perhaps the answer is late,but ...
You can do it in compressed domain steganography.Read image as binary file and analysis this file with libs like JPEG Parser. Based on your selected algorithm, find location of venues and compute new value of this venue and replace result bits in file data. Finally write file in same input extension.
I hope I helped.
What you're looking for is called watermarking.
A little warning: Watermarking algorithms use insane amounts of redundancy to ensure high robustness of the information being embedded. That means the amount of data you'll be able to hide in an image will be orders of magnitude lower compared to standard steganographic algorithms.

Appropriate image file format for losslessly compressing series of screenshots

I am building an application which takes a great many number of screenshots during the process of "recording" operations performed by the user on the windows desktop.
For obvious reasons I'd like to store this data in as efficient a manner as possible.
At first I thought about using the PNG format to get this done. But I stumbled upon this: http://www.olegkikin.com/png_optimizers/
The best algorithms only managed a 3 to 5 percent improvement on an image of GUI icons. This is highly discouraging and reveals that I'm going to need to do better because just using PNG will not allow me to use previous frames to help the compression ratio. The filesize will continue to grow linearly with time.
I thought about solving this with a bit of a hack: Just save the frames in groups of some number, side by side. For example I could just store the content of 10 1280x1024 captures in a single 1280x10240 image, then the compression should be able to take advantage of repetitions across adjacent images.
But the problem with this is that the algorithms used to compress PNG are not designed for this. I am arbitrarily placing images at 1024 pixel intervals from each other, and only 10 of them can be grouped together at a time. From what I have gathered after a few minutes scanning the PNG spec, the compression operates on individual scanlines (which are filtered) and then chunked together, so there is actually no way that info from 1024 pixels above could be referenced from down below.
So I've found the MNG format which extends PNG to allow animations. This is much more appropriate for what I am doing.
One thing that I am worried about is how much support there is for "extending" an image/animation with new frames. The nature of the data generation in my application is that new frames get added to a list periodically. But I do have a simple semi-solution to this problem, which is to cache a chunk of recently generated data and incrementally produce an "animation", say, every 10 frames. This will allow me to tie up only 10 frames' worth of uncompressed image data in RAM, not as good as offloading it to the filesystem immediately, but it's not terrible. After the entire process is complete (or even using free cycles in a free thread, during execution) I can easily go back and stitch the groups of 10 together, if it's even worth the effort to do it.
Here is my actual question that everything has been leading up to. Is MNG the best format for my requirements? Those reqs are: 1. C/C++ implementation available with a permissive license, 2. 24/32 bit color, 4+ megapixel (some folks run 30 inch monitors) resolution, 3. lossless or near-lossless (retains text clarity) compression with provisions to reference previous frames to aid that compression.
For example, here is another option that I have thought about: video codecs. I'd like to have lossless quality, but I have seen examples of h.264/x264 reproducing remarkably sharp stills, and its performance is such that I can capture at a much faster interval. I suspect that I will just need to implement both of these and do my own benchmarking to adequately satisfy my curiosity.
If you have access to a PNG compression implementation, you could easily optimize the compression without having to use the MNG format by just preprocessing the "next" image as a difference with the previous one. This is naive but effective if the screenshots don't change much, and compression of "almost empty" PNGs will decrease a lot the storage space required.

If I take a loss-compressed file and save it again (e.g. JPEG) will there be loss of quality?

I've often wondered, if I load a compressed image file, edit it and the save it again, will it loose some quality? What if I use the same quality grade when saving, will the algorithms somehow detect that the file has already be compressed as a JPEG and therefore there is no point trying to compress the displayed representation again?
Would it be a better idea to always keep the original (say, a PSD) and always make changes to it and then save it as a JPEG or whatever I need?
Yes, you will lose further file information. If making multiple changes, work off of the original uncompressed file.
When it comes to lossy compression image formats such as JPEG, successive compression will lead to perceptible quality loss. The quality loss can be in the forms such as compression artifacts and blurriness of the image.
Even if one uses the same quality settings to save an image, there will still be quality loss. The only way to "preserve quality" or better yet, lose as little quality as possible, is to use the highest quality settings that is available. Even then, there is no guarantee that there won't be quality loss.
Yes, it would be a good idea to keep a copy of the original if one is going to make an image using a lossy compression scheme such as JPEG. The original could be saved with a compression scheme which is lossless such as PNG, which will preserve the quality of the file at the cost of (generally) larger file size.
(Note: There is a lossless version of JPEG, however, the most common one uses techniques such as DCT to process the image and is lossy.)
In general, yes. However, depending on the compression format there are usually certain operations (mainly rotation and mirroring) that can be performed without any loss of quality by software designed to work with the properties of the file format.
Theoretically, since JPEG compresses each 8x8 block pf pixels independantly, it should be possible to keep all unchanged blocks of an image if it is saved with the same compression settings, but I'm not aware of any software that implements this.
Of course. Because level of algorithm used initially will probably be different than in your subsequent saves. You can easily check this by using an Image manipulation software (eg. Photoshop). Save your file several times and change level of of compression each time. Just a slight bit. You'll see image degradation.
If the changes are local (fixing a few pixels, rather than reshading a region) and you use the original editing tool with the same settings, you may avoid degradation in the areas that you do not affect. Still, expect some additional quality loss around the area of change as the compressed blocks are affected, and cannot be recovered.
The real answer remains to carry out editing on the source image, captured without compression where possible, and applying the desired degree of compression before targeting the image for use.
Yes, you will always lose a bit of information when you re-save an image as JPEG. How much you lose depend on what you have done to the image after loading it.
If you keep the image the same size and only make minor changes, you will not lose that much data. When the image is loaded, an approximation of the original image is recreated from the compressed data. If you resave the image using the same compression, most of the data that you lose will be data that was recreated when loading.
If you resize the image, or edit large areas of it, you will lose more data when resaving it. Any edited part of the image will lose about the same amount of information as when you first compressed it.
If you want to get the best possible quality, you should always keep the original.