In a multiplayer game setting, I was going to use zlib to compress larger strings before sending them. I placed the resulting data back into strings, which are to be sent as byte streams using TCP.
My problem is, that I need to place control characters into the string as well. For example, I need to add the original string length (in plain text) to the front of the compressed string, and seperate it from the compressed data using some symbol, like "|".
But I can't find a way of knowing which bytes are actual content and which bytes are control characters. Are there any characters that a zlib-compressed-string will never contain (besides 0, which I can't use since it marks the end of a c-string) which I can use to seperate "metadata" and "compressed data"?
No, there are no byte values that cannot be contained in a zlib stream. However a zlib stream is self-terminating. By simply using inflate() to decompress the stream, you will find the end when it returns Z_STREAM_END. The bytes not consumed by inflate() are the next bytes immediately after the zlib stream. Upon completion of decompression you know both how many compressed data bytes there are in the stream, as well as how many uncompressed bytes were generated.
If you are simply processing your entire stream, with the zlib stream imbedded, sequentially, then there is no need to store either the compressed or uncompressed lengths anywhere. That information is inherent in the zlib data. You would only need to store such data if you had a need to process your stream non-sequentially, or with the desire to access other data in stream after the zlib data without having to decompress the zlib stream.
If you control both creation of compressed blobs and decompression of them, you can prepend the size to the compressed data (prrobably by reserving some bytes at the beginning of the buffer), and then when decompressing skip the size and pass the pointer to compressed data to the decompression utility. This way you don't have to worry about messing up your compressed data with the size: the decompression code never sees the bytes that carry the size information.
How are you doing the compression? If you're piping the output
through an external program, there's not much you can do, but if
you're using something internal, like a compressing streambuf,
you should be able to output to the original streambuf when
necessary. With filtering streams from boost::iostream, for
example, you can write the length, then push the compression
stream, write the data to be compressed, then pop the
compression stream, and continue writing plain text. (You may
have to insert a judicious flush here and there, before changing
the filter stack.) Or you should be able to compress the parts
you want to compress into a buffer, and use
std::ostream::write to output them.
Related
I have a file written using gzwrite. Now i want to edit this file and insert some data in the middle by seeking. Is this possible with gzseek/gzwrite in cpp?
No, it isn't possible. You have to create a new file by successively writing the pieces.
So it is not much different from inserting data in the middle of an uncompressed file, except for one thing: with the uncompressed file, you could leave a hole of the right size (a series of spaces, for example) and later on overwrite that with the data to be inserted, but of course that is not possible with the compressed file because you cannot predict its compressed length.
I'm an intern doing research into whether using Brotli compression in a piece of software provides a performance boost over the current release, which uses GZip.
My task is to change anything using GZip to use Brotli compression instead. One function I need to replace does a check to test if a buffer contains data that was compressed using GZip. It does this by checking the stream identifier at the beginning and end:
bool isGzipped() const
{
// Gzip file signature (0x1f8b)
return
(_bufferEnd >= _bufferStart + 2) &&
(static_cast<unsigned char>(_bufferStart[0]) == 0x1f) &&
(static_cast<unsigned char>(_bufferStart[1]) == 0x8b);
}
I want to create similar function bool isBrotliEncoded(). I was wondering if there is a similar quick check that can can be done with Brotli encoded buffers? I've had a look at the byte values for some of the compressed files that brotli produces, but I can't find a rule that holds for all of them. Some start with 0x5B, some with 0x1B, compression of empty files results in 0x06, and files that have been compressed multiple times start with a range of different values. The end of each file is also inconsistent.
The only way I know of to test if it is in the correct format is to attempt decompression and wait for an error, which defeats the purpose of doing this test.
So my question is: Does anyone know how to check if a buffer has been compressed with Brotli without attempting decompression and waiting for failure?
Unfortunately, the raw brotli format is not well suited to such detection, even when simply trying to decompress and waiting for an error.
I ran a trial of one million brotli decompressions of random data. About 5% of them checked out as good brotli streams. So you've already got a problem right there. 3.5% of the million are a single byte, since there are nine one-byte values that are each a valid brotli stream. The mean length of the random valid streams was almost a megabyte.
For those in which an error was detected (about 95% of the million cases), 3.5% went more than a megabyte before the error was detected. 1.4% went more than ten megabytes. The mean number of random bytes before finding an error was 309 KB. Another problem.
In short, the probability of a false positive is relatively high, and the number of bytes to process to find a negative can be quite large.
If you are writing this software, then you should put your own header before the brotli data to aid in detection. Or you can use the brotli framing format that I developed at their request, which has a unique four-byte header before the brotli compressed stream. That would reduce the probability of a false positive dramatically.
Brotli is formally defined in RFC 7932. The format of the data stream is covered in Section 2: Compressed Representation Overview and Section 9: Compressed Data Format. Brotli does not employ leading/trailing identifiers like gzip does, but it does consist of a sequence of uncompressed headers and commands that describe the compressed data. They are not all aligned on byte boundaries, you have to parse them at the bit level instead (Brotli is processed as a stream of bits and bytes). Refer to Section 10: Decoding Algorithm for how to read these headers. If you parse out a few headers that follow the Brotli format without error then it is a good bet that you are dealing with a Brotli compressed buffer.
My objective is to convert a 32-bit bitmap(BGRA) buffer into png image in real-time using C/C++. To achieve it, i used libpng library to convert bitmap buffer and then write into a png file. However it seemed to take huge time (~5 secs) to execute on target arm board (quad core processor) in single thread. On profiling, i found that libpng compression process (deflate algorithm) is taking more than 90% of time. So i tried to reduce it by using parallelization in some way. The end goal here is to get it done in less than 0.5 secs at least.
Now since a png can have multiple IDAT chunks, i thought of writing png with multiple IDATs in parallel. To write custom png file with multiple IDATs following methodology is adopted
1. Write PNG IHDR chunk
2. Write IDAT chunks in parallel
i. Split input buffer in 4 parts.
ii. compress each part in parallel using zlib "compress" function.
iii. compute CRC of chunk { "IDAT"+zlib compressed data }.
iv. create IDAT chunk i.e. { "IDAT"+zlib compressed data+ CRC}.
v. Write length of IDAT chunk created.
vi. Write complete chunk in sequence.
3. write IEND chunk
Now the problem is the png file created by this method is not valid or corrupted. Can somebody point out
What am I doing wrong?
Is there any fast implementation of zlib compress or multi-threaded png creation, preferably in C/C++?
Any other alternate way to achieve target goal?
Note: The PNG specification is followed in creating chunks
Update:
This method works for creating IDAT in parallel
1. add one filter byte before each row of input image.
2. split image in four equal parts. <-- may not be required passing pointer to buffer and their offsets
3. Compress Image Parts in parallel
(A)for first image part
--deflateinit(zstrm,Z_BEST_SPEED)
--deflate(zstrm, Z_FULL_FLUSH)
--deflateend(zstrm)
--store compressed buffer and its length
--store adler32 for current chunk, {a1=zstrm->adler} <--adler is of uncompressed data
(B)for second and third image part
--deflateinit(zstrm,Z_BEST_SPEED)
--deflate(zstrm, Z_FULL_FLUSH)
--deflateend(zstrm)
--store compressed buffer and its length
--strip first 2-bytes, reduce length by 2
--store adler32 for current chunk zstrm->adler,{a2,a3 similar to A} <--adler is of uncompressed data
(C) for last image part
--deflateinit(zstrm,Z_BEST_SPEED)
--deflate(zstrm, Z_FINISH)
--deflateend(zstrm)
--store compressed buffer and its length
--strip first 2-bytes and last 4-bytes of buffer, reduce length by 6
--here last 4 bytes should be equal to ztrm->adler,{a4=zstrm->adler} <--adler is of uncompressed data
4. adler32_combine() all four parts i.e. a1,a2,a3 & a4 <--last arg is length of uncompressed data used to calculate adler32 of 2nd arg
5. store total length of compressed buffers <--to be used in calculating CRC of complete IDAT & to be written before IDaT in file
6. Append "IDAT" to Final chunk
7. Append all four compressed parts in sequence to Final chunk
8. Append adler32 checksum computed in step 4 to Final chunk
9. Append CRC of Final chunk i.e.{"IDAT"+data+adler}
To be written in png file in this manner: [PNG_HEADER][PNG_DATA][PNG_END]
where [PNG_DATA] ->Length(4-bytes)+{"IDAT"(4-bytes)+data+adler(4-bytes)}+CRC(4-bytes)
Even when there are multiple IDAT chunks in a PNG datastream, they still contain a single zlib compressed datastream. The first two bytes of the first IDAT are the zlib header, and the final four bytes of the final IDAT are the zlib "adler32" checksum of the entire datastream (except for the 2-byte header), computed before compressing it.
There is a parallel gzip (pigz) under development at zlib.net/pigz. It will generate zlib datastreams instead of gzip datastreams when invoked as "pigz -z".
For that you won't need to split up your input file because the parallel compression happens internally to pigz.
In your step ii, you need to use deflate(), not compress(). Use Z_FULL_FLUSH on the first three parts, and Z_FINISH on the last part. Then you can concatenate them to a single stream, after pulling off the two-byte header from the last three (keep the header on the first one), and pulling the four-byte check values off of the last one. For all of them, you can get the check value from strm->adler. Save those.
Use adler32_combine() to combine the four check values you saved into a single check value for the complete input. You can then tack that on to the end of the stream.
And there you have it.
I'm trying to use zlib in an iPhone app to compress a text file into a gzip file as a test. I am using the following code
const char *s = [[Path stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#".%#", [Path pathExtension]] withString:#".gz"] UTF8String];
gzFile *fi = (gzFile *)gzopen(s, "wb");
const char *c = readFile(Path.UTF8String);
gzwrite(fi, c, strlen(c));
gzclose(fi);
where readFile() returns a const char* that was obtained from the file using the fgets() function. The problem is, when I use this to compress a file, it doesn't compress it, but instead the gzip file is larger than original file. For example, I have a text file that is 90 bytes, and after using this method the size of the gzip is 98 bytes. Why isn't the gzip smaller than the original file?
The GZip format includes fixed-size header information. Because you are compressing so little data, the header information is larger than the space you are saving.
90 bytes is generally not worth compressing.
http://www.gzip.org/zlib/rfc-gzip.html#header-trailer
Regardless of the compression algorithm used there's always a chance that the generated data will be larger than the input otherwise it wouldn't be possible to encode any combination of input bit patterns.
As already stated in your special case a very small file size compared to header overhead seems to be the problem.
Nevertheless it might be good to keep in mind that there's never a guarantee the "compressed" file size will be smaller.
The data you are trying to compress is too small and there is not a lot of redundancy, so there is nothing left to compress. Compression algorithms work, to put it very simply, by eliminating repeating sequences in data. In 90 bytes, you probably don't have much redundancy, unless it's text like "aaaaaaa....".
Fixed header overhead, as already mentioned.
Try a bigger data file.
I am currently trying to implement a PNG encoder in C++ based on libpng that uses OpenMP to speed up the compression process.
The tool is already able to generate PNG files from various image formats.
I uploaded the complete source code to pastebin.com so you can see what I have done so far: http://pastebin.com/8wiFzcgV
So far, so good! Now, my problem is to find a way how to parallelize the generation of the IDAT chunks containing the compressed image data. Usually, the libpng function png_write_row gets called in a for-loop with a pointer to the struct that contains all the information about the PNG file and a row pointer with the pixel data of a single image row.
(Line 114-117 in the Pastebin file)
//Loop through image
for (i = 0, rp = info_ptr->row_pointers; i < png_ptr->height; i++, rp++) {
png_write_row(png_ptr, *rp);
}
Libpng then compresses one row after another and fills an internal buffer with the compressed data. As soon as the buffer is full, the compressed data gets flushed in a IDAT chunk to the image file.
My approach was to split the image into multiple parts and let one thread compress row 1 to 10 and another thread 11 to 20 and so on. But as libpng is using an internal buffer it is not as easy as I thought first :) I somehow have to make libpng write the compressed data to a separate buffer for every thread. Afterwards I need a way to concatenate the buffers in the right order so I can write them all together to the output image file.
So, does someone have an idea how I can do this with OpenMP and some tweaking to libpng? Thank you very much!
This is too long for a comment but is not really an answer either--
I'm not sure you can do this without modifying libpng (or writing your own encoder). In any case, it will help if you understand how PNG compression is implemented:
At the high level, the image is a set of rows of pixels (generally 32-bit values representing RGBA tuples).
Each row can independently have a filter applied to it -- the filter's sole purpose is to make the row more "compressible". For example, the "sub" filter makes each pixel's value the difference between it and the one to its left. This delta encoding might seem silly at first glance, but if the colours between adjacent pixels are similar (which tends to be the case) then the resulting values are very small regardless of the actual colours they represent. It's easier to compress such data because it's much more repetitive.
Going down a level, the image data can be seen as a stream of bytes (rows are no longer distinguished from each other). These bytes are compressed, yielding another stream of bytes. The compressed data is arbitrarily broken up into segments (anywhere you want!) written to one IDAT chunk each (along with a little bookkeeping overhead per chunk, including a CRC checksum).
The lowest level brings us to the interesting part, which is the compression step itself. The PNG format uses the zlib compressed data format. zlib itself is just a wrapper (with more bookkeeping, including an Adler-32 checksum) around the real compressed data format, deflate (zip files use this too). deflate supports two compression techniques: Huffman coding (which reduces the number of bits required to represent some byte-string to the optimal number given the frequency that each different byte occurs in the string), and LZ77 encoding (which lets duplicate strings that have already occurred be referenced instead of written to the output twice).
The tricky part about parallelizing deflate compression is that in general, compressing one part of the input stream requires that the previous part also be available in case it needs to be referenced. But, just like PNGs can have multiple IDAT chunks, deflate is broken up into multiple "blocks". Data in one block can reference previously encoded data in another block, but it doesn't have to (of course, it may affect the compression ratio if it doesn't).
So, a general strategy for parallelizing deflate would be to break the input into multiple large sections (so that the compression ratio stays high), compress each section into a series of blocks, then glue the blocks together (this is actually tricky since blocks don't always end on a byte boundary -- but you can put an empty non-compressed block (type 00), which will align to a byte boundary, in-between sections). This isn't trivial, however, and requires control over the very lowest level of compression (creating deflate blocks manually), creating the proper zlib wrapper spanning all the blocks, and stuffing all this into IDAT chunks.
If you want to go with your own implementation, I'd suggest reading my own zlib/deflate implementation (and how I use it) which I expressly created for compressing PNGs (it's written in Haxe for Flash but should be comparatively easy to port to C++). Since Flash is single-threaded, I don't do any parallelization, but I do split the encoding up into virtually independent sections ("virtually" because there's the fractional-byte state preserved between sections) over multiple frames, which amounts to largely the same thing.
Good luck!
I finally got it to parallelize the compression process.
As mentioned by Cameron in the comment to his answer I had to strip the zlib header from the zstreams to combine them. Stripping the footer was not required as zlib offers an option called Z_SYNC_FLUSH which can be used for all chunks (except the last one which has to be written with Z_FINISH) to write to a byte boundary. So you can simply concatenate the stream outputs afterwards. Eventually, the adler32 checksum has to be calculated over all threads and copied to the end of the combined zstreams.
If you are interested in the result you can find the complete proof of concept at https://github.com/anvio/png-parallel