Simple compression in c++ - c++

we have a C++ MFC application and C# Web Service. They are communicating over HTTP, but as they are exchanging text data, compression will help a lot. But for some reasons, we can't use an external library.
We basically need to compress a byte array on one side and decompress it on the other side.
What should we use to compress our data? The best scenario would be if there is something in MFC/win32 api. Or is there some simplistic code with at most LGPL license that we could integrate into our project?

As has already been said, the zlib is probably what you are looking for.
There are several algorithms within:
The deflate and inflate pair
zlib itself
lzo
The simpler is probably lzo (I advise to pass the uncompressed size on the side), but zlib isn't very complicated either and the compression rate can be parameterized (speed / size trade off) which can be a plus depending on your constraints.
For XML data (since you were speaking of web services), LZO gave me a ~4x compression factor.

Can't you just switch on HTTP compression? http://en.wikipedia.org/wiki/HTTP_compression

zlib has a very liberal license.

Related

Trying to identify exact Lempel-Ziv variant of compression algorithm in firmware

I'm currently reverse engineering a firmware that seems to be compressed, but am really having hard time identifying which algorithm it is using.
I have the original uncompressed data dumped from the flash chip, below is some of human readable data, uncompressed vs (supposedly) compressed:
You can get the binary portion here, should it helps: Link
From what I can tell, it might be using Lempel-Ziv variant of compression algorithm such as LZO, LZF or LZ4.
gzip and zlib can be ruled out because there will be very little to no human readable data after compression.
I do tried to compress the dumped data with Lempel-Ziv variant algorithms mentioned above using their respective Linux cli tools, but none of them show exact same output as the "compressed data".
Another idea I have for now is to try to decompress the data with each algorithm and see what it gives. But this is very difficult due to lack of headers in the compressed firmware. (Binwalk and signsrch both detected nothing.)
Any suggestion on how I can proceed?

Compression library for Cross Platform

We need to compress and send the data from each of the endpoint client (IOS/Android/Windows Classic) and decompress it in serve end using .NET.
Is there any open source common library for compress/decompress which can be used in this scenario (common X platform).
Pl advise.
Just about any programming platform developed it the past 20 years supports zlib out of the box. Since they generally all incorporate the same free library, the data they generate is interoperable.
Look through API documentation for keywords like "zlib", "gzip", or "deflate". For example, in Android, check out Deflater and DeflaterOutputStream which implement zlib.

Software that apply LZ77 and LZW dictionary based compression algorithm

Are there good applications (software) that perform dictionary based compression algorithm (LZ77 and LZW). And it is better if the application show: Compression ratio, Compression and decompression Time.
I want to apply the compression in the text file and see the changes of file’s content after compressing.
Thanks
Probably the most widespread compression/decompression library is zlib, which uses LZ77. It is incredibly portable and runs on Linux and Windows. It also has a license with very few restrictions.
Starting with Windows XP, Windows supports LZ compression natively (see related functions).

Transferring large files with web services

What is the best way to transfer large files with web services ? Presently we are using the straight forward option to transfer the binary data by converting the binary data into base 64 format and embeding the base 64 encoding into soap envelop itself.But it slows down the application performance considerably.Please suggest something for performance improvement.
In my opinion the best way to do this is to not do this!
The Idea of Webservices is not designed to transfer large files. You should really transfer an url to the file and let the receiver of the message pull the file itsself.
IMHO that would be a better way to do this then encoding and sending it.
Check out MTOM, a W3C standard designed to transfer binary files through SOAP.
From Wikipedia:
MTOM provides a way to send the binary
data in its original binary form,
avoiding any increase in size due to
encoding it in text.
Related resources:
SOAP Message Transmission Optimization Mechanism
Message Transmission Optimization Mechanism (Wikipedia)

What compression/archive formats support inter-file compression?

This question on archiving PDF's got me wondering -- if I wanted to compress (for archival purposes) lots of files which are essentially small changes made on top of a master template (a letterhead), it seems like huge compression gains can be had with inter-file compression.
Do any of the standard compression/archiving formats support this? AFAIK, all the popular formats focus on compressing each single file.
Several formats do inter-file compression.
The oldest example is .tar.gz; a .tar has no compression but concatenates all the files together, with headers before each file, and a .gz can compress only one file. Both are applied in sequence, and it's a traditional format in the Unix world. .tar.bz2 is the same, only with bzip2 instead of gzip.
More recent examples are formats with optional "solid" compression (for instance, RAR and 7-Zip), which can internally concatenate all the files before compressing, if enabled by a command-line flag or GUI option.
Take a look at google's open-vcdiff.
http://code.google.com/p/open-vcdiff/
It is designed for calculating small compressed deltas and implements RFC 3284.
http://www.ietf.org/rfc/rfc3284.txt
Microsoft has an API for doing something similar, sans any semblance of a standard.
In general the algorithms you are looking for are ones based on Bentley/McIlroy:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.8470
In particular these algorithms will be a win if the size of the template is larger than the window size (~32k) used by gzip or the block size (100-900k) used by bzip2.
They are used by Google internally inside of their BIGTABLE implementation to store compressed web pages for much the same reason you are seeking them.
Since LZW compression (which pretty much they all use) involves building a table of repeated characters as you go along, such as schema as you desire would limit you to having to decompress the entire archive at once.
If this is acceptable in your situation, it may be simpler to implement a method which just joins your files into one big file before compression.