How do you Deflate data and put it into a vector? - c++

With zstr, a header-only C++ zlib wrapper library, I’m trying to Deflate a std::string and put it into a std::vector<unsigned char>.
zstr::ostream deflating_stream(std::cout);
deflating_stream.write(content.data(), content.size());
The above code works: it prints the Deflate’d. The problem is, I’m not familiar with C++ streams and I cannot get it into a std::vector. Tried several times in vain with std::ostringstream, std::ostream, std::istringstream, std::istreambuf_iterator, std::streambuf, .rdbuf(), et cetera, and the only thing that came out was an emptiness output (.tellp() == 0).
How do I Deflate a std::string and put it into a std::vector<unsigned char>?
The following is some of my tries. I have no idea how to access the Deflate’d data.
std::istringstream is;
std::ostream ss(is.rdbuf());
zstr::ostream deflating_stream(ss);
deflating_stream.write(
uncompressed_string.data(),
uncompressed_string.size()
);
the_vector.insert(
the_vector.cend(),
std::istreambuf_iterator<char>(is),
std::istreambuf_iterator<char>()
);
std::ostringstream oss;
zstr::ostream deflating_stream(oss);
deflating_stream.write(
uncompressed_string.data(),
uncompressed_string.size()
);
const std::string deflated = oss.str();
the_vector.insert(
the_vector.cend(),
deflated.cbegin(),
deflated.cend()
);
std::stringstream ss;
zstr::ostream deflating_stream(ss);
deflating_stream.write(
uncompressed_string.data(),
uncompressed_string.size()
);
std::string deflated = ss.str();
std::cout << deflated.size(); // Says 0.

Something like this works:
#include <iostream>
#include <sstream>
#include <vector>
#include <string>
#include <algorithm>
#include "zstr.hpp"
int main() {
std::string text{"some text\n"};
std::stringbuf buffer;
zstr::ostream compressor{&buffer};
// Must flush to get complete gzip data in buffer
compressor << text << std::flush;
// It's probably easier to use just the string...
auto compstr = buffer.str();
std::vector<unsigned char> deflated;
deflated.resize(compstr.size());
std::copy(compstr.begin(), compstr.end(), deflated.begin());
std::cout.write(reinterpret_cast<char *>(deflated.data()), deflated.size());
return 0;
}
After compiling:
$ ./a.out | zcat
some text

Related

Boost gzip how to output compressed string as text

I'm using boost gzip example code here.
I am attempting to compress a simple string test and am expecting the compressed string H4sIAAAAAAAACitJLS4BAAx+f9gEAAAA as shown in this online compressor
static std::string compress(const std::string& data)
{
namespace bio = boost::iostreams;
std::stringstream compressed;
std::stringstream origin(data);
bio::filtering_streambuf<bio::input> out;
out.push(bio::gzip_compressor(bio::gzip_params(bio::gzip::best_compression)));
out.push(origin);
bio::copy(out, compressed);
return compressed.str();
}
int main(int argc, char* argv[]){
std::cout << compress("text") << std::endl;
// prints out garabage
return 0;
}
However when I print out the result of the conversion I get garbage values like +I-. ~
I know that it's a valid conversion because the decompression value returns the correct string. However I need the format of the string to be human readable i.e. H4sIAAAAAAAACitJLS4BAAx+f9gEAAAA.
How can I modify the code to output human readable text?
Thanks
Motivation
The garbage format is not compatible with my JSON library where I will send the compressed text through.
The example site completely fails to mention they also base64 encode the result:
base64 -d <<< 'H4sIAAAAAAAACitJLS4BAAx+f9gEAAAA' | gunzip -
Prints:
test
In short, you need to also do that:
Live On Coliru
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <iostream>
#include <sstream>
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/transform_width.hpp>
std::string decode64(std::string const& val)
{
using namespace boost::archive::iterators;
return {
transform_width<binary_from_base64<std::string::const_iterator>, 8, 6>{
std::begin(val)},
{std::end(val)},
};
}
std::string encode64(std::string const& val)
{
using namespace boost::archive::iterators;
std::string r{
base64_from_binary<transform_width<std::string::const_iterator, 6, 8>>{
std::begin(val)},
{std::end(val)},
};
return r.append((3 - val.size() % 3) % 3, '=');
}
static std::string compress(const std::string& data)
{
namespace bio = boost::iostreams;
std::istringstream origin(data);
bio::filtering_istreambuf in;
in.push(
bio::gzip_compressor(bio::gzip_params(bio::gzip::best_compression)));
in.push(origin);
std::ostringstream compressed;
bio::copy(in, compressed);
return compressed.str();
}
static std::string decompress(const std::string& data)
{
namespace bio = boost::iostreams;
std::istringstream compressed(data);
bio::filtering_istreambuf in;
in.push(bio::gzip_decompressor());
in.push(compressed);
std::ostringstream origin;
bio::copy(in, origin);
return origin.str();
}
int main() {
auto msg = encode64(compress("test"));
std::cout << msg << std::endl;
std::cout << decompress(decode64(msg)) << std::endl;
}
Prints
H4sIAAAAAAAC/ytJLS4BAAx+f9gEAAAA
test

Simple Zlib C++ String Compression and Decompression

I need a simple compression and decompression of a std::string in C++. I looked at this site and the code is for Character array. What I want to implement are the two functions:
std::string original = "This is to be compressed!!!!";
std::string compressed = string_compress(original);
std::cout << compressed << std::endl;
std::string decompressed = string_decompress(compressed);
std::cout << decompressed << std::endl;
I had tried the boost compression as:
std::string CompressData(const std::string &data)
{
std::stringstream compressed;
std::stringstream decompressed;
decompressed << data;
boost::iostreams::filtering_streambuf<boost::iostreams::input> out;
out.push(boost::iostreams::zlib_compressor());
out.push(decompressed);
boost::iostreams::copy(out, compressed);
return compressed.str();
}
std::string DecompressData(const std::string &data)
{
std::stringstream compressed;
std::stringstream decompressed;
compressed << data;
boost::iostreams::filtering_streambuf<boost::iostreams::input> in;
in.push(boost::iostreams::zlib_decompressor());
in.push(compressed);
boost::iostreams::copy(in, decompressed);
return decompressed.str();
}
but the code sometimes gives Null characters in string ie \u0000. How do I handle if the compressed data contains these null characters. Is the return type string correct? How can I implement function string_compress and string_decompress using zlib?
You can do as #LawfulEvil suggested. Here is the code snippet that works :)
std::string original = "This is to be compressed!!!!";
std::string compressed_encoded = string_compress_encode(original);
std::cout << compressed_encoded << std::endl;
std::string decompressed_decoded = string_decompress_decode(compressed_encoded);
std::cout << decompressed_decoded << std::endl;
Using this as the base64 encode/decode library.
#include <sstream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/zlib.hpp>
#include <cpp-base64/base64.h>
std::string string_compress_encode(const std::string &data)
{
std::stringstream compressed;
std::stringstream original;
original << data;
boost::iostreams::filtering_streambuf<boost::iostreams::input> out;
out.push(boost::iostreams::zlib_compressor());
out.push(original);
boost::iostreams::copy(out, compressed);
/**need to encode here **/
std::string compressed_encoded = base64_encode(reinterpret_cast<const unsigned char*>(compressed.c_str()), compressed.length());
return compressed_encoded;
}
std::string string_decompress_decode(const std::string &data)
{
std::stringstream compressed_encoded;
std::stringstream decompressed;
compressed_encoded << data;
/** first decode then decompress **/
std::string compressed = base64_decode(compressed_encoded);
boost::iostreams::filtering_streambuf<boost::iostreams::input> in;
in.push(boost::iostreams::zlib_decompressor());
in.push(compressed);
boost::iostreams::copy(in, decompressed);
return decompressed.str();
}
Compression makes use of all the values available for each byte, so it will appear as 'garbage' or 'weird' characters when attempting to view as ascii. Its expected. You'll need to encode the data for transmission / json packing to avoid nulls. I suggest base 64. Code to do that is available at the link below(which I didn't author so I won't copy here).
http://www.adp-gmbh.ch/cpp/common/base64.html
Binary data JSONCPP

Uncompress data in memory using Boost gzip_decompressor

I'm trying to decompress binary data in memory using Boost gzip_decompressor. From this answer, I adapted the following code:
vector<char> unzip(const vector<char> compressed)
{
vector<char> decompressed = vector<char>();
boost::iostreams::filtering_ostream os;
os.push(boost::iostreams::gzip_decompressor());
os.push(boost::iostreams::back_inserter(decompressed));
boost::iostreams::write(os, &compressed[0], compressed.size());
return decompressed;
}
However, the returned vector has zero length. What am I doing wrong? I tried calling flush() on the os stream, but it did not make a difference
Your code works for me with this simple test program:
#include <iostream>
#include <vector>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
std::vector<char> unzip(const std::vector<char> compressed)
{
std::vector<char> decompressed = std::vector<char>();
boost::iostreams::filtering_ostream os;
os.push(boost::iostreams::gzip_decompressor());
os.push(boost::iostreams::back_inserter(decompressed));
boost::iostreams::write(os, &compressed[0], compressed.size());
return decompressed;
}
int main() {
std::vector<char> compressed;
{
boost::iostreams::filtering_ostream os;
os.push(boost::iostreams::gzip_compressor());
os.push(boost::iostreams::back_inserter(compressed));
os << "hello\n";
os.reset();
}
std::cout << "Compressed size: " << compressed.size() << '\n';
const std::vector<char> decompressed = unzip(compressed);
std::cout << std::string(decompressed.begin(), decompressed.end());
return 0;
}
Are you sure your input was compressed with gzip and not some other method (e.g. raw deflate)? gzip compressed data begins with bytes 1f 8b.
I generally use reset() or put the stream and filters in their own block to make sure that output is complete. I did both for compression above, just as an example.

boost json serialization and message_queue segfault

i'm making some test with boost interprocess and ptree structure, i have a segfault when i try to read the message sent(or when i try to parse it in json).
i'm using boost1.49 on debian linux.
i'm serializing it in json for later uses, and because i didn't find any good doc for the direct serialization of the boost property threes.
this is the code i'm using to test(the commed say where the segfault is):
recv.cc
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>
#include <boost/interprocess/ipc/message_queue.hpp>
#include <sstream>
struct test_data{
std::string action;
std::string name;
int faceID;
uint32_t Flags;
uint32_t freshness;
};
test_data recvData()
{
boost::interprocess::message_queue::remove("queue");
boost::property_tree::ptree pt;
test_data data;
std::istringstream buffer;
boost::interprocess::message_queue mq(boost::interprocess::open_or_create,"queue", 1, sizeof(buffer)
boost::interprocess::message_queue::size_type recvd_size;
unsigned int pri;
mq.receive(&buffer,sizeof(buffer),recvd_size,pri);
std::cout << buffer.str() << std::endl; //the segfault is there
boost::property_tree::read_json(buffer,pt);
data.action = pt.get<std::string>("action");
data.name = pt.get<std::string>("name");
data.faceID = pt.get<int>("face");
data.Flags = pt.get<uint32_t>("flags");
data.freshness = pt.get<uint32_t>("freshness");
boost::interprocess::message_queue::remove("queue");
return data;
}
int main()
{
test_data test;
test = recvData();
std::cout << test.action << test.name << test.faceID << test.Flags << test.freshness << std::endl;
}
sender.cc
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>
#include <boost/interprocess/ipc/message_queue.hpp>
#include <sstream>
struct test_data{
std::string action;
std::string name;
int faceID;
uint32_t Flags;
uint32_t freshness;
};
int sendData(test_data data)
{
boost::property_tree::ptree pt;
pt.put("action",data.action);
pt.put("name",data.name);
pt.put("face",data.faceID);
pt.put("flags",data.Flags);
pt.put("freshness",data.freshness);
std::ostringstream buffer;
boost::property_tree::write_json(buffer,pt,false);
boost::interprocess::message_queue mq(boost::interprocess::open_only,"chiappen")
std::cout << sizeof(buffer) << std::endl;
mq.send(&buffer,sizeof(buffer),0);
return 0;
}
int main ()
{
test_data prova;
prova.action = "registration";
prova.name = "prefix";
prova.Flags = 0;
prova.freshness = 52;
sendData(prova);
}
I know it's a bit late to an answer right now, but anyway..
You can't pass an istringstream as a buffer for receive. Boost message queues only handle raw bytes and don't handle std like objects.
To make it work, you must use a char array or any buffer previously reserved with malloc.
Ex:
char buffer [1024];
mq.receive(buffer, sizeof(buffer), recvd_size, pri);
For sending it's the same, you can only send raw bytes, so you can't use ostringstream.
Hope it helps.

boost::iostreams::zlib::default_noheader seems to be ignored

I'm having trouble getting boost::iostreams's zlib filter to ignore gzip headers ... It seems that setting zlib_param's default_noheader to true and then calling zlib_decompressor() produces the 'data_error' error (incorrect header check). This tells me zlib is still expecting to find headers.
Has anyone gotten boost::iostreams::zlib to decompress data without headers? I need to be able to read and decompress files/streams that do not have the two-byte header. Any assistance will be greatly appreciated.
Here's a modified version of the sample program provided by the boost::iostreams::zlib documentation:
#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/zlib.hpp>
int main(int argc, char** argv)
{
using namespace std;
using namespace boost::iostreams;
ifstream ifs(argv[1]);
ofstream ofs("out");
boost::iostreams::filtering_istreambuf in;
zlib_params p(
zlib::default_compression,
zlib::deflated,
zlib::default_window_bits,
zlib::default_mem_level,
zlib::default_strategy,
true
);
try
{
in.push(zlib_decompressor(p));
in.push(ifs);
boost::iostreams::copy(in, ofs);
ofs.close();
ifs.close();
}
catch(zlib_error& e)
{
cout << "zlib_error num: " << e.error() << endl;
}
return 0;
}
I know my test data is not bad; I wrote a small program to call gzread() on the test file; it is successfully decompressed ... so I'm confused as to why this does not work.
Thanks in advance.
-Ice
I think what you want to do is something that's described here which is to adjust the window bits parameter.
e.g
zlib_params p;
p.window_bits = 16 + MAX_WBITS;
in.push(zlib_decompressor(p));
in.push(ifs);
MAX_WBITS is defined in zlib.h I think.
Much simple, try this:
FILE* fp = fopen("abc.gz", "w+");
int dupfd = dup( fileno( fp ) );
int zfp = gzdopen( dupfd, "ab" )
gzwrite( zfp, YOUR_DATA, YOUR_DATA_LEN );
gzclose( zfp );
fclose( fp );
Link with zlib and include zlib.h
You can use STDOUT instead of a file by using fileno( stdout )
Just use the boost::iostreams::gzip_decompressor for decompressing gzip files.
For example:
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/device/file.hpp>
#include <boost/iostreams/filtering_stream.hpp>
// ...
boost::iostreams::filtering_istream stream;
stream.push(boost::iostreams::gzip_decompressor());
ifstream file(filename, std::ios_base::in | std::ios_base::binary);
stream.push(file);