compressed length of a string by boost::iostreams - c++

I have a string (of some fixed length), which I need to compress and then compare the compressed lengths (as a proxy for redundancy in the data or as a rough approximation to the Kolmogorov complexity). Currently, I am using boost::iostreams for compression, which seems working well. However, I don't know how to obtain the size of the compressed data. Can someone help, please?
The code snippet is
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/filesystem.hpp>
#include <string>
#include <sstream>
namespace io = boost::iostreams;
int main() {
std::string memblock;
std::cout << "Input the string to be compressed:";
std::cin >> memblock;
std::cout << memblock << std::endl;
io::filtering_ostream out;
out.push(io::gzip_compressor());
out.push(io::file_descriptor_sink("test.gz"));
out.write (memblock.c_str(), memblock.size());
std::cout << out.size() << std::endl;
return 0;
}

You can try adding boost::iostreams::counter to you chain between the compressor and sink and then calling it's characters() member to get number of bytes that went through it.
This works for me:
#include <boost/iostreams/filter/counter.hpp>
...
io::filtering_ostream out;
out.push(io::counter());
out.push(io::gzip_compressor());
out.push(io::counter());
out.push(io::file_descriptor_sink("test.gz"));
out.write (memblock.c_str(), memblock.size());
io::close(out); // Needed for flushing the data from compressor
std::cout << "Wrote " << out.component<io::counter>(0)->characters() << " bytes to compressor, "
<< "got " << out.component<io::counter>(2)->characters() << " bytes out of it." << std::endl;

I figured out yet another (and slightly slicker) way to achieve the compressed length of a string. I thought sharing it here, but basically it is simply passing the uncompressed string to a filtered buffer and copying the output back to a string:
template<typename T>
inline std::string compressIt(std::vector<T> s){
std::stringstream uncompressed, compressed;
for (typename std::vector<T>::iterator it = s.begin();
it != s.end(); it++)
uncompressed << *it;
io::filtering_streambuf<io::input> o;
o.push(io::gzip_compressor());
o.push(uncompressed);
io::copy(o, compressed);
return compressed.str();
}
Later one can easily get the size of the compressed string as
compressIt(uncompressedString).size()
I feel it is better for it does not required me to create an output file as previously.
cheers,
Nikhil

one other way would be
stream<array_source> input_stream(input_data,input_data_ize);
stream<array_sink> compressed_stream(compressed_data,alloc_compressed_size);
filtering_istreambuf out;
out.push(gzip_compressor());
out.push(input_stream);
int compressed_size = copy(out,compressed_stream);
cout << "size of compressed_stream" << compressed_size << endl;

Related

Stringstream Doesn't Include Initial Data

My problem is really petty, but nevertheless I have not found any answer by asking google or by asking peers. The problem can be shown by the following code:
std::ostringstream oss("I am a ");
oss << "donkey";
std::cout << oss.str();
Expected Output: "I am a donkey"
Actual Output: "donkey"
What happens here? Is the initial string the stringstream had to begin with been discarded?
You have to add std::ios_base::ate to the constructor, otherwise it would overwrite from the beginning:
#include <iostream>
#include <sstream>
using namespace std;
int main() {
std::ostringstream oss("I am a ", std::ios_base::ate);
oss << "donkey";
std::cout << oss.str();
return 0;
}
https://ideone.com/IIfOkB
More information: https://en.cppreference.com/w/cpp/io/basic_ostringstream
Example: https://en.cppreference.com/w/cpp/io/basic_ostringstream/str

C++ cereal de-serialization trouble with large size vector

I hope to serialize large size vector with cereal, C++ serialization library.
But, if trying to do that, the exception "Failed to read " + std::to_string(size) + " bytes from input stream! Read " + std::to_string(readSize)" is thrown.
Does anyone know a good solution for this?
I'm using VisualStudio 2017.
The source code is shown below.
#include <iostream>
#include <fstream>
#include "include\cereal\cereal.hpp"
#include "include\cereal\archives\binary.hpp"
#include "include\cereal\types\vector.hpp"
#include "include\cereal\types\string.hpp"
void show(std::vector<int> v) {
for (auto i : v)std::cout << i << ",";
std::cout << std::endl;
}
int main(void) {
const std::string file_name = "out.cereal";
{
std::vector<int> src;
// const int STOP = 10; //OK
const int STOP = 1000; // NG
for (int i = 0; i < STOP; i++)src.push_back(i);
std::cout << "src:" << std::endl;
show(src);
std::ofstream ofs(file_name, std::ios::binary);
cereal::BinaryOutputArchive archive(ofs);
archive(src);
}
{
std::vector<int> dst;
std::fstream fs(file_name);
cereal::BinaryInputArchive iarchive(fs);
iarchive(dst);
std::cout << "dst:" << std::endl;
show(dst);
}
#ifdef _MSC_VER
system("pause");
#endif
return 0;
}
You code works fine for me in Linux, so I think it is to do with the difference between text and binary handling on Windows. Check that you pass std::ios::binary when you are constructing the input stream. Also construct it as std::ifstream rather than just std::fstream.
I think this might have to do with Windows expecting (or adding) a Unicode byte-order mark, which is confusing the serializer.

Need help to understand where i am going wrong with strings

In c++ we have std::to_string which converts int/float/double to strings. So just to test my understanding of templates I tried the code below:
#include "iostream"
#include "sstream"
#include "string"
using std::cout;
template <typename T>
std::string getString(const T& data){
std::stringstream ss;
cout << '\n' << data << '\n';
ss << data;
std::string s;
ss >> s;
return s;
}
int main(int argc , char** argv){
cout << getString(1.0000011);
cout <<' '<<std::to_string(1.0000011);
return 0;
}
However, the output doesn't make sense, to_string gives me 1.0000011, whereas getString gets 1 and gives me 1. As I am using templates shouldn't getString get 1.0000011 as well and give me 1.0000011 too?
You can use std::setprecision in the <iomanip> header to set the precision that std::stringstream will use when formatting numeric data.
For example:
std::stringstream ss;
ss << std::setprecision(9) << data;
cout << ss.str();
Will print:
1.0000011
Here's a quick demo online: cpp.sh/9v7xf
As a side note, you don't have to create a string and output from the stringstream - you can replace the last 3 lines in getString() with:
return ss.str();
Numeric values are often truncated for appearance. You can supply the std::fixed manipulator from the iomanip standard header to avoid this issue.
#include "iomanip" // <- Add this header
#include "iostream"
#include "sstream"
#include "string"
using std::cout;
template <typename T>
std::string getString(const T& data)
{
std::stringstream ss;
cout << '\n' << data << '\n';
ss << std::fixed << data;
// ^^^^^^^^^^^^^ Add this
std::string s;
ss >> s;
return s;
}
int main(int argc, char** argv)
{
cout << getString(1.0000011);
cout << ' ' << std::to_string(1.0000011);
return 0;
}
<iomanip> needs included, and std::setprecision must be used when outputting float values to streams. Using your example, this looks like:
#include <iostream>
#include <iomanip> //include this.
#include <sstream>
#include <string>
template <typename T>
std::string getString(const T& data){
std::ostringstream ss;
ss << std::setprecision(8);
std::cout << std::setprecision(8);
std::cout << data << '\n';
ss << data;
return ss.str();
}
int main(int argc , char** argv){
std::cout << getString(1.0000011) << "\n";
std::cout << std::to_string(1.0000011) << std::endl;
return 0;
}
Which prints:
1.0000011
1.0000011
1.000001
Program ended with exit code: 0
Note how to_string alone truncates the floating point number!!! I suspect this is undesired behavior, but to_string cannot be manipulated directly, so...
If desired, you can fix this with the solution found here.
Otherwise, just use std:set_precision() when inserting into streams for precisely converted strings.

Writing to a .PGM file using C++

For some reason, I can't write anything to a .PGM file. The following code compiles without errors but nothing is written to the .PGM file it creates. I'm fairly new to C++ and pretty unfamiliar with working with strings in this syntax.
#include <iostream>
#include <iomanip>
#include <fstream>
#include <cstring>
#include <sstream>
int main(){
// Initialize variables.
const int ncols = 30;
const int nrows = 20;
const int maxval = 255;
std::string filename;
// Prompt user for filename.
std::cout << "What would you like to name the file of the PGM image? Please include .PGM at the end of your name." << std::endl;
// Uses getline() function to retrieve input from user into a string.
std::getline(std::cin, filename);
// Creates output stream object to use with managing the file.
std::ofstream fileOut(filename.c_str(),std::ios_base::out
|std::ios_base::binary
|std::ios_base::trunc
);
fileOut.open(filename.c_str());
fileOut << "P2" << " " << ncols << " " << nrows << " " << maxval << "\n";
fileOut.close();
}
I know there is another SO question similar to this one, but I used that answer to get here. I can't even get it to write the header part and that's not even the point of the assignment. Can anyone help?

How can I obtain the length of a const stringstream's buffer without copying or seeking?

I have a const std::stringstream and a desire to find out how many bytes there are in its underlying string buffer.
I cannot seekg to the end, tellg then seekg to the start again, because none of these operations are available constly.
I do not want to get the str().size() because str() returns a copy and this may not be a trivial amount of data.
Do I have any good options?
(The stream itself is presented to me as const, only because it is a member of another type, and I receive a const reference to an object of that type. The stream represents the contents of a "document", its encapsulating object represents a CGI response and I am trying to generate an accurate Content-Length HTTP header line from within operator<<(std::ostream&, const cgi_response&).)
I've never been very comfortable with stream buffers, but this seems to work for me:
#include <iostream>
#include <sstream>
std::stringstream::pos_type size_of_stream(const std::stringstream& ss)
{
std::streambuf* buf = ss.rdbuf();
// Get the current position so we can restore it later
std::stringstream::pos_type original = buf->pubseekoff(0, ss.cur, ss.out);
// Seek to end and get the position
std::stringstream::pos_type end = buf->pubseekoff(0, ss.end, ss.out);
// Restore the position
buf->pubseekpos(original, ss.out);
return end;
}
int main()
{
std::stringstream ss;
ss << "Hello";
ss << ' ';
ss << "World";
ss << 42;
std::cout << size_of_stream(ss) << std::endl;
// Make sure the output string is still the same
ss << "\nnew line";
std::cout << ss.str() << std::endl;
std::string str;
ss >> str;
std::cout << str << std::endl;
}
The key is that rdbuf() is const but returns a non-const buffer, which can then be used to seek.
If you want to know the remaining available input size:
#include <iostream>
#include <sstream>
std::size_t input_available(const std::stringstream& s)
{
std::streambuf* buf = s.rdbuf();
std::streampos pos = buf->pubseekoff(0, std::ios_base::cur, std::ios_base::in);
std::streampos end = buf->pubseekoff(0, std::ios_base::end, std::ios_base::in);
buf->pubseekpos(pos, std::ios_base::in);
return end - pos;
}
int main()
{
std::stringstream stream;
// Output
std::cout << input_available(stream) << std::endl; // 0
stream << "123 ";
std::cout << input_available(stream) << std::endl; // 4
stream << "567";
std::cout << input_available(stream) << std::endl; // 7
// Input
std::string s;
stream >> s;
std::cout << input_available(stream) << std::endl; // 4
stream >> s;
std::cout << input_available(stream) << std::endl; // 0
}
This is similar to #Cornstalks solution, but positions the input sequence correctly.
This should work :))
#include <iostream>
#include <sstream>
#include <boost/move/move.hpp>
int main()
{
const std::stringstream ss("hello");
std::cout << boost::move(ss).str().size();
}