istream for char buffer - c++

In Qt library there QByteArray and QDataStream classes, which allow me to read and write variables to memmory buffer with very easy to use syntax:
QByteArray data = getData();
QDataStream stream( data );
double d = 0;
int i = 0;
stream >> d >> i;
How to implement simmilar behavior with only stl stream classes?
For example I have a const char* data and size of it, so I want to construct std::istream and read varibles from that data:
const char* data = getData();
size_t size = getSize();
membuffer buf( data, data + size );
std::istream str( buf );
double d = 0;
str >> d;
Note that data should not be copied!

Assuming you have a fixed sized data buffer and its size it is trivial to implement by creating a suitable stream buffer:
struct membuf: std::streambuf {
membuf(char* begin, char* end) {
this->setg(begin, begin, end);
}
};
This simple stream buffer just sets up the stream buffers "get-area" to be the range [begin, end) (begin is used twice as it is possible to set up a "put-back area" which in this case is empty). When there are no more characters to read the stream will try to call underflow() whose default implementation indicates failure. If you want to read more characters you'd overwrite underflow() to provide more characters in a newly set up buffer.
With that you can create a stream using this memory backed stream buffer:
membuf sbuf(begin, end);
std::istream in(&sbuf);
double d = 0;
if (in >> d) { // always check conversions from string to a value...
// ...
}
For a bit of extra convenience, the creation of the stream buffer and the stream can also be bundled into a class. There is a small trick in that the stream buffer should be created reasonably earlier but it is doable:
class imemstream: private virtual membuf, public std::istream {
public:
imemstream(char* begin, char* end)
: membuf(begin, end)
, std::ios(static_cast<std::streambuf*>(this))
, std::istream(static_cast<std::streambuf*>(this)) {
}
};
Just a warning, though: creating a stream is more expensive than copying quite a bit of data. That is, if you want to use that stream in a loop you probably want to provide functionality to reset the buffer (probably combined with clearing state flags).

Related

use iostream or alternative for managing stream

I want to write a function which (simplified) takes as a parameter an input buffer of variable size, processes it (sequentially), and returns a buffer of a fixed size. The remaining part of the buffer has to stay in the "pipeline" for the next call of the function.
Question 1:
From my research it looks like iostream is the way to go, but apparently no one is using it. Is this the best way to go?
Question 2:
How can I declare the iostream object globally? Actually, as I have several streams I will need to write the iostream Object in a struct-vector. How do I do this?
At the moment my code looks like that:
struct membuf : std::streambuf
{
membuf(char* begin, char* end) {
this->setg(begin, begin, end);
}
};
void read_stream(char* bufferIn, char* BufferOut, int lengthBufferIn)
{
char* buffer = (char*) malloc(300); //How do I do this globally??
membuf sbuf(buffer, buffer + sizeof(buffer));//How do I do this globally??
std::iostream s(&sbuf); //How do I do this globally??
s.write(bufferIn, lengthBufferIn);
s.read(BufferOut, 100);
process(BufferOut);
}
I see no need for iostream here. You can create an object who has a reference to the buffer (so no copies involved) and to the position where it is left.
So something along this:
class Transformer {
private:
char const *input_buf_;
public:
Transformer(char const *buf) : input_buf_(buf) {
}
bool has_next() const { return input_buf_ != nullptr; } // or your own condition
std::array<char, 300> read_next() {
// read from input_buf_ as much as you need
// advance input_buf_ to the remaining part
// make sure to set input_buf_ accordingly after the last part
// e.g. input_buf_ = nullptr; for how I wrote hasNext
return /*the processed fixed size buffer*/;
}
}
usage:
char *str == //...;
Transformer t(str);
while (t.has_next()) {
std::array<char, 300> arr = t.read_next();
// use arr
}
Question 1: From my research it looks like iostream is the way to go, but apparently no one is using it. Is this the best way to go?
Yes (the std::istream class and specializations, are there to manage streams, and they fit the problem well).
Your code could look similar to this:
struct fixed_size_buffer
{
static const std::size_t size = 300;
std::vector<char> value;
fixed_size_buffer() : value(fixed_size_buffer::size, ' ') {}
};
std::istream& operator>>(std::istream& in, fixed_size_buffer& data)
{
std::noskipws(in); // read spaces as well as characters
std::copy_n(std::istream_iterator<char>{ in },
fixed_size_buffer::size);
std::begin(data.value)); // this leaves in in an invalid state
// if there is not enough data in the input
// stream;
return in;
}
Consuming the data:
fixed_size_buffer buffer;
std::ifstream fin{ "c:\\temp\\your_data.txt" };
while(fin >> buffer)
{
// do something with buffer here
}
while(std::cin >> buffer) // read from standard input
{
// do something with buffer here
}
std::istringstream sin{ "long-serialized-string-here" };
while(sin >> buffer) // read from standard input
{
// do something with buffer here
}
Question 2: How can I declare the iostream object globally? Actually, as I have several streams I will need to write the iostream Object in a struct-vector. How do I do this?
iostreams do not support copy-construction; Because of this, you will need to keep them in a sequence of pointers / references to base class:
auto fin = std::make_unique<std::ifstream>("path_to_input_file");
std::vector<std::istream*> streams;
streams.push_back(&std::cin);
streams.push_back(fin.get());
fixed_size_buffer buffer;
for(auto in_ptr: streams)
{
std::istream& in = &in_ptr;
while(in >> buffer)
{
// do something with buffer here
}
}

Saving a game state using serialization C++

I have a class called Game which contains the following:
vector<shared_ptr<A>> attr; // attributes
D diff; // differences
vector<shared_ptr<C>> change; // change
My question is, how can I write these (save) to a file and read/load it up later?
I thought about using a struct with these in it, and simply saving the struct but I have no idea where to start.
This is my attempt so far, with just trying to save change. I've read up a lot on the issue and my issue (well one of them, anyway) here seems to be that I am storing pointers which after closing the program would be invalid (compounded by the fact that I also free them before exiting).
/* Saves state to file */
void Game::saveGame(string toFile) {
ofstream ofs(toFile, ios::binary);
ofs.write((char *)&this->change, sizeof(C));
/* Free memory code here */
....
exit(0);
};
/* Loads game state from file */
void Game::loadGame(string fromFile) {
ifstream ifs(fromFile, ios::binary);
ifs.read((char *)&this->change, sizeof(C));
this->change.toString(); // display load results
};
Can anyone guide me in the right direction for serializing this data? I'd like to use only standard packages, so no boost.
Thanks.
I have no idea how is implemented classes A, C or D, but that is the first question: how to serialize an object of that class. For the C case, you need to implement something like this:
std::ostream& operator <<(std::ostream& os, const C& c) {
// ... code to serialize c to an output stream
return os;
}
std::istream& operator >>(std::istream& is, C& c) {
// ... code to populate c contents from the input stream
return is;
}
or, if you prefer, create a write() and read() function for that class.
Well, if you want to serialize a vector<shared_ptr<C>> looks obvious you don't want to serialize the pointer, but the contents. So you need to dereference each of those pointers and serialize. If the size of the vector is not known before loading it (i.e., is not always the same), you'll need to store that information. Then, you can create a pair of functions to serialize the complete vector:
std::ostream& operator <<(std::ostream& os, const std::vector<std::shared_ptr<C>>& vc) {
// serialize the size of the vector using << operator
// for each element of the vector, let it be called 'pc'
os << *pc << std::endl; // store the element pointed by the pointer, not the pointer.
return os;
}
std::istream& operator >>(std::istream& is, std::vector<std::shared_ptr<C>>& c) {
// read the size of the vector using >> operator
// set the size of the vector
// for each i < sizeo of the vector, let 'auto &pc = vc[i]' be a reference to the i-th element of the vector
C c; // temporary object
is >> c; // read the object stored in the stream
pc = std::make_shared<C>(c); // construct the shared pointer, assuming the class C has copy constructor
return is;
}
And then,
/* Saves state to file */
void Game::saveGame(string toFile) {
ofstream ofs(toFile);
ofs << change;
....
};
/* Loads game state from file */
void Game::loadGame(string fromFile) {
ifstream ifs(fromFile);
ifs >> change;
};
I know there are a lot of things you still need to resolve. I suggest you to investigate to resolve them so you understand well how to solve your problem.
Not only are you saving pointers, you're trying to save a shared_ptr but using the wrong size.
You need to write serialization functions for all your classes, taking care to never just write the raw bits of a non-POD type. It's safest to always implement member-by-member serialization for everything, because you never know what the future will bring.
Then handling collections of them is just a matter of also storing how many there are.
Example for the Cs:
void Game::save(ofstream& stream, const C& data)
{
// Save data as appropriate...
}
void Game::saveGame(string toFile) {
ofstream ofs(toFile, ios::binary);
ofs.write((char *)change.size(), sizeof(change.size());
for (vector<shared_ptr<C>>::const_iterator c = change.begin(); c != change.end(); ++c)
{
save(ofs, **c);
}
};
shared_ptr<C> Game::loadC(ofstream& stream)
{
shared_ptr<C> data(new C);
// load the object...
return data;
}
void Game::loadGame(string fromFile) {
change.clear();
size_t count = 0;
ifstream ifs(fromFile, ios::binary);
ifs.read((char *)&count, sizeof(count));
change.reserve(count);
for (int i = 0; i < count; ++i)
{
change.push_back(loadC(ifs));
}
};
All the error handling is missing of course - you would need to add that.
It's actually a good idea to at least start with text storage (using << and >>) instead of binary. It's easier to find bugs, or mess around with the saved state, when you can just edit it in a text editor.
Writing your own serialization is quite a challenge. Even if you do not use boost serializatoin I would recommend you learn how to use it and comprehend how it works rather than discovering it yourself.
When serializing you finally end up with a buffer of data of which content you have very vague idea. You have to save everything you need to be able to restore it. You read it chunk by chunk. Example (not compiled, not tested and not stylish ):
void save(ostream& out, const string& s)
{
out << s.size();
out.write(s.c_str(), s.size());
}
void load(istream& in, string& s)
{
unsigned len;
in >> len;
s.resize(len);
in.read((char*)s, len);
}
struct Game
{
void save(ostream& out)
{
player.save(out);
};
void load(istream& in)
{
player.load(in);
}
};
struct Player
{
void save(ostream& out)
{
// save in the same order as loading, serializing everything you need to read it back
save(out, name);
save(out, experience);
}
void load(istream& in)
{
load(in, name);
load(in, experience); //
}
};
I do not know why you would do it to yourself instead of using boost but those are some of the cases you should consider:
- type - you must figure out a way to know what "type of change" you actually have there.
- a string (vector, whatever) - size + data (then the first thing you read back from the string is the length, you resize it and copy the "length" number of characters)
- a pointer - save the data pointed by pointer, then upon deserialization you have to allocate it, construct it (usually default construct) and read back the data and reset the members to their respective values. Note: you have to avoid memory leakage.
- polymorphic pointer - ouch you have to know what type the pointer actually points to, you have to construct the derived type, save the values of the derived type... so you have to save type information
- null pointer... you have to distinguish null pointer so you know that you do not need to further read data from the stream.
- versioning - you have to be able to read a data after you added/removed a field
There is too much of it for you to get a complete answer.

How to write custom input stream in C++

I'm currently learning C++ (Coming from Java) and I'm trying to understand how to use IO streams properly in C++.
Let's say I have an Image class which contains the pixels of an image and I overloaded the extraction operator to read the image from a stream:
istream& operator>>(istream& stream, Image& image)
{
// Read the image data from the stream into the image
return stream;
}
So now I'm able to read an image like this:
Image image;
ifstream file("somepic.img");
file >> image;
But now I want to use the same extraction operator to read the image data from a custom stream. Let's say I have a file which contains the image in compressed form. So instead of using ifstream I might want to implement my own input stream. At least that's how I would do it in Java. In Java I would write a custom class extending the InputStream class and implementing the int read() method. So that's pretty easy. And usage would look like this:
InputStream stream = new CompressedInputStream(new FileInputStream("somepic.imgz"));
image.read(stream);
So using the same pattern maybe I want to do this in C++:
Image image;
ifstream file("somepic.imgz");
compressed_stream stream(file);
stream >> image;
But maybe that's the wrong way, don't know. Extending the istream class looks pretty complicated and after some searching I found some hints about extending streambuf instead. But this example looks terribly complicated for such a simple task.
So what's the best way to implement custom input/output streams (or streambufs?) in C++?
Solution
Some people suggested not using iostreams at all and to use iterators, boost or a custom IO interface instead. These may be valid alternatives but my question was about iostreams. The accepted answer resulted in the example code below. For easier reading there is no header/code separation and the whole std namespace is imported (I know that this is a bad thing in real code).
This example is about reading and writing vertical-xor-encoded images. The format is pretty easy. Each byte represents two pixels (4 bits per pixel). Each line is xor'd with the previous line. This kind of encoding prepares the image for compression (usually results in lot of 0-bytes which are easier to compress).
#include <cstring>
#include <fstream>
using namespace std;
/*** vxor_streambuf class ******************************************/
class vxor_streambuf: public streambuf
{
public:
vxor_streambuf(streambuf *buffer, const int width) :
buffer(buffer),
size(width / 2)
{
previous_line = new char[size];
memset(previous_line, 0, size);
current_line = new char[size];
setg(0, 0, 0);
setp(current_line, current_line + size);
}
virtual ~vxor_streambuf()
{
sync();
delete[] previous_line;
delete[] current_line;
}
virtual streambuf::int_type underflow()
{
// Read line from original buffer
streamsize read = buffer->sgetn(current_line, size);
if (!read) return traits_type::eof();
// Do vertical XOR decoding
for (int i = 0; i < size; i += 1)
{
current_line[i] ^= previous_line[i];
previous_line[i] = current_line[i];
}
setg(current_line, current_line, current_line + read);
return traits_type::to_int_type(*gptr());
}
virtual streambuf::int_type overflow(streambuf::int_type value)
{
int write = pptr() - pbase();
if (write)
{
// Do vertical XOR encoding
for (int i = 0; i < size; i += 1)
{
char tmp = current_line[i];
current_line[i] ^= previous_line[i];
previous_line[i] = tmp;
}
// Write line to original buffer
streamsize written = buffer->sputn(current_line, write);
if (written != write) return traits_type::eof();
}
setp(current_line, current_line + size);
if (!traits_type::eq_int_type(value, traits_type::eof())) sputc(value);
return traits_type::not_eof(value);
};
virtual int sync()
{
streambuf::int_type result = this->overflow(traits_type::eof());
buffer->pubsync();
return traits_type::eq_int_type(result, traits_type::eof()) ? -1 : 0;
}
private:
streambuf *buffer;
int size;
char *previous_line;
char *current_line;
};
/*** vxor_istream class ********************************************/
class vxor_istream: public istream
{
public:
vxor_istream(istream &stream, const int width) :
istream(new vxor_streambuf(stream.rdbuf(), width)) {}
virtual ~vxor_istream()
{
delete rdbuf();
}
};
/*** vxor_ostream class ********************************************/
class vxor_ostream: public ostream
{
public:
vxor_ostream(ostream &stream, const int width) :
ostream(new vxor_streambuf(stream.rdbuf(), width)) {}
virtual ~vxor_ostream()
{
delete rdbuf();
}
};
/*** Test main method **********************************************/
int main()
{
// Read data
ifstream infile("test.img");
vxor_istream in(infile, 288);
char data[144 * 128];
in.read(data, 144 * 128);
infile.close();
// Write data
ofstream outfile("test2.img");
vxor_ostream out(outfile, 288);
out.write(data, 144 * 128);
out.flush();
outfile.close();
return 0;
}
The proper way to create a new stream in C++ is to derive from std::streambuf and to override the underflow() operation for reading and the overflow() and sync() operations for writing. For your purpose you'd create a filtering stream buffer which takes another stream buffer (and possibly a stream from which the stream buffer can be extracted using rdbuf()) as argument and implements its own operations in terms of this stream buffer.
The basic outline of a stream buffer would be something like this:
class compressbuf
: public std::streambuf {
std::streambuf* sbuf_;
char* buffer_;
// context for the compression
public:
compressbuf(std::streambuf* sbuf)
: sbuf_(sbuf), buffer_(new char[1024]) {
// initialize compression context
}
~compressbuf() { delete[] this->buffer_; }
int underflow() {
if (this->gptr() == this->egptr()) {
// decompress data into buffer_, obtaining its own input from
// this->sbuf_; if necessary resize buffer
// the next statement assumes "size" characters were produced (if
// no more characters are available, size == 0.
this->setg(this->buffer_, this->buffer_, this->buffer_ + size);
}
return this->gptr() == this->egptr()
? std::char_traits<char>::eof()
: std::char_traits<char>::to_int_type(*this->gptr());
}
};
How underflow() looks exactly depends on the compression library being used. Most libraries I have used keep an internal buffer which needs to be filled and which retains the bytes which are not yet consumed. Typically, it is fairly easy to hook the decompression into underflow().
Once the stream buffer is created, you can just initialize an std::istream object with the stream buffer:
std::ifstream fin("some.file");
compressbuf sbuf(fin.rdbuf());
std::istream in(&sbuf);
If you are going to use the stream buffer frequently, you might want to encapsulate the object construction into a class, e.g., icompressstream. Doing so is a bit tricky because the base class std::ios is a virtual base and is the actual location where the stream buffer is stored. To construct the stream buffer before passing a pointer to a std::ios thus requires jumping through a few hoops: It requires the use of a virtual base class. Here is how this could look roughly:
struct compressstream_base {
compressbuf sbuf_;
compressstream_base(std::streambuf* sbuf): sbuf_(sbuf) {}
};
class icompressstream
: virtual compressstream_base
, public std::istream {
public:
icompressstream(std::streambuf* sbuf)
: compressstream_base(sbuf)
, std::ios(&this->sbuf_)
, std::istream(&this->sbuf_) {
}
};
(I just typed this code without a simple way to test that it is reasonably correct; please expect typos but the overall approach should work as described)
boost (which you should have already if you're serious about C++), has a whole library dedicated to extending and customizing IO streams: boost.iostreams
In particular, it already has decompressing streams for a few popular formats (bzip2, gzlib, and zlib)
As you saw, extending streambuf may be an involving job, but the library makes it fairly easy to write your own filtering streambuf if you need one.
Don't, unless you want to die a terrible death of hideous design. IOstreams are the worst component of the Standard library - even worse than locales. The iterator model is much more useful, and you can convert from stream to iterator with istream_iterator.
I agree with #DeadMG and wouldn't recommend using iostreams. Apart from poor design the performance is often worse than that of plain old C-style I/O. I wouldn't stick to a particular I/O library though, instead, I'd create an interface (abstract class) that has all required operations, for example:
class Input {
public:
virtual void read(char *buffer, size_t size) = 0;
// ...
};
Then you can implement this interface for C I/O, iostreams, mmap or whatever.
It is probably possible to do this, but I feel that it's not the "right" usage of this feature in C++. The iostream >> and << operators are meant for fairly simple operations, such as wriitng the "name, street, town, postal code" of a class Person, not for parsing and loading images. That's much better done using the stream::read() - using Image(astream);, and you may implement a stream for compression, as descrtibed by Dietmar.

Reading Binary Data with streambuf in C++ through a Class

Im am a c programmer trying to begin a new phase of my life in c++ (i know i am still using printf below, but that because formatting is so easy). I am looking to print out the first byte of a datafile from a member function of an object. I think my streambuffer is being destroyed before I can read it's data but I'm lost as to what to do.
My class looks like the following
class MyParser {
MyParser(string filepath);
void readHeader();
streambuf *pbuf;
long size;
}
My constructor opens the file, gets the buffer out, outputs the first byte and returns. (I think pbuf is dying at the end of this code). This code outputs First Byte (in constructor): 0x8C
MyParser::MyParser(string filepath) {
ifstream file(filepath.c_str(), ios::in | ios::binary)
pbuf = file.rdbuf();
size = pbuf->pubseekoff(0,ios::end,ios::in);
pbuf->pubseekpos(0,ios::in);
unsigned char byte = pbuf->sgetc();
printf("First Byte (in constructor): 0x%02X\n", byte);
return;
}
My read header is dumping the first byte, but based on the output all is see is First Byte (in readHeader): 0xFF
void MyParser::readHeader() {
unsigned char byte = pbuf->sgetc();
printf("First Byte (in readHeader): 0x%02X\n", byte);
}
My main simply creates a parser and tries to readHeader
void main() {
MyParser parser("../data/data.bin");
parser.readHeader();
}
I think the solution to my problem is to create a new streambuffer but new streambuf(file.rdbuf()) isn't working for me. Any advice?
Your program has undefined behavior: the stream buffer you keep is owned by the std::ifstream you open in body of you constructor. When this object dies, the stream buffer is released as well. The easiest approach to avoid this problem is to have your std::ifstream be a member of your class: this binds the life-time of the stream to your object. It may also be easier to use the std::istream interface for your parsing rather than the somewhat awkward std::streambuf interface.
If you really want to just use a stream buffer, you can allocate a std::filebuf using new filebuf and the open() the file stream directly. To keep a pointer to it, you would probably use std::unique_ptr<std::filebuf> (or std::auto_ptr<std::filebuf> if you are not using C++ 2011). Using the pointer class arranges for automatic release of the object. Of course, the pointers would still be members of your class to get the life-times right.
Your attempt to copy a stream buffer didn't work because stream buffers are not copyable. You'd need to create the file buffer directly:
MyParser::MyParser(std::string const& filename)
: pbuf(new std::filebuf)
{
this-pbuf->open("whatever", std::ios_base::in);
...
}
You need some new C++ teaching material, because (sorry) but this is just so wrong. You need to declare the filestream as a member, there's no need for any new anywhere in this program, and pretty much nobody, ever, needs to deal with streambuf.
class MyParser {
std::ifstream file;
public:
MyParser(string filepath) {
file.open(filepath, std::ios::in | std::ios::binary );
char byte;
file.read(sizeof(byte), &byte);
printf("First Byte (in constructor): 0x%02X\n", byte);
}
void readHeader() {
char byte;
file.read(sizeof(byte), &byte);
printf("First Byte (in readHeader): 0x%02X\n", byte);
}
};

Can boost iostreams read and compress gzipped files on the fly?

I am reading a gzipped file using boost iostreams:
The following works fine:
namespace io = boost::iostreams;
io::filtering_istream in;
in.push(boost::iostreams::basic_gzip_decompressor<>());
in.push(io::file_source("test.gz"));
stringstream ss;
copy(in, ss);
However, I don't want to take the memory hit of reading an entire gzipped file
into memory. I want to be able to read the file incrementally.
For example, if I have a data structure X that initializes itself from istream,
X x;
x.read(in);
fails. Presumably this is because we may have to put back characters into the stream
if we are doing partial streams. Any ideas whether boost iostreams supports this?
According to the iostream documentation the type boost::io::filtering_istream derives from std::istream. That is, it should be possible to pass this everywhere an std::istream& is expected. If you have errors at run-time because you need to unget() or putback() characters you should have a look at the pback_size parameter which specifies how many characters are return at most. I haven't seen in the documentation what the default value for this parameter is.
If this doesn't solve your problem can you describe what your problem is exactly? From the looks of it should work.
I think you need to write your own filter. For instance, to read a .tar.gz and output the files contained, I wrote something like
//using namespace std;
namespace io = boost::iostreams;
struct tar_expander
{
tar_expander() : out(0), status(header)
{
}
~tar_expander()
{
delete out;
}
/* qualify filter */
typedef char char_type;
struct category :
io::input_filter_tag,
io::multichar_tag
{ };
template<typename Source>
void fetch_n(Source& src, std::streamsize n = block_size)
{
/* my utility */
....
}
// Read up to n filtered characters into the buffer s,
// returning the number of characters read or -1 for EOF.
// Use src to access the unfiltered character sequence
template<typename Source>
std::streamsize read(Source& src, char* s, std::streamsize n)
{
fetch_n(src);
const tar_header &h = cast_buf<tar_header>();
int r;
if (status == header)
{
...
}
std::ofstream *out;
size_t fsize, stored;
static const size_t block_size = 512;
std::vector<char> buf;
enum { header, store_file, archive_end } status;
}
}
My function read(Source &...) when called receives the unzipped text.
To use the filter:
ifstream file("/home/..../resample-1.8.1.tar.gz", ios_base::in | ios_base::binary);
io::filtering_streambuf<io::input> in;
in.push(tar_expander());
in.push(io::gzip_decompressor());
in.push(file);
io::copy(in, cout);