Are ifstream/ofstream really used for serialization? - c++

I am using ifstream and and ostream to serialize my data but I am surprised to discover the `<<' operator can't seperate two adjacent strings and seperating them would be quite complicated.
class Name
{
string first_name;
string last name;
friend std::ostream& operator<< (std::ostream& os, const Name& _name)
{
os << _name.first_name << _name.last_name;
return os;
}
friend std::istream& operator>> (std::istream& is, Name& _name)
{
is >> _name.first_name >> _name.last_name;
return is;
}
This doesn't work because << and >> doesn't write null terminator characters and ifstream reads the whole string in variable (first_name) which is kinda disappointing. How can I store the two strings separately so I can read them separately as well? I don't understand what is the motivation of << concatenating all the strings in ostream so we can't read them back seperatly!?

I don't understand what is the motivation of << concatenating all the strings in ostream so we can't read them back seperatly!?
This assumes that the only reason to write them separately is to read them as individual strings. Consider the case where someone has a pair of strings that they want to write to a stream without separators. Or a string followed by a float that they don't want separators for.
If ostreams automatically inserted separators for every << output, then it would be much harder for someone to write text without separators. They'd have to manually concatenate these strings and/or values into a single string, then output that.
And what would they use for this concatenation? They can't use ostringstream like you normally might, because it uses the same facilities as ofstream. So every << would put a separator character in the stream.
In short, the IO streams API writes what you told it to write, not what you may or may not "want" to write. It's not a serialization API; C++ isn't C# or Java. If you want serious serialization features, use Boost.Serialization.

Often times you want to concatenate strings with ostream (commonly stringstream). If you specifically don't want them concatenated it's easy enough to do:
os << _name.first_name << '\n' << _name.last_name;

ifstream and ofstream basically are streams, so they have nothing to indicate limit of data in them. Think about them as a river, all data can read from or write to them. This is true nature of files, so if you need them for serialization you must implement your serialization mechanism or use a library that designed for this purpose like boost::serialization. In C++ every thing implemented as is, and because of this you can gain maximum performance!! :)

Related

How can I keep track of the number of bytes written to an std::ostream object?

I want to write a bunch of data to an ostream object and return the number of bytes written. For example:
using namespace std;
size_t writeStuffToStream(ostream &stream)
{
stream << some_string << some_integer << some_other_arbitrary_object << endl;
return number_of_bytes_written;
}
There is the obvious workaround of writing everything to a stringstream and getting the byte count out of that, and then writing the stringstream to the stream, but that takes extra time and memory.
I also realize that if if all the data I wanted to write were preexisting strings, then there would be no problem. It's some_integer and some_other_arbitrary_object that are the problem.
Use the ostream tellp() method.
Note that this might fail if the provided ostream does not support positions. In that case you can create a temporary ostringstream to format your data, then extract the string, get its length and send it to the input ostream.
You can probably also write a custom ostream that send to another ostream and count emitted characters. I expected to find a virtual method to override in ostream to write just characters, but I did not find it :( You can re-use the stringstream code and replace the buffer writes to writes to an other ostream. string-stream.cc is about 500 lines long, so that's not this bad.

Defining the structure of a binary file in C++ 11

Since I have to work with files in binary a lot, I would like to have a more abstract way to do that, I have have to perform the same loop over and over:
write an header
write different kind of chunks ( with different set of values ) in a given order
write an optional closing header
Now I would like to break down this problem in small building blocks, imagine if I can write something like what the DTD is for the XML, a definition of what can possibly be in after a given chunk or inside a given semantic, so I can think about my files in terms of building blocks instead of hex values or something like that, also the code will be much more "idiomatic" and less cryptic.
In the end, there something in the language that can help me with binary files from this prospective ?
I'm not sure about C++11 specific features, but for C++ in general, streams make file I/O much easier to work with. You can overload the stream insertion (<<) and stream extraction (>>) operators to accomplish your goals. If you're not very familiar with operator overloading, chapter 9 of this site, which explains it well, along with numerous examples. Here's the particular page for overloading the << and >> operators in the context of streams.
Allow me to illustrate what I mean. Suppose we define a few classes:
BinaryFileStream - which represents the file you are trying to write to and (possibly) read from.
BinaryFileStreamHeader - which represents the file header.
BinaryFileStreamChunk - which represents one chunk.
BinaryFileStreamClosingHeader - which represents the closing header.
Then, you can overload the stream insertion and extraction operators in your BinaryFileStream to write and read the file (or any other istream or ostream).
...
#include <iostream> // I/O stream definitions, you can specify your overloads for
// ifstream and ofstream, but doing so for istream and ostream is
// more general
#include <vector> // For holding the chunks
class BinaryFileStream
{
public:
...
// Write binary stream
friend const std::ostream& operator<<( std::ostream& os, const BinaryFileStream& bfs )
{
// Write header
os << bfs.mHeader;
// write chunks
std::vector<BinaryFileStreamChunk>::iterator it;
for( it = bfs.mChunks.begin(); it != bfs.mChunks.end(); ++it )
{
os << (*it);
}
// Write Closing Header
os << bfs.mClosingHeader;
return os;
}
...
private:
BinaryFileStreamHeader mHeader;
std::vector<BinaryFileStreamChunk> mChunks;
BinaryFileStreamClosingHeader mClosingHeader;
};
All you must do then, is have operator overloads for your BinaryFileStreamHeader, BinaryFileStreamChunk and BinaryFileStreamClosingHeader classes that convert their data into the appropriate binary representation.
You can overload the stream extraction operator (>>) in an analogous way, though some extra work may be required for parsing.
Hope this helps.

Is it possible to manipulate some text with an user-defined I/O manipulator?

Is there a (clean) way to manipulate some text from std::cin before inserting it into a std::string, so that the following would work:
cin >> setw(80) >> Uppercase >> mystring;
where mystring is std::string (I don't want to use any wrappers for strings).
Uppercase is a manipulator. I think it needs to act on the Chars in the buffer directly (no matter what is considered uppercase rather than lowercase now). Such a manipulator seems difficult to implement in a clean way, as user-defined manipulators, as far as I know, are used to just change or mix some pre-determined format flags easily.
(Non-extended) manipulators usually only set flags and data which the extractors afterwards read and react to. (That is what xalloc, iword, and pword are for.) What you could, obviously, do, is to write something analogous to std::get_money:
struct uppercasify {
uppercasify(std::string &s) : ref(s) {}
uppercasify(const uppercasify &other) : ref(other.ref) {}
std::string &ref;
}
std::istream &operator>>(std::istream &is, uppercasify uc) { // or &&uc in C++11
is >> uc.ref;
boost::to_upper(uc.ref);
return is;
}
cin >> setw(80) >> uppercasify(mystring);
Alternatively, cin >> uppercase could return not a reference to cin, but an instantiation of some (template) wrapper class uppercase_istream, with the corresponding overload for operator>>. I don't think having a manipulator modify the underlying stream buffer's contents is a good idea.
If you're desperate enough, I guess you could also imbue a hand-crafted locale resulting in uppercasing strings. I don't think I'd let anything like that go through a code review, though – it's simply just waiting to surprise and bite the next person working on the code.
You may want to check out boost iostreams. Its framework allows defining filters which can manipulate the stream. http://www.boost.org/doc/libs/1_49_0/libs/iostreams/doc/index.html

difference between cout and write in c++

I am still confused about the difference between ostream& write ( const char* s , streamsize n ) in c++ and cout in c++
The first function writes the block of data pointed by s, with a size of n characters, into the output buffer. The characters are written sequentially until n have been written.
whereas cout is an object of class ostream that represents the standard output stream. It corresponds to the cstdio stream stdout.
Can anyone clearly bring out the differences between the two functions.
ostream& write ( const char* s , streamsize n );
Is an Unformatted output function and what is written is not necessarily a c-string, therefore any null-character found in the array s is copied to the destination and does not end the writing process.
cout is an object of class ostream that represents the standard output stream.
It can write characters either as formatted data using for example the insertion operator ostream::operator<< or as Unformatted data using the write member function.
You are asking what is the difference between a class member function and an instance of the class? cout is an ostream and has a write() method.
As to the difference between cout << "Some string" and cout.write("Some string", 11): It does the same, << might be a tiny bit slower since write() can be optimized as it knows the length of the string in advance. On the other hand, << looks nice and can be used with many types, such as numbers. You can write cout << 5;, but not cout.write(5).
cout is not a function. Like you said, it is an object of class ostream. And as an object of that class, it possesses the write function, which can be called like this:
cout.write(source,size);
"In binary files, to input and output data with the extraction and insertion operators (<< and >>) and functions like getline is not efficient, since we do not need to format any data, and data may not use the separation codes used by text files to separate elements (like space, newline, etc...).
File streams include two member functions specifically designed to input and output binary data sequentially: write and read. The first one (write) is a member function of ostream inherited by ofstream. And read is a member function of istream that is inherited by ifstream. Objects of class fstream have both members. Their prototypes are:
write ( memory_block, size );
read ( memory_block, size );
"
from: http://www.cplusplus.com/doc/tutorial/files/
There is no function ostream& write ( const char* s , streamsize n ). Perhaps you are referring to the member function ostream& ostream::write ( const char* s , streamsize n )?
The .write() function is called raw (or unformatted) output. It simply outputs a series of bytes into the stream.
The global variable cout is one instance of class ofstream and has the .write() method. However, cout is typically used for formatted output, such as:
string username = "Poulami";
cout << "Username: '" << username << "'." << endl;
Many different types have the ostream& operator<<(ostream& stream, const UserDefinedType& data), which can be overloaded to enrich ofstream's vocabulary.
Oh boy! A chance to smash up a question.
From your question I feel you are some Java or Python programmer and definitely not a begginner.
You dont understand that C++ is probably the only language that allows programmers to implement primitive built in operators as class members and as part of the general interface.
In Java you could never go
class Money
{
int operator + (int cash) { return this.cash + cash; }
void operator << () { System.out.println(cash); }
int cash;
}
public class Main_
{
public static void Main(String [] args)
{
Money cashOnHand;
System << cashOnHand;
}
}
But cpp allows this with great effect. class std::ostream implements the stream operators but also implements a regular write function which does raw binary operations.
I agreed with Alok Save!A litte before, I searched the problem and read the answer carefully.
Maybe in other word, cout is an object of ostream, but write is just a function provided. So cout have twe ways to used by coders: one is as a member function, another is used by operator(<<).

How to write an object to file in C++

I have an object with several text strings as members. I want to write this object to the file all at once, instead of writing each string to file. How can I do that?
You can override operator>> and operator<< to read/write to stream.
Example Entry struct with some values:
struct Entry2
{
string original;
string currency;
Entry2() {}
Entry2(string& in);
Entry2(string& original, string& currency)
: original(original), currency(currency)
{}
};
istream& operator>>(istream& is, Entry2& en);
ostream& operator<<(ostream& os, const Entry2& en);
Implementation:
using namespace std;
istream& operator>>(istream& is, Entry2& en)
{
is >> en.original;
is >> en.currency;
return is;
}
ostream& operator<<(ostream& os, const Entry2& en)
{
os << en.original << " " << en.currency;
return os;
}
Then you open filestream, and for each object you call:
ifstream in(filename.c_str());
Entry2 e;
in >> e;
//if you want to use read:
//in.read(reinterpret_cast<const char*>(&e),sizeof(e));
in.close();
Or output:
Entry2 e;
// set values in e
ofstream out(filename.c_str());
out << e;
out.close();
Or if you want to use stream read and write then you just replace relevant code in operators implementation.
When the variables are private inside your struct/class then you need to declare operators as friend methods.
You implement any format/separators that you like. When your string include spaces use getline() that takes a string and stream instead of >> because operator>> uses spaces as delimiters by default. Depends on your separators.
It's called serialization. There are many serialization threads on SO.
There are also a nice serialization library included in boost.
http://www.boost.org/doc/libs/1_42_0/libs/serialization/doc/index.html
basically you can do
myFile<<myObject
and
myFile>>myObject
with boost serialization.
If you have:
struct A {
char a[30], b[25], c[15];
int x;
}
then you can write it all just with write(fh, ptr, sizeof(struct a)).
Of course, this isn't portable (because we're not saving the endieness or size of "int," but that may not be an issue for you.
If you have:
struct A {
char *a, *b, *c;
int d;
}
then you're not looking to write the object; you're looking to serialize it. Your best bet is to look in the Boost libraries and use their serialization routines, because it's not an easy problem in languages without reflection.
There's not really a simple way, it's C++ after all, not PHP, or JavaScript.
http://www.parashift.com/c++-faq-lite/serialization.html
Boost also has some library for it: http://www.boost.org/doc/libs/release/libs/serialization ... like Tronic already mentioned :)
The better method is to write each field individually along with the string length.
As an alternative, you can create a char array (or std::vector<char>) and write all the members into the buffer, then write the buffer to the output.
The underlying thorn is that a compiler is allowed to insert padding between members in a class or structure. Use memcpy or std::copy will result in padding bytes written to the output.
Just remember that you need to either write the string lengths and the content or the content followed by some terminating character.
Other people will suggest checking out the Boost Serialization library.
Unfortunately that is generally not quite possible. If your struct only contains plain data (no pointers or complex objects), you can store it as a one chunk, but care must be taken if portability is an issue. Padding, data type size and endianess issues make this problematic.
You can use Boost.Serialization to minimize the amount of code required for proper portable and versionable searialization.
Assuming your goal is as stated, to write out the object with a single call to write() or fwrite() or whatever, you'd first need to copy the string and other object data into a single contiguous block of memory. Then you could write() that block of memory out with a single call. Or you might be able to do a vector-write by calling writev(), if that call is available on your platform.
That said, you probably won't gain much by reducing the number of write calls. Especially if you are using fwrite() or similar already, then the C library is already doing buffering for you, so the cost of multiple small calls is minimal anyway. Don't put yourself through a lot of extra pain and code complexity unless it will actually do some good...