Use C++ strings in file handling - c++

How to use C++ strings in file handling? I created a class that had C++ string as one of its private data members but that gave an error while reading from the file even if I am not manipulating with it at the moment and was initialised with default value in constructor. There is no problem while writing to the file. It works fine if I use C string instead but I don't want to. Is there a way to solve this?
class budget
{
float balance;
string due_name,loan_name; //string objects
int year,month;
float due_pay,loan_given;
public:
budget()
{
balance=0;
month=1;
due_name="NO BODY"; //default values
loan_name="SAFE";
year=0;
balance = 0;
due_pay=0;
loan_given=0;
}
.
.
.
};
void read_balance() //PROBLEM AFTER ENTERING THIS FUNCTION
{
system("cls");
budget b;
ifstream f1;
f1.open("balance.dat",ios::in|ios::binary);
while(f1.read((char*)&b,sizeof(b)))
{ b.show_data();
}
system("cls");
cout<<"No More Records To Display!!";
getch();
f1.close();
}

String is non-POD data-type. You cannot read/write from/in string by read/write functions.
basic_istream<charT,traits>& read(char_type* s, streamsize n);
30 Effects: Behaves as an unformatted input function (as described in
27.7.2.3, paragraph 1). After constructing a sentry object, if !good() calls setstate(failbit) which may throw an exception, and return.
Otherwise extracts characters and stores them into successive
locations of an array whose first element is designated by s.323
Characters are extracted and stored until either of the following
occurs: — n characters are stored; — end-of-file occurs on the input
sequence (in which case the function calls setstate(failbit | eofbit),
which may throw ios_base::failure (27.5.5.4)). 31 Returns: *this.
There is nothing about, how members of std::string placed. Look at, or use boost::serialiation. http://www.boost.org/doc/libs/1_50_0/libs/serialization/doc/index.html And of course you can write size of string and then write data and when read - read size, allocate array of this size, read data in this array and then create string. But use boost is better.

While reading the string members (due_name,loan_name) of your class budget your code literally fills them byte by byte. While it makes sense for floats and ints it won't work for strings.
Strings are designed to keep 'unlimited' amount of text, therefore their constructors, copy constructors, concatenations and so on must ensure to allocate the actual piece of memory to store the text and expand it if necessary (and delete upon destruction). Filling strings this way from disk will result in invalid pointers inside your string objects (not pointing to the actual memory which contains the text), actually no text will be actually read this way at all.

The easiest way to solve this is to not use C++ strings in that class. Work out the maximum length for each of the strings you will be storing, and make a char array that is one byte longer (to allow for the 0-terminator). Now you can read and write that class as binary without worrying about serialization etc.
If you don't want to do that, you cannot use iostream::read() on your class. You will need member functions that read/write to a stream. This is what serialization is about... But you don't need the complexity of boost. In basic terms, you'd do something like:
// Read with no error checking :-S
istream& budget::read( istream& s )
{
s.read( (char*)&balance, sizeof(balance) );
s.read( (char*)&year, sizeof(year) );
s.read( (char*)&month, sizeof(month) );
s.read( (char*)&due_pay, sizeof(due_pay) );
s.read( (char*)&loan_given, sizeof(loan_given) );
size_t length;
char *tempstr;
// Read due_name
s.read( (char*)&length, sizeof(length) );
tempstr = new char[length];
s.read( tempstr, length );
due_name.assign(tempstr, length);
delete [] tempstr;
// Read loan_name
s.read( (char*)&length, sizeof(length) );
tempstr = new char[length];
s.read( tempstr, length );
loan_name.assign(tempstr, length);
delete [] tempstr;
return s;
}
ostream& budget::write( ostream& s )
{
// etc...
}
Notice above that we've serialized the strings by writing a size value first, and then that many characters after.

Related

Strings to binary files

My problem goes like this: I have a class called 'Register'. It has a string attribute called 'trainName' and its setter:
class Register {
private:
string trainName;
public:
string getTrainName();
};
As a matter of fact, it is longer but I want to make this simpler.
In other class, I copy several Register objects into a binary file, previously setting trainName.
Register auxRegister = Register();
auxRegister.setName("name");
for(int i = 0; i < 10; i++) {
file.write(reinterpret_cast<char*>(&auxRegister),sizeof(Register));
}
Later on, I try to retrieve the register from the binary file:
Register auxRegister = Register();
while(!file.eof()) { //I kwnow this is not right. Which is the right way?
file.read(reinterpret_cast<char*>(&auxRegister), sizeof(Register));
}
It occurs it does not work. Register does, in fact, have more attributes (they are int) and I retrieve them OK, but it's not the case with the string.
Am I doing something wrong? Should I take something into consideration when working with binary files and strings?
Thank you very much.
The std::string class contains a pointer to a buffer where the string is stored (along with other member variables). The string buffer itself is not a part of the class. So writing out the contents of an instance of the class is not going to work, since the string will never be part of what you dump into the file, if you do it that way. You need to get a pointer to the string and write that.
Register auxRegister = Register();
auxRegister.setName("name");
auto length = auxRegister.size();
for(int i = 0; i < 10; i++) {
file.write( auxRegister.c_str(), length );
// You'll need to multiply length by sizeof(CharType) if you
// use a wstring instead of string
}
Later on, to read the string, you'll have to keep track of the number of bytes that were written to the file; or maybe fetch that information from the file itself, depending on the file format.
std::unique_ptr<char[]> buffer( new char[length + 1] );
file.read( buffer, length );
buffer[length] = '\0'; // NULL terminate the string
Register auxRegister = Register();
auxRegister.setName( buffer );
You cannot write string this way, as it almost certainly contains pointers to some structs and other binary stuff that cannot be serialized at all.
You need to write your own serializing function, and write the string length + bytes (for example) or use complete library, for example, protobuf, which can solve serializing problem for you.
edit: see praetorian's answer. much better than mine (even with lower score at time of this edit).

Is it possible to use an std::string for read()?

Is it possible to use an std::string for read() ?
Example :
std::string data;
read(fd, data, 42);
Normaly, we have to use char* but is it possible to directly use a std::string ? (I prefer don't create a char* for store the result)
Thank's
Well, you'll need to create a char* somehow, since that's what the
function requires. (BTW: you are talking about the Posix function
read, aren't you, and not std::istream::read?) The problem isn't
the char*, it's what the char* points to (which I suspect is what
you actually meant).
The simplest and usual solution here would be to use a local array:
char buffer[43];
int len = read(fd, buffer, 42);
if ( len < 0 ) {
// read error...
} else if ( len == 0 ) {
// eof...
} else {
std::string data(buffer, len);
}
If you want to capture directly into an std::string, however, this is
possible (although not necessarily a good idea):
std::string data;
data.resize( 42 );
int len = read( fd, &data[0], data.size() );
// error handling as above...
data.resize( len ); // If no error...
This avoids the copy, but quite frankly... The copy is insignificant
compared to the time necessary for the actual read and for the
allocation of the memory in the string. This also has the (probably
negligible) disadvantage of the resulting string having an actual buffer
of 42 bytes (rounded up to whatever), rather than just the minimum
necessary for the characters actually read.
(And since people sometimes raise the issue, with regards to the
contiguity of the memory in std:;string: this was an issue ten or more
years ago. The original specifications for std::string were designed
expressedly to allow non-contiguous implementations, along the lines of
the then popular rope class. In practice, no implementor found this
to be useful, and people did start assuming contiguity. At which point,
the standards committee decided to align the standard with existing
practice, and require contiguity. So... no implementation has ever not
been contiguous, and no future implementation will forego contiguity,
given the requirements in C++11.)
No, you cannot and you should not. Usually, std::string implementations internally store other information such as the size of the allocated memory and the length of the actual string. C++ documentation explicitly states that modifying values returned by c_str() or data() results in undefined behaviour.
If the read function requires a char *, then no. You could use the address of the first element of a std::vector of char as long as it's been resized first. I don't think old (pre C++11) strings are guarenteed to have contiguous memory otherwise you could do something similar with the string.
No, but
std::string data;
cin >> data;
works just fine. If you really want the behaviour of read(2), then you need to allocate and manage your own buffer of chars.
Because read() is intended for raw data input, std::string is actually a bad choice, because std::string handles text. std::vector seems like the right choice to handle raw data.
Using std::getline from the strings library - see cplusplus.com - can read from an stream and write directly into a string object. Example (again ripped from cplusplus.com - 1st hit on google for getline):
int main () {
string str;
cout << "Please enter full name: ";
getline (cin,str);
cout << "Thank you, " << str << ".\n";
}
So will work when reading from stdin (cin) and from a file (ifstream).

How to read the standard istream buffer in c++?

I have the following problem. I have to implement a class that has an attribute that is a char pointer meant to point to the object's "code", as follows:
class foo{
private:
char* cod;
...
public:
foo();
void getVal();
...
}
So on, so forth. getVal() is a method that takes the code from the standard istream and fills in all the information, including the code. The thing is, the "code" that identifies the object can't be longer than a certain number of characters. This has to be done without using customized buffers for the method getVal(), so I can't do the following:
//suppose the maximum number of characters is 50
void foo::getVal()
{
char buffer[100];
cin >> buffer;
if (strlen(buffer) > 50) //I'm not sure this would work considering how the stream
of characters would be copied to buffer and how strlen
works, but suppose this tells me how long the stream of
characters was.
{
throw "Exception";
}
...
}
This is forbidden. I also can't use a customized istream, nor the boost library.
I thought I could find the place where istream keeps its information rather easily, but I can't find it. All I've found were mentions to other types of stream.
Can somebody tell me if this can be done or where the stream keeps its buffered information?
Thanks
yes using strlen would work definitely ..you can write a sample program
int main()
{
char buffer[10];
std::cout << "enter buffer:" ;
std::cin >>buffer;
if(strlen(buffer)>6)
std::cout << "size > 6";
getch();
}
for inputs greater than size 6 characters it will display size >6
uhm .... >> reads up to the first blank, while strlen counts up to the first null. They can be mixed if you know for sure no blanks are in the middle of string you're going to read and that there are no more than 100 consecutive characted. If not, you will overrun the buffer before throwing.
Also, accessing the buffer does not grant all the string to be already there (the string can go past the buffer space, requiring to partially read and refill the buffer...)
If blanks are separator, why not just read into an std::string, and react to its final state? All the dynamics above are already handled inside >> for std::string.
[EDIT after the comments below]
The only way to store a sequence of unknown size, is to dynamically allocate the space and make it grow as it is required to grow. This is, no more - no less, what sting and vector do.
Whether you use them or write your own code to allocate and reallocate where more space is required, doesn't change the substance.
I'm start thinking the only reason of those requirements is to see your capability in writing your own string class. So ... just write it:
declare a class holding a pointer a size and a capacity, allocate some space, track how much you store, and when no store is available, allocate another wider store, copy the old, destroy it, and adjust the data member accordingly.
Accessing directly the file buffer is not the way, since you don't control how the file buffer is filled in.
An istream uses a streambuf.
I find that www.cppreference.com is a pretty good place for quick C++ references. You can go there to see how to use a streambuf or its derivative filebuf.

Serializing struct containing char*

I'm getting an error with serializing a char* string error C2228: left of '.serialize' must have class/struct/union I could use a std::string and then get a const char* from it. but I require the char* string.
The error message says it all, there's no support in boost serialization to serialize pointers to primitive types.
You can do something like this in the store code:
int len = strlen(string) + 1;
ar & len;
ar & boost::serialization::make_binary_object(string, len);
and in the load code:
int len;
ar & len;
string = new char[len]; //Don't forget to deallocate the old string
ar & boost::serialization::make_binary_object(string, len);
There is no way to serialize pointer to something in boost::serialization (I suspect, there is no actual way to do that too). Pointer is just a memory address, these memory addresses are generally specific for instance of object, and, what's really important, this address doesn't contain information where to stop the serialization.
You can't just say to your serializer: "Hey, take something out from this pointer and serialize this something. I don't care what size does it have, just do it..."
First and the optimal solution for your problem is wrapping your char* using std::string or your own string implementation. The second would mean writing special serializing routine for char* and, I suspect, will generally do the same as the first method does.
Try this:
struct Example
{
int i;
char c;
char * text; // Prefer std::string to char *
void Serialize(std::ostream& output)
{
output << i << "\n";
output << c << "\n";
// Output the length of the text member,
// followed by the actual text.
size_t text_length = 0;
if (text)
(
text_length = strlen(text);
}
output << text_length << "\n";
output << text << "\n";
};
void Input(std::istream& input)
{
input >> i;
input.ignore(1000, '\n'); // Eat any characters after the integer.
input >> c;
input.ignore(1000, '\n');
// Read the size of the text data.
size_t text_length = 0;
input >> text_length;
input.ignore(1000, '\n');
delete[] text; // Destroy previous contents, if any.
text = NULL;
if (text_length)
{
text = new char[text_length];
input.read(text, text_length);
}
};
Since pointers are not portable, the data must be written instead.
The text is known as a variable length field. Variable length fields are commonly output (serialized) in two data structures: length followed by data OR data followed by terminal character. Specifying the length first allows usage of block reading. With the latter data structure, the data must be read one unit at a time until the terminal character is read. Note: the latter data structure also implies that the terminal character cannot be part of the set of data items.
Some important issue to think about for serialization:
1. Use a format that is platform independent, such as ASCII text for numbers.
2. If a platform method is not available or allowed, define the exact specification for numbers, including Endianness and maximum length.
3. For floating point numbers, the specification should treat the components of a floating point number as individual numbers that have to abide by the specification for a number (i.e. exponent, magnitude and mantissa).
4. Prefer fixed length records to variable length records.
5. Prefer serializing to a buffer. Users of the object can then create a buffer of one or more objects and write the buffer as one block (using one operation). Likewise for input.
6. Prefer using a database to serializing. Although this may not be possible for networking, try every effort to have a database manage the data. The database may be able to send the data over the network.

How to use length indicator in a C++ program

I want to make a program in C++ that reads a file where each field will have a number before it that indicates how long it is.
The problem is I read every record in object of a class; how do I make the attributes of the class dynamic?
For example if the field is "john" it will read it in a 4 char array.
I don't want to make an array of 1000 elements as minimum memory usage is very important.
Use std::string, which will resize to be large enough to hold the contents you assign to it.
If you just want to read in word by word from the file, you can do:
vector<string> words;
ifstream fin("words.txt");
string s;
while( fin >> s ) {
words.push_back(s);
}
This will put all the words in the file into the vector words, though you will lose the whitespace.
In order to do this, you need to use dynamic allocation (either directly or indirectly).
If directly, you need new[] and delete[]:
char *buffer = new char[length + 1]; // +1 if you want a terminating NUL byte
// and later
delete[] buffer;
If you are allowed to use boost, you can simplify that a bit by using boost::shared_array<>. With a shared_array, you don't have to manually delete the memory as the array wrapper will take care of that for you:
boost::shared_array<char> buffer(new char[length + 1]);
Finally, you can do dynamic allocation indirectly via classes like std::string or std::vector<char>.
I suppose there is no whitespace between records, or you would just write file >> record in a loop.
size_t cnt;
while ( in >> cnt ) { // parse number, needs not be followed by whitespace
string data( cnt, 0 ); // perform just one malloc
in.get( data[0], cnt ); // typically perform just one memcpy
// do something with data
}