C++ ofStream: "<<" vs "put" - c++

Being new to C++, I am confused about what << and put() means while using ofstream to write to a text file. I tried to experiment with the two following styles as follows:
Approach 1:
void writeTester() {
std::ofstream oFile("Resources/tst.txt", std::ios::out | std::ios::trunc);
std::vector<int> v{ 1,2,3,4 };
for (int i = 0;i < 4;i++) {
oFile.put(v[i]);
//casting to character pointer and writing also produced similar result
//oFile.put(*(char*)&v[i]);
}
oFile.close();
}
Approach 2:
void writeTester() {
std::ofstream oFile("Resources/tst.txt", std::ios::out | std::ios::trunc);
std::vector<int> v{ 1,2,3,4 };
for (int i = 0;i < 4;i++) {
oFile << v[i];
}
oFile.close();
}
While Approach 2 wrote the expected result to file (1234), Approach 1 wrote some garbage value to the file.
What is the difference between the 2 styles, and when to use which one? Also, what is the correct way to use Approach 1 to have "1234" as the output written to the file?

Simply put, using the '<<' operator means certain overloads can be used, which means writing integer values (as in your case) will mean they're converting to their string representations before being written to the file.
Using put, however, will write a single byte to the stream. This means your 4 byte int will be truncated to a single byte, thus writing nonsense data to your file.
Here's the documentation for ofstream::put.
And here's the documentation for ofstream.
For completion, and to spark your interest, here's an ASCII table, where you can look up the values you were writing to the file. Depends on the byte order (endianness) of your machine, which char is written. On LE machines it should be NUL, but I get confused myself sometimes and mix up the byte orders in my head so please take that last sentence with a grain of salt.

Related

C++ binary files I/O, data lost when writing

I am learning C++ with the "Programming: Principles and Practice Using C++" book from Bjarne Stroustrup. I am currently studying chapter 11 and I found an example on how to read and write binary files of integers (section 11.3.2). I played around with the example and used a .txt file (input.txt) with a sentence which I read and wrote to another file (output.txt) (text_to_binary fnc) and then read and wrote back to the original file (input.txt) (binary_to_text fnc).
#include<fstream>
#include<iostream>
using namespace std;
void text_to_binary(ifstream &ifs, ofstream &ofs)
{
for (int x; ifs.read(as_bytes(x), sizeof(char));)
{
ofs << x << '\n';
}
ofs.close();
ifs.close();
}
void binary_to_text(ifstream &ifs, ofstream &ofs)
{
for (int x; ifs >> x;)
{
ofs.write(as_bytes(x), sizeof(char));
}
ifs.close();
ofs.close();
}
int main()
{
string iname = "./chapter_11/input.txt";
string oname = "./chapter_11/output.txt";
ifstream ifs{iname, ios_base::binary};
ofstream ofs{oname, ios_base::binary};
text_to_binary(ifs, ofs);
ifstream ifs2{oname, ios_base::binary};
ofstream ofs2{iname, ios_base::binary};
binary_to_text(ifs2, ofs2);
return 0;
}
I figured out that I have to use sizeof(char) rather than sizeof(int) in the .read and .write command. If I use the sizeof(int) some chars of the .txt file go missing when I write them back to text. Funnily enough chars only goes missing if
x%4 != 0 (x = nb of chars in .txt file)
example with sizeof(int):
input.txt:
hello this is an amazing test. 1234 is a number everything else doesn't matter..asd
(text_to_binary fnc) results in:
output.txt:
1819043176
1752440943
1763734377
1851859059
1634558240
1735289210
1936028704
824192628
540291890
1629516649
1836412448
544367970
1919252069
1768453241
1696622446
543519596
1936027492
544483182
1953784173
774795877
(binary_to_text fnc) results back in:
input.txt:
hello this is an amazing test. 1234 is a number everything else doesn't matter..
asd went missing.
Now to my question, why does this happen? Is it because int's are saved as 4 bytes?
Bonus question: Out of interest, is there a simpler/more efficient way of doing this?
edit: updated the question with the results to make it hopefully more clear
When you attempt to do a partial read, the read will attempt to go beyond the end of the file and the eof flag will be set for the stream. This makes its use in the loop condition false so the loop ends.
You need to check gcount of the stream after the loop to see if any bytes was actually read into the variable x.
But note that partial reads will only write to parts of the variable x, leaving the rest indeterminate. Exactly which parts depends on the system endianness, and using the variable with its indeterminate bits will lead to undefined behavior.

how can i write an array of structures to a binary file and read it again?

I'm writing an array af structures(factor is a structure) to a binary file like this:
factor factors[100];
ofstream fa("Desktop:\\fa.dat", ios::out | ios::binary);
fa.write(reinterpret_cast<const char*>(&factors),sizeof(factors));
fa.close();
and I run the program and save 5 records in it.in another file, I want to read the structures so I wrote this:
int i=0;
ifstream a("Desktop:\\fa.dat", ios::in | ios::binary);
factor undelivered_Factors[100];
while(a && !a.eof()){
a.read(reinterpret_cast<char*>(&undelivered_Factors),sizeof(undelivered_Factors));
cout<<undelivered_Factors[i].ID<<"\t"<<undelivered_Factors[i].total_price<<endl;
i++;
}
a.close();
but after reading and printing the saved factors it inly reads and shows the firs 2 of them in the array.why?what should i do?
Second parameter of ofstream::write and ::read is size of written memory in bytes (aka 'char' in C\C++), which is right - you're writing entire array at once. In reading procedure you had mixed up an per element and array processing. You expect to read whole array, then you print one value, then you read another 100 of records which you do not have in file, I presume. also eof() happens only when you attempt to read and it failed. If you stand on end of file,eof() isn't triggered, that's why you get two records printed.
You are doing complete read in the single call so your loop runs only one time hence it will output only first struct value. Change your while loop like this:
if(a)
{
a.read(reinterpret_cast<char*>(&undelivered_Factors),sizeof(undelivered_Factors));
}
for(int i=0; i<100; ++i)
{
cout<<undelivered_Factors[i].ID<<"\t"<<undelivered_Factors[i].total_price<<endl;
}

Binary file not holding data properly

Im currently trying to replace a text based file in my application with a binary one. Im just doing some early tests so the code isn't exactly safe but I'm having problems with the data.
When trying to read out the data it gets about half way before it starts coming back with incorrect results.
Im creating the file in c++ and my client application is c#. I think the problem is in my c++ (which I haven't used very much)
Where the problem is at the moment is I have a vector of a struct that is called DoubleVector3 which consists of 3 doubles
struct DoubleVector3 {
double x, y, z;
DoubleVector3(std::string line);
};
Im currently writing the variables individually to the file
void ObjElement::WriteToFile(std::string file) {
std::ofstream fileStream;
fileStream.open(file); //, ios::out | ios::binary);
// ^^problem was this line. it should be
// fileStream.open(file, std::ios_base::out | std::ios_base::binary);
fileStream << this->name << '\0';
fileStream << this->materialName << '\0';
int size = this->vertices.size();
fileStream.write((char*)&size,sizeof(size));
//i have another int written here
for (int i=0; i<this->vertices.size(); i++) {
fileStream.write((char*)&this->vertices[i].x, 8);
fileStream.write((char*)&this->vertices[i].y, 8);
fileStream.write((char*)&this->vertices[i].z, 8);
}
fileStream.close();
}
When I read the file in c# the first 6 sets of 3 doubles are all correct but then I start getting 0s and minus infinities
Am I doing anything obviously wrong in my WriteToFile code?
I have the file uploaded on mega if anyone needs to look at it
https://mega.co.nz/#!XEpHTSYR!87ihtCfnGXJJNn13iE6GIpeRhlhbabQHFfN88kr_BAk
(im writing the name and material in first then the number of vertices before the actual list of vertices)
Small side question - Should I delimit these doubles or just add them in one after the other?
To store binary data in a stream, you must add std::ios_base::binary to the stream's flags when opening it. Without this, the stream is opened in text mode and line-ending conversions can happen.
On Windows, line-ending conversions mean inserting a byte 0x0D (ASCII for carriage-return) before each 0x0A byte (ASCII for line-feed). Needless to say, this corrupts binary data.

Converting between text files and binary files in C++

For converting an ordinary text file into binary and then convert that binary file back to a text file so that the first text file equals with the last text file, I have wrote below code.
But the bintex text file and the final text file aren't equal. I don't know which part of code is incorrect.
Input sample ("bintex") contains this: 1983 1362
The result ("final") contains this: 959788084
which of course are not equal.
#include <iostream>
#include <fstream>
using namespace std;
int main() try
{
string name1 = "bintex", name2 = "texbin", name3 = "final";
ifstream ifs1(name1.c_str());
if(!ifs1) error("Can't open file for reading.");
vector<int>v1, v2;
int i;
while(ifs1.read(as_bytes(i), sizeof(int)));
v1.push_back(i);
ifs1.close();
ofstream ofs1(name2.c_str(), ios::binary);
if(!ofs1) error("Can't open file for writting.");
for(int i=0; i<v1.size(); i++)
ofs1 << v1[i];
ofs1.close();
ifstream ifs2(name2.c_str(), ios::binary);
if(!ifs2) error("Can't open file for reading.");
while(ifs2.read(as_bytes(i), sizeof(int)));
v2.push_back(i);
ifs2.close();
ofstream ofs2(name3.c_str());
if(!ofs2) error("Can't open file for writting.");
for(int i=0; i<v2.size(); i++)
ofs2 << v2[i];
ofs2.close();
keep_window_open();
return 0;
}
//********************************
catch(exception& e)
{
cerr << e.what() << endl;
keep_window_open();
return 0;
}
What is this?
while(ifs1.read(as_bytes(i), sizeof(int)));
It looks like a loop that reads all input and throws it away. The line afterward suggests that you should be using braces instead of a semicolon there, and doing the write in the block.
Your read and write operations aren't symmetric.
ifs1.read(as_bytes(i), sizeof(int))
grabs 4 bytes, and dumps the values into the char* its passed.
ofs1 << v1[i];
output the integer in v[i] as text. Those are very very different formats.
If you used >> to read you would have a lot more success.
To expound, the first read might look like this {'1','9','8','3'}, which I would guess would be the 959788084 you are seeing when you pun it to an int. Your second read would be {' ','1','3','6'}, like not what you'd hoped for either.
It's not clear (to me, at least), what you are trying to do.
When you say that the orginal file contains 1983 1262, what do
you really mean? That it contains two four byte integers, in
some unspecified format, whose values are 1983 and 1262? If so,
the problem is probably due to your machine not using the same
format. You cannot, in general, just read bytes (using
istream::read) and expect them to mean anything in your
machine's internal format. You have to read the bytes into
a buffer, and unformat them, according to the format with which
they were written.
Of course, opening a stream in binary mode doesn't mean that
the actual data are in some binary format; it just affects
things like how (or more strictly speaking, whether) line
endings are encoded, and how end of file is recognized.
(Strictly speaking, a binary file is not divided into lines. It
is just a sequence of bytes. Of course, some of those bytes
might have values that you, in your program, interpret and new
line characters.) If your file actually contains nine bytes
with characters corresponding to "1983 1362", then you'll have
to parse them as a text format, even if the file is written in
binary. You can do this by reading the entire file into
a string, and usingstd::istringstream; _or_, on most common
systems (but not necessarily on all exotics) by using>>` to
read, just as you would with a text file.
EDIT:
Just a simple reminder: you don't show the code for as_bytes,
but I'm willing to guess that there's a reinterpret_cast in
it. And any time you have to use a reinterpret cast, you can be
very sure that what you're doing isn't portable, and if it's
supposed to be portable, you're doing it wrong.

Reading file byte by byte with ifstream::get

I wrote this binary reader after a tutorial on the internet. (I'm trying to find the link...)
The code reads the file byte by byte and the first 4 bytes are together the magic word. (Let's say MAGI!) My code looks like this:
std::ifstream in(fileName, std::ios::in | std::ios::binary);
char *magic = new char[4];
while( !in.eof() ){
// read the first 4 bytes
for (int i=0; i<4; i++){
in.get(magic[i]);
}
// compare it with the magic word "MAGI"
if (strcmp(magic, "MAGI") != 0){
std::cerr << "Something is wrong with the magic word: "
<< magic << ", couldn't read the file further! "
<< std::endl;
exit(1);
}
// read the rest ...
}
Now here comes the problem, when I open my file, I get this error output:
Something is wrong with the magic word: MAGI?, couldn't read the file further! So there is always one (mostly random) character after the word MAGI, like in this example the character ?!
I do think that it has something to do with how a string in C++ is stored and compared with each other. Am I right and how can I avoid this?
PS: this implementation is included in another program and works totally fine ... weird.
strcmp assumes that both strings are nul-terminated (end with a nul-character). When you want to compare strings which are not terminated, like in this case, you need to use strncmp and tell it how many characters to compare (4 in this case).
if (strncmp(magic, "MAGI", 4) != 0){
When you try to use strcmp to compare not null-terminated char arrays, it can't tell how long the arrays are (you can't tell the length of an array in C/C++ just by looking at the array itself - you need to know the length it was allocated with. The standard library is not exempt from this limitation). So it reads any data which happens to be stored in memory after the char array until it hits a 0-byte.
By the way: Note the comment to your question by Lightness Races in Orbit, which is unrelated to the issue you are having now, but which hints a different bug which might cause you some problems later on.