C++ ifstream will read some values then stop - c++

I am trying to write a program that reads 940 4-byte long values of binary data [hex] from a bin file, and output the values to console. I have ifstream::read, cout and seekg operations in a loop.
It will work for the first 10 or so iterations, and then in one iteration skip the read and write operations, preform the seekg operation, and continue on reading and writing. Also the last 200 lines or so are coming out the same value.
It will work properly for 12 iterations, then it will start outputting the wrong numbers. At this point it goes from address 0x230 to 0x28B when it should be at 0x260. It looks like read and cout are not called in this particular iteration.
The last correct value reads 3f4fc938. The next value should be 3ef646c1.
Does anyone know why this would fail? Any help is appreciated.
This is the program:
int main(int argc, char* argv[]) {
fstream in;
uint32_t buffer;
in.open(argv[1]);
in.seekg(0x6500,in.beg);
for(int i = 0; i < 940; i++) {
in.read(reinterpret_cast<char*> (&buffer),4);
cout << hex << buffer << endl;
in.seekg(0x2c,in.cur);
}
}

You have opened your file in text mode. Text mode means that operations on the file will interpret a Byte sequence that matches the platform-specific representation of a newline as a single '\n' character. If you're on Windows, for example, newlines are represented as the Byte sequence 0D 0A. So on Windows, whatever you do in your file will work well up to the point where your file happens to have a Byte with value 13 followed by a Byte with value 10. Once you reach that point, that 13 followed by 10 will be interpreted as a single character. Essentially, text mode will just swallow any Byte with value 13 if it happens to appear right before a Byte with value 10. Your application will never see the 13 and anything beyond the point where the 13 appeared will end up "shifted" by one Byte. On other platforms, other newline representations are common. If you wanna work with binary data, you will generally want to open your file in binary mode, for example
fstream in(argv[1], std::ios::binary);
or
in.open(argv[1], std::ios::binary);

Related

Read a file line by line in C++

I wrote the following C++ program to read a text file line by line and print out the content of the file line by line. I entered the name of the text file as the only command line argument into the command line.
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char buf[255] = {};
if (argc != 2)
{
cout << "Invalid number of files." << endl;
return 1;
}
ifstream f(argv[1], ios::in | ios::binary);
if (!f)
{
cout << "Error: Cannot open file." << endl;
return 1;
}
while (!f.eof())
{
f.get(buf,255);
cout << buf << endl;
}
f.close();
return 0;
}
However, when I ran this code in Visual Studio, the Debug Console was completely blank. What's wrong with my code?
Apart from the errors mentioned in the comments, the program has a logical error because istream& istream::get(char* s, streamsize n) does not do what you (or I, until I debugged it) thought it does. Yes, it reads to the next newline; but it leaves the newline in the input!
The next time you call get(), it will see the newline immediately and return with an empty line in the buffer, for ever and ever.
The best way to fix this is to use the appropriate function, namely istream::getline() which extracts, but does not store the newline.
The EOF issue
is worth mentioning. The canonical way to read lines (if you want to write to a character buffer) is
while (f.getline(buf, bufSz))
{
cout << buf << "\n";
}
getline() returns a reference to the stream which in turn has a conversion function to bool, which makes it usable in a boolean expression like this. The conversion is true if input could be obtained. Interestingly, it may have encountered the end of file, and f.eof() would be true; but that alone does not make the stream convert to false. As long as it could extract at least one character it will convert to true, indicating that the last input operation made input available, and the loop will work as expected.
The next read after encountering EOF would then fail because no data could be extracted: After all, the read position is still at EOF. That is considered a read failure. The condition is wrong and the loop is exited, which was exactly the intent.
The buffer size issue
is worth mentioning, as well. The standard draft says in 30.7.4.3:
Characters are extracted and stored until one of the following occurs:
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
traits::eq(c, delim) for the next available input character c
(in which case the input character
is extracted but not stored);
n is less than one or n - 1 characters are stored
(in which case the function calls setstate(
failbit)).
The conditions are tested in that order, which means that if n-1 characters have been stored and the next character is a newline (the default delimiter), the input was successful (and the newline is extracted as well).
This means that if your file contains a single line 123 you can read that successfully with f.getline(buf, 4), but not a line 1234 (both may or may not be followed by a newline).
The line ending issue
Another complication here is that on Windows a file created with a typical editor will have a hidden carriage return before the newline, i.e. a line actually looks like "123\r\n" ("\r" and "\n" each being a single character with the values 13 and 10, respectively). Because you opened the file with the binary flag the program will see the carriage return; all lines will contain that "invisible" character, and the number of visible characters fitting in the buffer will be one shorter than one would assume.
The console issue ;-)
Oh, and your Console was not entirely empty; it's just that modern computers are too fast and the first line which was probably printed (it was in my case) scrolled away faster than anybody could switch windows. When I looked closely there was a cursor in the bottom left corner where the program was busy printing line after line of nothing ;-).
The conclusion
Debug your programs. It's very easy with VS.
Use getline(istream, string).
Use the return value of input functions (typically the stream)
as a boolean in a while loop: "As long as you can extract any input, use that input."
Beware of line ending issues.
Consider C I/O (printf, scanf) for anything non-trivial (I didn't discuss this in my answer but I think that's what many people do).

Binary file not holding data properly

Im currently trying to replace a text based file in my application with a binary one. Im just doing some early tests so the code isn't exactly safe but I'm having problems with the data.
When trying to read out the data it gets about half way before it starts coming back with incorrect results.
Im creating the file in c++ and my client application is c#. I think the problem is in my c++ (which I haven't used very much)
Where the problem is at the moment is I have a vector of a struct that is called DoubleVector3 which consists of 3 doubles
struct DoubleVector3 {
double x, y, z;
DoubleVector3(std::string line);
};
Im currently writing the variables individually to the file
void ObjElement::WriteToFile(std::string file) {
std::ofstream fileStream;
fileStream.open(file); //, ios::out | ios::binary);
// ^^problem was this line. it should be
// fileStream.open(file, std::ios_base::out | std::ios_base::binary);
fileStream << this->name << '\0';
fileStream << this->materialName << '\0';
int size = this->vertices.size();
fileStream.write((char*)&size,sizeof(size));
//i have another int written here
for (int i=0; i<this->vertices.size(); i++) {
fileStream.write((char*)&this->vertices[i].x, 8);
fileStream.write((char*)&this->vertices[i].y, 8);
fileStream.write((char*)&this->vertices[i].z, 8);
}
fileStream.close();
}
When I read the file in c# the first 6 sets of 3 doubles are all correct but then I start getting 0s and minus infinities
Am I doing anything obviously wrong in my WriteToFile code?
I have the file uploaded on mega if anyone needs to look at it
https://mega.co.nz/#!XEpHTSYR!87ihtCfnGXJJNn13iE6GIpeRhlhbabQHFfN88kr_BAk
(im writing the name and material in first then the number of vertices before the actual list of vertices)
Small side question - Should I delimit these doubles or just add them in one after the other?
To store binary data in a stream, you must add std::ios_base::binary to the stream's flags when opening it. Without this, the stream is opened in text mode and line-ending conversions can happen.
On Windows, line-ending conversions mean inserting a byte 0x0D (ASCII for carriage-return) before each 0x0A byte (ASCII for line-feed). Needless to say, this corrupts binary data.

getline seems to not working correctly

Please tell me what am I doing wrong here. What I want to do is this:
1.Having txt file with four numbers and each of this numbers has 15 digits:
std::ifstream file("numbers.txt",std::ios::binary);
I'm trying to read those numbers into my array:
char num[4][15];
And what I'm thinking I'm doing is: for as long as you don't reach end of files write every line (max 15 chars, ending at '\n') into num[lines]. But this somewhat doesn't work. Firstly it reads correctly only first number, rest is just "" (empty string) and secondly file.eof() doesn't seems to work correctly either. In txt file which I'm presenting below this code I reached lines equal 156. What's going on?
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(num[lines],15,'\n');
}
So the whole "routine" looks like this:
int main()
{
std::ifstream file("numbers.txt",std::ios::binary);
char numbers[4][15];
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(numbers[lines],15,'\n');// sizeof(numbers[0])
}
}
This is contents of my txt file:
111111111111111
222222222222222
333333333333333
444444444444444
P.S.
I'm using VS2010 sp1
Do not use the eof() function! The canonical way to read lines is:
while( getline( cin, line ) ) {
// do something with line
}
file.getline() extracts 14 characters, filling in num[0][0] .. num[0][13]. Then it stores a '\0' in num[0][14] and sets the failbit on file because that's what it does when the buffer is full but terminating character not reached.
Further attempts to call file.getline() do nothing because failbit is set.
Tests for !file.eof() return true because the eofbit is not set.
Edit: to give a working example, best is to use strings, of course, but to fill in your char array, you could do this:
#include <iostream>
#include <fstream>
int main()
{
std::ifstream file("numbers.txt"); // not binary!
char numbers[4][16]={}; // 16 to fit 15 chars and the '\0'
for (unsigned lines = 0;
lines < 4 && file.getline(numbers[lines], 16);
++lines)
{
std::cout << "numbers[" << lines << "] = " << numbers[lines] << '\n';
}
}
tested on Visual Studio 2010 SP1
According to ifstream doc, reading stops either after n-1 characters are read or delim sign is found : first read would take then only 14 bytes.
It reads bytes : '1' (the character) is 0x41 : your buffer would be filled with 0x41 instead of 1 as you seem to expect, last character will be 0 (end of c-string)
Side note, your code doesn't check that lines doesn't go beyond your array.
Using getline supposes you're expecting text and you open the file in binary mode : seems wrong to me.
It looks like the '\n' in the end of the first like is not being considered, and remaining in the buffer. So in the next getline() it gets read.
Try adding a file.get() after each getline().
If one file.get() does not work, try two, because under the Windows default file encoding the line ends with '\n\r\' (or '\r\n', I never know :)
Change it to the following:
#include <cstring>
int main()
{
//no need to use std::ios_base::binary since it's ASCII data
std::ifstream file("numbers.txt");
//allocate one more position in array for the NULL terminator
char numbers[4][16];
//you only have 4 lines, so don't use EOF since that will cause an extra read
//which will then cause and extra loop, causing undefined behavior
for (unsigned lines = 0; lines < 4; ++lines)
{
//copy into your buffer that also includes space for a terminating null
//placing in if-statement checks for the failbit of ifstream
if (!file.getline(numbers[lines], 16,'\n'))
{
//make sure to place a terminating NULL in empty string
//since the read failed
numbers[lines][0] = '\0';
}
}
}

Reading binary istream byte by byte

I was attempting to read a binary file byte by byte using an ifstream. I've used istream methods like get() before to read entire chunks of a binary file at once without a problem. But my current task lends itself to going byte by byte and relying on the buffering in the io-system to make it efficient. The problem is that I seemed to reach the end of the file several bytes sooner than I should. So I wrote the following test program:
#include <iostream>
#include <fstream>
int main() {
typedef unsigned char uint8;
std::ifstream source("test.dat", std::ios_base::binary);
while (source) {
std::ios::pos_type before = source.tellg();
uint8 x;
source >> x;
std::ios::pos_type after = source.tellg();
std::cout << before << ' ' << static_cast<int>(x) << ' '
<< after << std::endl;
}
return 0;
}
This dumps the contents of test.dat, one byte per line, showing the file position before and after.
Sure enough, if my file happens to have the two-byte sequence 0x0D-0x0A (which corresponds to carriage return and line feed), those bytes are skipped.
I've opened the stream in binary mode. Shouldn't that prevent it from interpreting line separators?
Do extraction operators always use text mode?
What's the right way to read byte by byte from a binary istream?
MSVC++ 2008 on Windows.
The >> extractors are for formatted input; they skip white space (by
default). For single character unformatted input, you can use
istream::get() (returns an int, either EOF if the read fails, or
a value in the range [0,UCHAR_MAX]) or istream::get(char&) (puts the
character read in the argument, returns something which converts to
bool, true if the read succeeds, and false if it fails.
there is a read() member function in which you can specify the number of bytes.
Why are you using formatted extraction, rather than .read()?
source.get()
will give you a single byte. It is unformatted input function.
operator>> is formatted input function that may imply skipping whitespace characters.
As others mentioned, you should use istream::read(). But, if you must use formatted extraction, consider std::noskipws.

Binary file I/O issues

Edit: I'm trying to convert a text file into bytes. I'm not sure if the code is turning it into bytes or not. Here is the link to the header so you can see the as_bytes function.
link
#include "std_lib_facilities.h"
int main()
{
cout << "Enter input file name.\n";
string file;
cin >> file;
ifstream in(file.c_str(), ios::binary);
int i;
vector<int> bin;
while(in.read(as_bytes(i), sizeof(int)))
bin.push_back(i);
ofstream out(file.c_str(), ios::out);
for(int i = 0; i < bin.size(); ++i)
out << bin[i];
keep_window_open();
}
Note that now the out stream just outputs the contents of the vector. It doesn't use the write function or the binary mode. This converts the file to a large line of numbers - is this what I'm looking for?
Here is an example of the second code's file conversion:
that guy likes to eat lots of pie (not sure if this was exact text)
turns to
543518319544825700191924850016351970295432362115448292821701667182186922608417526375411952522351186935715718643976841768956006
The reason your first method didn't change the file is because all files are stored in the same way. The only "difference" between text files and binary files is that text files contain only bytes that can be shown as ASCII characters, while binary files* have a much more random variety and order of bytes. So you are reading bytes in as bytes and then outputting them as bytes again!
*I'm including Unicode text files as binary, since they can have multiple bytes to denote one character point, depending on the character point and the encoding used.
The second method is also fairly simple. You are reading in the bytes, as before, and storing them in integers (which are probably 4 bytes long). Then you are just printing out the integers as if they are integers, so you are seeing a string of numbers.
As for why your first method cut off some of the bytes, you're right in that it's probably some bug in your code. I thought it was more important to explain what the ideas are in this case, rather than debug some test code.