Missing data when reading binary file in C++ - c++

vector<byte> h;
ifstream fin(file.c_str(), ios::binary);
if(!fin)
return false;
byte b;
while(fin >> b)
h.push_back(b);
The length of h is 4021, while the raw file length is 4096 bytes. But the code below gives a string of 4096 bytes. Why?
ostringstream sout;
sout << fin.rdbuf();
string s = sout.str();
UPDATES:
#user2079303 solved my problem, but any other way to perform the reading task. It's too easy to get it wrong.

When reading input stream char by char (your bytes are chars, right?), standard streams ignore whitespace by default. You can use std::noskipws to stop ignoring them.
fin >> std::noskipws >> b;
Note: ios::binary has no effect on this behaviour even though one might expect so. It only disables translating line-endings as far as I know.
If you don't want to worry about processing, you can use functions that fulfill UnformattedInputFunction. cppreference has a nice example on how to read a binary file with istream::read.

Related

Ifstream in c++

I need some help with a code.
I need to take this information to my c++ code from another file, the last one is just like this:
Human:3137161264 46
This is what I wrote for it, it takes the word "Human" correctly but then it takes random numbers, not the ones written on the file I just wrote:
struct TSpecie {
string id;
int sizeGen;
int numCs; };
__
TSpecie readFile(string file){
TSpecie a;
ifstream in(file);
if (in){
getline(in,a.id,':');
in >> a.sizeGen;
in >> a.numCs;
}
else
cout << "File not found";
return a; }
Hope you can solve it and thanks for your help
3137161264 causes integer overflow leading to Undefined Behaviour.
So unsigned int sizeGen would be enough for this case, but consider long long (unsigned) int sizeGen too.
Edit 1: As pointed out by #nwp in comments to your question, you can also check your stream if any error has occured:
//read something and then
if (!in) {
// error occured during last reading
}
Always test whether input was successful after reading from the stream:
if (std::getline(in, a.id, ':') >> a.sizeGen >> a.NumCs) {
// ...
}
Most likely the input just failed. For example, the first number probably can't be read successful. Also note that std::getline() is an unformatted input function, i.e., it won't skip leading whitespace. For example the newline after the last number read is still in the stream (at least, since your use of std::getline() finishes on a colon, it will only create an odd ID).

incomplete string in result curlpp [duplicate]

I wanted to use fstream to read a txt file.
I am using inFile >> characterToConvert, but the problem is that this omits any spaces and newline.
I am writing an encryption program so I need to include the spaces and newlines.
What would be the proper way to go about accomplishing this?
Probably the best way is to read the entire file's contents into a string, which can be done very easily using ifstream's rdbuf() method:
std::ifstream in("myfile");
std::stringstream buffer;
buffer << in.rdbuf();
std::string contents(buffer.str());
You can then use regular string manipulation now that you've got everything from the file.
While Tomek was asking about reading a text file, the same approach will work for reading binary data, though the std::ios::binary flag needs to be provided when creating the input file stream.
For encryption, you're better off opening your file in binary mode. Use something like this to put the bytes of a file into a vector:
std::ifstream ifs("foobar.txt", std::ios::binary);
ifs.seekg(0, std::ios::end);
std::ifstream::pos_type filesize = ifs.tellg();
ifs.seekg(0, std::ios::beg);
std::vector<char> bytes(filesize);
ifs.read(&bytes[0], filesize);
Edit: fixed a subtle bug as per the comments.
I haven't tested this, but I believe you need to clear the "skip whitespace" flag:
inFile.unsetf(ios_base::skipws);
I use the following reference for C++ streams:
IOstream Library
std::ifstream ifs( "filename.txt" );
std::string str( ( std::istreambuf_iterator<char>( ifs ) ),
std::istreambuf_iterator<char>()
);
The following c++ code will read an entire file...
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main ()
{
string line;
ifstream myfile ("foo.txt");
if (myfile.is_open()){
while (!myfile.eof()){
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
return 0;
}
post your code and I can give you more specific help to your problem...
A lot of the benefit of the istream layer is providing basic formatting and parsing for simple types ro and from a stream. For the purposes that you describe, none of this is really important and you are just interested in the file as a stream of bytes.
For these purpose you may be better of just using the basic_streambuf interface provided by a filebuf. The 'skip whitespace' behaviour is part of the istream interface functionality that you just don't need.
filebuf underlies an ifstream, but it is perfectly valid to use it directly.
std::filebuf myfile;
myfile.open( "myfile.dat", std::ios_base::in | std::ios_base::binary );
// gets next char, then moves 'get' pointer to next char in the file
int ch = myfile.sbumpc();
// get (up to) the next n chars from the stream
std::streamsize getcount = myfile.sgetn( char_array, n );
Also have a look at the functions snextc (moves the 'get' pointer forward and then returns the current char), sgetc (gets the current char but doesn't move the 'get' pointer) and sungetc (backs up the 'get' pointer by one position if possible).
When you don't need any of the insertion and extraction operators provided by an istream class and just need a basic byte interface, often the streambuf interface (filebuf, stringbuf) is more appropriate than an istream interface (ifstream, istringstream).
You can call int fstream::get(), which will read a single character from the stream. You can also use istream& fstream::read(char*, streamsize), which does the same operation as get(), just over multiple characters. The given links include examples of using each method.
I also recommend reading and writing in binary mode. This allows ASCII control characters to be properly read from and written to files. Otherwise, an encrypt/decrypt operation pair might result in non-identical files. To do this, you open the filestream with the ios::binary flag. With a binary file, you want to use the read() method.
Another better way is to use istreambuf_iterator, and the sample code is as below:
ifstream inputFile("test.data");
string fileData(istreambuf_iterator<char>(inputFile), istreambuf_iterator<char>());
For encryption, you should probably use read(). Encryption algorithms usually deal with fixed-size blocks. Oh, and to open in binary mode (no translation frmo \n\r to \n), pass ios_base::binary as the second parameter to constructor or open() call.
Simple
#include <fstream>
#include <iomanip>
ifstream ifs ("file");
ifs >> noskipws
that's all.
ifstream ifile(path);
std::string contents((std::istreambuf_iterator<char>(ifile)), std::istreambuf_iterator<char>());
ifile.close();
I also find that the get() method of ifstream object can also read all the characters of the file, which do not require unset std::ios_base::skipws. Quote from C++ Primer:
Several of the unformatted operations deal with a stream one byte at a time. These operations, which are described in Table 17.19, read rather ignore whitespaces.
These operations are list as below:
is.get(), os.put(), is.putback(), is.unget() and is.peek().
Below is a minimum working code
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::ifstream in_file("input.txt");
char s;
if (in_file.is_open()){
int count = 0;
while (in_file.get(s)){
std::cout << count << ": "<< (int)s <<'\n';
count++;
}
}
else{
std::cout << "Unable to open input.txt.\n";
}
in_file.close();
return 0;
}
The content of the input file (cat input.txt) is
ab cd
ef gh
The output of the program is:
0: 97
1: 98
2: 32
3: 99
4: 100
5: 10
6: 101
7: 102
8: 32
9: 103
10: 104
11: 32
12: 10
10 and 32 are decimal representation of newline and space character. Obviously, all characters have been read.
As Charles Bailey correctly pointed out, you don't need fstream's services just to read bytes. So forget this iostream silliness, use fopen/fread and be done with it. C stdio is part of C++, you know ;)

C++ fstream: how to know size of string when reading?

...as someone may remember, I'm still stuck on C++ strings. Ok, I can write a string to a file using a fstream as follows
outStream.write((char *) s.c_str(), s.size());
When I want to read that string, I can do
inStream.read((char *) s.c_str(), s.size());
Everything works as expected. The problem is: if I change the length of my string after writing it to a file and before reading it again, printing that string won't bring me back my original string but a shorter/longer one. So: if I have to store many strings on a file, how can I know their size when reading it back?
Thanks a lot!
You shouldn’t be using the unformatted I/O functions (read() and write()) if you just want to write ordinary human-readable string data. Generally you only use those functions when you need to read and write compact binary data, which for a beginner is probably unnecessary. You can write ordinary lines of text instead:
std::string text = "This is some test data.";
{
std::ofstream file("data.txt");
file << text << '\n';
}
Then read them back with getline():
{
std::ifstream file("data.txt");
std::string line;
std::getline(file, line);
// line == text
}
You can also use the regular formatting operator >> to read, but when applied to string, it reads tokens (nonwhitespace characters separated by whitespace), not whole lines:
{
std::ifstream file("data.txt");
std::vector<std::string> words;
std::string word;
while (file >> word) {
words.push_back(word);
}
// words == {"This", "is", "some", "test", "data."}
}
All of the formatted I/O functions automatically handle memory management for you, so there is no need to worry about the length of your strings.
Although your writing solution is more or less acceptable, your reading solution is fundamentally flawed: it uses the internal storage of your old string as a character buffer for your new string, which is very, very bad (to put it mildly).
You should switch to a formatted way of reading and writing the streams, like this:
Writing:
outStream << s;
Reading:
inStream >> s;
This way you would not need to bother determining the lengths of your strings at all.
This code is different in that it stops at whitespace characters; you can use getline if you want to stop only at \n characters.
You can write the strings and write an additional 0 (null terminator) to the file. Then it will be easy to separate strings later. Also, you might want to read and write lines
outfile << string1 << endl;
getline(infile, string2, '\n');
If you want to use unformatted I/O your only real options are to either use a fixed size or to prepend the size somehow so you know how many characters to read. Otherwise, when using formatted I/O it somewhat depends on what your strings contain: if they can contain all viable characters, you would need to implement some sort of quoting mechanism. In simple cases, where strings consist e.g. of space-free sequence, you can just use formatted I/O and be sure to write a space after each string. If your strings don't contain some character useful as a quote, it is relatively easy to process quotes:
std::istream& quote(std::istream& out) {
char c;
if (in >> c && c != '"') {
in.setstate(std::ios_base::failbit;
}
}
out << '"' << string << "'";
std::getline(in >> std::ws >> quote, string, '"');
Obviously, you might want to bundle this functionality a class.

Reading binary istream byte by byte

I was attempting to read a binary file byte by byte using an ifstream. I've used istream methods like get() before to read entire chunks of a binary file at once without a problem. But my current task lends itself to going byte by byte and relying on the buffering in the io-system to make it efficient. The problem is that I seemed to reach the end of the file several bytes sooner than I should. So I wrote the following test program:
#include <iostream>
#include <fstream>
int main() {
typedef unsigned char uint8;
std::ifstream source("test.dat", std::ios_base::binary);
while (source) {
std::ios::pos_type before = source.tellg();
uint8 x;
source >> x;
std::ios::pos_type after = source.tellg();
std::cout << before << ' ' << static_cast<int>(x) << ' '
<< after << std::endl;
}
return 0;
}
This dumps the contents of test.dat, one byte per line, showing the file position before and after.
Sure enough, if my file happens to have the two-byte sequence 0x0D-0x0A (which corresponds to carriage return and line feed), those bytes are skipped.
I've opened the stream in binary mode. Shouldn't that prevent it from interpreting line separators?
Do extraction operators always use text mode?
What's the right way to read byte by byte from a binary istream?
MSVC++ 2008 on Windows.
The >> extractors are for formatted input; they skip white space (by
default). For single character unformatted input, you can use
istream::get() (returns an int, either EOF if the read fails, or
a value in the range [0,UCHAR_MAX]) or istream::get(char&) (puts the
character read in the argument, returns something which converts to
bool, true if the read succeeds, and false if it fails.
there is a read() member function in which you can specify the number of bytes.
Why are you using formatted extraction, rather than .read()?
source.get()
will give you a single byte. It is unformatted input function.
operator>> is formatted input function that may imply skipping whitespace characters.
As others mentioned, you should use istream::read(). But, if you must use formatted extraction, consider std::noskipws.

using fstream to read every character including spaces and newline

I wanted to use fstream to read a txt file.
I am using inFile >> characterToConvert, but the problem is that this omits any spaces and newline.
I am writing an encryption program so I need to include the spaces and newlines.
What would be the proper way to go about accomplishing this?
Probably the best way is to read the entire file's contents into a string, which can be done very easily using ifstream's rdbuf() method:
std::ifstream in("myfile");
std::stringstream buffer;
buffer << in.rdbuf();
std::string contents(buffer.str());
You can then use regular string manipulation now that you've got everything from the file.
While Tomek was asking about reading a text file, the same approach will work for reading binary data, though the std::ios::binary flag needs to be provided when creating the input file stream.
For encryption, you're better off opening your file in binary mode. Use something like this to put the bytes of a file into a vector:
std::ifstream ifs("foobar.txt", std::ios::binary);
ifs.seekg(0, std::ios::end);
std::ifstream::pos_type filesize = ifs.tellg();
ifs.seekg(0, std::ios::beg);
std::vector<char> bytes(filesize);
ifs.read(&bytes[0], filesize);
Edit: fixed a subtle bug as per the comments.
I haven't tested this, but I believe you need to clear the "skip whitespace" flag:
inFile.unsetf(ios_base::skipws);
I use the following reference for C++ streams:
IOstream Library
std::ifstream ifs( "filename.txt" );
std::string str( ( std::istreambuf_iterator<char>( ifs ) ),
std::istreambuf_iterator<char>()
);
The following c++ code will read an entire file...
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main ()
{
string line;
ifstream myfile ("foo.txt");
if (myfile.is_open()){
while (!myfile.eof()){
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
return 0;
}
post your code and I can give you more specific help to your problem...
A lot of the benefit of the istream layer is providing basic formatting and parsing for simple types ro and from a stream. For the purposes that you describe, none of this is really important and you are just interested in the file as a stream of bytes.
For these purpose you may be better of just using the basic_streambuf interface provided by a filebuf. The 'skip whitespace' behaviour is part of the istream interface functionality that you just don't need.
filebuf underlies an ifstream, but it is perfectly valid to use it directly.
std::filebuf myfile;
myfile.open( "myfile.dat", std::ios_base::in | std::ios_base::binary );
// gets next char, then moves 'get' pointer to next char in the file
int ch = myfile.sbumpc();
// get (up to) the next n chars from the stream
std::streamsize getcount = myfile.sgetn( char_array, n );
Also have a look at the functions snextc (moves the 'get' pointer forward and then returns the current char), sgetc (gets the current char but doesn't move the 'get' pointer) and sungetc (backs up the 'get' pointer by one position if possible).
When you don't need any of the insertion and extraction operators provided by an istream class and just need a basic byte interface, often the streambuf interface (filebuf, stringbuf) is more appropriate than an istream interface (ifstream, istringstream).
You can call int fstream::get(), which will read a single character from the stream. You can also use istream& fstream::read(char*, streamsize), which does the same operation as get(), just over multiple characters. The given links include examples of using each method.
I also recommend reading and writing in binary mode. This allows ASCII control characters to be properly read from and written to files. Otherwise, an encrypt/decrypt operation pair might result in non-identical files. To do this, you open the filestream with the ios::binary flag. With a binary file, you want to use the read() method.
Another better way is to use istreambuf_iterator, and the sample code is as below:
ifstream inputFile("test.data");
string fileData(istreambuf_iterator<char>(inputFile), istreambuf_iterator<char>());
For encryption, you should probably use read(). Encryption algorithms usually deal with fixed-size blocks. Oh, and to open in binary mode (no translation frmo \n\r to \n), pass ios_base::binary as the second parameter to constructor or open() call.
Simple
#include <fstream>
#include <iomanip>
ifstream ifs ("file");
ifs >> noskipws
that's all.
ifstream ifile(path);
std::string contents((std::istreambuf_iterator<char>(ifile)), std::istreambuf_iterator<char>());
ifile.close();
I also find that the get() method of ifstream object can also read all the characters of the file, which do not require unset std::ios_base::skipws. Quote from C++ Primer:
Several of the unformatted operations deal with a stream one byte at a time. These operations, which are described in Table 17.19, read rather ignore whitespaces.
These operations are list as below:
is.get(), os.put(), is.putback(), is.unget() and is.peek().
Below is a minimum working code
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::ifstream in_file("input.txt");
char s;
if (in_file.is_open()){
int count = 0;
while (in_file.get(s)){
std::cout << count << ": "<< (int)s <<'\n';
count++;
}
}
else{
std::cout << "Unable to open input.txt.\n";
}
in_file.close();
return 0;
}
The content of the input file (cat input.txt) is
ab cd
ef gh
The output of the program is:
0: 97
1: 98
2: 32
3: 99
4: 100
5: 10
6: 101
7: 102
8: 32
9: 103
10: 104
11: 32
12: 10
10 and 32 are decimal representation of newline and space character. Obviously, all characters have been read.
As Charles Bailey correctly pointed out, you don't need fstream's services just to read bytes. So forget this iostream silliness, use fopen/fread and be done with it. C stdio is part of C++, you know ;)