getline seems to not working correctly - c++

Please tell me what am I doing wrong here. What I want to do is this:
1.Having txt file with four numbers and each of this numbers has 15 digits:
std::ifstream file("numbers.txt",std::ios::binary);
I'm trying to read those numbers into my array:
char num[4][15];
And what I'm thinking I'm doing is: for as long as you don't reach end of files write every line (max 15 chars, ending at '\n') into num[lines]. But this somewhat doesn't work. Firstly it reads correctly only first number, rest is just "" (empty string) and secondly file.eof() doesn't seems to work correctly either. In txt file which I'm presenting below this code I reached lines equal 156. What's going on?
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(num[lines],15,'\n');
}
So the whole "routine" looks like this:
int main()
{
std::ifstream file("numbers.txt",std::ios::binary);
char numbers[4][15];
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(numbers[lines],15,'\n');// sizeof(numbers[0])
}
}
This is contents of my txt file:
111111111111111
222222222222222
333333333333333
444444444444444
P.S.
I'm using VS2010 sp1

Do not use the eof() function! The canonical way to read lines is:
while( getline( cin, line ) ) {
// do something with line
}

file.getline() extracts 14 characters, filling in num[0][0] .. num[0][13]. Then it stores a '\0' in num[0][14] and sets the failbit on file because that's what it does when the buffer is full but terminating character not reached.
Further attempts to call file.getline() do nothing because failbit is set.
Tests for !file.eof() return true because the eofbit is not set.
Edit: to give a working example, best is to use strings, of course, but to fill in your char array, you could do this:
#include <iostream>
#include <fstream>
int main()
{
std::ifstream file("numbers.txt"); // not binary!
char numbers[4][16]={}; // 16 to fit 15 chars and the '\0'
for (unsigned lines = 0;
lines < 4 && file.getline(numbers[lines], 16);
++lines)
{
std::cout << "numbers[" << lines << "] = " << numbers[lines] << '\n';
}
}
tested on Visual Studio 2010 SP1

According to ifstream doc, reading stops either after n-1 characters are read or delim sign is found : first read would take then only 14 bytes.
It reads bytes : '1' (the character) is 0x41 : your buffer would be filled with 0x41 instead of 1 as you seem to expect, last character will be 0 (end of c-string)
Side note, your code doesn't check that lines doesn't go beyond your array.
Using getline supposes you're expecting text and you open the file in binary mode : seems wrong to me.

It looks like the '\n' in the end of the first like is not being considered, and remaining in the buffer. So in the next getline() it gets read.
Try adding a file.get() after each getline().
If one file.get() does not work, try two, because under the Windows default file encoding the line ends with '\n\r\' (or '\r\n', I never know :)

Change it to the following:
#include <cstring>
int main()
{
//no need to use std::ios_base::binary since it's ASCII data
std::ifstream file("numbers.txt");
//allocate one more position in array for the NULL terminator
char numbers[4][16];
//you only have 4 lines, so don't use EOF since that will cause an extra read
//which will then cause and extra loop, causing undefined behavior
for (unsigned lines = 0; lines < 4; ++lines)
{
//copy into your buffer that also includes space for a terminating null
//placing in if-statement checks for the failbit of ifstream
if (!file.getline(numbers[lines], 16,'\n'))
{
//make sure to place a terminating NULL in empty string
//since the read failed
numbers[lines][0] = '\0';
}
}
}

Related

C++ Im trying to stream a file, and replace the first letter of every line streamed. It doesn't seem to be working as expected

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <iomanip>
void add1(std::fstream& files)
{
char c;
int i=0;
int j=0;
int k=0;
int con=0;
string word;
while(files.get(c)&&!files.eof())
{
i++;
j++;
if(c=='\n'||(con>=1&&isspace(c)))
{
con++;
if(con>=2)
{
break;
}
else
{
cout<<j<<"\/"<<i<<endl;
files.seekp(i-j,files.beg);
files.write("h",1);
files.seekg(i);
*seekg ends the loops I tried fstream::clear. I think it would work perfect if seekg worked.
+ without seekg it works but only for 3 lines then its off.
j=0;
word="";
}
}
else
{
con=0;
word=word+c;
}
}
}
*The goal is to be able stream the file, and replace the first letter of every line in the file while streaming.*
You seam to have a logical error and make thinks overcomplicated.
I do not knwow, what you want to do with your variable "word". It is consumed nowhere. So, I will ignore it.
Then you are playing with read and write pointers. That is not necessary. You only need to manipulate the write pointer.
Then, you want to "stream" something. This I do not fully understand. Maybe it means, that you want to write always something to the stream, even, if you do not replace anything. This would in my understanding only make sense, if you would have 2 streams. But in that case it would be brutally simple and no further thinking necessary.
If we use the same stream and do not want to replace a character, then this is already there, existing, and maybe not overwritten by the same character again.
So, if there is nothing to replace, then we will write nothing . . .
Also, and that is very important, we do no replacement operation, if we have an empty line, because then there is nothing to replace. There is now first character in an empty line.
And, most important, we cannot add characters to the same fstream. In that case we would have to shift the rest of the file one to the right. Therefore. 2 streams are always better. Then, this problem would not occur.
So, what's the logic.
Algorithm:
We always look at the previuosly read character. If that was a '\n' and the current character is not, then we are now in a new line and can replace the first character.
That is all.
It will take also into account, if a '\n' is encoded with 2 characters (for example \r\n). It will always work.
And, it is easy to implement. 10 lines of code.
Please see:
#include <iostream>
#include <fstream>
#include <string>
constexpr char ReplacementCharacter{ 'h' };
void replaceFirstCharacterOfLine(std::fstream& fileStream) {
// Here we stor the previously read character. In the beginning, a file always starts
// with a newline. Therefore we pretend that the last read character is a newline
char previouslyReadCharacter{'\n'};
// Here we store the current read character
char currentCharacter{};
// Get characters from file as lon as there are characters, so, until eof
while (fileStream.get(currentCharacter)) {
// No check, if a new line has started. We ignore empty lines!
if ((previouslyReadCharacter == '\n') && (currentCharacter != '\n')) {
// So last charcter was a newline and this is different. So, we are in a new, none empty line
// Set replacement character
currentCharacter = ReplacementCharacter;
// Go one back with the write pointer
fileStream.seekp(-1, std::ios_base::cur);
// Write (an with taht increment file pointer again)
fileStream.put(currentCharacter);
// Write to file
fileStream.flush();
}
else {
// Do not replace the first charcater. So nothing to be done here
}
// Now, set the previouslyReadCharacter to the just read currentCharacter
previouslyReadCharacter = currentCharacter;
}
}
int main() {
const std::string filename{"r:\\replace.txt"};
// Open file
std::fstream fileStream{ filename };
// Check, if file could be opened
if (fileStream)
replaceFirstCharacterOfLine(fileStream);
else
std::cerr << "\n\n*** Error: Could not open file '" << filename << "'\n\n";
return 0;
}

Unable to parse final word in text file using peek()

I'm attempting to write a lexer and parser but I'm having trouble getting the final variable in a text file due to in_file.tellg() equaling -1. My program only works if I add a space character after the variable, otherwise I get a compiler error. I want to mention that I'm able to get every other variable in the text file but the last one. I believe the cause of the problem is in_file.peek()!=EOF setting in_file.tellg() to -1.
My program is something like this:
ifstream in(file_name);
char c;
in >> noskipws;
while(in >> c ){
if(is_letter_part_of_variable(c)) {
int start_pos = in.tellg(),
end_pos,
length;
while(is_letter_part_of_variable(c) && in.peek()!=EOF ) {
in>>c;
}
end_pos = in.tellg(); // This becomes -1 for some reason
length = end_pos - start_pos; // Should be 7
// Reset file pointer to original position to chomp word.
in.clear();
in.seekg(start_pos-1, in.beg);
// The word 'message' should go in here.
char *identifier = new char[length];
in.read(identifier, length);
identifier[length] = '\0';
}
}
example.text
message = "Hello, World"
print message
I tried removing peek()!= EOF which gives me an eternal loop. I tried !in_file.eof() and that also makes tellg() equal to -1. What can I do to fix/enhance this code?
I believe the cause of the problem is in_file.peek()!=EOF setting in_file.tellg() to -1.
Close. peek attempts to read a character and returns EOF if it reads past the end of the stream. Reading past the end of a stream sets the stream's fail bit. tellg returns -1 if the fail bit is set.
Simple Solution
clear the fail bit before calling tellg.
Better solution
Use std::string.
std::string identifier;
while(in>>c && is_letter_part_of_variable(c)) {
identifier += c;
}
All of the messing around with peek, seekg, tellg and the dreaded new vanish.

C++ reading a file in binary mode. Problems with END OF FILE

I am learning C++and I have to read a file in binary mode. Here's how I do it (following the C++ reference):
unsigned values[255];
unsigned total;
ifstream in ("test.txt", ifstream::binary);
while(in.good()){
unsigned val = in.get();
if(in.good()){
values[val]++;
total++;
cout << val <<endl;
}
}
in.close();
So, I am reading the file byte per byte till in.good() is true. I put some cout at the end of the while in order to understand what's happening, and here is the output:
marco#iceland:~/workspace/huffman$ ./main
97
97
97
97
10
98
98
10
99
99
99
99
10
100
100
10
101
101
10
221497852
marco#iceland:~/workspace/huffman$
Now, the input file "test.txt" is just:
aaaa
bb
cccc
dd
ee
So everything works perfectly till the end, where there's that 221497852. I guess it's something about the end of file, but I can't figure the problem out.
I am using gedit & g++ on a debian machine(64bit).
Any help help will be appreciated.
Many thanks,
Marco
fstream::get returns an int-value. This is one of the problems.
Secondly, you are reading in binary, so you shouldn't use formatted streams. You should use fstream::read:
// read a file into memory
#include <iostream> // std::cout
#include <fstream> // std::ifstream
int main () {
std::ifstream is ("test.txt", std::ifstream::binary);
if (is) {
// get length of file:
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
char * buffer = new char [length];
std::cout << "Reading " << length << " characters... ";
// read data as a block:
is.read (buffer,length);
if (is)
std::cout << "all characters read successfully.";
else
std::cout << "error: only " << is.gcount() << " could be read";
is.close();
// ...buffer contains the entire file...
delete[] buffer;
}
return 0;
}
This isn't the way istream::get() was designed to be used.
The classical idiom for using this function would be:
for ( int val = in.get(); val != EOF; val = in.get() ) {
// ...
}
or even more idiomatic:
char ch;
while ( in.get( ch ) ) {
// ...
}
The first loop is really inherited from C, where in.get() is
the equivalent of fgetc().
Still, as far as I can tell, the code you give should work.
It's not idiomatic, and it's not
The C++ standard is unclear what it should return if the
character value read is negative. fgetc() requires a value in
the range [0...UCHAR_MAX], and I think it safe to assume that
this is the intent here. It is, at least, what every
implementation I've used does. But this doesn't affect your
input. Depending on how the implementation interprets the
standard, the return value of in.get() must be in the range
[0...UCHAR_MAX] or [CHAR_MIN...CHAR_MAX], or it must be EOF
(typically -1). (The reason I'm fairly sure that the intent is
to require [0...UCHAR_MAX] is because otherwise, you may not
be able to distinguish end of file from a valid character.)
And if the return value is EOF (almost always
-1), failbit should be set, so in.good() would return
false. There is no case where in.get() would be allowed
to return 221497852. The only explication I can possibly think
of for your results is that your file has some character with
bit 7 set at the end of the file, that the implementation is
returning a negative number for this (but not end of file,
because it is a character), which results in an out of bounds
index in values[val], and that this out of bounds index
somehow ends up modifying val. Or that your implementation is
broken, and is not setting failbit when it returns end of
file.
To be certain, I'd be interested in knowing what you get from
the following:
std::ifstream in( "text.txt", std::ios_base::binary );
int ch = in.get();
while ( ch != std::istream::traits_type::eof() ) {
std::cout << ch << std::endl;
ch = in.get();
}
This avoids any issues of a possibly invalid index, and any type
conversions (although the conversion int to unsigned is well
defined). Also, out of curiosity (since I can only access VC++
here), you might try replacing in as follows:
std::istringstream in( "\n\xE5" );
I would expect to get:
10
233
(Assuming 8 bit bytes and an ASCII based code set. Both of
which are almost, but not quite universal today.)
I've eventually figured this out.
Apparently it seems the problem wasn't due to any code. The problem was gedit. It always appends a newline character at the end of file. This also happen with other editors, such as vim. For some editor this can be configured to not append anything, but in gedit this is apparently not possible. https://askubuntu.com/questions/13317/how-to-stop-gedit-gvim-vim-nano-from-adding-end-of-file-newline-char
Cheers to everyone who asked me,
Marco

Error reading and printing a text file with C++

I have a bug with my code (the code at the end of the question). The purpose of my C++ executable is to read a file that contains numbers, copy it in a std::vector and
then just print the contents in the stdout? Where is the problem? (atoi?)
I have a simple text file that contains the following numbers (each line has one number)
mini01:algorithms ios$ cat numbers.txt
1
2
3
4
5
When I execute the program I receive one more line:
mini01:algorithms ios$ ./a.out
1
2
3
4
5
0
Why I get the 6th line in the stdout?
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
using namespace std;
void algorithm(std::vector<int>& v) {
for(int i=0; i < v.size(); i++) {
cout << v[i] << endl;
}
}
int main(int argc, char **argv) {
string line;
std::vector<int> vector1;
ifstream myfile("numbers.txt");
if ( myfile.is_open()) {
while( myfile.good() )
{
getline(myfile, line);
vector1.push_back(atoi(line.c_str()));
}
myfile.close();
}
else {
cout << "Unable to open file" << endl;
}
algorithm(vector1);
return 0;
}
You should not use while (myfile.good()), as it will loop once to many.
Instead use
while (getline(...))
The reason you can't use the flags to check for looping, is that they don't get set until after an input/output operation notices the problem (error or end-of-file).
Don't use good() as the condition of your extraction loop. It does not accurately indicate whether the next read will succeed or not. Move your call to getline into the condition:
while(getline(myfile, line))
{
vector1.push_back(atoi(line.c_str()));
}
The reason it is failing in this particular case is because text files typically have an \n at the end of the file (that is not shown by text editors). When the last line is read, this \n is extracted from the stream. Yes, that may be the very last character in the file, but getline doesn't care to look any further than the \n it has extracted. It's done. It does not set the EOF flag or do anything else to cause good() to return false.
So at the next iteration, good() is still true, the loop continues and getline attempts to extract from the file. However, now there's nothing left to extract and you just get line set to an empty string. This then gets converted to an int and pushed into the vector1, giving you the extra value.
In fact, the only robust way to check if there is a problem with extraction is to check the stream's status bits after extracting. The easiest way to do this is to make the extraction itself the condition.
You read one too many lines, since the condition while is false AFTER you had a "bad read".
Welcome to the wonderful world of C++. Before we go to the bug first, I would advise you to drop the std:: namespace resolution before defining or declaring a vector as you already have
using namespace::std;
A second advise would be to use the pre increment operator ++i instead of i++ wherever feasible. You can see more details on that here.
Coming to your problem in itself, the issue is an empty new line being read at the end of file. A simple way to avoid this would be to check the length of line before using it.
getline(myfile, line);
if (line.size()) {
vector1.push_back(atoi(line.c_str()));
}
This would enable your program now to read a file interspersed with empty lines. To be further foolproof you can check the line read for presence of any non numeric characters before using atoi on it. However the best solution as mentioned would be use to read the line read to the loop evaluation.

C++ getline and gcount

Suppose I have an std::istream pointing to the following contents (the line break is a '\n' character):
12345678
9
and run the following code:
std::istream & is = ...
char buff[9];
is.getline(buff, 9);
int n = is.gcount();
Now n == 8 and strcmp(buff, "12345678") == 0. The question is, how do I know that I read the entire line instead of some of the line?
If the steam instead points to the following contents:
123456789
0
and the same code is executed, I am still on the same line. How do I differentiate between these two cases?
Use std::string and the free std::getline function:
#include <istream>
#include <string>
// ...
std::istream & is = ...;
std::string line;
while (std::getline(is, line))
{
// process line
}
The key to your answer is in your question
is.getline(buff, 9);
int n = is.gcount();
Now n == 8
According to the reference for getline, it will extract up to n - 1 characters -- in your case, up to 8. This is slightly misleading, because if the nth character is the delimiter, it will also be extracted (but not copied to the buffer). More importantly, if you have NOT reached the delimiter before reaching n characters, this piece is relevant:
If the function stops reading because this size (n) is reached,
the failbit internal flag is set
So, in short, if the fail bit is set, you're still on the same line (and you'll have to clear the state in order to continue processing the istream). Sometimes, eof will also set the fail bit. So you probably want to check if the state is failbit and only failbit:
if ( is.rdstate() == std::ios::failbit ) {
std::cout << "Filled the buffer, but did NOT finish the line\n";
is.clear();
}
Read the next character. If it is a newline, you have read all the data on the line.
BTW, popular usage is to read the entire line, up to the newline, into an std::string, then process the string.