Detecting end of input using std::getline - c++

I have a code with the following snippet:
std::string input;
while(std::getline(std::cin, input))
{
//some read only processing with input
}
When I run the program code, I redirect stdin input through the file in.txt (which was created using gedit), and it contains:
ABCD
DEFG
HIJK
Each of the above lines end with one newline in the file in.txt.
The problem I am facing is, after the while loop runs for 3 times (for each line), the program control does not move forward and is stuck. My question is why is this happening and what can I do to resolve the problem?
Some clarification:
I want to be able to run the program from the command line as such:
$ gcc program.cc -o out
$ ./out < in.txt
Additional Information:
I did some debugging and found that the while loop actually is running for 4 times (the fourth time with input as empty string). This is causing the loop to program to stall, because the //some processing read only with input is unable to do its work.
So my refined question:
1) Why is the 4th loop running at all?
Rationale behind having std::getline() in the while loop's condition
must be that, when getline() cannot read any more input, it returns
zero and hence the while loop breaks.
Contrary to that, while loop
instead continues with an empty string! Why then have getline in the
while loop condition at all? Isn't that bad design?
2) How do I ensure that the while doesn't run for the 4th time without using break statements?
For now I have used a break statement and string stream as follows:
std::string input;
char temp;
while(std::getline(std::cin, input))
{
std::istringstream iss(input);
if (!(iss >>temp))
{
break;
}
//some read only processing with input
}
But clearly there has to be a more elegant way.

Contrary to DeadMG's answer, I believe the problem is with the contents of your input file, not with your expectation about the behavior of the newline character.
UPDATE : Now that I've had a chance to play with gedit, I think I see what caused the problem. gedit apparently is designed to make it difficult to create a file without a newline on the last line (which is sensible behavior). If you open gedit and type three lines of input, typing Enter at the end of each line, then save the file, it will actually create a 4-line file, with the 4th line empty. The complete contents of the file, using your example, would then be "ABCD\nEFGH\nIJKL\n\n". To avoid creating that extra empty line, just don't type Enter at the end of the last line; gedit will provide the required newline character for you.
(As a special case, if you don't enter anything at all, gedit will create an empty file.)
Note this important distinction: In gedit, typing Enter creates a new line. In a text file stored on disk, a newline character (LF, '\n') denotes the end of the current line.
Text file representations vary from system to system. The most common representations for an end-of-line marker are a single ASCII LF (newline) character (Unix, Linux, and similar systems), and as sequence of two characters, CR and LF (MS Windows). I'll assume the Unix-like representation here. (UPDATE: In a comment, you said you're using Ubuntu 12.04 and gcc 4.6.3, so text files should definitely be in the Unix-style format.)
I just wrote the following program based on the code in your question:
#include <iostream>
#include <string>
int main() {
std::string input;
int line_number = 0;
while(std::getline(std::cin, input))
{
line_number ++;
std::cout << "line " << line_number
<< ", input = \"" << input << "\"\n";
}
}
and I created a 3-line text file in.txt:
ABCD
EFGH
IJHL
In the file in.txt each line is terminated by a single newline character.
Here's the output I get:
$ cat in.txt
ABCD
EFGH
IJHL
$ g++ c.cpp -o c
$ ./c < in.txt
line 1, input = "ABCD"
line 2, input = "EFGH"
line 3, input = "IJHL"
$
The final newline at the very end of the file does not start a newline, it merely marks the end of the current line. (A text file that doesn't end with a newline character might not even be valid, depending on the system.)
I can get the behavior you describe if I add a second newline character to the end of in.txt:
$ echo '' >> in.txt
$ cat in.txt
ABCD
EFGH
IJHL
$ ./c < in.txt
line 1, input = "ABCD"
line 2, input = "EFGH"
line 3, input = "IJHL"
line 4, input = ""
$
The program sees an empty line at the end of the input file because there's an empty line at the end of the input file.
If you examine the contents of in.txt, you'll find two newline (LF) characters at the very end, one to mark the end of the third line, and one to mark the end of the (empty) fourth line. (Or if it's a Windows-format text file, you'll find a CR-LF-CR-LF sequence at the very end of the file.)
If your code doesn't deal properly with empty lines, then you should either ensure that it doesn't receive any empty lines on its input, or, better, modify it so it handles empty lines correctly. How should it handle empty lines? That depends on what the program is required to do, and it's probably entirely up to you. You can silently skip empty lines:
if (input != "") {
// process line
}
or you can treat an empty line as an error:
if (input == "") {
// error handling code
}
or you can treat empty lines as valid data.
In any case, you should decide exactly how you want to handle empty lines.

Why is the 4th loop running at all?
Because the text input contains four lines.
The new line character means just that- "Start a new line". It does not mean "The preceeding line is complete", and in this test, the difference between those two semantics is revealed. So we have
1. ABCD
2. DEFG
3. HIJK
4.
The newline character at the end of the third line begins a new line- just like it should do and exactly like its name says it will. The fact that that line is empty is why you get back an empty string. If you want to avoid it, trim the newline at the end of the third line, or, simply special-case if (input == "") break;.
The problem has nothing to do with your code, and lies in your faulty expectation of the behaviour of the newline character.

Finale:
Edit: Please read the accepted answer for the correct explanation of the problem and the solution as well.
As a note to people using std::getline() in their while loop condition, remember to check if it's an empty string inside the loop and break accordingly, like this:
string input;
while(std::getline(std::cin, input))
{
if(input = "")
break;
//some read only processing with input
}
My suggestion: Don't have std::getline() in the while loop condition at all. Rather use std::cin like this:
while(std::cin>>a>>b)
{
//loop body
}
This way extra checking for empty string will not be required and code design is better.
The latter method mentioned above negates the explicit checking of an empty string (However, it is always better to do as much explicit checking as possible on the format of the input).

Related

getline crashing when coming across a newline

I am trying to read in some informations from txt file with \n endings.
However whenever I come a cross an empty line, I get a seg fault. However I just want the line to get ignored.
code:
std::ifstream config_file (config_);
string input_line;
while (std::getline(config_file, input_line))
{
if (??check for newline??)
continue
}
I tried so far:
changing getline to these parameters:
(config_file, input_line, '\n')
and this if statement:
if (input_line.at(0) == '\n')
However I always get seg faults ^^'.
Use of
if (input_line.at(0) == '\n')
to check whether is an empty line is wrong sincce std::getline reads and discards the delimiter ('\n' in your case).
Instead, use
if (input_line.empty())
std::getline will discard the newline.
You can check for std::string::empty() to detect an empty line.

Does `cin` produce a newline automatically?

Let's consider the following code:
#include<iostream>
int main()
{
std::cout<<"First-";
std::cout <<"-Second:";
int i;
std::cin>>i;
std::cout<<"Third in a new line.";
while(1){}
}
The output when value 4 is given to i is:
First--Second:4
Third in a newline
cout doesn't print any newline. But after I input any value(4) for i a newline is printed. There could be two possible reasons for this:
The Enter key I press after typing a numerical value for i is printed as a newline.
cin automatically generates a newline.
Although the first reason seems more reasonable but the reason, I am thinking 2nd reason could also be true, is because after Third is printed when I press Enter no new line is printed even the program continue to run because of while(1)--which means the console window doesn't print a newline when Enter key is pressed. So it seems like cin automatically prints a newline.
So, why is the newline being generated after I give input to cin? Does cin automatically prints a newline?
The number and newline you entered is printed by the console software. cin won't print anything in my understanding.
Try giving some input via redirect or pipe, and I guess you will see no new line printed.
For example:
$ echo 4 | ./a.out
First--Second:Third in a new line.
where $ is the prompt and echo 4 | ./a.out(Enter) is your input.
Check this out: http://ideone.com/tBj1uS
You can see there that input and output are separated.
stdin:
1
2
stdout:
First--Second:Third in a new line.
Which means, that the newline is produced by the Enter key and is a part of the input, not the output.
If someone will have the same problem, I was able to solve it like this:
string ans1;
getline(cin, ans1);
ans1.erase(remove(ans1.begin(), ans1.end(), '\n'),
ans1.end());
int ans1int = atoi(ans1.c_str());
Basically, it works by deleting all the newline characters in the string, and then ,aking it integer or whatever else you need. Also you will need algorithm library for it
It's not that elegant, but hey, it works!

[c++]while(cin.get(str,size).get())... why infinit looping,rather than quit if i give a blank line

it's a c++ question.
char str[10];
while(cin.get(str,10).get())
...
cin.clear();
i hope when i give just the enter key, the while loop would end due to that the cin.get(str,size) would fail encountering the blank line. but when I add a .get() behind aim to read up the following \n, the while loop just keep looping when i give a blank line?
is it that the .get() causes the judgement true override the cin.get(str,size)'s false?
cin.get(str,10) Extracts characters from the stream and stores them in str as a c-string, until either 9 characters have been extracted or the delimiting character is encountered, not "gets 10 and fails if it cant, because the line ended."
This will basically never "fail" until you get to the end of file.
You will have to capture the line and test its length separately (probably not in the same expression)

compare a string with another string from a file c++

I am trying to compare a string from getline(file,line) with a std::string s="mmee" :
if (line==s){}
but this is never executed. why?? inside the file i have:
mmee
hello
hey
How to trim the spaces or enter from string line?
Your code is correct.
Please check your input file to verify there is no leading or trailing spaces.
First of all why do you think that will get executed.
as mmee hello hey is part of the same line in the file, when you do a getline line will contain the whole line and not just mmee. Hence the if condition would fail.
Look at this link for trimming spaces.
You can remove the space by using
std::remove(astring.begin(), astring.end(), ' ');
and after that you can compare the string.
check out the link:: http://www.cplusplus.com/forum/beginner/863/
You can compare to String : http://www.cplusplus.com/reference/string/string/compare/
std::getline() by default reads to the first new-line character. You can specify the character to which you want the function to read, like this:
getline(file, line, ' '); // not the space between single quote chars
This would read only "mmee" part of your example file (mmee hello hey).
EDIT: It is how I read your example. People who edited the OP just assumed that the input file is mmee\nhello\nhey

input, output and \n's

So I'm trying to solve this problem that asks to look for palindromes in strings, so seems like I've got everything right, however the problem is with the output.
Here's the original and my out put:
http://pastebin.com/c6Gh8kB9
Here's whats been said about input and input of the problem:
Input format :
A file with no more than 20,000
characters. The file has one or more
lines. No line is longer than 80
characters (not counting the newline
at the end).
Output format :
The first line of the output should be the length of the longest
palindrome found. The next line or
lines should be the actual text of the
palindrome (without any surrounding
white space or punctuation but with
all other characters) printed on a
line (or more than one line if
newlines are included in the
palindromic text). If there are
multiple palindromes of longest
length, output the one that appears
first.
Here's how I read the input :
string test;
string original;
while (getline(fin,test))
original += test;
And here's how I output it:
int len = answer.length();
answer = cleanUp(answer);
while (len > 0){
string s3 = answer.substr(0,80);
answer.erase(0,80);
fout << s3 << endl;
len -= 80;
}
cleanUp() is a function to remove the illegal characters from the beginning and the end. I'm guessing that the problem is with \n's and the way I read the input. How can I fix this ?
No line is longer than 80 characters (not counting the newline at the end)
does not imply that every line is 80 characters except for the last, while your output code does assume this by taking 80 characters off answer in every iteration.
You may want to keep the newlines in the string until the output phase. Alternatively, you might store newline positions in a separate std::vector. The first option complicates your palindrome search routine; the second your output code.
(If I were you, I'd also index into answer instead of taking chunks off with substr/erase; your output code is now O(n^2) while it could be O(n).)
After rereading, it appears that I misunderstood the question. I was thinking in terms of each line representing a single word, and the intent is to test whether that "word" is palindromic.
After rereading, I think the question is really more like: "Given a sequence of up to 20,000 characters, find the longest palindromic sub-sequence. Oh, incidentally, the input is broken up into lines of no more than 80 characters."
If that's correct, I'd ignore the line-length completely. I'd read the entire file into a single buffer, then search for palindromes in that buffer.
To find the palindromes, I'd simply walk through each position in the array, and find the longest possible palindrome with that as its center point:
for (int i=1; i<total_chars; i++)
for (n=1; n<min(i, total_chars-i); n++)
if (array[i+n] != array[i-n])
// Candidate palindrome is from array[i-n+1] to array[i+n-1]