Unexpected behaviour of getline() with ifstream - c++

To simplify, I'm trying to read the content of a CSV-file using the ifstream class and its getline() member function. Here is this CSV-file:
1,2,3
4,5,6
And the code:
#include <iostream>
#include <typeinfo>
#include <fstream>
using namespace std;
int main() {
char csvLoc[] = "/the_CSV_file_localization/";
ifstream csvFile;
csvFile.open(csvLoc, ifstream::in);
char pStock[5]; //we use a 5-char array just to get rid of unexpected
//size problems, even though each number is of size 1
int i =1; //this will be helpful for the diagnostic
while(csvFile.eof() == 0) {
csvFile.getline(pStock,5,',');
cout << "Iteration number " << i << endl;
cout << *pStock<<endl;
i++;
}
return 0;
}
I'm expecting all the numbers to be read, since getline is suppose to take what is written since the last reading, and to stop when encountering ',' or '\n'.
But it appears that it reads everything well, EXCEPT '4', i.e. the first number of the second line (cf. console):
Iteration number 1
1
Iteration number 2
2
Iteration number 3
3
Iteration number 4
5
Iteration number 5
6
Thus my question: what makes this '4' after (I guess) the '\n' so specific that getline doesn't even try to take it into account ?
(Thank you !)

You are reading comma separated values so in sequence you read: 1, 2, 3\n4, 5, 6.
You then print the first character of the array each time: i.e. 1, 2, 3, 5, 6.
What were you expecting?
Incidentally, your check for eof is in the wrong place. You should check whether the getline call succeeds. In your particular case it doesn't currently make a difference because getline reads something and triggers EOF all in one action but in general it might fail without reading anything and your current loop would still process pStock as if it had been repopulated successfully.
More generally something like this would be better:
while (csvFile.getline(pStock,5,',')) {
cout << "Iteration number " << i << endl;
cout << *pStock<<endl;
i++;
}

AFAIK if you use the terminator parameter, getline() reads until it finds the delimiter. Which means that in your case, it has read
3\n4
into the array pSock, but you only print the first character, so you get 3 only.

the problem with your code is that getline, when a delimiter is specified, ',' in your case, uses it and ignores the default delimiter '\n'. If you want to scan that file, you can use a tokenization function.

Related

Find out the average value of inputted integers

I managed to get this program to work. If the user types an unfixed amount of integers, the program will calculate the average value of it. But I need to end it with <Ctrl-D> in my terminal (end of file) in order for it to work. Why can I not just press enter for it to work?
I also believe that I've used an unnecessary amount of variables. Can it be narrowed down to maybe 2 variables?
#include <iomanip>
#include <iostream>
using namespace std;
int main ()
{
int digit {};
int res {};
int counter {};
cout << "Type in integers: ";
while (cin >> digit)
{
counter ++;
res += digit;
}
cout << "The mean was " << setw(1) << setprecision(1) << fixed << static_cast<double>(res) / static_cast<double>(counter) << endl;
return 0;
}
Why can I not just press enter for it to work?
Because that's not how the overloaded >> formatted extraction operator works. This operator skips over an unlimited amount of whitespace characters, including newline characters, until it reads the integer. It's simply how it works: it will read newlines and spaces, after newlines, and spaces, until it sees a digit. That's its mission in life: read and skip over spaces and newlines until it reads at least one digit. It never gets tired of reading newlines and spaces, and will keep going as long that's the case.
To handle input in the fashion you describe requires a completely different approach: using std::getline to read a single line of input into a std::string, up until the next newline character. Then, once that's done, you can check if the std::string is empty, which means that no input was entered, and then terminate; otherwise take the input in std::string and convert it to an int value (using std::stoi, std::from_chars, or a std::istringstream -- take your pick), and then proceed with the existing algorithm.
Can it be narrowed down to maybe 2 variables?
How do you expect to do that? Hard, immutable logic dictates that you must keep track of at least two discrete values: the total sum and the number of values read. But then you just ran out of variables. You have no more variables to use for storing the next read value (if there is one), using whatever approach you chose to use. So, you can't do it. Rules of logic require the use of at least three variables, possibly more depending on how fancy and robust you want your input validation to work.

How does cin works in a while loop when the inputs are given in a single line with white spaces?

Consider this small piece of code:
#include <iostream>
#include<stdio.h>
using namespace std;
int main() {
int a;
while(true){
cin>>a;
cout<<a;
}
return 0;
}
Input
1 2 3 5 7 23
Output
125723
How I thought it will run is:
First iteration
1. Reads the first input ie '1' and stops reading further, right after reading the whitespace.
2.Prints the value 1.
Second iteration
1. Again asks for new input
2. Print that in the second line
But that doesn't happen instead it reads the elements we gave after space
First iteration:
Peek at next character in the stream. It's a digit ('1'), so read it.
Peek at next character in the stream. It's not a digit (' '), so don't read it; store 1 in a and return from >>.
(Output 1.)
Second iteration:
Peek at next character in the stream. It's whitespace (' '), so read and ignore it.
Peek at next character in the stream. It's a digit ('2'), so read it.
Peek at next character in the stream. It's not a digit (' '), so don't read it; store 2 in a and return from >>.
(Output 2.)
And so on ...
The point is that >> does not care about lines. cin is one long input stream of characters (some of which may be '\n'). The only thing you can do is read more characters (and then maybe decide that you don't want to do anything with them).
cin is not necessarily connected to a keyboard. The program that started you gets to decide where cin reads from. It can be a file, a network socket, or interactive user input. In the latter case, reading from cin may block until the user types more input, but it will never cause input to just be dropped.
If you want a sane user interface, always read whole lines and process them afterwards:
std::string line;
while (std::getline(std::cin, line)) {
// do stuff with line
}

Seekg not behaving as expected

#include <string>
#include <iostream>
int main() {
std::string str;
char magic[9];
std::cin.read((char *)magic, sizeof(magic));
std::cin.seekg(0, std::ios::beg);
while (std::cin >> str) {
std::cout << str << std::endl;
}
}
my code contains implementation of seekg(0) fucntion on std::cin
it is not behaving as expected on some of the files
when run as
./a.out < filename
those files that it is not behaving as expected have property that they have number of characters(including endline characters and other white spaces) less than 9(9 is the number of characters we read from cin before seekg)
if the file contains more than 9 characters it is behaving as expected
for example:
123456789
will give output as
123456789
while file containing less than 9 characters will not give output
for example:
1234
will give no output
With a file of less than nine characters, you have already attempted to read past the end with your initial read. That means the eof (end of file) and fail flags have been set for the stream and, while seekg may reset eof, it does not reset fail (a).
You can check that by inserting:
cout << "eof/fail=" << cin.eof() << '/' << cin.fail() << '\n';
immediately before and after the seekg. For file sizes of 8, 9, and 10 respectively, you get:
eof/fail=1/1
eof/fail=0/1
eof/fail=0/0
eof/fail=0/0
12345678
eof/fail=0/0
eof/fail=0/0
123456789
You can see the first failure results in no output because the fail bit is still set. The second and third have output because it was never set (the output is the characters shown plus one newline).
To repair this, you can clear the fail bit simply by inserting the following before your seekg:
std::cin.clear();
Then running that code on the eight-character file gives:
eof/fail=1/1
eof/fail=0/0
1234567
showing that the clear has indeed cleared the fail bit.
You might also want to keep in mind that it's not a requirement for a stream to be seekable, especially if it's just coming in via standard input. You may find for certain sized files that you cannot seek back an arbitrary amount if you've read through a large chunk of the stream.
(a) For the language lawyers amongst us, Unformatted input functions (C++11 27.7.2.3/41, C++14 27.7.2.3/41 and C++17 30.7.4.3/41) all have essentially the same text on how seekg works (my emphasis):
After constructing a sentry object, if fail() != true, executes ...

End array input with a newline?

Not sure if the title is properly worded, but what I am trying to ask is how would you signify the end of input for an array using newline. Take the following code for example. Not matter how many numbers(more or less) you type during the input for score[6], it must take 6 before you can proceed. Is there a method to change it so that an array can store 6 or 100 variables, but you can decide how many variables actually contain values. The only way I can think of doing this is to somehow incorporate '\n', so that pressing enter once creates a newline and pressing enter again signifies that you don't want to set any more values. Or is something like this not possible?
#include <iostream>
using namespace std;
int main()
{
int i,score[6],max;
cout<<"Enter the scores:"<<endl;
cin>>score[0];
max = score[0];
for(i = 1;i<6;i++)
{
cin>>score[i];
if(score[i]>max)
max = score[i];
}
return 0;
}
To detect "no input was given", you will need to read the input as a input line (string), rather than using cin >> x; - no matter what the type is of x, cin >> x; will skip over "whitespace", such as newlines and spaces.
The trouble with reading the input as lines is that you then have to "parse" the input into numbers. You can use std::stringstream or similar to do this, but it's quite a bit of extra code compared to what you have now.
The typical way to solve this kind of problem, however, is to use a "sentry" value - for example, if your input is always going to be greater or equal to zero, you can use -1 as the sentry. So you enter
1 2 3 4 5 -1
This would reduce the amount of extra code is relatively small - just check if the input is -1, such as
while(cin >> score[i] && score[i] >= 0)
{
...
}
(This will also detect end-of-file, so you could end the input with CTRL-Z or CTRL-D as appropriate for your platform)

istream and cin.get()

I have a question about the difference between these two pieces of code:
char buffer5[5];
cin.get(buffer5, 5);
cout << buffer5;
cin.get(buffer5, 5);
cout << buffer5;
and
char buffer4;
while (cin.get(buffer4))
{
cout << buffer4;
}
In the first piece of code, the code gets 5 characters and puts it in buffer5. However, because you press enter, a newline character isn't put into the stream when calling get(), so the program will terminate and will not ask you for another round of 5 characters.
In the second piece of code, cin.get() waits for input to the input stream, so the loop doesn't just terminate (I think). Lets say I input "Apple" into the input stream. This will put 5 characters into the input stream, and the loop will print all characters to the output. However, unlike the first piece of code, it does not stop, even after two inputs as I can continuously keep inputting.
Why is it that I can continuously input character sequences into the terminal in the second piece of code and not the first?
First off, "pressing enter" has no special meaning to the IOStreams beyond entering a newline character (\n) into the input sequence (note, when using text streams the platform specific end of line sequences are transformed into a single newline character). When entering data on a console, the data is normally line buffered by the console and only forwarded to the program when pressing enter (typically this can be turned off but the details of this are platform specific and irrelevant to this question anyway).
With this out of the way lets turn our attention to the behavior of s.get(buffer, n) for an std::istream s and a pointer to an array of at least n characters buffer. The description of what this does is quite trivial: it calls s.get(buffer, n, s.widen('\n')). Since we are talking about std::istream and you probably haven't changed the std::locale we can assume that s.widen('\n') just returns '\n', i.e., the call is equivalent to s.get(buffer, n, '\n') where '\n' is called a delimiter and the question becomes what this function does.
Well, this function extracts up to m = 0 < n? n - 1: 0 characters, stopping when either m is reached or when the next character is identical to the delimiter which is left in the stream (you'd used std::istream::getline() if you'd wanted the delimiter to be extracted). Any extracted character is stored in the corresponding location of buffer and if 0 < n a null character is stored into location buffer[n - 1]. In case, if no character is extracted std::ios_base::failbit is set.
OK, with this we should have all ingredients to the riddle in place: When you entered at least one character but less than 5 characters the first call to get() succeeded and left the newline character as next character in the buffer. The next attempt to get() more characters immediately found the delimiter, stored no character, and indicated failure by setting std::ios_base::failbit. It is easy to verify this theory:
#include <iostream>
int main()
{
char buffer[5];
for (int count(0); std::cin; ++count) {
if (std::cin.get(buffer, 5)) {
std::cout << "get[" << count << "]='" << buffer << "'\n";
}
else {
std::cout << "get[" << count << "] failed\n";
}
}
}
If you enter no character, the first call to std::cin.get() fails. If you enter 1 to 4 characters, the first call succeeds but the second one fails. If you enter more than 4 characters, the second call also succeeds, etc. There are several ways to deal with the potentially stuck newline character:
Just use std::istream::getline() which behaves the same as std::istream::get() but also extracts the delimiter if this is why it stopped reading. This may chop one line into multiple reads, however, which may or may not be desired.
To avoid the limitation of a fixed line length, you could use std::getline() together with an std::string (i.e., std::getline(std::cin, string)).
After a successful get() you could check if the next character is a newline using std::istream::peek() and std::istream::ignore() it when necessary.
Which of these approaches meets your needs depends on what you are trying to achieve.