cin.get gets one extra character? [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why is iostream::eof inside a loop condition considered wrong?
I was reading a group of characters via cin.get() and I noticed that my cin.get() was getting an exra character at the end of the input. Might anyone know how to fix this? Here's my code:
unsigned char c;
while(!cin.eof())
{
c = cin.get();
cout << (int)c << endl;
}
My issue is that the character it gets is one of 255 ascii value. I simply don't want it to get this extra character, but if the user enters in a ascii value of 255 without it being a garbage character at the end, then that should be fine. An example would be so for my output:
if I entered in abc\n in my output:
I get
97
98
99
10
255
but I want:
97
98
99
10
Any ideas on how to fix this? Thanks!

Never use cin.eof() as a loop condition. It almost always produces buggy code, as it has here.
Instead, try:
int c;
while ( (c=cin.get()) != EOF ) {
cout << c << endl;
}
Or:
char c;
while (cin.get(c)) {
cout << (int)c << endl;
}

The get() function with no arguments returns an int_type, not a char. At the end of the stream, it returns a special non-character value that indicates end of file (usually -1). By assigning the result of cin.get() directly to an unsigned char, you are inadvertently throwing away this eof information. The relevant documentation quote is:
1) reads one character and returns it if available. Otherwise, returns Traits::eof() and sets failbit and eofbit.

Related

c++ program tries to read from stream after eof [duplicate]

This question already has answers here:
Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?
(5 answers)
Why is “while( !feof(file) )” always wrong?
(5 answers)
Closed 7 years ago.
I have a file containing only one character 1 (there is no newline symbol after it)
Reading the contents with this method:
int value;
ifstream in("somefile");
for (!in.eof())
{
in >> value;
cout << "input: " << value << " , eof" << in.eof() << "'\n";
}
gives me the following output:
input: 1 , eof: 0
input: 1 , eof: 1
The questions are:
1) Why EOF flag is not set after first reading try? I mean, if program successfully reads the number on the first try, it somehow knows that the string representatiion of it is over. To find it out it has to try read at least one byte after 1 character and there it should have hitted EOF. Why that doesn' happen?
2) That said, if I have a file with one value per line, I always do have a duplicate of last input. Does it mean that I always have to discard it in any way and that would be correct? Or for example add extra check for EOF after in >> value; and only if it succeeds, do any logic I want?
I know that I can work around with readline() or while(in >> value) methods but it's more a question of understanding what really happens there.
Thank you.

C++ primer 5th 1.4.4

I'm a beginner of C++,while reading the book 《C++ Primer》 5th,I'm a little confused in chapter 1.4.4.
when I run the program in 1.4.4,here is the step in my computer:
#include <iostream>
int main()
{
// currVal is the number we're counting; we'll read new values into val
int currVal = 0, val = 0;
// read first number and ensure that we have data to process
if (std::cin >> currVal)
{
int cnt = 1; // store the count for the current value we're processing
while (std::cin >> val)
{ // read the remaining numbers
if (val == currVal) // if the values are the same
++cnt; // add 1 to cnt
else
{ // otherwise, print the count for the previous value
std::cout << currVal << " occurs " << cnt << " times" << std::endl;
currVal = val; // remember the new value
cnt = 1; // reset the counter
}
} // while loop ends here
// remember to print the count for the last value in the file
std::cout << currVal << " occurs " << cnt << " times" << std::endl;
} // outermost if statement ends here
return 0;
}
type the numbers:42 42 42 42 42 55 55 62 100 100 100
type Ctrl+D
the program run by itself(not waitting me to input Enter)
output the answer:
42 occurs 5 times
55 occurs 2 times
62 occurs 1 times
secondly type Ctrl+D
output the remained answer
100 occurs 3 times
my question is why I have to input second times of Ctrl+D,my code environment is Ubuntu+GCC,I also run it in VS2013,it only needs to input once Ctrl+D.
I've searched in stackoverflow,but I didn't got my answer.
Incorrect output. C++ primer 1.4.4
confused by control flow execution in C++ Primer example
C++ Primer fifth edtion book (if statement) is this not correct?
In Linux Ctrl+D does not unconditionally mean "end-of-file" (EOF). What it actually means is "push the currently pending input to whoever is waiting to read it". If the input buffer in non-empty, then hitting Ctrl+D does not create EOF condition at the end of the buffer. Only if you hit Ctrl+D when the input buffer is empty, only then it will produce EOF condition. (See here for a more technical explanation: https://stackoverflow.com/a/1516177/187690)
In your case you are inputting your data as a single line and then hitting Ctrl+D at the end. This pushes your input to your program and makes your program to read and process the data. But it does not produce EOF condition at the end of your input.
For this reason, once all input data is read by the cycle, your program does not see it as EOF. The cycle keeps waiting on empty input buffer for additional data. If at this point you press Ctrl+D again, it will be recognized as EOF and your program will exit the cycle and print the last line.
This is why you have to hit Ctrl+D twice: the first hit works pretty much as Enter key would. And only the second hit creates EOF condition.
It is possible that the program you have provided was not intended to accept input all at once as you have shown.
The reason it does not provide the output you expect is due to the fact that the program is still expecting input, because the >> operation's return value is still logically true/error-free. (It's blocked at: while (std::cin >> val)) This is so, because you have not provided an EOF to the input stream after the last 100. Said another way, your first Ctrl+D gets past the if (std::cin >> currVal). Your second Ctrl+D gets past the while (std::cin >> val).
See the accepted answer to this question for why the first Ctrl+D does not result in a eofbit error on your input stream: Why do I have to type ctrl-d twice? The bottom line is that Ctrl+D does not necessarily mean EOF; it results in a flush of the input stream.
Entering the numbers one at a time would provide the output you expect.
Alternatively, you could provide: 42 42 42 42 42 55 55 62 100 100 100\n.
http://www.cplusplus.com/reference/istream/istream/operator%3E%3E/

istream and cin.get()

I have a question about the difference between these two pieces of code:
char buffer5[5];
cin.get(buffer5, 5);
cout << buffer5;
cin.get(buffer5, 5);
cout << buffer5;
and
char buffer4;
while (cin.get(buffer4))
{
cout << buffer4;
}
In the first piece of code, the code gets 5 characters and puts it in buffer5. However, because you press enter, a newline character isn't put into the stream when calling get(), so the program will terminate and will not ask you for another round of 5 characters.
In the second piece of code, cin.get() waits for input to the input stream, so the loop doesn't just terminate (I think). Lets say I input "Apple" into the input stream. This will put 5 characters into the input stream, and the loop will print all characters to the output. However, unlike the first piece of code, it does not stop, even after two inputs as I can continuously keep inputting.
Why is it that I can continuously input character sequences into the terminal in the second piece of code and not the first?
First off, "pressing enter" has no special meaning to the IOStreams beyond entering a newline character (\n) into the input sequence (note, when using text streams the platform specific end of line sequences are transformed into a single newline character). When entering data on a console, the data is normally line buffered by the console and only forwarded to the program when pressing enter (typically this can be turned off but the details of this are platform specific and irrelevant to this question anyway).
With this out of the way lets turn our attention to the behavior of s.get(buffer, n) for an std::istream s and a pointer to an array of at least n characters buffer. The description of what this does is quite trivial: it calls s.get(buffer, n, s.widen('\n')). Since we are talking about std::istream and you probably haven't changed the std::locale we can assume that s.widen('\n') just returns '\n', i.e., the call is equivalent to s.get(buffer, n, '\n') where '\n' is called a delimiter and the question becomes what this function does.
Well, this function extracts up to m = 0 < n? n - 1: 0 characters, stopping when either m is reached or when the next character is identical to the delimiter which is left in the stream (you'd used std::istream::getline() if you'd wanted the delimiter to be extracted). Any extracted character is stored in the corresponding location of buffer and if 0 < n a null character is stored into location buffer[n - 1]. In case, if no character is extracted std::ios_base::failbit is set.
OK, with this we should have all ingredients to the riddle in place: When you entered at least one character but less than 5 characters the first call to get() succeeded and left the newline character as next character in the buffer. The next attempt to get() more characters immediately found the delimiter, stored no character, and indicated failure by setting std::ios_base::failbit. It is easy to verify this theory:
#include <iostream>
int main()
{
char buffer[5];
for (int count(0); std::cin; ++count) {
if (std::cin.get(buffer, 5)) {
std::cout << "get[" << count << "]='" << buffer << "'\n";
}
else {
std::cout << "get[" << count << "] failed\n";
}
}
}
If you enter no character, the first call to std::cin.get() fails. If you enter 1 to 4 characters, the first call succeeds but the second one fails. If you enter more than 4 characters, the second call also succeeds, etc. There are several ways to deal with the potentially stuck newline character:
Just use std::istream::getline() which behaves the same as std::istream::get() but also extracts the delimiter if this is why it stopped reading. This may chop one line into multiple reads, however, which may or may not be desired.
To avoid the limitation of a fixed line length, you could use std::getline() together with an std::string (i.e., std::getline(std::cin, string)).
After a successful get() you could check if the next character is a newline using std::istream::peek() and std::istream::ignore() it when necessary.
Which of these approaches meets your needs depends on what you are trying to achieve.

Unexpected behaviour of getline() with ifstream

To simplify, I'm trying to read the content of a CSV-file using the ifstream class and its getline() member function. Here is this CSV-file:
1,2,3
4,5,6
And the code:
#include <iostream>
#include <typeinfo>
#include <fstream>
using namespace std;
int main() {
char csvLoc[] = "/the_CSV_file_localization/";
ifstream csvFile;
csvFile.open(csvLoc, ifstream::in);
char pStock[5]; //we use a 5-char array just to get rid of unexpected
//size problems, even though each number is of size 1
int i =1; //this will be helpful for the diagnostic
while(csvFile.eof() == 0) {
csvFile.getline(pStock,5,',');
cout << "Iteration number " << i << endl;
cout << *pStock<<endl;
i++;
}
return 0;
}
I'm expecting all the numbers to be read, since getline is suppose to take what is written since the last reading, and to stop when encountering ',' or '\n'.
But it appears that it reads everything well, EXCEPT '4', i.e. the first number of the second line (cf. console):
Iteration number 1
1
Iteration number 2
2
Iteration number 3
3
Iteration number 4
5
Iteration number 5
6
Thus my question: what makes this '4' after (I guess) the '\n' so specific that getline doesn't even try to take it into account ?
(Thank you !)
You are reading comma separated values so in sequence you read: 1, 2, 3\n4, 5, 6.
You then print the first character of the array each time: i.e. 1, 2, 3, 5, 6.
What were you expecting?
Incidentally, your check for eof is in the wrong place. You should check whether the getline call succeeds. In your particular case it doesn't currently make a difference because getline reads something and triggers EOF all in one action but in general it might fail without reading anything and your current loop would still process pStock as if it had been repopulated successfully.
More generally something like this would be better:
while (csvFile.getline(pStock,5,',')) {
cout << "Iteration number " << i << endl;
cout << *pStock<<endl;
i++;
}
AFAIK if you use the terminator parameter, getline() reads until it finds the delimiter. Which means that in your case, it has read
3\n4
into the array pSock, but you only print the first character, so you get 3 only.
the problem with your code is that getline, when a delimiter is specified, ',' in your case, uses it and ignores the default delimiter '\n'. If you want to scan that file, you can use a tokenization function.

Loop efficiency - C++

Beginners question, on loop efficiency. I've started programming in C++ (my first language) and have been using 'Principles and Practice Using C++' by Bjarne Stroustrup. I've been making my way through the earlier chapters and have just been introduced to the concept of loops.
The first exercise regarding loops asks of me the following:
The character 'b' is char('a'+1), 'c' is char('a'+2), etc. Use a loop to write out
a table of characters with their corresponding integer values:
a 97, b 98, ..., z 122
Although, I used uppercase, I created the following:
int number = 64; //integer value for # sign, character before A
char letter = number;//converts integer to char value
int i = 0;
while (i<=25){
cout << ++letter << "\t" << ++number << endl;
++i;
}
Should I aim for only having 'i' be present in a loop or is it simply not possible when converting between types? I can't really think of any other way the above can be done apart from having the character value being converted to it's integer counterpart(i.e. opposite of current method) or simply not having the conversion at all and have letter store '#'.
Following on from jk you could even use the letter itself in the loop (letter <= 'z'). I'd also use a for loop but that's just me.
for( char letter = 'a'; letter <= 'z'; ++letter )
std::cout << letter << "\t" << static_cast<int>( letter ) << std::endl;
You should aim for clarity first and you try to micro-optimize instead. You could better rewrite that as a for loop:
const int offsetToA = 65;
const int numberOfCharacters = 26;
for( int i = 0; i < numberOfCharacters; ++i ) {
const int characterValue = i + offsetToA;
cout << static_cast<char>( characterValue ) << characterValue << endl;
}
and you can convert between different types - that's called casting (the static_cast construct in the code above).
That's not a bad way to do it, but you can do it with only one loop variable like this:
char letter = 65;
while(letter <= 65+25){
printf("%c\t%d\n", letter, letter);
++letter;
}
there is nothing particularly inefficient about the way you are doing it but it certainly is possible to just convert between chars and ints (a char is an integer type). this would mean you only need to store 1 counter rather than the 3 (i, letter + number) you curently have
also, for looping from a fixed start to end a 'for' loop is perhaps more idiomatic (though its possible you havent met this yet!)
If you are concerned about the efficiency of your loop, I would urge you to try this:
Get this code compiled and running under an IDE, such as Visual Studio, and set a break point at the beginning. When you get there, switch to the disassembly view (instruction view) and start hitting the F11 (single-step) key, and keep a mental count of how many times you are hitting it.
You will see that it enters the loop, compares i against 25, and then starts doing the code for the cout line. That involves incrementing letter, and then going into the << routine for cout. It does a number of things in there, possibly going deeper into subroutines, etc., and finally comes back out, returning an object. Then it pushes "\t" as an argument and passes it to that object, and goes back in and does all the stuff it did before. Then it takes number, increments it, and passes it to the cout::<< routine that accepts an integer, calls a function to convert it to a string (which involves a loop), then does all the stuff it did before to loop that string into the output buffer and return.
Tired? You're not done yet. The endl has to be output, and when that happens, not only does it put "\n" in the buffer, but it calls the system routine to flush that buffer to the file or console where you are sending the I/O. You probably can't F11 into that, but rest assured it takes lots of cycles and doesn't return until the I/O is done.
By now, your F11-count should be in the vicinity of several thousand, more or less.
Finally, you come out and get to the ++i statement, which takes 1 or 2 instructions, and jumps back to the top of the loop to start the next iteration.
NOW, are you still worried about the efficiency of the loop?
There's an easier way to make this point, and it's just as instructive. Wrap an infinite loop around your entire code so it runs forever. While it's running, hit the "pause" button in the IDE, and look at the call stack. (This is called a "stackshot".) If you do this several times you get a good idea of how it spends time. Here's an example:
NTDLL! 7c90e514()
KERNEL32! 7c81cbfe()
KERNEL32! 7c81cc75()
KERNEL32! 7c81cc89()
MSVCRTD! 1021bed3()
MSVCRTD! 1021bd59()
MSVCRTD! 10218833()
MSVCRTD! 1023a500()
std::_Fputc() line 42 + 18 bytes
std::basic_filebuf<char,std::char_traits<char> >::overflow() line 108 + 25 bytes
std::basic_streambuf<char,std::char_traits<char> >::sputc() line 85 + 94 bytes
std::ostreambuf_iterator<char,std::char_traits<char> >::operator=() line 304 + 24 bytes
std::num_put<char,std::ostreambuf_iterator<char,std::char_traits<char> > >::_Putc() line 633 + 32 bytes
std::num_put<char,std::ostreambuf_iterator<char,std::char_traits<char> > >::_Iput() line 615 + 25 bytes
std::num_put<char,std::ostreambuf_iterator<char,std::char_traits<char> > >::do_put() line 481 + 71 bytes
std::num_put<char,std::ostreambuf_iterator<char,std::char_traits<char> > >::put() line 444 + 44 bytes
std::basic_ostream<char,std::char_traits<char> >::operator<<() line 115 + 114 bytes
main() line 43 + 96 bytes
mainCRTStartup() line 338 + 17 bytes
I did this a bunch of times, and not ONCE did it stop in the code for the outer i<=25 loop. So optimizing that loop is like someone's great metaphor: "getting a haircut to lose weight".
Since no one else mentioned it: Having a fixed amount of iterations, this is also a candidate for post-condition iteration with do..while.
char letter = 'a';
do {
std::cout << letter << "\t" << static_cast<int>( letter ) << std::endl;
} while ( ++letter <= 'z' );
However, as shown in Patrick's answer the for idiom is often shorter (in number of lines in this case).
You can promote char to int...
//characters and their corresponding integer values
#include"../../std_lib_facilities.h"
int main()
{
char a = 'a';
while(a<='z'){
cout<<a<<'\t'<<a*1<<'\n'; //a*1 => char operand promoted to integer!
++a;
}
cout<<endl;
}
Incrementing three separate variables is probably a little confusing. Here's a possibility:
for (int i = 0; i != 26; ++i)
{
int chr = 'a' + i;
std::cout << static_cast<char>(chr) << ":\t" << chr << std::endl;
}
Note that using a for loop keeps all the logic of setting up, testing and incrementing the loop variable in one place.
At this point, I wouldn't worry about micro-optimizations such as an efficient way to write a small loop like this. What you have allows a for loop to do the job nicely, but if you are more comfortable with while, you should use that. But I am not sure if that is your question.
I don't think you have understood the question properly. You are writing the code, knowing that 'A' is 65. The whole point of the exercise is to print the value of 'A' to 'Z' on your system, without knowing what value they have.
Now, to get an integer value for a character c, you can do: static_cast<int>(c). I believe that is what you're asking.
I haven't written any code because it should be more fun for you to do so.
Question for the experts: In C, I know that 'a'...'z' need not have continuous values (same for 'A'...'Z'). Is the same true for C++? I would think so, but then it seems highly unlikely that Stroustrup's book assumes that.
thanks for the help.. all i wrote down was
int main()
{
char letter = 96;
int number = letter;
int i = 0;
while(i <26)
{
cout <<++letter <<":" <<++numbers <<" ";
++i;
}
works great...and pretty simple to understand now.
I've tried this and worked fine:
char a = 'a';
int i = a; //represent char a as an int
while (a <= 'z') {
cout << a << '\t' << i << '\n';
++a;
++i;
}
Programming Principles and Practice using C++ (2nd Edition) | Bjarne Stroustrup
Chapter 4 - Computation (Try this #3 - Character Loop)
The character 'b' is char('a'+1), 'c' is char('a'+2), etc. Use
a loop to write out a table of characters with their corresponding integer values:
a 97 b 98 . . . z 122
This is how I solved the problem (from 10 years ago :D)
I am a freshmen btw, so I just started reading this book now... just want to input my solution
#include <iostream>
using namespace std;
int main()
{
int i = 0;
while (i < 26) {
cout << char('a' + i) << '\t' << int(97 + i) << '\n';
++i;
}
}
I solved it by analyzing first the problem which is knowing the char value of 'a' which is 97 up to 'z'. According to this ASCII table
https://www.ascii-code.com/#:~:text=ASCII%20printable%20characters%20%28character%20code%2032-127%29%20Codes%2032-127,digits%2C%20punctuation%20marks%2C%20and%20a%20few%20miscellaneous%20symbols.
Now, we have a clearer understanding on how to solve the said problem.