Two cins in a row, what exactly happens with whitespaces? - c++

cin >> name;
cin >> age;
cout << name << age;
What exactly is happening here if I type a string, then some whitespace and a number? For example Something 20. Does it read Something then sees the whitespace and goes okay that's the end of this first line because a whitespace terminates the reading of the string, goes to the next input and reads 20?
But I also wanna be a bit more specific. Is it okay to say at first when I'm in the console typing Something, that's going into the standard input stream, then getting stored in the buffer and when I press that space it's like pressing enter? And that Something gets extracted and assigned to name? Then that 20 I type is like a whole new unrelated line because I pressed space earlier and so that gets extracted and assigned to age?

How they'll get extracted
The integer gets extracted via std::basic_istream::operator::>>:
Extracts values from an input stream
1-4 ) Extracts an integer value potentially skipping preceding
whitespace. The value is stored to a given reference value.
This function behaves as a FormattedInputFunction. After constructing and
checking the sentry object, which may skip leading whitespace,
extracts an integer value by calling std::num_get::get().
The string gets extracted via std::basic_string::operator>>:
2 ) Behaves as a FormattedInputFunction. After constructing and
checking the sentry object, which may skip leading whitespace, first
clears str with str.erase(), then reads characters from is and appends
them to str as if by str.append(1, c), until one of the following
conditions becomes true:
N characters are read, where N is is.width() if is.width() > 0,
otherwise N is str.max_size()
the end-of-file condition occurs in the stream is
std::isspace(c,is.getloc()) is true for the next character c
in is (this whitespace character remains in the input stream).
And in FormattedInputFunction:
if ios_base::skipws flag is set on this input stream, extracts and
discards characters from the input stream until one of the following
becomes true:
the next available character on the input stream is not
a whitespace character, as tested by the std::ctype facet of the
locale currently imbued in this input stream. The non-whitespace
character is not extracted.
the end of the stream is reached, in which
case failbit and eofbit are set and if the stream is on for exceptions
on one of these bits, ios_base::failure is thrown.
And as stated in Basic Input/Output from cplusplus.com:
...Note that the characters introduced using the keyboard are only transmitted to the
program when the ENTER (or RETURN) key is pressed.
...
...cin extraction always considers spaces (whitespaces, tabs,
new-line...) as terminating the value being extracted, and thus
extracting a string means to always extract a single word, not a
phrase or an entire sentence.
Testing
Compiling and testing your program with leading and trailing whitespaces via MSVC-v142 compiler:
AA 123 some trailing whitespaces
Prints out:
AA123
Read also
Stackoverflow: Clarify the difference between input/output stream and input/output buffer
Learn cpp: Input and output streams

Related

disallow the line break while reading in a number in c++?

I have a problem.
i want the user to enter a number in my c++ program, but during the input i want to prevent that he just presses enter without having made an input, thus creating a line break.
I have already solved the problem in another place where the user has to enter a character.
I have read the character with getchar, determined the position of the cursor with the ANSI escape sequences and provided the whole thing with a do while loop.
But since I want to read in a number between 0 and 250, getchar would not be suitable.
scanf and cin both wait for a valid input and cause these nasty line breaks.
I have already thought about using getchar anyway and storing the characters in a char array whose indices I then convert to the corresponding numbers which I then add up to the actual number which can then be stored again in an int variable.
But surely there is an easier alternative or?
Translated with www.DeepL.com/Translator (free version)
This line will flush the newline character
cin.ignore(256, '\n');

What does the format string "%*[^\n]" in a scanf() statement instruct? How do assignment suppressor (*) and negated scanset ([^) work together?

I know about the introduction of the scanset with the [ conversion specifier which subsequent indicate characters to match or not to match with an additional interposition of the ^ symbol.
For this, in ISO/IEC 9899/1999 (C99) is stated:
The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket.
So, the expression [^\n] means, that it is scanning characters until a \n character is found in the according stream, here at scanf(), stdin. \n is not taken from stdin and scanf() proceeds with the next format string if any remain, else it skips to the next C statement.
Next there is the assignment-suppression-operator *:
For this, in ISO/IEC 9899/1999 (C99) is stated:
Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result.
Meaning in the case of f.e. scanf("%*100s",a); that a sequence of 100 characters is taken from stdin unless a trailing white-space character is found but not assigned to a if a is a proper-defined char array of 101 elements (char a[101];).
But what does now the format string "%*[^\n]" in a scanf()-statement achieve?
Does \n remain instdin?
How do assignment supressor * and negated scanset [^ work together?
Does it mean, that:
By using * all characters matching to this format string are taken from stdin, but are sure not assigned?, and
\n isn't taken from stdin but it is used to determine the scan-operation for the according format string?
I know what each of those [^ and * do alone, but not together. The question is what is the result of the mix of those two together, incorporated with the negated scanset of \n.
I know that there is a similar question on Stack Overflow which covers the understanding of %[^\n] only, here: What does %[^\n] mean in a scanf() format string. But the answers there do not help me with my problem.
%[^\n] reads up to but not including the next \n character. In plain English, it reads a line of text. Normally, the line would be stored in a char * string variable.
char line[SIZE];
scanf("%[^\n]", line);
The * modifier suppresses that behavior. The line is simply discarded after being read and no variable is needed.
scanf("%*[^\n]");
* doesn't alter how the input is processed. In either case, everything up to but not including the next \n is read from stdin. Assuming no I/O errors, it is guaranteed that the next read from stdin will see either \n or EOF.
Which scanf() statement should I use if I want to read and thereafter discard every character in stdin including the \n character?
Add %*c to also consume the \n.
scanf("%*[^\n]%*c");
Why %*c instead of just \n? If you used \n it wouldn't just consume a single newline character, it would consume any number of spaces, tabs, and newlines. \n matches any amount of whitespace. It's better to use %*c to consume exactly 1 character.
// Incorrect
scanf("%*[^\n]\n");
See also:
How to skip a line when fscanning a text file?
Could I use fflush() instead?
No, don't. fflush(stdin) is undefined.
Isn't the negated scanset of [\n] completely redundant because scanf() terminate the scan process of the according format string at first occurrence of a white space character by default?
With %s, yes, it will stop reading at the first whitespace character. %s only reads a single word. %[^\n], by contrast, reads an entire line. It will not stop at spaces or tabs, only newlines.
More generally, with square brackets only the exact characters listed are relevant. There is no special behavior for whitespace. Unlike %s it does not skip leading whitespace, nor does it stop processing early if it encounters whitespace.
Does \n remain in stdin?
Yes, it does.
But what does now the format string "%*[^\n]" in a scanf()-statement achieve?
It reads all characters from the input stream until it reaches a newline and discards them, without removing the newline from the input stream.
By using * all characters matching to this format string are taken from stdin, but are not assigned?
Correct.
\n isn't taken from stdin but it is used to determine the scan-operation for the according format string?
Exactly. When \n is reached, most scanfs use ungetc to push the character back to the input stream.
I know what each of those [^ and * do alone, but not together.
Putting * before [^ does exactly what [^ alone does except that it does not read the input into an argument and instead discards it.
If you want to discard the \n afterwards, use this format string:
"%*[^\n]%*c"
Since it doesn't appear to be covered, the working way to read everything before newline, and then the newline, is:
scanf("%*[^\n]%*c");
%*[^\n] reads and discards until next character is newline
%*c reads and discards just one character, which per above will be newline
You could also read the newline with %c to a variable and see if you really get a newline successfully, but you could also just directly check for EOF or error directly and not bother with this this.
%[^\n] tells scanf to read everything until a newline character ('\n') and store it in its corresponding argument.
%*[^\n] tells scanf to read everything until a newline character ('\n') and discard it instead of storing it.
Examples:
Input Hi there\n into scanf("%[^\n]", buffer); results in buffer content Hi there and leftover stdin content \n
Input Hi there\n into scanf("%*[^\n]"); results in Hi there getting scanned and discarded from the stdin and leftover stdin content \n.
Note that both %[^\n] and %*[^\n] will fail if the first character that it encounters is a \n character. Once it fails, the stdin is left untouched and the scanf returns resulting in the rest of the format string getting ignored.
If you wish to remove clear a line of stdin upto and including the newline character using scanf, use
scanf("%*[^\n]"); /* Read and discard everything until a newline character */
scanf("%*c"); /* Discard the newline character */

When to use a blank cin.get()?

As the title says - when should I use a blank cin.get() ?
I encountered situations when the program acted strange until I added a few blank cin.get()s between reading lines. (e.g. in a struct when reading its fields I had to enter a cin.get() between each non-blank cin.get())
So what does a blank cin.get() do and when should I use it?
Thanks.
There are two broad categories of stream input operations: formatted and unformatted. Formatted operations expect input in a particular form; they start out by skipping whitespace, then looking for text that matches what they expect. They typically are written as extractors; that's the >> that you see so often:
int i;
std::cin >> i; // formatted input operation
Here, the extractor is looking for digits, and will translate the digits that it sees into an integer value. When it sees something that isn't a digit it stops looking.
Unformatted input operations just do something, without regard to any rules about what the input should look like. basic_istream::get() is one of those: it simply reads a character or a sequence of characters. If you ask it to read a sequence it doesn't care what's in that sequence, except that the form that takes a delimiter looks for that delimiter. Other than that, it just copies text.
When you mix formatted and unformatted operations they fight with each other.
int i;
std::cin >> i;
If std::cin is reading from the console (that is, you haven't redirected it at the command line), you'll typically type in some digits followed by the "Enter" key. The extractor reads the digits, and when it hits the newline character (that's what the "Enter" key looks like on input) it stops reading, and leaves the newline character alone. That's fine, if the next operation on that stream is also a formatted extractor: it skips the newline character and any other whitespace until it hits something that isn't whitespace, and then it starts translating the text into the appropriate value.
There's a problem, though, if you use a formatted operation followed by an unformatted operation. This is a common problem when folks mix extractors (>>) with getline(): the extractor reads up to the newline, and the call to getline() reads the newline character, says "Hey, I've got an empty line", and returns an empty string.
Same thing for the version of basic_istream::get() that reads a sequence of characters: when it hits the delimiter (newline if you haven't specified something else) it stops reading. If that newline was a leftover from an immediately preceding formatted extractor, it's probably not what you're looking for.
One (really really ugly) solution is the brute force cin.ignore(256, '\n');, which ignores up to 256 sequential newline characters.
A more delicate solution is to not create the problem in the first place. If you need to read lines, read lines. If you need to read lines and sometimes extract values from the text in a line, read the line, then create a std::stringstream object and extract from that.

istringstream ignores first letter

I am trying to access different words in a string using std::istringstream and I am also doing so with multiple test cases.
int t;
cin>>t;
while(t--)
{
string arr;
cin.ignore();
getline(cin,arr);
istringstream iss(arr);
string word;
while(iss>>word)
{
cout<<word;
}
}
For the first test case, everything is perfect (i.e. it outputs the correct words). But for every subsequent test case, the first letter of the first word is left out.
Example:
Input:
4
hey there hey
hi hi hi
my name is xyz
girl eats banana
And I'm getting:
Output:
hey there hey
i hi hi
y name is xyz
irl eats banana
Can anyone please suggest me what to do and why is this error occurring?
Your problem is that formatted input, i.e., something like in >> value conventionally skips leading whitespace before attempting to read. Unformatted input, on the other hand, doesn't skip leading whitespace. With the std::cin.ignore(); in your loop you make the assumption that std::getline(std::cin, arr) would leave the newline in the input like the input of t does. That is not so. std::getline() extracts and stores all characters up to the first newline where it stop, still extracting the newline. So, you'd remove the cin.ignore(); from the loop.
The key question becomes how to switch between formatted input and unformatted input. Since the newline upon entry of a numeric value may be preceded with arbitrary spaces which you probably also want to ignore, there are essentially to ways:
std::cin >> std::ws; skips all leading whitespace. That may include multiple newlines and spaces at the beginning of the line. Skipping those may not necessarily desirable.
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); ignores all characters up to and including the first newline. That would allow for empty lines to follow up as well as lines starting with leading whitespace.
This line is the culprit: cin.ignore();.
When std::basic_istream::ignore is called without any arguments, it ignores exactly 1 character.
In your case, std::cin.ignore() will ignore the first letter, but not for the first test case, because at that point std::cin is empty, so there is nothing to ignore. But then, std::cin has the other words in it, so it ignores 1 character from the first word.
According to the documentation of std::basic_istream::ignore:
Extracts and discards characters from the input stream until and
including delim. ignore behaves as an UnformattedInputFunction
Its worth to mention that std::basic_istream::ignore will block and wait for user input if there is nothing to ignore in the stream.
With this in mind, lets break down what your code does:
the first time you call this function in your loop, it is going to
ignore the new line character that is still in the buffer from the
previous cin>>t operation. Then the getline statment will wait and read a line from the user.
The next time around, since there is nothing in the buffer to
ignore(as std::getline doesn't leave the new line character in the
buffer), it is going to block and wait for input to ignore. So
the next time the program block and waits for input, it is because
of the ignore() function,not the getline function as you would
have hoped, and the moment you provide an input(i.e you second test
case),one character from the input is going to be ignored.
The next getline function will not block since there is something
in the buffer left by the previous ignore function after it
ignores the first character of the input so getline will read the
remaining string which will happen to be one character short.
The process continues from step 2 until your loop terminates.
int t;
cin>>t;//this leaves new line character in the buffer
while(t--)
{
string arr;
cin.ignore();//this will ignore one character from the buffer,so the first time
//it will ignore the new line character from the previous cin input
//but next time it will block and wait for input to ignore
getline(cin,arr);//this will not block if there is something in the buffer
//to read
...
}
The solution would be to move the ignore statement out of the loop and next to your cin>>t statement. It's also better write ignore(INT_MAX,'\n'); in this case. You might also want to read this answer to see when and how to use ignore.

Will cin recognize \n typed in from keyboard as a newline character?

I am a beginner for C++ so I'm sorry if this question sounds stupid..
I made this little program to help me get familiar with the properties of cin:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string next;
cout<<"Enter your input.\n";
cin>>next;
cout<<next;
return 0;
}
When I typed in \n from the keyboard input, I was returned \n.
Also, when I changed the variable next from a string to a character and gave it the same input as above, I was returned only a \.
My question is: Why am I not returned with a new line instead? Doesn't cin recognize \n type in from keyboard as a newline character? Or is it just applicable to cout?
\n is an escape sequence in C++; when it appears in a character constant or a string literal, the two character sequence is replaced by the single character representing a new line in the default basic encoding (almost always 0x0A in modern systems). C++ defines a number of such escape sequences, all starting with a \.
Input is mapped differently, and in many cases, depending on the device. When reading from the keyboard, most systems will buffer a full line, and only return characters from it when the Enter key has been pressed; what he Enter key sends to a C++ program may vary, and whether the file has been opened in text mode or binary mode can make a difference as well—in text mode, the C++ library should negotiate with the OS to ensure that the enter key always results in the single character represented by \n. (std::cin is always opened in text mode.) Whether the keyboard driver does something special with \ or not depends on the driver, but most don't. C++ never does anything special with \ when inputting from a keyboard (and \n has no special meaning in C++ source code outside of string literals and character constants).
If you need your program to recognize \n as a new line character at input you can check this out:
https://stackoverflow.com/a/2508814/815812
What Michael say is perfectly correct.
You can try out in similar way.
Technically speaking, this depends on things outside your program, but assuming your terminal simply passes the individual bytes corresponding to the '\' and 'n' characters (which I think any sane one will), then the behavior you're seeing is expected.
"\n" is nothing more than a shortcut added to the programming language and environment to let you more easily represent the notion of the ASCII return key. It's not a character itself, just a command to tell the program to generate a non-printable character that corresponds to pressing the Enter key.
Let's say you're in Notepad or whatever and you press the Tab key. It tabs over a spot. Typing "\t" just enters the literal characters "\" and "t". Internally, whoever wrote Notepad had to say what it should do when the user pressed Tab, and he did so by using the mnemonic like
if(key == '\t') {
// tab over
}