How large can cin input be? - c++

Consider this code:
int a;
cin >> a;
The input doesn't stop when you enter for example 1 or 2; it waits until you press ENTER. How does this work? Why does cin wait for your input until you press ENTER?

Your runtime environment and your terminal control the raw keyboard input. Typically, they only send the input to the application line by line, to allow for editing. You have to speak to your terminal, in a platform-dependent way, if you want it to send you the keyboard input immediately.
(This is often referred to as "raw" mode, as opposed to the usual, line-buffered "cooked" mode. Note that the cooked mode also handles backspace and delete and cursor movement and all that.)

cin just has a buffer behind it that gets filled up with input and then gets emptied as you use the extraction operator (>>). When and how it gets filled up depends on the platform. In Unix-like systems, for example, the input terminal is in either canonical or non-canonical mode. In canonical mode, input is made available line by line. In non-canonical mode, it is available immediately. It's possible to change between these modes (check man termios).
The actual size of the standard input buffer is implementation-defined.

Related

Receiving input from user in c++

I'm trying to read input from user in this way
std::string point;
while (std::getline(std::cin, point))
{
// SOME CODE
}
I want to receive input from user until he hits CTRL+D which means EOF.
obviously in the current situation I get an error since it doesn't end when I hit CTRL+D. Any tips? Thanks!
In Windows, Ctrl+Z or F6 on an otherwise empty line is the convention for marking end-of-file for a text stream. In Unix-land, Ctrl+D is the convention for sending your typed text straighaway to the program, which for an empty line sends nothing (a zero length input) which the program interprets as end-of-file. Also in Windows the interpretation is in each program, not done by Windows itself or the console subsystem or command shell, and for C and C++ it's handled by the standard library's i/o facilities.
One big difference is that in Windows the Ctrl-Z is data, which you can have in e.g. a text file, while in Unix-land the “submit now!” Ctrl-D is an action, that can't occur in data read from a file.
Another big difference is that because of that, in Windows you have to press Return (Enter, newline) after the Ctrl+Z, while in Unix-land the Ctrl-D is it, on its own.

C++ Ansi escape codes and it's interpretation when done manually

I just noticed this accidentally, with the following code. In the following code,
char teststring[20];
cin.getline(teststring, 20);
the prompt stops for a userinput, and when I press an up arrow, which I did out of muscle memory to check the bash history, it printed the Ansi Escape code (got the details from here) ^[[A and when I pressed a single backspace and pressed enter, the character A got deleted and it printed an un-readable garbage, instead of ^[[, but when I typed the same keys manually or copy-pasted again (to make sure that it is not a similar looking symbol in ASCII) without the last letter, it printed ^[[.
What would be the reason though the characters entered were the same?
The Unix terminal is a very intricate beast. Posix includes a pretty thorough description of its features; the below is just a quick summary.
Normally, the terminal input device operates in "canonical" mode. In that mode, the terminal driver maintains a line buffer which it fills when necessary by reading user input. If the buffer is emptied and more data is requested by the program, the driver will read an entire line of input before providing any more data to the program. So if the buffer is empty, even a getc to read a single character will cause an entire to be read into the terminal driver's buffer before the getc returns.
As the driver reads input characters, it checks for certain special characters; anything else is added to the line buffer and echoed to the terminal device. (Input and output to a terminal device are independent; if the driver or the program didn't echo input, nothing would appear on the screen, which would usually be confusing. Programs turn echoing off in order to be able to accept passwords, for example.)
All of the special characters are configurable. There are quite a few; here are some of the more common ones:
Enter Inserts a newline character into the buffer, and terminates the input line so that the pending read will return.
Ctrl-D (EOF) The character itself is discarded, but the input is terminated and a pending read returns. If the input buffer is empty -- i.e., the Ctrl-D was pressed at the beginning of a line, a zero-length buffer will be returned to the pending read, which will be interpreted as an end of file marker.
Bksp (ERASE) Unless the input buffer is empty, removes the last character from the input buffer and erases it from the screen.
Ctrl-C (INTR) Sends SIGINT to the process.
Ctrl-Z (SUSP) Sends SIGTSTP to the process.
Ctrl-U (KILL) Deletes the entire input buffer.
Ctrl-S (STOP) Stops output.
Ctrl-Q (START) Resumes output if it has been stopped with the STOP character.
When the Linux terminal driver is echoing characters, it will normally echo control characters (characters whose ascii code is less than 0x20) as a caret (^) followed by the character whose code is 0x40 higher, which is usually a letter. The ESC character has the code 0x1B, so it will normally be echoed as a caret followed by the character 0x5B, which is an open square bracket. Hence, you would normally expect ESC to echo as ^[.
Many keys on the keyboard actually send more than one character, and almost all of these sequences start with ESC[. The uparrow, for example, sends the codes ESC[A, and so if you are running a naive program which doesn't handle cursor moving characters, you will see ^[[A echoed when you press the uparrow key.
The character you see is one of the ways used to show characters which don't correspond to any Unicode glyph. The box contains four hex digits, which correspond to the Unicode codepoint, in this case U+001B, which is an ESC character. I don't know why this happened, but it is most likely the result of a race condition between the various components which contribute to the terminal echo.

How can I flush stdin? (environment: Mingw compiler, running in xterm or mintty of Cygwin)

There are two ways that I know to flush stdin:
(1) bool FlushConsoleInputBuffer(_In_ HANDLE hConsoleInput);
(2) fflush (stdin);
However, in my environment:
Compiler: MinGW g++
Running in: Windows, Cygwin xterm or Cygwin mintty
Neither of them works.
What can I do?
Note: FlushConsoleInputBuffer() works if my program runs under dos prompt window. In addition, FlushConsoleInputBuffer() nicely returns false, when it runs on Cygwin xterm or mintty.
--UPDATE--
I suspect that Cygwin handles stdin separately than Windows native stdin, which make FlushConsoleInputBuffer() fail.
#wallyk: yes. 'flush' means dropping all unread buffered inputs.
--UPDATE-- (final answer accepted and reason)
Tony D is right. The problem is that Cygwin terminal is a unix-like terminal, which allows editing before 'ENTER' key is hit. Thus any partial input must be buffered and will never be passed to stdin before the 'ENTER' key is hit, since it expects editing commands. I guess it should be possible to overcome this by setting terminal to raw mode (not experimented). Yet the editing feature will be lost in the raw mode.
fflush is meant to be used with an output stream. The behavior of fflush(stdin) is undefined. See http://en.cppreference.com/w/cpp/io/c/fflush.
If you use std::cin to access stdin, you can use std::istream::ignore() to ignore the contents of the stream up to a given number of characters or a given character.
Example:
// Ignore the rest of the line.
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
Working code: http://ideone.com/Z6zLue
If you are using stdin to access the input stream, you can use the following to ignore the rest of the line.
while ( (c = fgetc(stdin)) != '\n' && c != EOF);
Working code: http://ideone.com/gg0Az2
Discarding exactly and only the data currently buffered/available to stdin isn't supported using only C++ Standard library features.
Most of the time, programmers just ignore (see example at the bottom of that page) the rest of a problematic line then try the next buffered line. If you're concerned there may be a lot of problematic lines - for example, that the user may have cut-and-pasted pages of nonsense that you want to discard, but then you do want to give them a chance to enter further lines, you need to use an OS-specific function to work out when a read on stdin would block. You'd then ignore lines until that would-block condition is true.
select and poll are two such operations that work on most Operating Systems, but from memory they're only defined for socket streams on Windows so of no use to you. Cygwin may or may not support them somehow; if you want to try it - you would ignore lines as long as the stdin file descriptor (which is 0) tests readable. You'll find lots of other Q&A discussing how to see if there's input available: e.g. checking data availability before calling std::getline, Check if stdin is empty, Win32 - read from stdin with timeout
Keep in mind that your terminal program is probably internally buffering what you type until you press ENTER, so at most your program can clear the earlier lines but not a line the user's partially typed (though you could use some heuristic to discard it after it's sent to your program's stdin).
UPDATE
Cruder alternatives that might be good enough in some circumstances:
save the now() time, then loop calling getline(std::cin, my_string) until either it fails (e.g. EOF on stdin) or the time between reads is greater than some threshold - say half a second; that way it's likely to consume the already-buffered but unwanted input, and yet ENTER for further hand-typed user input's likely to happen after the discarding loop's terminated: you could prompt ala std::cout >> "bad input discarded - you may press ^U to clear your input buffer if it contains unwanted text...\n"; (Control-U works for many terminals, but check your own)
have a particular string like say "--reset--" that the user knows they can type to stop discarding lines and switch back to processing future lines

How do I clear user input (cin) that occurred while the process was blocked?

I have a C++ program that takes input from the user on std::cin. At some points it needs to call a function that opens a GUI window with which the user can interact. While this window is open, my application is blocked. I noticed that if the user types anything into my application's window while the other window is open, then nothing happens immediately, but when control returns to my application those keystrokes are all acted upon at once. This is not desirable. I would like for all keystrokes entered while the application is blocked to be ignored; alternatively, a way to discard them all upon the application regaining control, but retaining the capability to react to keystrokes that occur after that.
There are various questions on Stack Overflow that explain how to clear a line of input, but as far as I can tell they tend to assume things like "the unwanted input only lasts until the next newline character". In this case this might not be so, because the user could press enter several times while the application is blocked. I have tried a variety of methods (getline(), get(), readsome(), ...) but they generally seem not to detect when cin is temporarily exhausted. Rather, they wait for the user to continue supplying content for cin. For example, if I use cin.ignore(n), then not only is everything typed while the GUI window was open ignored, but the program keeps waiting afterwards while the user types content until a total of n characters have been typed. That's not what I want - I want to ignore characters based on where in time they occurred, not where in the input stream they occur.
What is the idiom for "exhaust everything that's in cin right now, but then stop looking for more stuff"? I don't know what to search for to solve this.
I saw this question, which might be similar and has an answer, but the answer asks for the use of <termios.h>, which isn't available on Windows.
There is no portable way to achieve what you are trying to do. You basically need to set the input stream to non-blocking state and keep reading as long as there are any characters.
get() and getline() will just block until there is enough input to satisfy the request. readsome() only deals with the stream's internal buffer and is only use to non-blockingly extract what was already read from the streams internal buffer.
On POSIX systems you'd just set the O_NONBLOCK with fcntl() and keep read()ing from file descriptor 0 until the read returns a value <= 0 (if it is less than 0 there was an error; otherwise there is no input). Since the OS normally buffers input on a console, you'd also need to set the stream to non-canonical mode (using tcsetattr()). Once you are done you'd probably restore the original settings.
How to something similar on non-POSIX systems I don't know.

The meaning of ENABLE_PROCESSED_INPUT in SetConsoleMode flags

In windows API, there is SetConsoleMode function.
Among the mode values, I cannot understand the ENABLE_PROCESSED_INPUT value.
The MSDN document says
ENABLE_PROCESSED_INPUT : value (0x0001) :
CTRL+C is processed by the system and is not placed in the input buffer. If the input buffer is being read by ReadFile or ReadConsole, other control keys are processed by the system and are not returned in the ReadFile or ReadConsole buffer. If the ENABLE_LINE_INPUT mode is also enabled, backspace, carriage return, and line feed characters are handled by the system.
Does it mean that when this flag is set, CTRL+C is not placed in the input buffer(because it's handled by the system)? or is it otherwise(CTRL+C is placed in the input buffer)? The explanation is confusing to me.. Please can anyone explain it to me?
It means that Ctrl+C will not be put in the input buffer if the ENABLE_PROCESSED_INPUT flag is set (instead, the system will handle it and send the SIGINT signal to the process running in the console).
The same behavior applies to the ENABLE_LINE_INPUT flag: if it is set, characters like backspace, carriage return and line feed are not put in the input buffer and are handled by the system (erasing characters from the buffer and processing end-of-lines automatically).
ENABLE_PROCESSED_INPUT : value (0x0001) : CTRL+C is processed by the system and is not placed in the input buffer.
So basicaly yes, nothing goes into the input buffer, because the special symbols are handled by the system.