Distinguishing between single and multiple characters while using get()? - c++

The get() function is used to read a single character and to read multiple characters into an array. How does the system know if it is to read a single character or multiple characters?

Based on the parameters you specify, it will direct the calls to the matching prototype of the function.
For example if you call get(character reference) it will consider that one character has to be read from the stream.
If you pass a string (pointer) as argument and specify the length, it will consider reading characters in stream uptil that length in get(string, stream length).
If you specify a delimiter (add a 'delimiter character' to the parameters above) it will read uptil that character is matched against.
If you specify a stream-buffer using get(streambuf) it will read all the available characters. (can add a delimiter here as well)
Evident from the above, only the first prototype get(character reference) or get(char& c) reads a single character, rest read multiple characters (eg: passing a string/stream-buffer with length), and this is known by the parameters you provide or the function prototype you follow during your call to get(). Reference

Related

SAS function findc with null character argument

Very incidentally, I wrote a findc() function and I submitted the program.
data test;
x=findc(,'abcde');
run;
I looked at the result and nothing is unnormal. As I glanced over the code, I noticed the findc() function missed the first character argument. I was immediately amazed that such code would work.
I checked the help documentation:
The FINDC function allows character arguments to be null. Null arguments are treated as character strings that have a length of zero. Numeric arguments cannot be null.
What is this feature designed for? Fault tolerance or something more? Thanks for any hint.
PS: I find findw() has the same behavior but find() not.
I suspect that allowing the argument to be not present at all is just an artifact of allowing the strings passed to it to be of zero length.
Normally in SAS strings are fixed length. So there was no such thing as an empty string, just one that was filled with spaces. If you use the TRIM() function on a string that only has spaces the result is a string with one space.
But when they introduced the TRIMN() and other functions like FINDC() and FINDW() they started allowing arguments to functions to be empty strings (if you want to store the result into a variable it will still be fixed length). But they did not modify the behavior of the existing functions like INDEX() or FIND().
For the FINDC() function you might want this functionality when using the TRIMN() function or the strip modifier.
Example use case might be to locate the first space in a string while ignoring the spaces used to pad the fixed length variable.
space = findc(trimn(string),' ');

Read multiple strings input using fgets()

I need to
Take a command line argument giving number of strings (say N).
Call a function to take N input lines from user (using fgets)
Store them in an array of pointers.
i.e. char *input_lines[MAX_LINES];
All newlines from the lines should be removed
How can I achieve this?

How does `cin>>` take three values though they are separated by space?

When looking at some code online I found
cin>>arr[0][0]>>arr[0][1]>>arr[0][2]
where I put a line of three integer values separated by space. I see that those three integers separated by space become the value of arr[0][0], arr[0][1] and arr[0][2].
It doesn't cause any trouble if there are more than one space between them.
plz, can anyone explain me how this work?
Most overloads of operator>> consume and discard all whitespace characters first thing. They begin parsing the actual value (say, an int) starting from the first non-whitespace character in the stream.
Reading almost any types of inputs from a stream will skip any leading whitespaces first, unless you explicitly turn that feature off. You should read std::basic_istream documentation for more information:
Extracts an integer value potentially skipping preceding whitespace. The value is stored to a given reference value.
This function behaves as a FormattedInputFunction. After constructing and checking the sentry object, which may skip leading whitespace, extracts an integer value by calling std::num_get::get().
The same applies to other stream input functions, including the scanf family where most format specifiers will consume all whitespace characters before reading the value:
All conversion specifiers other than [, c, and n consume and discard all leading whitespace characters (determined as if by calling isspace) before attempting to parse the input. These consumed characters do not count towards the specified maximum field width.

What to use to represent a lambda character in C++

In the program, Lambda λ theoretically represents nothing: ''. I thought of representing this programatically as '\0', but obviously that terminates a string which is not necessarily what lambda does. Also, I am reading in from istringstream and it has problems reading that character in.
So what character would you use?
I'm assuming you have a reason for representing Int,Char,Int as a string, rather than just define a struct to hold the data.
As you say, \0 doesn't work as it terminates the string. But there are other invisible ASCII characters that you can use and easily escape in C++. Have a look at this list of escape codes.

Difference between putback() and unget()

I'm using a Standard iostream to get some input from a file, and I'm confused about unget() versus putback(character). It seems to me from the documentation that these functions are effectively identical, where unget() just remembers the character put in, so I'm nervous. I've always used putback(character), but character is always the last read character and I've been thinking about changing to unget(). Is putback(character) always identical to unget(), if character is always the last read character?
You can't lie with unget(). It "ungets" the last-read character. You can lie with putback(c). You can "putback" some character other than the last-read character. Sometimes putting back a character other than the last-read character can be useful.
Also, if the underlying read buffer really does have buffering capability, you can "putback" more than one character. I think ungetc() is limited to one character.
Edit
Nope. It looks like unget() can go as far back as putback().
It's not the answer you probably expect, but want to introduce my reasoning. Documentation stays that the methods putback and unget call streambuf::sputbackc and streambuf::sungetc respectively. Definitions are as follow:
streambuf::sungetc
Moves the get pointer one character backwards, making the last character gotten by an input operation available once again for the next input operation.
During its operation, the function will call the protected virtual member function pbackfail if the get pointer gptr points to the same position as the beginning pointer eback.
The other one:
streambuf::sputbackc
The get pointer is moved back to point to the character right before its current position so the last character gotten, c, becomes available again as the character to be read at that position by the next input operation.
During its operation, the function calls the protected virtual member function pbackfail either if the character c doesn't match gptr()[-1] or if the get pointer gptr points to the same position as the beginning pointer eback.
When c does not match the character at that position, the default definition of pbackfail in streambuf will prepend c to be the character extracted at that position if possible, but derived classes may override this behavior.
The member function sungetc behaves in a similar way but without taking any parameter
As sputbackc calls pbackfail if character doesn't match, it means the method has to check if the values are equal. It looks like the additional check is the only overhead, but have no idea how it is solved in practise. I can imagine that if the last character is not stored in the object then it has to be reread, so you might expect it even when the characters are guaranteed to be the same.
I was a little bit concerned about situation when we call unget, but last character is not available. Would the putback put the value correctly? I doubt, but it shouldn't be the case while operating on files.