I'm using a Standard iostream to get some input from a file, and I'm confused about unget() versus putback(character). It seems to me from the documentation that these functions are effectively identical, where unget() just remembers the character put in, so I'm nervous. I've always used putback(character), but character is always the last read character and I've been thinking about changing to unget(). Is putback(character) always identical to unget(), if character is always the last read character?
You can't lie with unget(). It "ungets" the last-read character. You can lie with putback(c). You can "putback" some character other than the last-read character. Sometimes putting back a character other than the last-read character can be useful.
Also, if the underlying read buffer really does have buffering capability, you can "putback" more than one character. I think ungetc() is limited to one character.
Edit
Nope. It looks like unget() can go as far back as putback().
It's not the answer you probably expect, but want to introduce my reasoning. Documentation stays that the methods putback and unget call streambuf::sputbackc and streambuf::sungetc respectively. Definitions are as follow:
streambuf::sungetc
Moves the get pointer one character backwards, making the last character gotten by an input operation available once again for the next input operation.
During its operation, the function will call the protected virtual member function pbackfail if the get pointer gptr points to the same position as the beginning pointer eback.
The other one:
streambuf::sputbackc
The get pointer is moved back to point to the character right before its current position so the last character gotten, c, becomes available again as the character to be read at that position by the next input operation.
During its operation, the function calls the protected virtual member function pbackfail either if the character c doesn't match gptr()[-1] or if the get pointer gptr points to the same position as the beginning pointer eback.
When c does not match the character at that position, the default definition of pbackfail in streambuf will prepend c to be the character extracted at that position if possible, but derived classes may override this behavior.
The member function sungetc behaves in a similar way but without taking any parameter
As sputbackc calls pbackfail if character doesn't match, it means the method has to check if the values are equal. It looks like the additional check is the only overhead, but have no idea how it is solved in practise. I can imagine that if the last character is not stored in the object then it has to be reread, so you might expect it even when the characters are guaranteed to be the same.
I was a little bit concerned about situation when we call unget, but last character is not available. Would the putback put the value correctly? I doubt, but it shouldn't be the case while operating on files.
Related
Very incidentally, I wrote a findc() function and I submitted the program.
data test;
x=findc(,'abcde');
run;
I looked at the result and nothing is unnormal. As I glanced over the code, I noticed the findc() function missed the first character argument. I was immediately amazed that such code would work.
I checked the help documentation:
The FINDC function allows character arguments to be null. Null arguments are treated as character strings that have a length of zero. Numeric arguments cannot be null.
What is this feature designed for? Fault tolerance or something more? Thanks for any hint.
PS: I find findw() has the same behavior but find() not.
I suspect that allowing the argument to be not present at all is just an artifact of allowing the strings passed to it to be of zero length.
Normally in SAS strings are fixed length. So there was no such thing as an empty string, just one that was filled with spaces. If you use the TRIM() function on a string that only has spaces the result is a string with one space.
But when they introduced the TRIMN() and other functions like FINDC() and FINDW() they started allowing arguments to functions to be empty strings (if you want to store the result into a variable it will still be fixed length). But they did not modify the behavior of the existing functions like INDEX() or FIND().
For the FINDC() function you might want this functionality when using the TRIMN() function or the strip modifier.
Example use case might be to locate the first space in a string while ignoring the spaces used to pad the fixed length variable.
space = findc(trimn(string),' ');
On implementing a basic_filebuf I stumbled over basic_filebuf::pbackfail and don't fully understand its definition.
From cplusplus.com
Moves the current input position on position back to point to the previous character and, if supported, makes c available as that next character to be read.
If the implementation does not support writing to putback positions, c shall either match the character at the putback position or be the end-of-file value (traits_type::eof()). Otherwise, the function fails. [...]
If the get pointer (gptr) is at the beginning of the character sequence before the call, the function may either fail or make additional putback positions available and succeed, depending on the library implementation.
Or from cppreference:
1) The caller is requesting that the get area is backed up by one character (pbackfail() is called with no arguments), in which case, this function re-reads the file starting one byte earlier and decrements basic_streambuf::gptr(), e.g. by calling gbump(-1).
2) The caller attempts to putback a different character from the one retrieved earlier (pbackfail() is called with the character that needs to be put back), in which case
a) First, checks if there is a putback position, and if there isn't, backs up the get area by re-reading the file starting one byte earlier.
a) Then checks what character is in the putback position. If the character held there is already equal to c, as determined by Traits::eq(to_char_type(c), gptr()[-1]), then simply decrements basic_streambuf::gptr().
b) Otherwise, if the buffer is allowed to modify its own get area, decrements basic_streambuf::gptr() and writes c to the location pointed to gptr() after adjustment.
So in essence both say that the input position is decremented (unless it is at the start of the file) and possibly a char is put back. So the following should succeed (assume prepared file according to comments, using std classes for comparison of behavior):
std::fstream f(..., in | out | binary);
f.get() // == 'a'
f.get() // == 'b'
f.sync(); // or f.seekg(0)
f.putback('b');
f.putback('a');
f.putback(); // may fail
However on libc++ the first putback fails already and checking the source code I found pbackfail to be guarded by if (__file_ && this->eback() < this->gptr()) aka "if there is an open file and there is space at the front of the current read buffer".
A flush/sync/seek clears the read buffer which explains the failing putback. When using unbuffered IO there will only be a single char space in the read buffer so (at least) the 2nd putback will fail even without the flush. Or the second get might cross a buffer "border" which means "b" will be the first char in the current buffer which also makes the second putback fail.
Question: How is putback exactly specified? It seems to be only valid immediately after a get although both cppreference and cplusplus seem to imply that the read position is decremented in any case. If they are right, is libc++ non-conforming or am I missing anything?
Question: How is putback exactly specified?
It is specified in the section istream.unformatted.
However, that refers to sputbackc, and you'll probably want to read the general stream buffer requirements as well.
How does it work when you use a while loop with the condition !fin.eof()? What if my loop contains a statement like fin.get(ch) where ch is of type char?
How exactly does the pointer move and when is it updated?
If at the beginning of the file, the pointer points to the 1st element. Does it move over to the second element in the first iteration itself?
fin.eof() increments the pointer to the next character in the input buffer. This might include a newline character or other hidden character which comes after what you are expecting.
"it reads characters from the current stream position to and including the first newline character, to the end of the stream, or until the number of characters read is equal to n – 1 [for fgets], whichever comes first." (from http://www.cplusplus.com/forum/beginner/37005/)
I've found that
while (getline(fin,sent))
usually works better
(see http://mathbits.com/MathBits/CompSci/Files/End.htm)
Now when we declare a string, the last character is the null character, right.
(Now pls see the image of the code and its output that i have attached)
As you can see in the image attached, i am getting the null character at the 7th posn!!! What is happening?
According to the book i refer to(see the other image attached), a string always has an extra character associated with it, at the end of the string, called the null character which adds to the size of the string.
But by the above code i am getting the null character at the 7th position, although according to the book, i should get it at the 6th position.
Can someone explain the output pls?
Any help is really appreciated!!
Thank You!
Do not use gets() - ever! It is entirely immaterial what gets() does as is has no place in any reasonably written code! It is certainly removed from the C++ standard and, as far as I know, also from C (I think C removed it first). gets() happily overruns the buffer provided as it doesn't even know the size of the storage provided. It was blamed as the primary reason for most hacks of systems.
In the code you linked to there is such a buffer overrun. Also not that sizeof() determines the size of a variable. It does not consider its content in any shape or form: sizeof(str) will not change unless you change the type of str. If you want to determine the size of the string in that array you'll need to use strlen(str).
If you really need to read a string into a C array using FILE* functions, you shall use fgets() which, in addition ot the pointer to the storage and the stream (e.g. stdin for the default input stream) also takes the size of the array as parameter. fgets() fails if it can't read a complete null-terminated string.
You declare a char array that can hold up to 5 chars, however, dummy\0 is 6 characters long, resulting in buffer overflow.
In the program, Lambda λ theoretically represents nothing: ''. I thought of representing this programatically as '\0', but obviously that terminates a string which is not necessarily what lambda does. Also, I am reading in from istringstream and it has problems reading that character in.
So what character would you use?
I'm assuming you have a reason for representing Int,Char,Int as a string, rather than just define a struct to hold the data.
As you say, \0 doesn't work as it terminates the string. But there are other invisible ASCII characters that you can use and easily escape in C++. Have a look at this list of escape codes.