I'm writing a little parser in c++98 (yup, cannot use 11).
I'm working with a std::stringstream which I pass by reference to different functions, let's call them subparsers.
In order to know which subparser to call i need to know the next word in the stringstream.
As stringstream is an istream it does have a peek function which returns the next character without moving the iterator / pointer / whatever it is that marks the current location within the stringstream, but as I need the next word I wrote a function peekWord
(ignore the commented line for now):
std::string Parser::peekWord(std::stringstream& sstream){
std::string myString = "EOF";
if(!sstream.eof()){
unsigned pos = sstream.tellg();
sstream >> myString;
//sstream.tellg();
sstream.seekg(pos);
}
return myString;
}
which seems to work nicely.
While debugging I noticed, that as soon as i call tellg() after the pointer/marker/thing has been moved past the final word (which the returns -1), seekg(xBeforeLastPosition) doesn't work anymore and still sets the position to -1.
Does the call of tellg() at the end of a stringstream set that failbit or something like that? I would intuitively had hoped that the void function tellg() has no side effects.
Looking forward to hearing from you guys :)
pip
tellg is specified as such:
Returns: After constructing a sentry object, if fail() != false, returns pos_type(-1) to indicate failure. Otherwise, returns rdbuf()->pubseekoff(0, cur, in).
(istream::sentry objects are used to check that input is available.)
So, yes, it will set failbit on EOF. You can detect this however by checking eof() and using clear() to return to normal processing.
Related
I don't understand the design decisions behind the C++ getline function.
Why does it take a stream and a string by reference as arguments, only to return the same stream that was passed in? It seems more intuitive to only take the stream as an argument, then return the string that was read. Returning the same stream lets you chain the call, but would anyone really want to use getline(getline(stream, x), y)?
Additionally, why is the function not in the std namespace like the rest of the standard library?
If the function returned a string, there would be no way of indicating that the read failed, as all string values are valid values that could be returned by this (or any other) function. On the other hand, a stream has lots of error indicator flags that can be tested by the code that calls getline. So people can write code like:
while( std::getline( std::cin, somestring )) {
// do stuff with somestring
}
and it is hard to see how you could write similar code if getline returned a string.
why is the function not in the std namespace like the rest of the standard library?
It is in the std namespace - what makes you think otherwise?
Why does it take a stream and a string by reference as arguments, only to return the same stream that was passed in?
It is a common pattern in the stream library to do that. It means you can test the operation being performed as you perform it. For example:
std::string line;
while(std::getline(std::cin, line))
{
// use line here because we know the read succeeded
}
You can also make succinct parsers by "chaining" stream functions:
std::string key, value;
if(std::getline(std::getline(in, key, '='), value))
my_map[key] = value;
It seems more intuitive to only take the stream as an argument, then return the string that was read.
The problem with returning a new string every call is that you are constantly allocating new memory for them instead of reusing the memory already allocated to the string you passed in or that it gained while iterating through a loop.
// Here line will not need to allocate memory every time
// through the loop. Only when it finds a longer line than
// it has capacity for:
std::string line;
while(std::getline(std::cin, line))
{
// use line here because we know the read succeeded
}
I have an instance of stringstream that I am reading from. At a certain point of getting data out of the stream, I need to read an identifier that may or may not be there. The logic is something like this:
std::string identifier;
sstr >> identifier;
if( identifier == "SomeKeyword" )
//process the the rest of the string stream using method 1
else
// back up to before we tried to read "identifier" and process the stream using method 2
How can I achieve the above logic?
Use the stream's tellg() and seekg() methods, eg:
std::string identifier;
std::stringstream::pos_type pos = sstr.tellg();
sstr >> identifier;
if (identifier == "SomeKeyword")
{
//process the rest of the string stream using method 1
}
else
{
// back up to before we tried to read "identifier
sstr.seekg(pos);
// process the stream using method 2
}
You can get the get pointer in the stream before you get the identifier and restore the position if the identifier is wrong:
std::string identifier;
std::stringstream::pos_type pos = sstr.tellg();
sstr >> identifier;
if( identifier == "SomeKeyword") {
// process the the rest of the string stream using method 1
} else {
sstr.clear();
sstr.seekg(pos, sstr.beg);
// process the stream using method 2
}
The page on tellg at cplusplus.com has a very nice example. The purpose of calling clear() is to ensure that seekg works even if the previous read reached end-of-file. This is only necessary for versions of C++ before C++ 11. If you are using C++11 or newer, seekg clears the EOF bit automatically and you should not include the line with clear() in your solution. Thanks to #metal for pointing this out.
You can directly inspect a stringstream's contents. That may be a clearer approach than extracting and rolling back, as you aren't guaranteed your stringstreams condition after extraction. For example, if your string only contained one word, extracting it would have set the ios_base::iostate::eofbit flag.
You could accomplish inspecting the stringstream's contents like this:
if(sstr.str().compare(0, identifier.length(), identifier) == 0) {
sstr.ignore(identifier.length());
// process the the rest of the string stream using method 1
} else {
// process the stream using method 2
}
One risk this takes on is, if you were depending upon the stringstream's extraction operator to eliminate leading white-space you'll need to purge before doing the compare. This can be done by before your if-block with the command sstr >> skipws;.
While I do consider this method safer, it should be noted that if you are dependent upon leading white-space being in sstr for "method 2" then you should use one of the other answers (but you should also reconsider your use of stringstream since all the extraction operators first eat white-space.)
I'm finding that gcount on an ifstream object after a call to
getline(istream &, string &)
returns 0.
Is this supposed to be the case?
Yes, gcount() is supposed to return the number of characters extracted by the last unformatted input operation performed on the object.
getline() is listed in the functions supposed to updated gcount(), but it is the member getline() of a stream and not the string getline().
In case of doubt, this link tells it black on white: Behaves as UnformattedInputFunction, except that input.gcount() is not affected.
seekg uses ios as the second argument, and ios can be set to end or beg or some other values as shown here: http://www.cplusplus.com/reference/iostream/ios/
I just want the pointer to move to the next character, how is that to be accomplished through ifstream?
EDIT
Well, the problem is that I want a function in ifstream similar to fseek, which moves the pointer without reading anything.
ifstream fin(...);
// ...
fin.get(); // <--- move one character
// or
fin.ignore(); // <--- move one character
Yes. Its called seekg() as you seem to already know?
std::ifstream is("plop.txt" );
// Do Stuff
is.seekg (1, std::ios::cur); // Move 1 character forward from the current position.
Note this is the same as:
is.get();
// or
is.ignore();
Read the docs for seekg and use ios_base::cur as indicated there.
I guess peek() and unget() could be useful too.
Use peek() to peek next character so that getline will work as you have wanted.
Use unget to put the character back to your buffer in case you have used get() method.
What does the istream::getline method return?
I am asking because I have seen that to loop through a file, it should be done like this:
while ( file.getline( char*, int ) )
{
// handle input
}
What is being returned?
It returns a stream so that we can chain the operation.
But when you use an object in a boolean context the compiler looks for an conversion operator that can convert it into a type that can be used in the boolean context.
C++11
In this case stream has explicit operator bool() const. When called it checks the error flags. If either failbit or badbit are set then it returns false otherwise it returns true.
C++03
In this case stream has operator void*() const. As this results in a pointer it can be used in a boolean context. When called it checks the error flags. If either failbit or badbit are set then it returns NULL which is equivalent to FALSE otherwise it returns a pointer to self (or something else valid though you should not use this fact)).
Usage
So you can use a stream in any context that would require a boolean test:
if (stream >> x)
{
}
while(stream)
{
/* do Stuff */
}
Note: It is bad idea to test the stream on the outside and then read/write to it inside the body of the conditional/loop statement. This is because the act of reading may make the stream bad. It is usually better to do the read as part of the test.
while(std::getline(stream, line))
{
// The read worked and line is valid.
}
Look from reference. The istream returned from getline is converted to bool by implicit conversion to check success of operation. That conversion makes usage of if(mystream.getline(a,b)) into shorthand for if(!mystream.getline(a,b).fail()).
It returns the stream itself. The stream can convert (through void*) to bool indicating its state. In this example, your while loop will terminate when the stream's conversion to bool goes "false", which happens when your stream enters an error state. In your code, it's most likely to occur when there was an attempt to read past the end of the file. In short, it'll read as much as there is, and then stop.
The function returns a reference to the stream object itself, which can be used either to chain further read operations:
myStream.getline(...).getline(...);
or, because streams are implicitly convertible to void *s, in a loop or condition:
while (myStream.getline(...)) {
...
}
You can read more about this on the cplusplus.com website:
http://cplusplus.com/reference/iostream/istream/getline/
Everyone has told you what it is, now let me tell you, use the free form version
std::string line;
while(getline(file, line)) // assuming file is an instance of istream
{
//
}
Why this version? It should become immediately apparent - you pass in a std::string rather than some fixed size character buffer!