What are the guidelines regarding parsing with iostreams? - c++

I found myself writing a lot of parsing code lately (mostly custom formats, but it isn't really relevant).
To enhance reusability, I chose to base my parsing functions on i/o streams so that I can use them with things like boost::lexical_cast<>.
I however realized I have never read anywhere anything about how to do that properly.
To illustrate my question, lets consider I have three classes Foo, Bar and FooBar:
A Foo is represented by data in the following format: string(<number>, <number>).
A Bar is represented by data in the following format: string[<number>].
A FooBar is kind-of a variant type that can hold either a Foo or a Bar.
Now let's say I wrote an operator>>() for my Foo type:
istream& operator>>(istream& is, Foo& foo)
{
char c1, c2, c3;
is >> foo.m_string >> c1 >> foo.m_x >> c2 >> std::ws >> foo.m_y >> c3;
if ((c1 != '(') || (c2 != ',') || (c3 != ')'))
{
is.setstate(std::ios_base::failbit);
}
return is;
}
The parsing goes fine for valid data. But if the data is invalid:
foo might be partially modified;
Some data in the input stream was read and is thus no longer available to further calls to is.
Also, I wrote another operator>>() for my FooBar type:
istream& operator>>(istream& is, FooBar foobar)
{
Foo foo;
if (is >> foo)
{
foobar = foo;
}
else
{
is.clear();
Bar bar;
if (is >> bar)
{
foobar = bar;
}
}
return is;
}
But obviously it doesn't work because if is >> foo fails, some data has already been read and is no longer available for the call to is >> bar.
So here are my questions:
Where is my mistake here ?
Should one write his calls to operator>> to leave the initial data still available after a failure ? If so, how can I do that efficiently ?
If not, is there a way to "store" (and restore) the complete status of an input stream: state and data ?
What differences are they between failbit and badbit ? When should we use one or the other ?
Is there any online reference (or a book) that explains deeply how to deal with iostreams ? not just the basic stuff: the complete error handling.
Thank you very much.

Personally, I think these are reasonable questions and I remember very well that I struggled with them myself. So here we go:
Where is my mistake here ?
I wouldn't call it a mistake but you probably want to make sure you don't have to back off from what you have read. That is, I would implement three versions of the input functions. Depending on how complex the decoding of a specific type is I might not even share the code because it might be just a small piece anyway. If it is more than a line or two probably would share the code. That is, in your example I would have an extractor for FooBar which essentially reads the Foo or the Bar members and initializes objects correspondingly. Alternatively, I would read the leading part and then call a shared implementation extracting the common data.
Let's do this exercise because there are a few things which may be a complication. From your description of the format it isn't clear to me if the "string" and what follows the string are delimited e.g. by a whitespace (space, tab, etc.). If not, you can't just read a std::string: the default behavior for them is to read until the next whitespace. There are ways to tweak the stream into considering characters as whitespace (using std::ctype<char>) but I'll just assume that there is space. In this case, the extractor for Foo could look like this (note, all code is entirely untested):
std::istream& read_data(std::istream& is, Foo& foo, std::string& s) {
Foo tmp(s);
if (is >> get_char<'('> >> tmp.m_x >> get_char<','> >> tmp.m_y >> get_char<')'>)
std::swap(tmp, foo);
return is;
}
std::istream& operator>>(std::istream& is, Foo& foo)
{
std::string s;
return read_data(is >> s, foo, s);
}
The idea is that read_data() read the part of a Foo which is different from Bar when reading a FooBar. A similar approach would be used for Bar but I omit this. The more interesting bit is the use of this funny get_char() function template. This is something called a manipulator and is just a function taking a stream reference as argument and returning a stream reference. Since we have different characters we want to read and compare against, I made it a template but you can have one function per character as well. I'm just too lazy to type it out:
template <char Expect>
std::istream& get_char(std::istream& in) {
char c;
if (in >> c && c != 'e') {
in.set_state(std::ios_base::failbit);
}
return in;
}
What looks a bit weird about my code is that there are few checks if things worked. That is because the stream would just set std::ios_base::failbit when reading a member failed and I don't really have to bother myself. The only case where there is actually special logic added is in get_char() to deal with expecting a specific character. Similarly there is no skipping of whitespace characters (i.e. use of std::ws) going on: all the input functions are formatted input functions and these skip whitespace by default (you can turn this off by using e.g. in >> std::noskipws) but then lots of things won't work.
With a similar implementation for reading a Bar, reading a FooBar would look something like this:
std::istream& operator>> (std::istream& in, FooBar& foobar) {
std::string s;
if (in >> s) {
switch ((in >> std::ws).peek()) {
case '(': { Foo foo; read_data(in, foo, s); foobar = foo; break; }
case '[': { Bar bar; read_data(in, bar, s); foobar = bar; break; }
default: in.set_state(std::ios_base::failbit);
}
}
return in;
}
This code uses an unformatted input function, peek() which just looks at the next character. It either return the next character or it returns std::char_traits<char>::eof() if it fails. So, if there is either an opening parenthesis or an opening bracket we have read_data() take over. Otherwise we always fail. Solved the immediate problem. On to distributing information...
Should one write his calls to operator>> to leave the initial data still available after a failure ?
The general answer is: no. If you failed to read something went wrong and you give up. This might mean that you need to work harder to avoid failing, though. If you really need to back off from the position you were at to parse your data, you might want to read data first into a std::string using std::getline() and then analyze this string. Use of std::getline() assumes that there is a distinct character to stop at. The default is newline (hence the name) but you can use other characters as well:
std::getline(in, str, '!');
This would stop at the next exclamation mark and store all characters up to it in str. It would also extract the termination character but it wouldn't store it. This makes it interesting sometimes when you read the last line of a file which may not have a newline: std::getline() succeeds if it can read at least one character. If you need to know if the last character in a file is a newline, you can test if the stream reached:
if (std::getline(in, str) && in.eof()) { std::cout << "file not ending in newline\"; }
If so, how can I do that efficiently ?
Streams are by their very nature single pass: you receive each character just once and if you skip over one you consume it. Thus, you typically want to structure your data in a way such that you don't have to backtrack. That said, this isn't always possible and most streams actually have a buffer under the hood two which characters can be returned. Since streams can be implemented by a user there is no guarantee that characters can be returned. Even for the standard streams there isn't really a guarantee.
If you want to return a character, you have to put back exactly the character you extracted:
char c;
if (in >> c && c != 'a')
in.putback(c);
if (in >> c && c != 'b')
in.unget();
The latter function has slightly better performance because it doesn't have to check that the character is indeed the one which was extracted. It also has less chances to fail. Theoretically, you can put back as many characters as you want but most streams won't support more than a few in all cases: if there is a buffer, the standard library takes care of "ungetting" all characters until the start of the buffer is reached. If another character is returned, it calls the virtual function std::streambuf::pbackfail() which may or may not make more buffer space available. In the stream buffers I have implemented it will typically just fail, i.e. I typically don't override this function.
If not, is there a way to "store" (and restore) the complete status of an input stream: state and data ?
If you mean to entirely restore the state you were at, including the characters, the answer is: sure there is. ...but no easy way. For example, you could implement a filtering stream buffer and put back characters as described above to restore the sequence to be read (or support seeking or explicitly setting a mark in the stream). For some streams you can use seeking but not all streams support this. For example, std::cin typically doesn't support seeking.
Restoring the characters is only half the story, though. The other stuff you want to restore are the state flags and any formatting data. In fact, if the stream went into a failed or even bad state you need to clear the state flags before the stream will do most operations (although I think the formatting stuff can be reset anyway):
std::istream fmt(0); // doesn't have a default constructor: create an invalid stream
fmt.copyfmt(in); // safe the current format settings
// use in
in.copyfmt(fmt); // restore the original format settings
The function copyfmt() copies all fields associated with the stream which are related to formatting. These are:
the locale
the fmtflags
the information storage iword() and pword()
the stream's events
the exceptions
the streams's state
If you don't know about most of them don't worry: most stuff you probably won't care about. Well, until you need it but by then you have hopefully acquired some documentation and read about it (or ask and got a good response).
What differences are they between failbit and badbit ? When should we use one or the other ?
Finally a short and simple one:
failbit is set when formatting errors are detected, e.g. a number is expected but the character 'T' is found.
badbit is set when something goes wrong in the stream's infrastructure. For example, when the stream buffer isn't set (as in the stream fmt above) the stream has std::badbit set. The other reason is if an exception is thrown (and caught by way of the the exceptions() mask; by default all exceptions are caught).
Is there any online reference (or a book) that explains deeply how to deal with iostreams ? not just the basic stuff: the complete error handling.
Ah, yes, glad you asked. You probably want to get Nicolai Josuttis's "The C++ Standard Library". I know that this book describes all the details because I contributed to writing it. If you really want to know everything about IOStreams and locales you want Angelika Langer & Klaus Kreft's "IOStreams and Locales". In case you wonder where I got the information from originally: this was Steve Teale's "IOStreams" I don't know if this book is still in print and it lacking a lot of the stuff which was introduced during standardization. Since I implemented my own version of IOStreams (and locales) I know about the extensions as well, though.

So here are my questions:
Q: Where is my mistake here ?
I would not call your technique a mistake. It is absolutely fine.
When you read data from a stream you normally already know the objects coming off that stream (if the objects have multiple interpretations then that also needs to either be encoded into the stream (or you need to be able to rollback the stream).
Q: Should one write his calls to operator>> to leave the initial data still available after a failure?
Failure state should be there only if something really bad went wrong.
In your case if you are expecting a foobar (that has two representations) you have a choice:
Mark the type of object that is coming in the stream with some prefix data.
In the foobar parsing section use ftell() and fseek() to restore the stream position.
Try:
std::streampos point = stream.tellg();
if (is >> foo)
{
foobar = foo;
}
else
{
stream.seekg(point)
is.clear();
Q: If so, how can I do that efficiently ?
I prefer method 1 where you know the type on the stream.
Method two can used when this is unknowable.
Q: If not, is there a way to "store" (and restore) the complete status of an input stream: state and data ?
Yes but it requires two calls: see
std::iostate state = stream.rdstate()
std::istream holder;
holder.copyfmt(stream)
Q: What differences are they between failbit and badbit ?
From the documentation to the call fail():
failbit: is generally set by an input operation when the error was related to the internal logic of the operation itself, so other operations on the stream may be possible.
badbit: is generally set when the error involves the loss of integrity of the stream, which is likely to persist even if a different operation is performed on the stream. badbit can be checked independently by calling member function bad.
Q: When should we use one or the other ?
You should be setting failbit.
This means that your operation failed. If you know how it failed then you can reset and try again.
badbit is when you accidentally mash internal members of the stream or do something so bad that to the stream object itself is completely forked.

When you serialize your FooBar you should have a flag indicating which one it is, which will be the "header" for your write/read.
When you read it back, you read the flag then read in the appropriate datatype.
And yes, it is safest to read first into a temporary object then move the data. You can sometimes optimise this with a swap() function.

Related

How does this one stream command read in an entire file in c++?

Given this:
auto f = std::ifstream{file};
if (f) {
std::stringstream stream;
stream << f.rdbuf();
return stream.str();
}
return std::string{};
I don't see why it works.
I don't know what type f is, because it says auto, but apparently you can check that for non-zero-ness.
But when the file is large, like 2 gig, the delay in running happens in
this line:
stream << f.rdbuf();
The documentation says rdbuf() gets you the pointer to the ifstream's internal buffer. So in order for it to read the entire file, the buffer would have to size the file, and load it all in in one shot. But by the time the stream << happens, rdbuf() has to already be set, or it won't be able to return a point.
I would expect the constructor to do that in this case, but it's obviously lazy loaded, because reading in the entire file on construction would be bad for every other use case, and the delay is in the stream << command.
Any thoughts? All other stack overflow references to reading in a file to a string always loop in some way.
If there's some buffer involved, which obviously there is, how big can it get? What if it is 1 byte, it would surely be slow.
Adorable c++ is very opaque, bad for programmers who have to know what's going on under the covers.
It's a function of how operator<< is defined on ostreams when the argument is a streambuf. As long as the streambuf isn't a null pointer, it extracts characters from the input sequence controlled by the streambuf and inserts them into *this until one of the following conditions are met (see operator<< overload note #9):
end-of-file occurs on the input sequence;
inserting in the output sequence fails (in which case the character to be inserted is not extracted);
an exception occurs (in which case the exception is caught).
Basically, the ostream (which stringstream inherits from) knows how to exercise a streambuf to pull all the data from the file it's associated with. It's an idiomatic, but as you note, not intuitive, way to slurp a whole file. The streambuf isn't actually buffering all the data here (as you note, reading the whole file into the buffer would be bad in the general case), it's just that it has the necessary connections to adjust the buffered window as an ostream asks for more (and more, and more) data.
if (f) works because ifstream has an overload for operator bool that is implicitly invoked when the "truthiness" of the ifstream is tested that tells you if the file is in a failure state.
To answer your first question first:
f is of the type that's assigned to it, an std::ifstream, but that's a rather silly way to write it. One would usually write std::ifstream f {...}. A stream has an overloaded operator bool () which gives you !fail().
As for the second question: What .rdbuf() returns is a streambuf object. This object doesn't contain the whole file contents when it is returned. Instead, it provides an interface to access data, and this interface is used by the stringstream stream.
auto f = std::ifstream{file};
Type of f is std::ifstream.
stream << f.rdbuf();
std::ifstream maintains a buffer which you can get by f.rdbuf() and it does not load entire file in 1 shot. The loading happens when the above commands is called, stringstream will extract data from that buffer, and ifstream will perform loading as the buffer runs out of data.
You can manually set the buffer size by using setbuf.

Overloading >> using istream

So I am trying to overload the >> operator, but in this case I am getting a null terminated string in. How do I make the user only input enough characters that my dynamically allocated char[] named data and allocate it. I know there could be a way where I make a temp char[] with a size very big and use a for loop to copy them in, but I want to make it without making a very big char[]. I have this code for now but I know it doesn't work because of the length allowed in my class being passed in.
std::istream & operator>>(std::istream & is, String346 & objIn) {
using std::istream;
is >> objIn.data;
return is;
}
The C++ language contains no provision to technically bar the user to "only input enough characters" for your char array. There may be some operating system-specific resources available to you, such as limiting the maximum number of characters in a text entry field, but that's outside the scope of C++.
When reading from a std::istream, your code must be prepared to handle and deal with input that does not fit your criteria. Throw an exception, exit the program after printing an error message, or read up to the maximum number of characters you can accept and ignore the extra -- in whatever manner makes sense to you. It's entirely up to you.
std::istream::get() has an overload that allows you to limit size of the input. You still need to deal with the remaining input one way or another though.

Overloading operator>> to a char buffer in C++ - can I tell the stream length?

I'm on a custom C++ crash course. I've known the basics for many years, but I'm currently trying to refresh my memory and learn more. To that end, as my second task (after writing a stack class based on linked lists), I'm writing my own string class.
It's gone pretty smoothly until now; I want to overload operator>> that I can do stuff like cin >> my_string;.
The problem is that I don't know how to read the istream properly (or perhaps the problem is that I don't know streams...). I tried a while (!stream.eof()) loop that .read()s 128 bytes at a time, but as one might expect, it stops only on EOF. I want it to read to a newline, like you get with cin >> to a std::string.
My string class has an alloc(size_t new_size) function that (re)allocates memory, and an append(const char *) function that does that part, but I obviously need to know the amount of memory to allocate before I can write to the buffer.
Any advice on how to implement this? I tried getting the istream length with seekg() and tellg(), to no avail (it returns -1), and as I said looping until EOF (doesn't stop reading at a newline) reading one chunk at a time.
To read characters from the stream until the end of line use a loop.
char c;
while(istr.get(c) && c != '\n')
{
// Apped 'c' to the end of your string.
}
// If you want to put the '\n' back onto the stream
// use istr.unget(c) here
// But I think its safe to say that dropping the '\n' is fine.
If you run out of room reallocate your buffer with a bigger size.
Copy the data across and continue. No need to be fancy for a learning project.
you can use cin::getline( buffer*, buffer_size);
then you will need to check for bad, eof and fail flags:
std::cin.bad(), std::cin.eof(), std::cin.fail()
unless bad or eof were set, fail flag being set usually indicates buffer overflow, so you should reallocate your buffer and continue reading into the new buffer after calling std::cin.clear()
A side note: In the STL the operator>> of an istream is overloaded to provide this kind of functionality or (as for *char ) are global functions. Maybe it would be more wise to provide a custom overload instead of overloading the operator in your class.
Check Jerry Coffin's answer to this question.
The first method he used is very simple (just a helper class) and allow you to write your input in a std::vector<std::string> where each element of the vector represents a line of the original input.
That really makes things easy when it comes to processing afterwards!

How do I set EOF on an istream without reading formatted input?

I'm doing a read in on a file character by character using istream::get(). How do I end this function with something to check if there's nothing left to read in formatted in the file (eg. only whitespace) and set the corresponding flags (EOF, bad, etc)?
Construct an istream::sentry on the stream. This will have a few side effects, the one we care about being:
If its skipws format flag is set, and the constructor is not passed true as second argument (noskipws), all leading whitespace characters (locale-specific) are extracted and discarded. If this operation exhausts the source of characters, the function sets both the failbit and eofbit internal state flags
You can strip any amount of leading (or trailing, as it were) whitespace from a stream at any time by reading to std::ws. For instance, if we were reading a file from STDIN, we would do:
std::cin >> std::ws
Credit to this comment on another version of this question, asked four years later.
How do I end this function with something to check if there's nothing left to read in formatted in the file (eg. only whitespace)?
Whitespace characters are characters in the stream. You cannot assume that the stream will do intelligent processing for you. Until and unless, you write your own filtering stream.
By default, all of the formatted extraction operations (overloads of operator>>()) skip over whitespace before extracting an item -- are you sure you want to part ways with this approach?
If yes, then you could probably achieve what you want by deriving a new class, my_istream, from istream, and overriding each operator>>() to call the following method at the end:
void skip_whitespace() {
char ch;
ios_base old_flags = flags(ios_base::skipws);
*this >> ch; // Skips over whitespace to read a character
flags(old_flags);
if (*this) { // I.e. not at end of file and no errors occurred
unget();
}
}
It's quite a bit of work. I'm leaving out a few details here (such as the fact that a more general solution would be to override the class template basic_istream<CharT, Traits>).
istream is not going to help a lot - it functions as designed. However, it delegates the actual reading to streambufs. If your streambuf wrapper trims trailing whitespace, an istream reading from that streambuf won't notice it.

What is the best way to do input validation in C++ with cin?

My brother recently started learning C++. He told me a problem he encountered while trying to validate input in a simple program. He had a text menu where the user entered an integer choice, if they entered an invalid choice, they would be asked to enter it again (do while loop). However, if the user entered a string instead of an int, the code would break.
I read various questions on stackoverflow and told him to rewrite his code along the lines of:
#include<iostream>
using namespace std;
int main()
{
int a;
do
{
cout<<"\nEnter a number:"
cin>>a;
if(cin.fail())
{
//Clear the fail state.
cin.clear();
//Ignore the rest of the wrong user input, till the end of the line.
cin.ignore(std::numeric_limits<std::streamsize>::max(),\
'\n');
}
}while(true);
return 0;
}
While this worked ok, I also tried a few other ideas:
1. Using a try catch block. It didn't work. I think this is because an exception is not raised due to bad input.
2. I tried if(! cin){//Do Something} which didn't work either. I haven't yet figured this one out.
3. Thirdly, I tried inputting a fixed length string and then parsing it. I would use atoi(). Is this standards compliant and portable? Should I write my own parsing function?
4. If write a class that uses cin, but dynamically does this kind of error detection, perhaps by determining the type of the input variable at runtime, would it have too much overhead? Is it even possible?
I would like to know what is the best way to do this kind of checking, what are the best practices?
I would like to add that while I am not new to writing C++ code, I am new to writing good standards compliant code. I am trying to unlearn bad practices and learn the right ones. I would be much obliged if answerers give a detailed explanation.
EDIT: I see that litb has answered one of my previous edits. I'll post that code here for reference.
#include<iostream>
using namespace std;
int main()
{
int a;
bool inputCompletionFlag = true;
do
{
cout<<"\nEnter a number:"
cin>>a;
if(cin.fail())
{
//Clear the fail state.
cin.clear();
//Ignore the rest of the wrong user input, till the end of the line.
cin.ignore(std::numeric_limits<std::streamsize>::max(),\
'\n');
}
else
{
inputCompletionFlag = false;
}
}while(!inputCompletionFlag);
return 0;
}
This code fails on input like "1asdsdf". I didn't know how to fix it but litb has posted a great answer. :)
Here is code you could use to make sure you also reject things like
42crap
Where non-number characters follow the number. If you read the whole line and then parse it and execute actions appropriately it will possibly require you to change the way your program works. If your program read your number from different places until now, you then have to put one central place that parses one line of input, and decides on the action. But maybe that's a good thing too - so you could increase the readability of the code that way by having things separated: Input - Processing - Output
Anyway, here is how you can reject the number-non-number of above. Read a line into a string, then parse it with a stringstream:
std::string getline() {
std::string str;
std::getline(std::cin, str);
return str;
}
int choice;
std::istringstream iss(getline());
iss >> choice >> std::ws;
if(iss.fail() || !iss.eof()) {
// handle failure
}
It eats all trailing whitespace. When it hits the end-of-file of the stringstream while reading the integer or trailing whitespace, then it sets the eof-bit, and we check that. If it failed to read any integer in the first place, then the fail or bad bit will have been set.
Earlier versions of this answer used std::cin directly - but std::ws won't work well together with std::cin connected to a terminal (it will block instead waiting for the user to input something), so we use a stringstream for reading the integer.
Answering some of your questions:
Question: 1. Using a try catch block. It didn't work. I think this is because an exception is not raised due to bad input.
Answer: Well, you can tell the stream to throw exceptions when you read something. You use the istream::exceptions function, which you tell for which kind of error you want to have an exception thrown:
iss.exceptions(ios_base::failbit);
I did never use it. If you do that on std::cin, you will have to remember to restore the flags for other readers that rely on it not throwing. Finding it way easier to just use the functions fail, bad to ask for the state of the stream.
Question: 2. I tried if(!cin){ //Do Something } which didn't work either. I haven't yet figured this one out.
Answer: That could come from the fact that you gave it something like "42crap". For the stream, that is completely valid input when doing an extraction into an integer.
Question: 3. Thirdly, I tried inputting a fixed length string and then parsing it. I would use atoi(). Is this standards compliant and portable? Should I write my own parsing function?
Answer: atoi is Standard Compliant. But it's not good when you want to check for errors. There is no error checking, done by it as opposed to other functions. If you have a string and want to check whether it contains a number, then do it like in the initial code above.
There are C-like functions that can read directly from a C-string. They exist to allow interaction with old, legacy code and writing fast performing code. One should avoid them in programs because they work rather low-level and require using raw naked pointers. By their very nature, they can't be enhanced to work with user defined types either. Specifically, this talks about the function "strtol" (string-to-long) which is basically atoi with error checking and capability to work with other bases (hex for example).
Question: 4. If I write a class that uses cin, but dynamically do this kind of error detection, perhaps by determining the type of the input variable at runtime, will it have too much overhead? Is it even possible?
Answer: Generally, you don't need to care too much about overhead here (if you mean runtime-overhead). But it depends specifically on where you use that class. That question will be very important if you are writing a high performance system that processes input and needs to have high throughout. But if you need to read input from a terminal or a file, you already see what this comes down to: Waiting for the user to input something takes really so long, you don't need to watch runtime costs at this point anymore on this scale.
If you mean code overhead - well it depends on how the code is implemented. You would need to scan your string that you read - whether it contains a number or not, whether some arbitrary string. Depending on what you want to scan (maybe you have a "date" input, or a "time" input format too. Look into boost.date_time for that), your code can become arbitrarily complex. For simple things like classifying between number or not, I think you can get away with small amount of code.
This is what I do with C but it's probably applicable for C++ as well.
Input everything as a string.
Then, and only then, parse the string into what you need. It's sometimes better to code your own than try to bend someone else's to your will.
In order to get the exceptions with iostreams you need to set the proper exception flag for the stream.
And I would use get_line to get the whole line of input and then handle it accordingly - use lexical_cast, regular expressions (for example Boost Regex or Boost Xpressive, parse it with Boost Spirit, or just use some kind of appropriate logic
What I would do is twofold: First, try to validate the input, and extract the data, using a regular expression, if the input is somewhat not trivial. It can be very helpful also even if the input is just a series of numbers.
Then, I like to use boost::lexical_ cast, that can raise a bad_ lexical_ cast exception if the input cannot be converted.
In your example:
std::string in_str;
cin >> in_str;
// optionally, test if it conforms to a regular expression, in case the input is complex
// Convert to int? this will throw bad_lexical_cast if cannot be converted.
int my_int = boost::lexical_cast<int>(in_str);
Forget about using formatted input (the >> operator) directly in real code. You will always need to read raw text with std::getline or similar and then use your own input parsing routines (which may use of the >> operator) to parse the input.
How about a combination of the various approaches:
Snag the input from std::cin using std::getline(std::cin, strObj) where strObj is a std::string object.
Use boost::lexical_cast to perform a lexical translation from strObj to either a signed or unsigned integer of largest width (e.g., unsigned long long or something similar)
Use boost::numeric_cast to cast the integer down to the expected range.
You could just fetch the input with std::getline and then call boost::lexical_cast to the appropriately narrow integer type as well depending on where you want to catch the error. The three step approach has the benefit of accepting any integer data and then catch narrowing errors separately.
I agree with Pax, the simplest way to do this is to read everything as string, then use TryParse to verify the input. If it is in the right format, then proceed, otherwhise just notify the user and use continue on the loop.
One thing that hasn't been mentioned yet is that it is usually important that you test to see if the cin >> operation worked before using the variable that supposedly got something from the stream.
This example is similar to yours, but makes that test.
#include <iostream>
#include <limits>
using namespace std;
int main()
{
while (true)
{
cout << "Enter a number: " << flush;
int n;
if (cin >> n)
{
// do something with n
cout << "Got " << n << endl;
}
else
{
cout << "Error! Ignoring..." << endl;
cin.clear();
cin.ignore(numeric_limits<streamsize>::max(), '\n');
}
}
return 0;
}
This will use the usual operator >> semantics; it will skip whitespace first, then try to read as many digits as it can and then stop. So "42crap" will give you the 42 then skip over the "crap". If that isn't what you want, then I agree with the previous answers, you should read it into a string and then validate it (perhaps using a regular expression - but that may be overkill for a simple numeric sequence).