Input validation for integer - c++

I am having trouble with some basic input validation. I have done a lot of searching but cant seem to find the answer that works with my code. I am trying to validate integer input. I am able to test whether a char was entered or not but I need another parameter to test if the number is actually an int and not a double or float.
do
{
cout << "\nHow many numbers would you like to enter?" << endl;
cin >> size;
if (size < 1 || cin.fail())
{
cout << "\nInvalid input" << endl;
cin.clear();
cin.ignore(numeric_limits<streamsize>::max(), '\n');
}
}while(size < 1 || cin.fail());

Too late. By the time operator>> does its job, the horse has left the barn already. You have to figure it out before you actually parse the value.
If you tell operator>> to extract a value into an integer, operator>> will, first, skip any spaces, parse an optional negative sign, then as many digits as it sees. And that's what you will get.
If the last digit is followed by a ".", operator>> doesn't care. It stops, because you told it to parse an integer, and calls it a day.
Similarly, if you tell operator>> to extract a value into one of the floating point types (float, double, long double), operator>> will parse anything that looks like a floating point number, and put it into the floating point type instance. Since an integer looks like a perfectly valid floating point number, either an integer or a floating point number will parse equally, and stored.
You have no indication of what was actually parsed, except the value itself.
So, there are two ways of doing this. One way is to always parse the value into a floating point number, then check if the parsed number is a whole number, or contains any fractional part, then proceed accordingly.
Or, you could parse this yourself. Use get() to parse one character at a time. Ignore any leading whitespace, then collect everything that follows: an optional minus sign, zero or more digits, an optional period followed by zero or more digits, and an optional "e" followed by an exponent.
If you collected a period or an "e", as part of this process, you know that you just parsed a floating point number, otherwise it's going to be an integer. Then, take everything that's been collected, and then parse it, with std::istringstream's assistance, into your selected value.
(Technically, "e" does not necessarily indicate a floating point value; for example 1e3 will happily live inside an int, it's up to you to decide how you wish to handle this case).

I think you could read the input first then do the validation afterward.
If you can use a c11 compiler and the application only wants positive number and do not care symbols e.g "+", "-", you could try something like the following code:
#include <regex>
...
std::regex integers(R"(\d+)");
enum{INPUT_SIZE=256};
char input[INPUT_SIZE];
do
{
memset(input, 0, INPUT_SIZE);
cin >> input;
if(std::regex_match(input, input+strlen(input), integers))
{
cout << input << " is an integer\n";
}
else
{
cout << "input is not an integer";
}
}while(1);

Related

C++ Numerical Input Validation

I'm used to Python and I'm now learning C++ which is a bit more complicated to me. How can I modify the input so that it fits the description in the comment at the top of my code? I tried to include if (input[0]!='.' &&...) but it just returns 0. I want it to be included as part of the number. Same with the characters after the first character of the input.
I also don't know how I can separate numbers with more than three digits (starting from the end of the number obviously) with a comma (so 1000000 should be returned as 1,000,000).
/*
* The first character can be a number, +, -, or a decimal point
* All other characters can be numeric, a comma or a decimal point
* Any commas must be in their proper location (ie, separating hundreds from thousands, from millions, etc)
* No commas after the decimal point
* Only one decimal point in the number
*
*/
#include <iostream>
#include <cmath>
#include <climits>
#include <string>
int ReadInt(std::string prompt);
int ReadInt(std::string prompt)
{
std::string input;
std::string convert;
bool isValid=true;
do {
isValid=true;
std::cout << prompt;
std::cin >> input;
if (input[0]!='.' && input[0]!='+' && input[0]!='-' && isdigit(input[0]) == 0) {
std::cout << "Error! Input was not an integer.\n";
isValid=false;
}
else {
convert=input.substr(0,1);
}
long len=input.length();
for (long index=1; index < len && isValid==true; index++) {
if (input[index]==',') {
;
}
else if (isdigit(input[index]) == 0){
std::cout << "Error! Input was not an integer.\n";
isValid=false;
}
else if (input[index] == '.') {
;
}
else {
convert += input.substr(index,1);
}
}
} while (isValid==false);
int returnValue=atoi(convert.c_str());
return returnValue;
}
int main()
{
int x=ReadInt("Enter a value: ");
std::cout << "Value entered was " << x << std::endl;
return 0;
}
Writing parsing code is tricky. When parsing, it is easy to make a mess of the control flow. I suggest dividing up the I/O from the validation code: make a separate function bool IsVaildInt(const std::string& s) that returns whether s is a valid input, and do the prompts outside in a calling function.
It helps to think through systematically for every character what constitutes a valid input. If you are familiar with regexes, like cigien suggested, that may be a good way of organizing your approach even if you end up writing the parsing code by hand instead of using a regex library.
Here are the requirements that you posted:
* The first character can be a number, +, -, or a decimal point
* All other characters can be numeric, a comma or a decimal point
* Any commas must be in their proper location (ie, separating hundreds
from thousands, from millions, etc)
* No commas after the decimal point
* Only one decimal point in the number
That's a lot of logic, but it's doable. It sounds like you are working on this as an exercise to master C++ basics, so I won't post any code. Instead, here's an outline for how I would approach this:
Test that the first character is either 0-9, +, -, or decimal point. If it's not, return invalid.
Search the string for whether it has a decimal point. If it does, remember its position.
Loop over the remaining characters, in reverse starting from the last character.
Separate from the loop index, make a counter that says what the current digit place is (... -1 for tenths, 0 for ones, 1 for tens, 2 for hundreds, ...). If the string has a decimal point, use that vs. the string length to determine the digit place of the last character.
If a character is a comma, check that it is in a valid location compared to the current digit place. If not, return invalid.
Otherwise, if the character is a decimal point, it must be the one identified earlier. If not, that means there are multiple decimal points, so return invalid.
Otherwise, the character must be a digit 0-9, and the digit place counter should be incremented. If the character is not a digit, return invalid.
Finally, if the loop makes it all the way through without hitting an error, return that the string is valid.

Why doesn't this function print all the char array as it takes it?

i was trying to convert from a char array to integers and the atoi function is working properly except when i put a zero in the first index...it didn't print it
#include<iostream>
using namespace std;
int main()
{
char arr[]= "0150234";
int num;
num=atoi(arr);
cout << num;
return 0;
}
I expect the output of 0150234 but the actual output is 150234
I think inside the atoi function you have typecasted the string to integer because of which the 0 gets removed. You can never get a 0 printed before a number since it doesn't make sense.
000001 will always be represented as 1.
I hope this clears your doubt.
Binary number representations (such as int) do not store leading 0s because there is an infinite number of them. Rather they store a fixed number of bits which may have some leading 0 bits.
You can still print the leading 0s if necessary:
std::cout << std::setw(4) << std::setfill('0') << 1 << '\n';
Output:
0001
You're confusing two ideas:
Numbers: These are abstract things. They're quantities. Your computer stores the number in a manner that you should not care about (though it's probably binary).
Representations: These are ways we display numbers to humans, like "150234", or "0x24ADA", or "one hundred and fifty thousand, two hundred and thirty four". You pick a representation when you convert to a string. When streaming to std::cout a representation is picked for you by default, but you can choose your own representation using I/O manipulators, as Maxim shows.
The variable num is a number, not a representation of a number. It does not contain the information «display this as "0150234"». That's what arr provides, because it is a string, containing a representation of a number. So, if that leading zero in the original representation is important to you, when you print num, you have to reproduce that representation yourself.
By the way…
Usually, in the programming world, and particularly in C-like source code:
When we see a string like "150234" we assume that it is the decimal (base-10) representation of a number;
When we see a string like "0x24ADA" (with a leading 0x) we assume that it is the hexadecimal (base-16) representation of a number;
When we see a string like "0150234" (with a leading 0) we assume that it is the octal (base-8) representation of a number.
So, if you do add a leading zero, you may confuse your users.
FYI the conventional base-8 representation of your number is "0445332".

Explain me how cin is working when decimal value pass for integer variable

int a, b;
cin >> a >> b;
cout << a << endl << b;
input1: 3.5 5.5
input2: 3 5.5
check this code
The behaviour of your code is undefined up to and including C++03. The stream halts on the .. From C++11 onwards b is set to 0; prior to that it was not modified. Currently you read its value in the fail case, which is careless.
A good fix is to always write something like
if (cin >> a >> b){
// yes, all good
} else {
// a parsing error occurred
}
On the true branch, values are guaranteed to have been written to a and b.
It reads:
spaces/tabs/newlines (just consumes that if any)
digits till something different (the dot in your case) and parse them as the number
So a becomes 3.
Then, when it tries to read the second number it is still at the . , but a 'dot' is different from spaces and digits, so it does not consume any char and assignes 0 to b and set the failbit
Thanks to #tobi303 for the specs link:
(until C++11) If extraction fails (e.g. if a letter was entered where a digit is expected), value is left unmodified and failbit is set.
(since C++11) If extraction fails, zero is written to value and failbit is set.
The input is not a decimal value; it's a text string, and the code will translate that text string into an integer value. So, what's the integer value that the string "3.5" represents? It's 3, just as if the input had been "3 5": the code that translates the text reads as much of the text as makes sense, then stops. What happens after that depends on what caused the translation to stop. If it hit a whitespace character, all is well. If it hit something else (in this case, a .), you're in trouble.

In C++, regarding cin and inputs, is this a bug or am I doing this wrong? [duplicate]

I have a problem about cin.
int main(void)
{
int a;
float b;
cin >> a >> b;
}
When I give a floating number (such as 3.14) as input, neither a nor b get the complete value (3.14): the output is a=3, b=0.14.
I know that cin will split the input by space, tab or Return, but 'dot' will not, right?
And why will the following code work?
int main(void)
{
int i=0;
int k=0;
float j=0;
cin >> i >> k >> j; // i =3, j=k=0
}
And one more problem, what benefit will compiler do this for us?
Thanks!
You've declared a to be of type int, in which case what do you expect it to do with the "."?
failbit The input obtained could not be interpreted as an element of
the appropriate type. Notice that some eofbit cases will also set
failbit.
The other mention you have works fine what is the issue you are asking about? You've cined 3 variables what did you expect here? 0 is valid for float or int.
cin >> a >> b;
Given an input of 3.14, the first parse on the dot (period) because dot doesn't fit the syntax for an integer. The second parse picks up with .14, which parses just fine.
cin >> i >> k >> j;
This is problematic with an input of 3.14. The first parse once again stops on the dot. The second parse can't restart with the dot, so it marks the input stream as failed.
Always check status when doing I/O.
The formatted input functions work quite simple:
They skip leading whitespace if any.
They try to read a format matching the type given.
If reading the value fails because the data doesn't match the required format, they set std::ios_base::failbit. If reading fails the input shouldn't change the variable attempted to be read (the standard input operators follow this rule but user defined input operator might not).
The first value you try to read is an int. Reading an int means that a optional leading sign is read followed by a sequence of digits (where, depending on your settings and the value given, the stream may read octal or hexadecimal numbers rather than decimal ones). That is, the int receives the value 3 and reading stops right in front of the ..
Depending on what you read next, the next read fails or doesn't:
In the first code you try to read a floating point value which starts with an optional sign, followed by an optional integral parts, followed by an optional thousands separator, followed by an optional fractional part, followed by an optional exponent. At least one digit is required in either the integral or the fractional part. In your example, there is only a thousands separate followed by a fractional part.
When trying to read an integer, a . is found which isn't a valid part of an int and reading fails.
After attempting to read a value, you should always try if the read operation was successful and report potential errors:
if (in >> value) {
std::cout << "successfully read '" << value << "'\n";
}
else {
std::cerr << "failed to read a value from input\n";
}
Note, that after a failed read you may need to clean up as well, e.g., using
in.clear();
in.ignore();
This first clears the error flags (without this, the stream would ignore any further attempts to read the data) and then it ignores the next character.
cin is reading input "3.14" and you ask to put it into an integer and then a float.
So cin starts reading, finds "3", then finds "." which is not integer. Stores 3 into a and continues. ".14" is valid float and it puts it into b.
Then you ask to read int, int, float. The second integer is not matched and cin stops, but it only appears to be working. Actually, it failed.
The compiler can't warn you that you're doing something which will not work, because the input is not known at compile time.

Why will cin split a floating value into two parts?

I have a problem about cin.
int main(void)
{
int a;
float b;
cin >> a >> b;
}
When I give a floating number (such as 3.14) as input, neither a nor b get the complete value (3.14): the output is a=3, b=0.14.
I know that cin will split the input by space, tab or Return, but 'dot' will not, right?
And why will the following code work?
int main(void)
{
int i=0;
int k=0;
float j=0;
cin >> i >> k >> j; // i =3, j=k=0
}
And one more problem, what benefit will compiler do this for us?
Thanks!
You've declared a to be of type int, in which case what do you expect it to do with the "."?
failbit The input obtained could not be interpreted as an element of
the appropriate type. Notice that some eofbit cases will also set
failbit.
The other mention you have works fine what is the issue you are asking about? You've cined 3 variables what did you expect here? 0 is valid for float or int.
cin >> a >> b;
Given an input of 3.14, the first parse on the dot (period) because dot doesn't fit the syntax for an integer. The second parse picks up with .14, which parses just fine.
cin >> i >> k >> j;
This is problematic with an input of 3.14. The first parse once again stops on the dot. The second parse can't restart with the dot, so it marks the input stream as failed.
Always check status when doing I/O.
The formatted input functions work quite simple:
They skip leading whitespace if any.
They try to read a format matching the type given.
If reading the value fails because the data doesn't match the required format, they set std::ios_base::failbit. If reading fails the input shouldn't change the variable attempted to be read (the standard input operators follow this rule but user defined input operator might not).
The first value you try to read is an int. Reading an int means that a optional leading sign is read followed by a sequence of digits (where, depending on your settings and the value given, the stream may read octal or hexadecimal numbers rather than decimal ones). That is, the int receives the value 3 and reading stops right in front of the ..
Depending on what you read next, the next read fails or doesn't:
In the first code you try to read a floating point value which starts with an optional sign, followed by an optional integral parts, followed by an optional thousands separator, followed by an optional fractional part, followed by an optional exponent. At least one digit is required in either the integral or the fractional part. In your example, there is only a thousands separate followed by a fractional part.
When trying to read an integer, a . is found which isn't a valid part of an int and reading fails.
After attempting to read a value, you should always try if the read operation was successful and report potential errors:
if (in >> value) {
std::cout << "successfully read '" << value << "'\n";
}
else {
std::cerr << "failed to read a value from input\n";
}
Note, that after a failed read you may need to clean up as well, e.g., using
in.clear();
in.ignore();
This first clears the error flags (without this, the stream would ignore any further attempts to read the data) and then it ignores the next character.
cin is reading input "3.14" and you ask to put it into an integer and then a float.
So cin starts reading, finds "3", then finds "." which is not integer. Stores 3 into a and continues. ".14" is valid float and it puts it into b.
Then you ask to read int, int, float. The second integer is not matched and cin stops, but it only appears to be working. Actually, it failed.
The compiler can't warn you that you're doing something which will not work, because the input is not known at compile time.