Test if string represents "yyyy-mm-dd" - c++

I am working on a program that takes two command line arguments. Both arguments should be dates of the form yyyy-mm-dd. Since other folks will be using this program and it will be requesting from mysql, I want to make sure that the command line arguments are valid. My original thought was to loop over each element of the incoming string and perform some kind of test on it. The '-' would be easy to check but I'm not so sure how to handle the digits, and to distinguish them between ints and chars. Also, I need the first date to be "less than or equal to" the second but I'm pretty sure I can handle that.

If you can use boost library you could simple do it like this:
string date("2015-11-12");
string format("%Y-%m-%d");
date parsedDate = parser.parse_date(date, format, svp);
You can read more about this here.
If you want a pure C++ solution you can try using
struct tm tm;
std::string s("2015-11-123");
if (strptime(s.c_str(), "%Y-%m-%d", &tm))
std::cout << "Validate date" << std::endl;
else
std::cout << "Invalid date" << std::endl;
Additionally you can do a simple check to see if the date is valid, and is not for example 2351-20-35. A simple solution would be:
bool isleapyear(unsigned short year){
return (!(year%4) && (year%100) || !(year%400));
}
//1 valid, 0 invalid
bool valid_date(unsigned short year,unsigned short month,unsigned short day){
unsigned short monthlen[]={31,28,31,30,31,30,31,31,30,31,30,31};
if (!year || !month || !day || month>12)
return 0;
if (isleapyear(year) && month==2)
monthlen[1]++;
if (day>monthlen[month-1])
return 0;
return 1;
}
Source: http://www.cplusplus.com/forum/general/3094/

Related

bulletproof use of from_chars()

I have some literal strings which I want to convert to integer and even double. The base is 16, 10, 8, and 2.
At this time, I wonder about the behavior of std::from_chars() - I try to convert and the error code inside from_chars_result return holds success - even if it isn't as shown here:
#include <iostream>
#include <string_view>
#include <charconv>
using namespace std::literals::string_view_literals;
int main()
{
auto const buf = "01234567890ABCDEFG.FFp1024"sv;
double d;
auto const out = std::from_chars(buf.begin(), buf.end(), d, std::chars_format::hex);
if(out.ec != std::errc{} || out.ptr != buf.end())
{
std::cerr << buf << '\n'
<< std::string(std::distance(buf.begin(), out.ptr), ' ') << "^- here\n";
auto const ec = std::make_error_code(out.ec);
std::cerr << "err: " << ec.message() << '\n';
return 1;
}
std::cout << d << '\n';
}
gives:
01234567890ABCDEFG.FFp1024
^- here
err: Success
For convenience also at coliru.
In my use case, I'll check the character set before but, I'm not sure about the checks to make it bulletproof. Is this behavior expected (maybe my English isn't sufficient, or I didn't read carefully enough)? I've never seen such checks on iterators on blogs etc.
The other question is related to different base like 2 and 8. Base of 10 and 16 seems to be supported - what would be the way for the other two bases?
Addendum/Edit:
Bulletproof here means that I can have nasty things in the string. The obvious thing for me is that 'G' is not a hex character. But I would have expected an appropriate error code in some way! The comparison out.ptr != buf.end() I've never seen in blogs (or I didn't read the right ones :)
If I enter a crazy long hex float, at least a numerical result out of range comes up.
By bulletproof I also mean that I can find such impossible strings by length, for example, so that I can save myself the call to from_chars() - for float/doubles and integers (here I would 'strlen' compare digits10 from std::numeric_limits).
The from_chars utility is designed to convert the first number it finds in the string and to return a pointer to the point where it stopped. This allows you to parse strings like "42 centimeters" by first converting the number and then parsing the rest of the string yourself for what comes after it.
The comparison out.ptr != buf.end() I've never seen in blogs (or I didn't read the right ones :)
If you know that the entire string should be a number, then checking that the pointer in the result points to the end of the string is the normal way to ensure that from_chars read the entire string.

How do i check argv[1] doesn't have any alphabetical characters?

This has been driving my entire C++ class nuts, none of us has been able to find a solid solution to this problem.
We are passing information to our program through the Terminal, via argv* [1]. We would call our program ./main 3 and the program will run 3 times.
The problem comes when we are validating the input, we are trying to cover all of our bases and for most of them we are good, like an alphabetical character entered, a negative number, 0, etc. But what keeps passing through is an int followed by a str for example ./main 3e or ./main 1.3. I've tried this
( Ashwin's answer caught my eye ) but it doesn't seem to work or at least I can't implement it in my code.
This is my code now:
int main(int argc, char * argv[]){
if (!argv[1]) exit(0);
int x = atoi(argv[1]);
if (!x or x <= 0) exit(0);
// I would like to add another exit(0); for when the input mixes numbers and letters or doubles.
for (int i = 0; i < x; i++){
// rest of the main func.
}
Despite the title, it sounds like you really want to do is check whether every single character in the input argument is a digit. You can achieve this by iterating over it, checking that every element is a digit using std::isdigit.
Here's a sketch using the std::all_of algorithm:
size_t len = strlen(argv[1]);
bool ok = std::all_of(argv[1], argv[1] + len,
[](unsigned char c) { return std::isdigit(c); } );
You can add an extra check for the first element being '0' if needed.
If you want to convert a string to a number and verify that the entire string was numeric, you can use strtol instead of atoi. As an additional bonus, strtol correctly checks for overflow and gives you the option of specifying whether or not you want hexadecimal/octal conversions.
Here's a simple implementation, with all the errors noted (printing error messages from a function like this is not a good idea; I just did it for compactness). A better option might be to return an error enum instead of the bool, but this function returns a std::pair<bool, int>: either (false, <undefined>) or (true, value):
std::pair<bool, int> safe_get_int(const char* s) {
char* endptr;
bool ok = false;
errno = 0; /* So we can check ERANGE later */
long val = strtol(s, &endptr, 10); /* Don't allow hex or octal. */
if (endptr == s) /* Includes the case where s is just whitespace */
std::cout << "You must specify some value." << '\n';
if (*endptr != '\0')
std::cout << "Argument must be an integer: " << s << '\n';
else if (val < 0)
std::cout << "Argument must not be negative: " << s << '\n';
else if (errno == ERANGE || val > std::numeric_limits<int>:max())
std::cout << "Argument is too large: " << s << '\n';
else
ok = true;
return std::make_pair(ok, ok ? int(val) : 0);
}
In general, philosophical terms, when you have an API like strtol (or, for that matter, fopen) which will check for errors and deny the request if an error occurs, it is better programming style to "try and then check the error return", than "attempt to predict an error and only try if it looks ok". The second strategy, "check before use", is plagued with bugs, including security vulnerabilities (not in this case, of course, but see TOCTOU for a discussion). It also doesn't really help you, because you will have to check for error returns anyway, in case your predictor was insufficiently precise.
Of course, you need to be confident that the API in question does not have undefined behaviour on bad input, so read the official documentation. In this case, atoi does have UB on bad input, but strtol does not. (atoi: "If the value cannot be represented, the behavior is undefined."; contrast with strtol)

Is there a way to check input data type using only basic concepts?

I'm being challenged to find ways to perform tasks that usually require the use of headers (besides iostream and iomanip) or greater-than-basic C++ knowledge. How can I check the data type of user input using only logical operators, basic arithmetic (+, -, *, /, %), if statements, and while loops?
Obviously the input variable has a declared data type in the first place, but this problem is covering the possibility of the user inputting the wrong data type.
I've tried several methods including the if (!(cin >> var1)) trick, but nothing works correctly. Is this possible at all?
Example
int main() {
int var1, var2;
cin >> var1;
cin >> var2;
cout << var1 << " - " << var2 << " = " << (var1-var2);
return 0;
}
It's possible to input asdf and 5.25 here, so how do I check that the input aren't integers as expected, using only the means I stated earlier?
I understand this problem is vague in many ways, mostly because the restrictions are extremely specific and listing everything I'm allowed to use would be a pain. I guess part of the problem as mentioned in the comments is figuring out how to distinguish between data types in the first place.
You can do that using simple operations, although it might be a little difficult, for example the following function can be used to check if the input is a decimal number. You can extend the idea and check if there is a period in between for floating point numbers.
Add a comment if you need further help.
bool isNumber(char *inp){
int i = 0;
if (inp[0] == '+' || inp[0] == '-') i = 1;
int sign = (inp[0] == '-' ? -1 : 1);
for (; inp[i]; i++){
if (!(inp[i] >= '0' && inp[i] <= '9'))
return false;
}
return true;
}
General checking after reading is done like this:
stream >> variable;
if (not stream.good()) {
// not successful
}
This can be done on any std::ios. It works for standard types (any numeric type, char, string, etc.) stopping at whitespace. If your variable could not be read, good returns false. You can customize it for your own classes (including control over good's return value):
istream & operator>>(istream & stream, YourClass & c)
{
// Read the data from stream into c
return stream;
}
For your specific problem: Suppose you read the characters 42. There is no way of distinguishing between reading it as
- an int
- a double
as both would be perfectly fine. You have to specify the input format more precisely.
The standard library is not magic - you just have to parse the data read from the user, similarly to what the standard library does.
First read the input from the user:
std::string s;
cin >> s;
(you may use getline instead if you want to read a whole line)
Then you can go on parsing it; we'll try to distinguish between integer (*[+-]?[0-9]+ *), real number (*[+-][0-9](\.[0-9]*)?([Ee][+-]?[0-9]+)? *), string (*"[^"]" *) and anything else ("bad").
enum TokenType {
Integer,
Real,
String,
Bad
};
The basic building block is a routine that "eats" consecutive digits; this will help us with the [0-9]* and [0-9]+ parts.
void eatdigits(const char *&rp) {
while(*rp>='0' && *rp<='0') rp++;
}
Also, a routine that skips whitespace can be handy:
void skipws(const char *&rp) {
while(*rp==' ') rp++;
// feel free to skip also tabs and whatever
}
Then we can attack the real problem
TokenType categorize(const char *rp) {
first, we want to skip the whitespace
skipws(rp);
then, we'll try to match the easiest stuff: the string
if(*rp=='"') {
// Skip the string content
while(*rp && *rp!='"') rp++;
// If the string stopped with anything different than " we
// have a parse error
if(!*rp) return Bad;
// Otherwise, skip the trailing whitespace
skipws(rp);
// And check if we got at the end
return *rp?Bad:String;
}
Then, on to numbers, notice that the real and integer definitions start in the same way; we have a common branch:
// If there's a + or -, it's fine, skip it
if(*rp=='+' || *rp=='-') rp++;
const char *before=rp;
// Skip the digits
eatdigits(rp);
// If we didn't manage to find any digit, it's not a valid number
if(rp==start) return Bad;
// If it ends here or after whitespace, it's an integer
if(!*rp) return Integer;
before = rp;
skipws(rp);
if(before!=rp) return *rp?Bad:Integer;
If we notice that there's still stuff, we tackle the real number:
// Maybe something after the decimal dot?
if(*rp=='.') {
rp++;
eatdigits(rp);
}
// Exponent
if(*rp=='E' || *rp=='e') {
rp++;
if(*rp=='+' || *rp=='-') rp++;
before=rp;
eatdigits(rp);
if(before==rp) return Bad;
}
skipws(rp);
return *rp?Bad:Real;
}
You can easily invoke this routine after reading the input.
(notice that here the string thing is just for fun, cin does not have any special processing for double-quotes delimited strings).

Cannot get ispunct or isspace to work, but isupper works fine. Can you help me?

This code only outputs the number of capital letters. It always outputs numMarks and numSpaces as 0. I've also tried sentence.c_str() with the same results. I cannot understand what's happening.
cout << "Please enter a sentence using grammatically correct formatting." << endl;
string sentence = GetLine();
int numSpaces = 0;
int numMarks = 0;
int numCaps = 0;
char words[sentence.length()];
for(int i = 0; i < sentence.length(); ++i)
{
words[i] = sentence[i];
if(isspace(words[i]) == true)
{
numSpaces++;
}
else if(ispunct(words[i]) == true)
{
numMarks++;
}
else if(isupper(words[i]) == true)
{
numCaps++;
}
}
cout << "\nNumber of spaces: " << numSpaces;
cout << "\nNumber of punctuation marks: " << numMarks;
cout << "\nNumber of capital letters: " << numCaps;
Edit: Fixed the problem. My compiler is weird. All I had to do was remove == true And it worked perfectly. Thanks for the information though. Now I know for the future
The functions isspace, ispunct, isupper that you are using have return type int. They return 0 if it is not a match, and non-zero if it is a match. They don't necessarily return 1, so testing == true may fail even though the check succeeded.
Change your code to be:
if ( isspace(words[i]) ) // no == true
and it should start working properly (so long as you don't type any extended characters - see below).
Further info: there are two different isupper functions in C++ (and the same for the other two functions). The are:
#include <cctype>
int isupper(int ch)
and
#include <locale>
template< class charT >
bool isupper( charT ch, const locale& loc );
You are currently using the first one, which is a legacy function coming from C. However you are using it incorrectly by passing a char; the argument must be in the range of unsigned char. Related question.
So to fix your code properly, choose one of the following two options (including the right header):
if ( isupper( static_cast<unsigned char>(words[i]) ) )
or
if ( isupper( words[i], locale() ) )
Other things: char words[sentence.length()]; is illegal in Standard C++; array dimensions must be known at compile-time. Your compiler is implementing an extension.
However this is redundant, you could just write sentence[i] and not use words at all.
Please change your code to
char c;
...
c = sentence[i];
if(isspace(c))
{
++numSpaces;
}
...
isspace returns zero if it is not a space or tab, but you can not assume that it is always returns 1 if it a space or tab. From http://www.cplusplus.com/reference/cctype/isspace/, it says, "A value different from zero (i.e., true) if indeed c is a white-space character. Zero (i.e., false) otherwise."
But if you test it with true, true is converted to 1 and the test fails because for example, on my machine, it returns 8 for a space.
Two things to consider.
First I would use " else if(ispunct(words[i]) !=0 )", instead of comparing the return of the function against true. Thissince the functions return an integer. The value of the integer returned might not match the case of being equal to whatever true is defined in your platform or compiler.
My second suggestion is to check your locale. In unix you can use the "locale" command. In windows you can ask google how to check your locale, for instance for windows 7.
https://www.java.com/en/download/help/locale.xml
If your locale is a "wide character" locale, you might need to use iswpunct (wint_t wc) instead of ispunct(int c).
I hope this helps

C++ check if a date is valid

is there any function to check if a given date is valid or not?
I don't want to write anything from scratch.
e.g. 32/10/2012 is not valid
and 10/10/2010 is valid
If your string is always in that format the easiest thing to do would be to split the string into its three components, populate a tm structure and pass it to mktime(). If it returns -1 then it's not a valid date.
You could also use Boost.Date_Time to parse it:
string inp("10/10/2010");
string format("%d/%m/%Y");
date d;
d = parser.parse_date(inp, format, svp);
The boost date time class should be able to handle what you require.
See http://www.boost.org/doc/libs/release/doc/html/date_time.html
If the format of the date is constant and you don't want to use boost, you can always use strptime, defined in the ctime header:
const char date1[] = "32/10/2012";
const char date2[] = "10/10/2012";
struct tm tm;
if (!strptime(date1, "%d/%m/%Y", &tm)) std::cout << "date1 isn't valid\n";
if (!strptime(date2, "%d/%m/%Y", &tm)) std::cout << "date2 isn't valid\n";