Parse string to be number or number and percent symbol - c++

In C++, better without Boost library, how to make sure that the std::string str contains either a number or a number followed by '%' sign? If it does not belong to these two cases an error should be issued.

#include <iostream>
#include <string>
#include <algorithm>
#include <ctype.h>
bool is_a_bad_char(char c) {
return !(isdigit(c) || (c=='%'));
}
int main() {
std::string str = "123123%4141219";
if (std::find_if(str.begin(), str.end(), is_a_bad_char) != str.end()) {
std::cout << "error" << std::endl;
return 1;
}
return 0;
}

The easiest solution is probably to convert the string (using strtol
or strtod, depending on what type of number you expect), then look at
the following character. Something like:
(EDITED to correct error handling):
bool
isNumberOrPercent( std::string const& value )
{
char const* end;
errno = 0;
strtod( value.c_str(), &end );
return errno == 0
&& (*end = '%' ? end + 1 : end) - value.c_str() == value.size();
}

find_first_not_of with all the digits and %
If the above returns npos, then check the last character is %.

Not very C++-ish, but something like this would do it:
#include <cstdlib>
#include <cstring>
bool checkformat(const std::string &s) {
const char *begin = s.c_str();
char *end;
double val = std::strtod(begin, &end);
if (end == begin) return false;
if (*end == '%') ++end;
return (end - begin == s.size());
}
Be aware that strtod skips initial whitespace, so if you don't want to accept a string with initial whitespace then you'd need to separately reject that. It also accepts "NAN", "INF", "INFINITY" (all case-insensitive), and each of those things preceded by + or -, and in the case of "NAN" optionally followed some implementation-defined characters to indicate which NaN value it represents. Arguably "INF" is a number, but by definition "NAN" isn't, so you'd want to return false if val != val and possibly also check for infinities.
[Edit: I think I've fixed the issues James raises below, except that " " and " %" are still in dispute. And then he added overflow to the mix. Between his answer and mine, you should get the idea -- first decide how you want to treat each edge case, then code it.]

Related

Extracting integers from strings in C++ with arbitrary structure

This seems like a question that should be easy to search for, but any answers out there seem to be drowned out by a sea of questions asking the more common problem of converting a string to an integer.
My question is: what's an easy way to extract integers from std::strings that might look like "abcd451efg" or "hel.lo42-world!" or "hide num134rs here?" I see that I can use isDigit to manually parse the strings myself, but I'm wondering if there is a more standard way in the vein of atoi or stoi, etc.
The outputs above would be 451, 42, and 134. We can also assume there is only one integer in a string (although a general solution wouldn't hurt). So we don't have to worry about strings like "abc123def456".
Java has an easy solution in the form of
Integer.parseInt(str.replaceAll("[\\D]", ""));
does C++ have something as straightforward?
You can use
string::find_first_of("0123456789") to get the position of the first digit, then string::find_last_of("0123456789") to get the position of the last digit, and finally use an atoi on the substring defined by the two positions. I cannot think of anything simpler (without regex).
BTW, this works only when you have a single number inside the string.
Here is an example:
#include <iostream>
#include <string>
#include <cstdlib>
using namespace std;
int main()
{
string s = "testing;lasfkj358kdfj-?gt";
size_t begin = s.find_first_of("0123456789");
size_t end = s.find_last_of("0123456789");
string num = s.substr(begin, end - begin + 1);
int result = atoi(num.c_str());
cout << result << endl;
}
If you have more than 1 number, you can combine string::find_first_of with string::find_first_not_of to get the beginning and the end of each number inside the string.
This code is the general solution:
#include <iostream>
#include <string>
#include <cstdlib>
using namespace std;
int main()
{
string s = "testing;lasfkj358kd46fj-?gt"; // 2 numbers, 358 and 46
size_t begin = 0, end = 0;
while(end != std::string::npos)
{
begin = s.find_first_of("0123456789", end);
if(begin != std::string::npos) // we found one
{
end = s.find_first_not_of("0123456789", begin);
string num = s.substr(begin, end - begin);
int number = atoi(num.c_str());
cout << number << endl;
}
}
}
atoi can extract numbers from strings even if there are trailing non-digits
int getnum(const char* str)
{
for(; *str != '\0'; ++str)
{
if(*str >= '0' && *str <= '9')
return atoi(str);
}
return YOURFAILURENUMBER;
}
Here's one way
#include <algorithm>
#include <iostream>
#include <locale>
#include <string>
int main(int, char* argv[])
{
std::string input(argv[1]);
input.erase(
std::remove_if(input.begin(), input.end(),
[](char c) { return !isdigit(c, std::locale()); }),
input.end()
);
std::cout << std::stoll(input) << '\n';
}
You could also use the <functional> library to create a predicate
auto notdigit = not1(
std::function<bool(char)>(
bind(std::isdigit<char>, std::placeholders::_1, std::locale())
)
);
input.erase(
std::remove_if(input.begin(), input.end(), notdigit),
input.end()
);
It's worth pointing out that so far the other two answers hard-code the digit check, using the locale version of isdigit guarantees your program will recognize digits according to the current global locale.

How to validate that there are only digits in a string?

I'm new to C++. I'm working on a project where I need to read mostly integers from the user through the console. In order to avoid someone entering non-digit characters I thought about reading the input as a string, checking there are only digits in it, and then converting it to an integer. I created a function since I need to check for integers several times:
bool isanInt(int *y){
string z;
int x;
getline(cin,z);
for (int n=0; n < z.length(); n++) {
if(!((z[n] >= '0' && z[n] <= '9') || z[n] == ' ') ){
cout << "That is not a valid input!" << endl;
return false;
}
}
istringstream convert(z); //converting the string to integer
convert >> x;
*y = x;
return true;
}
When I need the user to input an integer I'll call this function. But for some reason when I make a call tho this function the program doesn't wait for an input, it jumps immediately to the for-loop processing an empty string. Any thoughts? Thanks for your help.
There are many ways to test a string for only numeric characters. One is
bool is_digits(const std::string &str) {
return str.find_first_not_of("0123456789") == std::string::npos;
}
This would work:
#include <algorithm> // for std::all_of
#include <cctype> // for std::isdigit
bool all_digits(const std::string& s)
{
return std::all_of(s.begin(),
s.end(),
[](char c) { return std::isdigit(c); });
}
You can cast the string in a try/catch block so that if the cast fails you it would raise an exception and you can write whatever you want in the console.
For example:
try
{
int myNum = strtoint(myString);
}
catch (std::bad_cast& bc)
{
std::cerr << "Please insert only numbers "<< '\n';
}
Character-classification is a job typically delegated to the ctype facets of a locale. You're going to need a function that takes into account all 9 digits including the thousands separator and the radix point:
bool is_numeric_string(const std::string& str, std::locale loc = std::locale())
{
using ctype = std::ctype<char>;
using numpunct = std::numpunct<char>;
using traits_type = std::string::traits_type;
auto& ct_f = std::use_facet<ctype>(loc);
auto& np_f = std::use_facet<numpunct>(loc);
return std::all_of(str.begin(), str.end(), [&str, &ct_f, &np_f] (char c)
{
return ct_f.is(std::ctype_base::digit, c) || traits_type::eq(c, np_f.thousands_sep())
|| traits_type::eq(c, np_f.decimal_point());
});
}
Note that extra effort can go into making sure the thousands separator is not the first character.
try another way like cin.getline(str,sizeof(str)), and str here is char*. I think ur problem may be cause by other functions before calling this function. Maybe u can examine other parts of ur codes carefully. Breakpoints setting is recommended too.
Always use off-the-shelf functions. Never write alone.
I recommend
std::regex
Enjoy.

How to check if a string is all lowercase and alphanumerics?

Is there a method that checks for these cases? Or do I need to parse each letter in the string, and check if it's lower case (letter) and is a number/letter?
You can use islower(), isalnum() to check for those conditions for each character. There is no string-level function to do this, so you'll have to write your own.
Assuming that the "C" locale is acceptable (or swap in a different set of characters for criteria), use find_first_not_of()
#include <string>
bool testString(const std::string& str)
{
std::string criteria("abcdefghijklmnopqrstuvwxyz0123456789");
return (std::string::npos == str.find_first_not_of(criteria);
}
It's not very well known, but a locale actually does have functions to determine characteristics of entire strings at a time. Specifically, the ctype facet of a locale has a scan_is and a scan_not that scan for the first character that fits a specified mask (alpha, numeric, alphanumeric, lower, upper, punctuation, space, hex digit, etc.), or the first that doesn't fit it, respectively. Other than that, they work a bit like std::find_if, returning whatever you passed as the "end" to signal failure, otherwise returning a pointer to the first item in the string that doesn't fit what you asked for.
Here's a quick sample:
#include <locale>
#include <iostream>
#include <iomanip>
int main() {
std::string inputs[] = {
"alllower",
"1234",
"lower132",
"including a space"
};
// We'll use the "classic" (C) locale, but this works with any
std::locale loc(std::locale::classic());
// A mask specifying the characters to search for:
std::ctype_base::mask m = std::ctype_base::lower | std::ctype_base::digit;
for (int i=0; i<4; i++) {
char const *pos;
char const *b = &*inputs[i].begin();
char const *e = &*inputs[i].end();
std::cout << "Input: " << std::setw(20) << inputs[i] << ":\t";
// finally, call the actual function:
if ((pos=std::use_facet<std::ctype<char> >(loc).scan_not(m, b, e)) == e)
std::cout << "All characters match mask\n";
else
std::cout << "First non-matching character = \"" << *pos << "\"\n";
}
return 0;
}
I suspect most people will prefer to use std::find_if though -- using it is nearly the same, but can be generalized to many more situations quite easily. Even though this has much narrower applicability, it's not really a lot easier to user (though I suppose if you're scanning large chunks of text, it might well be at least a little faster).
You could use the tolower & strcmp to compare if the original_string and the tolowered string.And do the numbers individually per character.
(OR) Do both per character as below.
#include <algorithm>
static inline bool is_not_alphanum_lower(char c)
{
return (!isalnum(c) || !islower(c));
}
bool string_is_valid(const std::string &str)
{
return find_if(str.begin(), str.end(), is_not_alphanum_lower) == str.end();
}
I used the some info from:
Determine if a string contains only alphanumeric characters (or a space)
Just use std::all_of
bool lowerAlnum = std::all_of(str.cbegin(), str.cend(), [](const char c){
return isdigit(c) || islower(c);
});
If you don't care about locale (i.e. the input is pure 7-bit ASCII) then the condition can be optimized into
[](const char c){ return ('0' <= c && c <= '9') || ('a' <= c && c <= 'z'); }
If your strings contain ASCII-encoded text and you like to write your own functions (like I do) then you can use this:
bool is_lower_alphanumeric(const string& txt)
{
for(char c : txt)
{
if (!((c >= '0' and c <= '9') or (c >= 'a' and c <= 'z'))) return false;
}
return true;
}

string analysis

IF a string may include several un-necessary elements, e.g., such as #, #, $,%.
How to find them and delete them?
I know this requires a loop iteration, but I do not know how to represent sth such as #, #, $,%.
If you can give me a code example, then I will be really appreciated.
The usual standard C++ approach would be the erase/remove idiom:
#include <string>
#include <algorithm>
#include <iostream>
struct OneOf {
std::string chars;
OneOf(const std::string& s) : chars(s) {}
bool operator()(char c) const {
return chars.find_first_of(c) != std::string::npos;
}
};
int main()
{
std::string s = "string with #, #, $, %";
s.erase(remove_if(s.begin(), s.end(), OneOf("##$%")), s.end());
std::cout << s << '\n';
}
and yes, boost offers some neat ways to write it shorter, for example using boost::erase_all_regex
#include <string>
#include <iostream>
#include <boost/algorithm/string/regex.hpp>
int main()
{
std::string s = "string with #, #, $, %";
erase_all_regex(s, boost::regex("[##$%]"));
std::cout << s << '\n';
}
If you want to get fancy, there is Boost.Regex otherwise you can use the STL replace function in combination with the strchr function..
And if you, for some reason, have to do it yourself C-style, something like this would work:
char* oldstr = ... something something dark side ...
int oldstrlen = strlen(oldstr)+1;
char* newstr = new char[oldstrlen]; // allocate memory for the new nicer string
char* p = newstr; // get a pointer to the beginning of the new string
for ( int i=0; i<oldstrlen; i++ ) // iterate over the original string
if (oldstr[i] != '#' && oldstr[i] != '#' && etc....) // check that the current character is not a bad one
*p++ = oldstr[i]; // append it to the new string
*p = 0; // dont forget the null-termination
I think for this I'd use std::remove_copy_if:
#include <string>
#include <algorithm>
#include <iostream>
struct bad_char {
bool operator()(char ch) {
return ch == '#' || ch == '#' || ch == '$' || ch == '%';
}
};
int main() {
std::string in("This#is#a$string%with#extra#stuff$to%ignore");
std::string out;
std::remove_copy_if(in.begin(), in.end(), std::back_inserter(out), bad_char());
std::cout << out << "\n";
return 0;
}
Result:
Thisisastringwithextrastufftoignore
Since the data containing these unwanted characters will normally come from a file of some sort, it's also worth considering getting rid of them as you read the data from the file instead of reading the unwanted data into a string, and then filtering it out. To do this, you could create a facet that classifies the unwanted characters as white space:
struct filter: std::ctype<char>
{
filter(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::mask());
rc['#'] = std::ctype_base::space;
rc['#'] = std::ctype_base::space;
rc['$'] = std::ctype_base::space;
rc['%'] = std::ctype_base::space;
return &rc[0];
}
};
To use this, you imbue the input stream with a locale using this facet, and then read normally. For the moment I'll use an istringstream, though you'd normally use something like an istream or ifstream:
int main() {
std::istringstream in("This#is#a$string%with#extra#stuff$to%ignore");
in.imbue(std::locale(std::locale(), new filter));
std::copy(std::istream_iterator<char>(in),
std::istream_iterator<char>(),
std::ostream_iterator<char>(std::cout));
return 0;
}
Is this C or C++? (You've tagged it both ways.)
In pure C, you pretty much have to loop through character by character and delete the unwanted ones. For example:
char *buf;
int len = strlen(buf);
int i, j;
for (i = 0; i < len; i++)
{
if (buf[i] == '#' || buf[i] == '#' || buf[i] == '$' /* etc */)
{
for (j = i; j < len; j++)
{
buf[j] = buf[j+1];
}
i --;
}
}
This isn't very efficient - it checks each character in turn and shuffles them all up if there's one you don't want. You have to decrement the index afterwards to make sure you check the new next character.
General algorithm:
Build a string that contains the characters you want purged: "##$%"
Iterate character by character over the subject string.
Search if each character is found in the purge set.
If a character matches, discard it.
If a character doesn't match, append it to a result string.
Depending on the string library you are using, there are functions/methods that implement one or more of the above steps, such as strchr() or find() to determine if a character is in a string.
use the characterizer operator, ie a would be 'a'. you haven't said whether your using C++ strings(in which case you can use the find and replace methods) or C strings in which case you'd use something like this(this is by no means the best way, but its a simple way):
void RemoveChar(char* szString, char c)
{
while(*szString != '\0')
{
if(*szString == c)
memcpy(szString,szString+1,strlen(szString+1)+1);
szString++;
}
}
You can use a loop and call find_last_of (http://www.cplusplus.com/reference/string/string/find_last_of/) repeatedly to find the last character that you want to replace, replace it with blank, and then continue working backwards in the string.
Something like this would do :
bool is_bad(char c)
{
if( c == '#' || c == '#' || c == '$' || c == '%' )
return true;
else
return false;
}
int main(int argc, char **argv)
{
string str = "a #test ##string";
str.erase(std::remove_if(str.begin(), str.end(), is_bad), str.end() );
}
If your compiler supports lambdas (or if you can use boost), it can be made even shorter. Example using boost::lambda :
string str = "a #test ##string";
str.erase(std::remove_if(str.begin(), str.end(), (_1 == '#' || _1 == '#' || _1 == '$' || _1 == '%')), str.end() );
(yay two lines!)
A character is represented in C/C++ by single quotes, e.g. '#', '#', etc. (except for a few that need to be escaped).
To search for a character in a string, use strchr(). Here is a link to a sample code:
http://www.cplusplus.com/reference/clibrary/cstring/strchr/

How do I check if a C++ string is an int?

When I use getline, I would input a bunch of strings or numbers, but I only want the while loop to output the "word" if it is not a number.
So is there any way to check if "word" is a number or not? I know I could use atoi() for
C-strings but how about for strings of the string class?
int main () {
stringstream ss (stringstream::in | stringstream::out);
string word;
string str;
getline(cin,str);
ss<<str;
while(ss>>word)
{
//if( )
cout<<word<<endl;
}
}
Another version...
Use strtol, wrapping it inside a simple function to hide its complexity :
inline bool isInteger(const std::string & s)
{
if(s.empty() || ((!isdigit(s[0])) && (s[0] != '-') && (s[0] != '+'))) return false;
char * p;
strtol(s.c_str(), &p, 10);
return (*p == 0);
}
Why strtol ?
As far as I love C++, sometimes the C API is the best answer as far as I am concerned:
using exceptions is overkill for a test that is authorized to fail
the temporary stream object creation by the lexical cast is overkill and over-inefficient when the C standard library has a little known dedicated function that does the job.
How does it work ?
strtol seems quite raw at first glance, so an explanation will make the code simpler to read :
strtol will parse the string, stopping at the first character that cannot be considered part of an integer. If you provide p (as I did above), it sets p right at this first non-integer character.
My reasoning is that if p is not set to the end of the string (the 0 character), then there is a non-integer character in the string s, meaning s is not a correct integer.
The first tests are there to eliminate corner cases (leading spaces, empty string, etc.).
This function should be, of course, customized to your needs (are leading spaces an error? etc.).
Sources :
See the description of strtol at: http://en.cppreference.com/w/cpp/string/byte/strtol.
See, too, the description of strtol's sister functions (strtod, strtoul, etc.).
The accepted answer will give a false positive if the input is a number plus text, because "stol" will convert the firsts digits and ignore the rest.
I like the following version the most, since it's a nice one-liner that doesn't need to define a function and you can just copy and paste wherever you need it.
#include <string>
...
std::string s;
bool has_only_digits = (s.find_first_not_of( "0123456789" ) == std::string::npos);
EDIT: if you like this implementation but you do want to use it as a function, then this should do:
bool has_only_digits(const string s){
return s.find_first_not_of( "0123456789" ) == string::npos;
}
You might try boost::lexical_cast. It throws an bad_lexical_cast exception if it fails.
In your case:
int number;
try
{
number = boost::lexical_cast<int>(word);
}
catch(boost::bad_lexical_cast& e)
{
std::cout << word << "isn't a number" << std::endl;
}
If you're just checking if word is a number, that's not too hard:
#include <ctype.h>
...
string word;
bool isNumber = true;
for(string::const_iterator k = word.begin(); k != word.end(); ++k)
isNumber &&= isdigit(*k);
Optimize as desired.
Use the all-powerful C stdio/string functions:
int dummy_int;
int scan_value = std::sscanf( some_string.c_str(), "%d", &dummy_int);
if (scan_value == 0)
// does not start with integer
else
// starts with integer
You can use boost::lexical_cast, as suggested, but if you have any prior knowledge about the strings (i.e. that if a string contains an integer literal it won't have any leading space, or that integers are never written with exponents), then rolling your own function should be both more efficient, and not particularly difficult.
Ok, the way I see it you have 3 options.
1: If you simply wish to check whether the number is an integer, and don't care about converting it, but simply wish to keep it as a string and don't care about potential overflows, checking whether it matches a regex for an integer would be ideal here.
2: You can use boost::lexical_cast and then catch a potential boost::bad_lexical_cast exception to see if the conversion failed. This would work well if you can use boost and if failing the conversion is an exceptional condition.
3: Roll your own function similar to lexical_cast that checks the conversion and returns true/false depending on whether it's successful or not. This would work in case 1 & 2 doesn't fit your requirements.
Here is another solution.
try
{
(void) std::stoi(myString); //cast to void to ignore the return value
//Success! myString contained an integer
}
catch (const std::logic_error &e)
{
//Failure! myString did not contain an integer
}
Since C++11 you can make use of std::all_of and ::isdigit:
#include <algorithm>
#include <cctype>
#include <iostream>
#include <string_view>
int main([[maybe_unused]] int argc, [[maybe_unused]] char *argv[])
{
auto isInt = [](std::string_view str) -> bool {
return std::all_of(str.cbegin(), str.cend(), ::isdigit);
};
for(auto &test : {"abc", "123abc", "123.0", "+123", "-123", "123"}) {
std::cout << "Is '" << test << "' numeric? "
<< (isInt(test) ? "true" : "false") << std::endl;
}
return 0;
}
Check out the result with Godbolt.
template <typename T>
const T to(const string& sval)
{
T val;
stringstream ss;
ss << sval;
ss >> val;
if(ss.fail())
throw runtime_error((string)typeid(T).name() + " type wanted: " + sval);
return val;
}
And then you can use it like that:
double d = to<double>("4.3");
or
int i = to<int>("4123");
I have modified paercebal's method to meet my needs:
typedef std::string String;
bool isInt(const String& s, int base){
if(s.empty() || std::isspace(s[0])) return false ;
char * p ;
strtol(s.c_str(), &p, base) ;
return (*p == 0) ;
}
bool isPositiveInt(const String& s, int base){
if(s.empty() || std::isspace(s[0]) || s[0]=='-') return false ;
char * p ;
strtol(s.c_str(), &p, base) ;
return (*p == 0) ;
}
bool isNegativeInt(const String& s, int base){
if(s.empty() || std::isspace(s[0]) || s[0]!='-') return false ;
char * p ;
strtol(s.c_str(), &p, base) ;
return (*p == 0) ;
}
Note:
You can check for various bases (binary, oct, hex and others)
Make sure you don't pass 1, negative value or value >36 as base.
If you pass 0 as the base, it will auto detect the base i.e for a string starting with 0x will be treated as hex and string starting with 0 will be treated as oct. The characters are case-insensitive.
Any white space in string will make it return false.