Regex.IsMatch for only letters and numbers [duplicate] - regex

How can I validate a string using Regular Expressions to only allow alphanumeric characters in it?
(I don't want to allow for any spaces either).

In .NET 4.0 you can use LINQ:
if (yourText.All(char.IsLetterOrDigit))
{
//just letters and digits.
}
yourText.All will stop execute and return false the first time char.IsLetterOrDigit reports false since the contract of All cannot be fulfilled then.
Note! this answer do not strictly check alphanumerics (which typically is A-Z, a-z and 0-9). This answer allows local characters like åäö.
Update 2018-01-29
The syntax above only works when you use a single method that has a single argument of the correct type (in this case char).
To use multiple conditions, you need to write like this:
if (yourText.All(x => char.IsLetterOrDigit(x) || char.IsWhiteSpace(x)))
{
}

Use the following expression:
^[a-zA-Z0-9]*$
ie:
using System.Text.RegularExpressions;
Regex r = new Regex("^[a-zA-Z0-9]*$");
if (r.IsMatch(SomeString)) {
...
}

You could do it easily with an extension function rather than a regex ...
public static bool IsAlphaNum(this string str)
{
if (string.IsNullOrEmpty(str))
return false;
for (int i = 0; i < str.Length; i++)
{
if (!(char.IsLetter(str[i])) && (!(char.IsNumber(str[i]))))
return false;
}
return true;
}
Per comment :) ...
public static bool IsAlphaNum(this string str)
{
if (string.IsNullOrEmpty(str))
return false;
return (str.ToCharArray().All(c => Char.IsLetter(c) || Char.IsNumber(c)));
}

While I think the regex-based solution is probably the way I'd go, I'd be tempted to encapsulate this in a type.
public class AlphaNumericString
{
public AlphaNumericString(string s)
{
Regex r = new Regex("^[a-zA-Z0-9]*$");
if (r.IsMatch(s))
{
value = s;
}
else
{
throw new ArgumentException("Only alphanumeric characters may be used");
}
}
private string value;
static public implicit operator string(AlphaNumericString s)
{
return s.value;
}
}
Now, when you need a validated string, you can have the method signature require an AlphaNumericString, and know that if you get one, it is valid (apart from nulls). If someone attempts to pass in a non-validated string, it will generate a compiler error.
You can get fancier and implement all of the equality operators, or an explicit cast to AlphaNumericString from plain ol' string, if you care.

I needed to check for A-Z, a-z, 0-9; without a regex (even though the OP asks for regex).
Blending various answers and comments here, and discussion from https://stackoverflow.com/a/9975693/292060, this tests for letter or digit, avoiding other language letters, and avoiding other numbers such as fraction characters.
if (!String.IsNullOrEmpty(testString)
&& testString.All(c => Char.IsLetterOrDigit(c) && (c < 128)))
{
// Alphanumeric.
}

^\w+$ will allow a-zA-Z0-9_
Use ^[a-zA-Z0-9]+$ to disallow underscore.
Note that both of these require the string not to be empty. Using * instead of + allows empty strings.

Same answer as here.
If you want a non-regex ASCII A-z 0-9 check, you cannot use char.IsLetterOrDigit() as that includes other Unicode characters.
What you can do is check the character code ranges.
48 -> 57 are numerics
65 -> 90 are capital letters
97 -> 122 are lower case letters
The following is a bit more verbose, but it's for ease of understanding rather than for code golf.
public static bool IsAsciiAlphaNumeric(this string str)
{
if (string.IsNullOrEmpty(str))
{
return false;
}
for (int i = 0; i < str.Length; i++)
{
if (str[i] < 48) // Numeric are 48 -> 57
{
return false;
}
if (str[i] > 57 && str[i] < 65) // Capitals are 65 -> 90
{
return false;
}
if (str[i] > 90 && str[i] < 97) // Lowers are 97 -> 122
{
return false;
}
if (str[i] > 122)
{
return false;
}
}
return true;
}

In order to check if the string is both a combination of letters and digits, you can re-write #jgauffin answer as follows using .NET 4.0 and LINQ:
if(!string.IsNullOrWhiteSpace(yourText) &&
yourText.Any(char.IsLetter) && yourText.Any(char.IsDigit))
{
// do something here
}

Based on cletus's answer you may create new extension.
public static class StringExtensions
{
public static bool IsAlphaNumeric(this string str)
{
if (string.IsNullOrEmpty(str))
return false;
Regex r = new Regex("^[a-zA-Z0-9]*$");
return r.IsMatch(str);
}
}

While there are many ways to skin this cat, I prefer to wrap such code into reusable extension methods that make it trivial to do going forward. When using extension methods, you can also avoid RegEx as it is slower than a direct character check. I like using the extensions in the Extensions.cs NuGet package. It makes this check as simple as:
Add the https://www.nuget.org/packages/Extensions.cs package to your project.
Add "using Extensions;" to the top of your code.
"smith23".IsAlphaNumeric() will return True whereas "smith 23".IsAlphaNumeric(false) will return False. By default the .IsAlphaNumeric() method ignores spaces, but it can also be overridden as shown above. If you want to allow spaces such that "smith 23".IsAlphaNumeric() will return True, simple default the arg.
Every other check in the rest of the code is simply MyString.IsAlphaNumeric().

12 years and 7 months later, if anyone comes across this article nowadays.
Compiled RegEx actually has the best performance in .NET 5 and .NET 6
Please look at the following link where I compare several different answers given on this question. Mainly comparing Compiled RegEx, For-Loops, and Linq Predicates: https://dotnetfiddle.net/WOPQRT
Notes:
As stated, this method is only faster in .NET 5 and .NET 6.
.NET Core 3.1 and below show RegEx being the slowest.
Regardless of the version of .NET, the For-Loop method is consistently faster than the Linq Predicate.

I advise to not depend on ready made and built in code in .NET framework , try to bring up new solution ..this is what i do..
public bool isAlphaNumeric(string N)
{
bool YesNumeric = false;
bool YesAlpha = false;
bool BothStatus = false;
for (int i = 0; i < N.Length; i++)
{
if (char.IsLetter(N[i]) )
YesAlpha=true;
if (char.IsNumber(N[i]))
YesNumeric = true;
}
if (YesAlpha==true && YesNumeric==true)
{
BothStatus = true;
}
else
{
BothStatus = false;
}
return BothStatus;
}

Related

How to check if String contains only operators and numbers in as3?

How to check if String contains only operators and numbers.
The string which may contains 0-9 and +,-,.,/,*,X,=
For example : 28+30-22*5 = when i check this it should return true. If this contains a character then it will return false.
Can we use regexp for this.
This is totally primitive and straightforward, but it should do the trick:
// A collection of valid characters.
const VALID:String = "0123456789+-*/=X ";
function check(sample:String):Boolean
{
for (var i:int = 0; i < sample.length; i++)
{
// Let's iterate the given String, char by char.
var aChar:String = sample.charAt(i);
// The .indexOf(...) method returns -1 if there's no match.
if (sample.indexOf(aChar) < 0)
{
return false;
}
}
// If we got as far as here, it means
// there's no invalid characters in the sample.
return true;
}
trace(check("28+30-22*5 =")); // true
trace(check("a = 100 * 3 / 10")); // false
Of course you can do it the RegExp way, but it will probably be the same logic, just less readable, more difficult to handle, and not measurably faster.

Is there a way to replace string ">" with > in an 'if' condition?

I came across the below use case, but I could not find a proper solution.
Is there a way to replace string "<" or ">" with condition < or > in an if condition?
Example:
string condition = "<";
if (10 condition 8) // Here I want to replace condition with <
{
// Some code
}
I don't want to do it like:
if ("<" == condition)
{
if (10 < 8)
{
}
}
else if (">" == condition)
{
if (10 > 10)
{
}
}
And my condition will change during run time. I am just searching for a simple way if exist apart from above.
Use case: The user will give some query like below:
input: 10 > 9 => output: true
input: 10 < 7 => output: false
Basically I need to parse this query, as I have these 3 words (10, >, 9) as strings, and somehow I want to convert string ">" or "<" to actual symbol > or <.
You can map the string to a standard library comparator functor such as std::less via a std::map or a std::unordered_map.
You can't create a new operator in C++ (Can I create a new operator in C++ and how?). I can see where you are coming from with this idea, but the language just doesn't support that. You can, however, create a function that takes two operands and a string "argument" and returns the appropriate value.
bool CustomCompare(int operand1, int operand2, string op)
{
if (op == "<")
{
return operand1<operand2;
}
if (op == ">")
{
return operand1>operand2;
}
if (op == "_")
{
return DoTheHokeyPokeyAndTurnTheOperandsAround(operand1, operand2);
}
}
std::function<bool(int,int)> comparator = std::less;
if(comparator(10, 8))
{
//some code
}
See Also:
http://en.cppreference.com/w/cpp/utility/functional/function
http://en.cppreference.com/w/cpp/utility/functional/less
#include <functional>
#include <map>
#include <string>
int main()
{
using operation = std::function<bool(int,int)>;
std::map<std::string, operation> mp =
{{"<", std::less<int>()},
{">", std::greater<int>()}};
int x = 5;
int y = 10;
std::string op = "<";
bool answer = mp[op](x, y);
}
If you are a C++ Ninja, and you are very stubborn to get it working just the way you wish, there is a way, but it is advanced and complicated.
I mostly write POSIX code for VxWorks, but I guess it can be done for any other.
Let's say you have: string myExpression = "mySize > minSize";
"Rebuild the String as a C code" (save to file, use ccppc, gcc, gpp, whatever toolchain you have.)
You need to link it with your code, at least to get ELF relocations for mySize & minSize (I think it can be done using app-load, if you customize your ld command.
Load the code using ld
Jump to the new address you loaded your code to.
All that said, I would not recommend you to do that:
Very complicated.
Not the most stable, and very bug/error prone.
Can lead to major vulnerabilities "Hacker" style,
and proper sanitation is required.
The only pro I can see, is that this method supports everything C has to offer out of the box! BitWise, +-/*^!, even functions as pow(), and such.
A little bit better is:
To compile a function as:
`"bool comparer_AB(int a, int b) { return a " + operator + "}"`
and then call comparer_AB();.

Searching a function with the cctype library to find the number of characters that are digits in a range

Trying to solve one of the questions I was given by an instructor and I'm having trouble understanding how to call this properly.
I'm given a function that is linked to a test driver and my goal is to use the cstring library to find any numbers in the range of 0-9 in a randomly generated string object with this function.
int countDigits(char * const line) {return 0;}
So far this is what I have:
int countDigits(char * const line)
{
int i, index;
index = -1;
found = false;
i = 0;
while (i < *line && !found)
{
if (*line > 0 && *line < 9)
index++;
}
return 0;
}
My code not great and at the moment only results in an infinite loop and failure, any help would be very much appreciated.
Well, there are several problems with your function.
you want it to return the number of digits, but it returns 0 in any case
found is never set to anything than false and thus prohibits the while loop from stopping
also the comparison i<*line does not make much sense to me, I guess you want to check for the end of the line. Maybe you would want to look for a null termination "\0" (here again i is never set to anything else than 0)
and, if you want to compare single characters, you should look up the ASCII code of the characters you are comparing to (the digits 0-9 are not equal to codes 0-9)
Hope that is a start to improve your function.
There's a readymade for this called count_if:-
count_if(begin, end, [](char c){ return isdigit(c);});

Match a structure against set of patterns

I need to match a structure against set of patterns and take some action for each match.
Patterns should support wildcards and i need to determine which patterns is matching incoming structure, example set:
action=new_user email=*
action=del_user email=*
action=* email=*#gmail.com
action=new_user email=*#hotmail.com
Those patterns can be added/removed at realtime. There can be thousands connections, each have its own pattern and i need to notify each connection about I have received A structure which is matching. Patterns are not fully regex, i just need to match a string with wildcards * (which simple match any number of characters).
When server receives message (lets call it message A) with structure action=new_user email=testuser#gmail.com and i need to find out that patterns 1 and 3 are matching this message, then i should perform action for each pattern that match (send this structure A to corresponding connection).
How this can be done with most effecient way? I can iterate this patterns and check one-by-one, but im looking for more effecient and thread-safe way to do this. Probably its possible to group those patterns to reduce checking.. Any suggestions how this can be done?
UPD: Please note i want match multiplie patterns(thousands) aganst fixed "string"(actually a struct), not vice versa. In other words, i want to find which patterns are fitting into given structure A.
Convert the patterns to regular expressions, and match them using RE2, which is written in C++ and is one of the fastest.
Actually, if I understood correctly, the fourth pattern is redundant, since the first pattern is more general, and includes every string that is matched by the fourth. That leaves only 3 patterns, which can be easly checked by this function:
bool matches(const char* name, const char* email)
{
return strstr(name, "new_user") || strstr(name, "del_user") || strstr(email, "#gmail.com");
}
And if you prefer to parse whole string, not just match the values of action and email, then the following function should do the trick:
bool matches2(const char* str)
{
bool match = strstr(str, "action=new_user ") || strstr(str, "action=del_user ");
if (!match)
{
const char* emailPtr = strstr(str, "email=");
if (emailPtr)
{
match = strstr(emailPtr, "#gmail.com");
}
}
return match;
}
Note that the strings you put as arguments must be escaped with \0. You can read about strstr function here.
This strglobmatch supports * and ? only.
#include <string.h> /* memcmp, index */
char* strfixstr(char *s1, char *needle, int needle_len) {
int l1;
if (!needle_len) return (char *) s1;
if (needle_len==1) return index(s1, needle[0]);
l1 = strlen(s1);
while (l1 >= needle_len) {
l1--;
if (0==memcmp(s1,needle,needle_len)) return (char *) s1;
s1++;
}
return 0;
}
int strglobmatch(char *str, char *glob) {
/* Test: strglobmatch("almamxyz","?lmam*??") */
int min;
while (glob[0]!='\0') {
if (glob[0]!='*') {
if ((glob[0]=='?') ? (str[0]=='\0') : (str[0]!=glob[0])) return 0;
glob++; str++;
} else { /* a greedy search is adequate here */
min=0;
while (glob[0]=='*' || glob[0]=='?') min+= *glob++=='?';
while (min--!=0) if (*str++=='\0') return 0;
min=0; while (glob[0]!='*' && glob[0]!='?' && glob[0]!='\0') { glob++; min++; }
if (min==0) return 1; /* glob ends with star */
if (!(str=strfixstr(str, glob-min, min))) return 0;
str+=min;
}
}
return str[0]=='\0';
}
If all you want is wildcart matching, then you might try this algorithm. The point is to check all substrings that is not a wildcart to be subsequent in a string.
patterns = ["*#gmail.com", "akalenuk#*", "a*a#*", "ak*#gmail.*", "ak*#hotmail.*", "*#*.ua"]
string = "akalenuk#gmail.com"
preprocessed_patterns = [p.split('*') for p in patterns]
def match(s, pp):
i = 0
for w in pp:
wi = s.find(w, i)
if wi == -1:
return False
i = wi+len(w)
return i == len(s) or pp[-1] == ''
print [match(string, pp) for pp in preprocessed_patterns]
But it might be best to still use regexp in case you would need something more than a wildcart in a future.

is using std::regex for simple RX is good practice?

For example, my situation:
I'm getting an input of "0", "1", "true" or "false". (in any case)
what is preferred on terms of performance, code reading, any basically best-practice:
bool func(string param)
{
string lowerCase = param;
to_lower(lowerCase);
if (lowerCase == "0" || lowerCase == "false")
{
return false;
}
if (lowerCase == "1" || lowerCase == "true")
{
return true;
}
throw ....
}
or:
bool func(string param)
{
string lowerCase = param;
to_lower(lowerCase);
regex rxTrue ("1|true");
regex rxFalse ("0|false");
if (regex_match(lowerCase, rxTrue)
{
return true;
}
if (regex_match(lowerCase, rxFalse)
{
return false;
}
throw ....
}
The second is somewhat clearer, and easier to extend (e.g.: accepting
"yes" and "no", or prefixes, with "1|t(?:rue)?)" and
"0|f(?:alse)?". With regards to performance, the second can (and
should) be made significantly faster by declaring the regex static
(and const, while you're at it), e.g.:
static regex const rxTrue ( "1|true" , regex_constants::icase );
static regex const rxFalse( "0|false", regex_constants::icase );
Note too that by specifying case insensitivity, you'll not have to
convert the input to lower case.
It's just a hunch, but probably the first one is going to be faster (no regex-compiling involved). Also, the second version depends on your compiler supporting the C++11 <regex> implementation, so depending on the environments you need to support, the second option is ruled out automatically.