Regular Expression Valid and Invalid togather - c++

I the below items i want to only detect the valid items with regular expression.
Space in word means invalid, # sign means invalid, Starting word with number is invalid.
Invalid : M_123 ASD
Invalid : M_123#ASD
Invalid : 1_M# ADD
Valid : M_125ASD
Valid : M_125$ASD
I am trying as below :
[A-Za-z0-9_$]
Not working properly. I need to set both valid and invalid sets for a word.
Can i do a match with regular expression?

Your regex [A-Za-z0-9_$] presents a character class that matches a single character that is either an ASCII letter or digit, or _ or $ symbols. If you use it with std::regex_match, it would only match a whole string that consists of just one char like that since the pattern is anchored by default when used with that method. If you use it with an std::regex_search, a string like ([_]) would pass, since the regex is not anchored and can find partial matches.
To match 0 or more chars, you need to add * quantifier after your class. To match one or more chars, you need to add + quantifier after your character class. However, you have an additional restriction: a digit cannot appear at the start.
It seems you may use
^[A-Za-z][A-Za-z0-9_$]*$
See the regex demo at regex101.com.
Details:
^ - start of string
[A-Za-z] - an ASCII letter (exactly one occurrence)
[A-Za-z0-9_$]* - 0+ ASCII letters, digits, _ or $
$ - end of string anchor.
Note that with regex_match, you may omit ^ and $ anchors.

So the requirements are
cannot start with number( i am assuming it as start with alphabet)
cannot contain space or #
all other characters are valid
you can try this regex ^[a-zA-Z]((?![\# ]).)+?$
^[a-zA-Z] checks for alphabet at start of the line
((?![\# ]).)+?$ checks if there are no # or space in the remaining part of the line.
Online demo here
EDIT
As per Wiktor's comment the regex can be simplified to ^[a-zA-Z][^# ]+$.

Related

Regex for only alphanumeric(only numeric should not be allowed)

I have tried
^[a-zA-Z0-9]*[a-zA-Z][a-zA-Z0-9]*$
&
^(?![0-9]*$)[a-zA-Z0-9]+$
but both are allowing numbers only as well.
You may try this:
^(?=.*[a-zA-Z])[a-zA-Z0-9]+$
Explanation:
^ start of string.
(?=.*[a-zA-Z]) positive look ahead, that ensures there exists
one or more a-zA-Z; if true then :
[a-zA-Z0-9]+ this matches one or more combination of alphanumeric characters.
$ end of string.
Demo

How to make regex that can take at most one asterisk in character class?

I want to create a regular expression that match a string that starts with an optional minus sign - and ends with a minus sign. In between must begin with a letter (upper or lower case) which can be followed by any combination of letters, numbers and may, at most, contain one asterix (*)
So far I have came up with this
[-]?[a-zA-Z]+[a-zA-Z0-9(*{0,1})]*[-]
Some examples of what I am trying to achieve.
"-yyy-" // valid
"-u8r*y75-" // valid
"-u8r**y75-" // invalid
Code
See regex in use here
^-?[a-z](?!(?:.*\*){2})[a-z\d*]*-$
Alternatively, you can use the following regex to achieve the same results without using a negative lookahead.
See regex in use here
^-?[a-z][a-z\d]*(?:\*[a-z\d]*)?-$
Results
Input
** VALID **
-yyy-
-u8r*y75-
** INVALID **
-u8r**y75-
Output
-yyy-
-u8r*y75-
Explanation
^ Assert position at the start of the line
-? Match zero or one of the hyphen character -
[a-z] Match a single ASCII alpha character between a and z. Note that the i modifier is turned on, thus this will also match uppercase variations of the same letters
(?!(?:.*\*){2}) Negative lookahead ensuring what follows doesn't match
(?:.*\*){2} Match an asterisk * twice
[a-z\d*]* Match any ASCII letter between a and z, or a digit, or the asterisk symbol * literally, any number of times
- Match this character literally
$ Assert position at the end of the line
Try this one:
-(((\w|\d)*)(\*?)((\w|\d)*))-
You can try it here:
https://regex101.com/
(-)?(\w)+(\*(?!\*)|\w+)(-)
I used grouping to make it more clear. I changed [a-zA-Z0-9] to \w which stands for the same.
(\*(?!\*)|\w+)
This is the important change. Explained in words:
If it is a star \* and the preceding char was not a star(?!\*) (called negative lookahead = look at the preceding part) or if it is \w = [a-zA-Z0-9].
Use this site to test: https://regexr.com/
They have a pretty good explaination on the left menu under "Reference".

Match 3 and 4 delimiters and between them; not less not more

I have a command-line program that its first argument ( = argv[ 1 ] ) is a regex pattern.
./program 's/one-or-more/anything/gi/digit-digit'
So I need a regex to check if the entered input from user is correct or not. This regex can be solve easily but since I use c++ library and std::regex_match and this function by default puts begin and end assertion (^ and $) at the given string, so the nan-greedy quantifier is ignored.
Let me clarify the subject. If I want to match /anything/ then I can use /.*?/ but std::regex_match considers this pattern as ^/.*?/$ and therefore if the user enters: /anything/anything/anyhting/ the std::regex_match still returns true whereas the input-pattern is not correct. The std::regex_match only returns true or false and the expected pattern form the user can only be a text according to the pattern. Since the pattern is various, here, I can not provide you all possibilities, but I give you some example.
Should be match
/.//
s/.//
/.//g
/.//i
/././gi
/one-or-more/anything/
/one-or-more/anything/g/3
/one-or-more/anything/i
/one-or-more/anything/gi/99
s/one-or-more/anything/g/4
s/one-or-more/anything/i
s/one-or-more/anything/gi/54
and anything look like this pattern
Rules:
delimiters are /|##
s letter at the beginning and g, i and 2 digits at the end are optional
std::regex_match function returns true if the entire target character sequence can be match, otherwise return false
between first and second delimiter can be one-or-more +
between second and third delimiter can be zero-or-more *
between third and fourth can be g or i
At least 3 delimiter should be match /.// not less so /./ should not be match
ECMAScript 262 is allowed for the pattern
NOTE
May you would need to see may question about std::regex_match:
std::regex_match and lazy quantifier with strange
behavior
I no need any C++ code, I just need a pattern.
Do not try d?([/|##]).+?\1.*?\1[gi]?[gi]?\1?d?\d?\d?. It fails.
My attempt so far: ^(?!s?([/|##]).+?\1.*?\1.*?\1)s?([/|##]).+?\2.*?\2[gi]?[gi]?\d?\d?$
If you are willing to try, you should put ^ and $ around your pattern
If you need more details please comment me, and I will update the question.
Thanks.
You could use this regular expression:
^s?([/|##])((?!\1).)+\1((?!\1).)*\1((gi?|ig)(\1\d\d?)?|i)?$
See regex101.com
Note how this also rejects these cases:
///anything/
/./anything/gg
/./anything/ii
/./anything/i/12
How it works:
Some explanation of the parts that are different:
((?!\1).): this will match any character that is not the delimiter. This way you are sure you can keep track of the exact number of delimiters used. You can this way also prevent that the first character after the first delimiter, is again that delimiter, which should not be allowed.
(gi?|ig): matches any of the valid modifier combinations, except a sole i, which is treated separately. So this also excludes gg and ii as valid character sequences.
(\1\d\d?)?: optionally allows for an extra delimiter (after a g modifier -- see previous) to be added with one or two digits following it.
( |i)?: for the case there is no g modifier present, but just the i or none: then no digits are allowed to follow.
This is a tricky one, but I took the challenge - here is what I have ended up with:
^s?([\/|##])(?:(?!\1).)+\1(?:(?!\1).)*\1(?:i|(?:gi?|ig)(\1\d{1,2})?)?$
Pattern breakdown:
^ matches start of string
s? matches an optional 's' character
([\/|##]) matches the delimeter characters and captures as group 1
(?:(?!\1).)+ matches anything other than the delimiter character one or more times (uses negative lookahead to make sure that the character isn't the delimiter matched in group 1)
\1 matches the delimiter character captured in group 1
(?:(?!\1).)* matches anything other than the delimiter character zero or more times
\1 matches the delimiter character captured in group 1
(?: starts a new group
i matches the i character
| or
(?:gi?|ig) matches either g, gi, or ig
(\1\d{1,2})? followed by an optional extra delimiter and 0-9 once or twice
)? closes group and makes it optional
$ matches end of string
I have used non capturing groups throughout - these are groups that start ?:

How do I check a whole existing regular expression for a digit?

I have written a regular expression as follows:
"^[\+]{0,1}([\#]|[\*]|[\d]){1,15}$"
In summary this matches an optional '+' sign followed by up to 15 characters which might be '#', '*' or a digit.
However, this means that '+#' will match and this is not a valid result as I always need at least one number.
Typical valid matches might be:
+1234
445678999
+#7897897
+345764756#775
So, given that I've crafted a valid RegEx for these to match, I guess the elegant solution is to use this regex and add some special criterion to globally check for a digit in the result OR somehow disallow anything which doesn't have at least one digit in.
How do I check for that digit?
This solutions requires at least one digit in the string, using lookahead (the (?=...) section):
^(?=.*\d)\+?[#*\d]{1,15}$
Legenda
^ # Start of the string (or line with m/multiline flag)
(?=.*\d) # Lookahead that checks for at least one digit in the match
\+? # An optional literal plus '+'
[#*\d]{1,15} # one to fifteen of literal '#' or '*' or digit (\d is [0-9])
$ # End of the string (line with m/multiline flag)
Online Demo
Regex graphical schema (everybody loves it)
NOTE: as you can see in the demo avoid also combinations just like +* or + or #* , you get it...
Try this regex (my first idea initially):
^(?=.*[0-9])[+]?([#*\d]{1,15})$
You can replace [0-9] with \d.
DEMO:
https://regex101.com/r/bM9oE6/3
I'd use
^(?=.*\d)\+?[#*\d]{1,15}$
Explanation:
^ : begining of line
(?= : lookahead
.*\d : at least one digit
)
\+? : optional +
[#*\d]{1,15} : 1 to 15 character in class [#*\d]
$ : end of line
matched:
+1234
445678999
+#7897897
+345764756#775
###456
not matched:
+#*
+*
#*
+#
This should work in your case:
^(\+{0,1}[\d#]{1,15})$
Demo:
https://regex101.com/r/fU1eC2/1
Edit:
If you need # after + in string use ^[+#]?([\d#]{1,15})(?<!#)$
matches "+#7897897"
If don't, use ^[+#]*([\d#]{1,15})(?<!#)$
matches "+#7897897"

regular expression for matching

It is for a normal register name, could be 1-n characters with a-zA-Z and -, like
larry-cai, larrycai, larry-c-cai, l,
but - can't be the first and end character, like
-larry, larry-
my thinking is like
^[a-zA-Z]+[a-zA-Z-]*[a-zA-Z]+$
but the length should be 2 if my regex
should be simple, but don't how to do it
Will be nice if you can write it and pass http://tools.netshiftmedia.com/regexlibrary/
You didn't specify which regex engine you're using. One way would be (if your engine supports lookaround):
^(?!-)[A-Za-z-]+(?<!-)$
Explanation:
^ # Start of string
(?!-) # Assert that the first character isn't a dash
[A-Za-z-]+ # Match one or more "allowed" characters
(?<!-) # Assert that the previous character isn't a dash...
$ # ...at the end of the string.
If lookbehind is not available (for example in JavaScript):
^(?!-)[A-Za-z-]*[A-Za-z]$
Explanation:
^ # Start of string
(?!-) # Assert that the first character isn't a dash
[A-Za-z-]* # Match zero or more "allowed" characters
[A-Za-z] # Match exactly one "allowed" character except dash
$ # End of string
This should do it:
^[a-zA-Z]+(-[a-zA-Z]+)*$
With this there need to be one or more alphabetic characters at the begin (^[a-zA-Z]+). And if there is a - following, it needs to be followed by at least one alphabetic character (-[a-zA-Z]+). That pattern can be repeated arbitrary times until the end of the string is reached.
A simple answer would be:
^(([a-zA-Z])|([a-zA-Z][a-zA-Z-]*[a-zA-Z]))$
This matches either a string with length 1 and characters a-zA-Z or it matches an improved version of your original expression which is fine for strings with length greater than 1.
Credit for the improvement goes to Tim and ridgerunner (see comments).
Try this:
^[a-zA-Z]+([-]*[a-zA-Z])*$
Not sure which lazy group takes precedence..
^[a-zA-Z][a-zA-Z-]*?[a-zA-Z]?$
maybe this?
^[^-]\S*[^-]$|^[^-]{1}$