fixed number of characters in a regex match - regex

Is there a way to match a fixed number of characters in a fixed length string via regex?
Example, I want to match all strings where the length of string is 5 and there are exactly 3 alphabets and 2 exclamations (!). The exclamations can be anywhere in the string.
Example matches: abc!!, a!b!c, !!abc, a!!bc
I tried to match using lookahead but I wasn't able to limit the length. The following was the regex I used.
(?=\w*!\w*!\w*)[\w!]{5}
This matches a!!!b and a!!!! as well which I don't want.

You can do this using a lookahead based regular expression.
^(?=(?:\w*!){2}\w*$)[\w!]{5}$
Live Demo

Probably easiest to just specify all possibilities.
(?=\w\w\w!!|\w\w\!\w\!|\w\w\!!\w|\w!\w\w!|\w!\w!\w|\w!!\w\w|!\w!\w\w|!!\w\w\w)
Regex doesn't work well with combinations/permutations.
If the number of combinations is too large, do it in parts where the first regex gathers potential matches and the second (and beyond) continue to validate it.
[\w!]{5}
match.count('!') == 2
match.count('\w') == 3
(that isn't valid code -- just a concept)

Related

Regex not select word with character at the end

I have a simple question.
I need a regular expression to match a hexdecimal number without colon at the end.
For example:
0x85af6b9d: 0x00256f8a ;some more interesting code
// dont match 0x85af6b9d: at all, but match 0x00256f8a
My expression for hexdecimal number is 0[xX][0-9A-Fa-f]{1,8}
Version with (?!:) is not possible, because it will just match 0x85af6b9 (because of the {1,8} token)
Using a $ also isn't possible - there can be more numbers than one
Thanks!
Here is one way to do so:
0[xX][0-9A-Fa-f]{1,8}(?![0-9A-Fa-f:])
See the online demo.
We use a negative lookahead to match all hexadecimal numbers without : at the end. Because of {1,8}, it is also necessary to ensure that the entire hexadecimal number is correctly matched. We therefore reuse the character set ([0-9A-Fa-f]) to ensure that the number does not continue.

Regex that matches the following character

Regex that matches following patterns
1. mrrtjjsf8907m5q29ui
2. 0?userid=y1arx6uxb1nidmz3tguv
3. bryj9itvwjmbyv3wg8ef
I am trying to pass these values to another variable col=?([a-zA-Z0-9]{1,20})|([a-zA-Z0-9]{1,20})
it is taking the right values for first and three for second one it is taking values 0
instead it should take y1arx6uxb1nidmz3tguv
I think you need to use this regex instead:
[a-zA-Z](?=[A-Za-z]*\d)[a-zA-Z0-9]{1,19}
Demo
This will make sure that the bunch of characters that you are going to match is only built of alphabets and numeber thus 0 Or userid are not considered as match!
Alternative:
But the above may not consider the case where a valid sequence may start with number instead of alphabet. In that case you may use the following regex which will consider both situation:
(?:[a-zA-Z](?=[A-Za-z]*\d)|\d(?=\d*[A-Za-z]))[a-zA-Z0-9]{1,19}
Demo 2
It appears you are not wanting to match anything before an equal sign =. You can use the line terminator $ to ensure that it will match your your characters and stop at any non-matching characters.
([a-zA-Z0-9]{1,20})$
DEMO

Regular expression (regex): each character can appear at most as many as given

So far I have this regex ^(?!.*?(a|c|e|g|i).*?\1)[acegi]+$ which match any word as combination of the characters "acegi", and these characters can occur only once.
Now I'm trying to match any word which will consist of given characters and these characters can repeat as many times as given.
Example for set of given characters "acegii"
Valid matches: "acegii" "ace" "a" "i" "ai" "gii" "ici" "iic" "aicige" etc.
Invalid matches: "acegiii" "iacegii" "iii" "aa" "cc" etc.
Thanks for any help!
Note: the characters set in the regex should be easily replaceable if possible.
Prefered regexs: posix, ruby
You can use something similar to what you have, but with a second negative lookahead for the i:
^(?!.*?([aceg]).*?\1)(?!.*?i.*?i.*?i)[acegi]+$
Basically, one negative lookahead for each number of 'most' appearances.
rubular demo
Quantify your lookahead:
/^(?!.*?([acegi])(?:.*?\1){N})[acegi]+$/
Replace that N with the number of appearances that are allowed - for instance, {1} will allow a single one of each character. {2} will allow one or two occurrences. {3} allows up to three, and so on.
Keep in mind, though, that you are dangerously close to the path of catastrophic backtracking, which could well crash your script.
You may want to use string operations instead. In summary:
Match string against /^[acegi]+$/
Count number of occurrences of each character (ie. iterate through the string)
Get the maximum number of occurrences (could be a simple max() call if done right)
If that max is higher than your allowed limit, trigger failure.

How to select any number of digits with regex?

I'm trying to write a regex that will extract the numbers after directory/ in the following URL:
http://www.website.com/directory/9892639512/alphanum3r1c/some-more-text-here/0892735235
If I know the number of digits, then this regex I wrote works:
directory\/([0-9]{7})\/
However, if I remove the number of digits to match {7}, then the regex stops working.
Live demo: http://regex101.com/r/wX3eI2
I've been trying different, things, but can't seem to get the regex to work without explicitly setting the number of characters to match.
How can I get this working?
Change regex to:
directory\/([0-9]+)\/
The {7} means, 7 characters (in this case only numbers). The + means one or more characters (in this case numbers).

Regular Expression for matching a phone number

I need a regular expression to match phone numbers. I just want to know if the number is probably a phone number and it could be any phone format, US or international. So I developed a strategy to determine if it matches.
I want it to accept the following characters: 0-9 as well as ,.()- and optionally start with a + (for international numbers). The string should not match if it has any other characters.
I tried this:
/\+?[0-9\/\.\(\)\-]/
But it matches phone numbers that have + in the middle of the number. And it matches numbers that contain alpha chars (I don't want that).
Lastly, I want to set the minimum length to 9 characters.
Any thoughts?
Thanks for any help, I'm obviously not too swift on RegEx stuff :)
Well, you're pretty close. Try this:
^\+?[0-9\/.()-]{9,}$
Without the start and end anchors you allow partial matching, so it can match +123 from the string :-)+123.
If you want a minimum of 9 digits, rather than any characters (so ---.../// isn't valid), you can use:
^\+?[\/.()-]*([0-9][\/.()-]*){9,}$
or, using a lookahead - before matching the string for [0-9/.()-]* the regex engine is looking for (\D*\d){9}, which is a of 9 digits, each digit possibly preceded by other characters (which we will validate later).
^\+?(?=(\D*\d){9})[0-9\/.()-]*$
The reason why it matches alpha character is because of the period. You have to escape it. I don't know what editor you are using for this, this is what I'll use for VIM:
^+\?[()\-\.]\?\([0-9][\.()\-]\?\)\{3,\}$
The juqeury has a plugin for US phone validation. Check this link. You can also see the regular expression in the source code.