This question already has answers here:
How to determine if a number is a prime with regex?
(4 answers)
Closed 3 years ago.
Given an input string with an arbitrary number of 'x' characters (x, xx, xxxxx, xxxxxxxxxxxxx and so on), how can one write a regex that matches the input string only if it has a prime number of 'x' characters? A string of length 1 should not be matched.
For example:
Match these:
xx
xxx
xxxxx
xxxxxxx
But not these:
x
xxxx
xxxxxxxxx
This is one solution I found - ^(?!(xx+)\1+$) (here, as an answer to this problem). However, I would like to know why it works. Please share any alternate solutions as well.
I'm using the PCRE engine.
I realize that one would typically not use regexes for this sort of thing. I am simply curious about how it could be done.
^(?!(xx+)\1+$)
works by performing a negative lookahead at the start of the line. It will reject any line that consists of
a number of 2 or more xes
followed by the same number of xes, possibly multiple times
In other words, the regex works by matching any number of xes that can not be divided in smaller, equally sized groups of size >= 2.
To exclude the case of just a single x, you could use ^(?!(xx+)\1+$|x$).
I don't think regex is the right tool for this. Why do you need to do this?
If you can't make any assumptions about the length of the strings you need to check if the number is a prime number somehow (which is computationally expensive).
If you know the max length you could precalculate the prime-numbers, and then check the length against them, but it would still be needlessly complex to do this using regex.
So the only way I know of to do this is to use \b(\d{2}|\d{3}|\d{5})\b which as you can tell will quickly become cumbersome.
Related
This question already has an answer here:
Restricting character length in a regular expression
(1 answer)
Closed 3 years ago.
I need help with a regex that should match a fixed length pattern.
For example, the following regex allows for at most 1 ( and 1 ) in the matched pattern:
([^)(]*\(?[^)(]*\)?[^)(]*)
However I can not / do not want to use this solution because of the *, as the text I have to scan through is very large using it seems to really affect the performance.
I thus want to impose a match length limit, e.g. using {10,100} for example.
In other words, the regex should only match if
there are between 0 and 1 set of parenthese inside the string
the total length of the match is fixed, e.g. not infinite (No *!)
This seems to be a solution to my problem, however I do not get it to work and I have trouble understanding it.
I tried to use the accepted answer and created this:
^(?=[^()]{5,10}$)[^()]*(?:[()][^()]*){0,2}$
which does not seem to really work as expected: https://regex101.com/r/XUiJZz/1
Also please do not mark this question a duplicate of another question, if the answers in that question make use of the kleene star operator, it wont help me.
Edit:
I know this is a possible solution, but I'm wondering if there is a better way to do it:
([^)(]{0,100}\(?[^)(]{0,100}\)?[^)(]{0,100})
I thus want to impose a match length limit, e.g. using {10,100}
You may want to anchors add a lookahead assertion in your regex:
^(?=.{10,100})[^)(]*(?:\(?[^)(]*\))?[^)(]*$
(?=.{10,100}) is lookahead condition to assert that length of string must be between 10 and 100.
RegEx Demo
Hey I'm supposed to develop a regular expression for a binary string that has no consecutive 0s and no consecutive 1s. However this question is proving quite tricky. I'm not quite sure how to approach it as is.
If anyone could help that'd be great! This is new to me.
You're basically looking for alternating digits, the string:
...01010101010101...
but one that doesn't go infinitely in either direction.
That would be an optional 0 followed by any number of 10 sets followed by an optional 1:
^0?(10)*1?$
The (10)* (group) gives you as many of the alternating digits as you need and the optional edge characters allow you to start/stop with a half-group.
Keep in mind that also allows an empty string which may not be what you want, though you could argue that's still a binary string with no consecutive identical digits. If you need it to have a length of at least one, you can do that with a more complicated "or" regex like:
^(0(10)*1?)|(1(01)*0?)$
which makes the first digit (either 1 or 0) non-optional and adjusts the following sequences accordingly for the two cases.
But a simpler solution may be better if it's allowed - just ensure it has a length greater than zero before doing the regex check.
I'm trying to figure out a regex for this rule:
"Must contain minimum 3 and maximum 5
numeric chars. The same character can
be repeated max. 5 times! Also, the length should be minim 10 chars."
Do you have any ideea?
I started with this:
^\d{3,5}$
but this does restrict to have min. 3 decimals one after other and what I need is the possibility to have them intercalated with letters also (min. 3 and max 5 occurrences).
Can you give a helping hand please?
It's possible in regex, but the need of backreference is going to make it very slow.
^(?=(?:\D*\d){3,5}\D*$)(?!.*(.)(?:.*\1){4}).{10,}
Description:
(?=(?:\D*\d){3,5}\D*$): Ensure there is 3 to 5 numerals
(?!.*(.)(?:.*\1){4}): Ensure there is not 5 copies on the same character
.{10,}: Ensure the matched string's length is at least 10.
An easier way is to use a Dictionary<char, int> and tally the characters.
All of these conditions (except length) are extremely unsuited to regular expressions. Unsuited as in: it will take an exponential-size expression. Use normal programming methods instead to count letters, numbers and repetition. - unless this is homework for Regular Expressions 500, there is no point whatsoever in using a regex.
I got the following regex that almost does the work but does not exclude zero ...How to do that?
^(\d|\d{1,9}|1\d{1,9}|20\d{8}|213\d{7}|2146\d{6}|21473\d{5}|214747\d{4}|2147482\d{3}|21474835\d{2}|214748364[0-7])$
Also can anybody explain a bit how this works?
Regular expressions are not the right tool for this job. A much better solution is to extract the integer from your string (you can use a regex for this, just \d+), then convert that to an integer, then check the integer against your desired range.
An important corollary is to never blindly use a regular expression (or any code, really) that you don't understand yourself. What would you do if you used the regular expression above, then a requirement came in to modify the acceptable range?
As Greg said, regexes are not the right tool for the job here. But if you insist on knowing how the regex you pasted works:
The most important thing to remember is that 2**31 - 1 = 2147483647 (a number with 10 digits). In essence, the regex says:
The number can have 1-9 digits, OR
It can be 1 with any 9 digits after it, OR
20 with any 8 digits after it, OR
213 with any 7 digits after it, OR
... I'm sure you see where it's going
It restricts the numbers to the range of being below 2147483647.
P.S. given such a number as a string s, in Python, you can just pose this condition:
1 <= int(s) <= 2**31 - 1
In addition to the other answers, your regex doesn't work (besides allowing 0): it incorrectly excludes numbers like 2100000000, 2147483639, and most of the numbers between those two. The solution is to replace most of the nnnn prefixes with nnn[0-n] (along with other fixes), but the real solution is to not use regular expressions.
I'm new to StackOverflow, so please let me know if there is a better way to ask the following question.
I need to create a regular expression that detects whether a field in the database is numeric, and if it is numeric does it fall within a valid range (i.e. 1-50). I've tried [1-50], which works except for the instances where a single digit number is preceded by a 0 (i.e. 06). 06 should still be considered a valid number, since I can later convert that to a number.
I really appreciate your help! I'm trying to learn more about regular expressions, and have been learning all I can from: www.regular-expressions.info. If you guys have recommendations of other sites to bone up on this stuff I would appreciate it!
Try this
^(0?[1-9])|([1-4][0-9])|(50)$
The idea of this regex is to break the problem down into cases
0?[1-9] takes care of the single digit case allowing for an optional preceeding 0
[1-4][0-9] takes care of all numbers from 10 to 49. This also allwows for a preceeding 0 on a single digit
50 takes care of 50
Regular expressions work on characters (in this case digits), not numbers. You need to have a separate pattern for each number of digits in your pattern, and combine them with | (the OR operator) like the other answers have suggested. However, consider just checking if the text is numeric with a regular expression (like [0-9]+) and then converting to an integer and checking the integer is within range.
You can't easily do range checking with regular expressions. You can -- with some work -- develop a pattern that recognizes a numeric range, but it's usually quite complex, and difficult to modify for a slightly different range.
You're better off breaking this into two parts.
Recognize the number pattern (^\d+$).
Check the range of that number in an application program.
^0?[1-50]{1,2}$