RegEx Numeric Check in Range? - regex

I'm new to StackOverflow, so please let me know if there is a better way to ask the following question.
I need to create a regular expression that detects whether a field in the database is numeric, and if it is numeric does it fall within a valid range (i.e. 1-50). I've tried [1-50], which works except for the instances where a single digit number is preceded by a 0 (i.e. 06). 06 should still be considered a valid number, since I can later convert that to a number.
I really appreciate your help! I'm trying to learn more about regular expressions, and have been learning all I can from: www.regular-expressions.info. If you guys have recommendations of other sites to bone up on this stuff I would appreciate it!

Try this
^(0?[1-9])|([1-4][0-9])|(50)$
The idea of this regex is to break the problem down into cases
0?[1-9] takes care of the single digit case allowing for an optional preceeding 0
[1-4][0-9] takes care of all numbers from 10 to 49. This also allwows for a preceeding 0 on a single digit
50 takes care of 50

Regular expressions work on characters (in this case digits), not numbers. You need to have a separate pattern for each number of digits in your pattern, and combine them with | (the OR operator) like the other answers have suggested. However, consider just checking if the text is numeric with a regular expression (like [0-9]+) and then converting to an integer and checking the integer is within range.

You can't easily do range checking with regular expressions. You can -- with some work -- develop a pattern that recognizes a numeric range, but it's usually quite complex, and difficult to modify for a slightly different range.
You're better off breaking this into two parts.
Recognize the number pattern (^\d+$).
Check the range of that number in an application program.

^0?[1-50]{1,2}$

Related

Regex to check if input (alphanumeric) contains a number smaller than X

This will probably be easy for regex magicians, however I can't seem to figure out a way with my limited knowledge.
I need a regex that would check if an alphanumeric string contains a number smaller than a number (16539065 in my case).
For example the following should be matched:
alpha16000000beta
foo300bar
And the following should not be matched:
foo16539066bar
Help please.
EDIT: I'm aware that it's inefficient, however I'm doing it in a cPanel Account Level filter, which only accepts regex. Unless I figure out a way for it to trigger a script instead, this would definitely need to be done with regex. :(
Your best option for this kind of operation is to use a capture group to get the number and then use whatever language you are using to do the comparison. If you absolutely have to use a regex to do this, it will be extremely inefficient. To do so, you will need to combine a lot of similar expressions:
\d{1,7} will find any numbers with 1 to 7 digits, which will always be less than 16539065
1653906[1-4] will catch the absolute maximum values accepted
165390[1-5]\d will catch the next range of acceptable values
1653[1-8]\d{3} will continue on the acceptable range
Repeat the above until you reach 1[1-5]\d{6}
Once you have all of those expressions, they can be combined using the 'or' operator. Keep in mind that using regular expressions in this manner is considered to be bad practice and creates hard to read code.
Bad Karma might kill me, but here is a working solution for your cases (letters then numbers then letters). It will not work for e.g. ab12cd34de.
There is not really a way to shortcode anything, just the long way. I'm using a negative lookahead to check, that the number is not bigger or equal to 16539065.
^\D*(?!0*(?:\d{9}|2\d{7}|1[7-9]\d{6}|16[6-9]\d{5}|165[4-9]\d{4}|16539[1-9]\d{2}|165390[7-9]\d|1653906[5-9]))\d+\D*$
It checks for the general format ^\D*\d+\D*$ and then rolls 16539065 down to it's parts.
Here's a little demo to play around: https://regex101.com/r/aV6yQ9/1

how do i write the regex for a limited range repetition

I am creating a form whereby the users can input a multi-value in a limited range
I am having trouble repeating the range as shown below, do i have to validate the comma as well and can i have the full regular expression solution for this?
I am allowing the user to input the value multiple times for a limited range of 0-1000 for 64 times or less
the input can be as follow:
1000,0,100,123,10,23,56,654,981
and here's my current regular expression for the range
(^(?:[0-9]|[1-9][0-9]|[1-9][0-9]{2}|1000)$)
Short answer:
^((1000|\d{1,3})(,|$)){1,64}
(Assuming you don't mind leading zeros. If you do, then change \d{1,3} to the more complex ([1-9][0-9]|[1-9])?[0-9].)
Long version:
We want to match (numbers in the range 0-1000) (repeated 1-64 times).
The first part can be done with (1000|\d{3}) (with the caveat noted above about leading zeros).
For the second part, we use a handy trick to do the comma-separation aspect: we say that each number must either be followed by a comma or the end of the string.
Note that there is a small weakness to this approach: it accepts trailing commas, e.g. 1,2,3, matches. If you're not okay, you can adapt by just adding, but it'll make the pattern longer:
^((1000|\d{1,3}),){0,63}(1000|\d{1,3})$
Note that I use an explicit {0,63} but many regex variants will accept the short form {,63} as well.
Also note that regex might not be the best solution for this - it might be better to just split the input string on commas and then iterate though the pieces, validating that each one is a number from 0-1000 and there are 64 or fewer pieces.

Regular Expression (consecutive 1s and 0s)

Hey I'm supposed to develop a regular expression for a binary string that has no consecutive 0s and no consecutive 1s. However this question is proving quite tricky. I'm not quite sure how to approach it as is.
If anyone could help that'd be great! This is new to me.
You're basically looking for alternating digits, the string:
...01010101010101...
but one that doesn't go infinitely in either direction.
That would be an optional 0 followed by any number of 10 sets followed by an optional 1:
^0?(10)*1?$
The (10)* (group) gives you as many of the alternating digits as you need and the optional edge characters allow you to start/stop with a half-group.
Keep in mind that also allows an empty string which may not be what you want, though you could argue that's still a binary string with no consecutive identical digits. If you need it to have a length of at least one, you can do that with a more complicated "or" regex like:
^(0(10)*1?)|(1(01)*0?)$
which makes the first digit (either 1 or 0) non-optional and adjusts the following sequences accordingly for the two cases.
But a simpler solution may be better if it's allowed - just ensure it has a length greater than zero before doing the regex check.

regex - At most two pair of consecutives

I'm taking a computation course which also teaches about regular expressions. There is a difficult question that I cannot answer.
Find a regular expression for the language that accepts words that contains at most two pair of consecutive 0's. The alphabet consists of 0 and 1.
First, I made an NFA of the language but cannot convert it to a GNFA (that later be converted to regex). How can I find this regular expressin? With or without converting it to a GNFA?
(Since this is a homework problem, I'm assuming that you just want enough help to get started, and not a full worked solution?)
Your mileage may vary, but I don't really recommend trying to convert an NFA into a regular expression. The two are theoretically equivalent, and either can be converted into the other algorithmically, but in my opinion, it's not the most intuitive way to construct either one.
Instead, one approach is to start by enumerating various possibilities:
No pairs of consecutive zeroes at all; that is, every zero, except at the end of the string, must be followed by a one. So, the string consists of a mixed sequence of 1 and 01, optionally followed by 0:
(1|01)*(0|ε)
Exactly one pair of consecutive zeroes, at the end of the string. This is very similar to the previous:
(1|01)*00
Exactly one pair of consecutive zeroes, not at the end of the string — and, therefore, necessarily followed by a one. This is also very similar to the first one:
(1|01)*001(1|01)*(0|ε)
To continue that approach, you would then extend the above to support two pair of consecutive zeroes; and lastly, you would merge all of these into a single regular expression.
(0+1)*00(0+1)*00(0+1)* + (0+1)*000(0+1)*
contains at most two pair of consecutive 0's
(1|01)*(00|ε)(1|10)*(00|ε)(1|10)*

Random string that matches a regexp [duplicate]

This question already has answers here:
Using Regex to generate Strings rather than match them
(12 answers)
Closed 1 year ago.
How would you go about creating a random alpha-numeric string that matches a certain regular expression?
This is specifically for creating initial passwords that fulfill regular password requirements.
Welp, just musing, but the general question of generating random inputs that match a regex sounds doable to me for a sufficiently relaxed definition of random and a sufficiently tight definition of regex. I'm thinking of the classical formal definition, which allows only ()|* and alphabet characters.
Regular expressions can be mapped to formal machines called finite automata. Such a machine is a directed graph with a particular node called the final state, a node called the initial state, and a letter from the alphabet on each edge. A word is accepted by the regex if it's possible to start at the initial state and traverse one edge labeled with each character through the graph and end at the final state.
One could build the graph, then start at the final state and traverse random edges backwards, keeping track of the path. In a standard construction, every node in the graph is reachable from the initial state, so you do not need to worry about making irrecoverable mistakes and needing to backtrack. If you reach the initial state, stop, and read off the path going forward. That's your match for the regex.
There's no particular guarantee about when or if you'll reach the initial state, though. One would have to figure out in what sense the generated strings are 'random', and in what sense you are hoping for a random element from the language in the first place.
Maybe that's a starting point for thinking about the problem, though!
Now that I've written that out, it seems to me that it might be simpler to repeatedly resolve choices to simplify the regex pattern until you're left with a simple string. Find the first non-alphabet character in the pattern. If it's a *, replicate the preceding item some number of times and remove the *. If it's a |, choose which of the OR'd items to preserve and remove the rest. For a left paren, do the same, but looking at the character following the matching right paren. This is probably easier if you parse the regex into a tree representation first that makes the paren grouping structure easier to work with.
To the person who worried that deciding if a regex actually matches anything is equivalent to the halting problem: Nope, regular languages are quite well behaved. You can tell if any two regexes describe the same set of accepted strings. You basically make the machine above, then follow an algorithm to produce a canonical minimal equivalent machine. Do that for two regexes, then check if the resulting minimal machines are equivalent, which is straightforward.
String::Random in Perl will generate a random string from a subset of regular expressions:
#!/usr/bin/perl
use strict;
use warnings;
use String::Random qw/random_regex/;
print random_regex('[A-Za-z]{3}[0-9][A-Z]{2}[!##$%^&*]'), "\n";
If you have a specific problem, you probably have a specific regular expression in mind. I would take that regular expression, work out what it means in simple human terms, and work from there.
I suspect it's possible to create a general regex random match generator, but it's likely to be much more work than just handling a specific case - even if that case changes a few times a year.
(Actually, it may not be possible to generate random matches in the most general sense - I have a vague memory that the problem of "does any string match this regex" is the halting problem in disguise. With a very cut-down regex language you may have more luck though.)
I have written Parsley, which consist of a Lexer and a Generator.
Lexer is for converting a regular expression-like string into a sequence of tokens.
Generator is using these tokens to produce a defined number of codes.
$generator = new \Gajus\Parsley\Generator();
/**
* Generate a set of random codes based on Parsley pattern.
* Codes are guaranteed to be unique within the set.
*
* #param string $pattern Parsley pattern.
* #param int $amount Number of codes to generate.
* #param int $safeguard Number of additional codes generated in case there are duplicates that need to be replaced.
* #return array
*/
$codes = $generator->generateFromPattern('FOO[A-Z]{10}[0-9]{2}', 100);
The above example will generate an array containing 100 codes, each prefixed with "FOO", followed by 10 characters from "ABCDEFGHKMNOPRSTUVWXYZ23456789" haystack and 2 numbers from "0123456789" haystack.
This PHP library looks promising: ReverseRegex
Like all of these, it only handles a subset of regular expressions but it can do fairly complex stuff like UK Postcodes:
([A-PR-UWYZ]([0-9]([0-9]|[A-HJKSTUW])?|[A-HK-Y][0-9]([0-9]|[ABEHMNPRVWXY])?) ?[0-9][ABD-HJLNP-UW-Z]{2}|GIR0AA)
Outputs
D43WF
B6 6SB
MP445FR
P9 7EX
N9 2DH
GQ28 4UL
NH1 2SL
KY2 9LS
TE4Y 0AP
You'd need to write a string generator that can parse regular expressions and generate random members of character ranges for random lengths, etc.
Much easier would be to write a random password generator with certain rules (starts with a lower case letter, has at least one punctuation, capital letter and number, at least 6 characters, etc) and then write your regex so that any passwords created with said rules are valid.
Presuming you have both a minimum length and 3-of-4* (or similar) requirement, I'd just be inclined to use a decent password generator.
I've built a couple in the past (both web-based and command-line), and have never had to skip more than one generated string to pass the 3-of-4 rule.
3-of-4: must have at least three of the following characteristics: lowercase, uppercase, number, symbol
It is possible (for example, Haskell regexp module has a test suite which automatically generates strings that ought to match certain regexes).
However, for a simple task at hand you might be better off taking a simple password generator and filtering its output with your regexp.