Regex Match Roman Numerals from 0-39 Only - regex

I am trying to write a regex that will match Roman numerals from 0 to 39 only. There are plenty of examples which match much larger Roman numerals, but I cannot figure out how to match this specific subset.

Got it. Try this:
/^(X{1,3})(I[XV]|V?I{0,3})$|^(I[XV]|V?I{1,3})$|^V$/
Update:
Zero doesn't exist in Roman numerals. Therefore feel free to tack on your own implementation for zero.

I'm not sure how to represent 0 using Roman numerals. I assume that it has separate token N (see Wikipedia).
Assuming the regex tries to match the whole string (like in Java) and you have lookahead, you can use this regex:
(?.)(X{0,3}(IX|IV|V?I{0,3})|N)
Explanation:
(?.): ensure at least one character
X{0,3}: define the tens (0, 10, 20, 30)
(...): define the final digit
IX: 9
IV: 4
V?I{0,3}: 0-3, 5-8 (0 not as whole number, require at least one X)
N: 0 (as whole number)
If you represent 0 as empty string, the regex is simpler:
X{0,3}(IX|IV|V?I{0,3})
since the lookahead and N in the previous regex is just to prevent empty string.

Assuming you know you have valid Roman numerals and want to fetch only the ones <= 39, that is easy:
^[XVI]*$
See it in action
If that is not the case, it's a little bit trickier, but you can still take advantage of the fact that all the numbers that can be represented only with X, V and I are 1..39:
^X{0,3}(?:V?I{0,3}|I[VX])$
See it in action
X{0,3} covers 10, 20, 30
X{0,3}V?I{0,3} covers all but the ones that end with 4 or 9 (14, 29, etc)
X{0,3}I[VX] exactly the ones ending with 4 or 9
Note: these will also match an empty string, which is my interpretation of a Roman zero. If that is not the case, you can replace the * with + for the first regex and add a positive lookahead at the start of the regex for the second ((?=.)).
Note 2: If they are not on separate lines (or in separate strings), you can replace ^ and $ with word boundaries (\b).

Related

Optimization of Regular Expression to match numbers bigger or equal to 50

I want to check if a number is 50 or more using a regular expression. This in itself is no problem but the number field has another regex checking the format of the entered number.
The number will be in the continental format: 123.456,78 (a dot between groups of three digits and always a comma with 2 digits at the end)
Examples:
100.000,00
50.000,00
50,00
34,34
etc.
I want to capture numbers which are 50 or more. So from the four examples above the first three should be matched.
I've come up with this rather complicated one and am wondering if there is an easier way to do this.
^(\d{1,3}[.]|[5-9][0-9]|\d{3}|[.]\d{1,3})*[,]\d{2}$
EDIT
I want to match continental numbers here. The numbers have this format due to internal regulations and specify a price.
Example: 1000 EUR would be written as 1.000,00 EUR
50000 as 50.000,00 and so on.
It's a matter of taste, obviously, but using a negative lookahead gives a simple solution.
^(?!([1-4]?\d),)[1-9](\d{1,2})?(\.\d{3})*,\d{2}\b
In words: starting from a boundary ignore all numbers that start with 1 digit OR 2 digits (the first being a 1,2,3 or 4), followed by a comma.
Check on regex101.com
Try:
EDIT ^(.{3,}|[5-9]\d),\d{2}$
It checks if:
there 3 chars or more before the ,
there are 2 numbers before the , and the first is between 5 and 9
and then a , and 2 numbers
Donno if it answer your question as it'll return true for:
aa50,00
1sdf,54
But this assumes that your original string is a number in the format you expect (as it was not a requirement in your question).
EDIT 3
The regex below tests if the number is valid referring to the continental format and if it's equal or greater than 50. See tests here.
Regex: ^((([1-9]\d{0,2}\.)(\d{3}\.){0,}\d{3})|([1-9]\d{2})|([5-9]\d)),\d{2}$
Explanation (d is a number):
([1-9]\d{0,2}\.): either d., dd. or ddd. one time with the first d between 1 and 9.
(\d{3}\.){0,}: ddd. zero or x time
\d{3}: ddd 3 digit
These 3 parts combined match any numbers equals or greater than 1000 like: 1.000, 22.002 or 100.000.000.
([1-9]\d{2}): any number between 100 and 999.
([5-9]\d)): a number between 5 and 9 followed by a number. Matches anything between 50 and 99.
So it's either the one of the parts above or this one.
Then ,\d{2}$ matches the comma and the two last digits.
I have named all inner groups, for better understanding what part of number is matched by each group. After you understand how it works, change all ?P<..> to ?:.
This one is for any dec number in the continental format.
^(?P<common_int>(?P<int>(?P<int_start>[1-9]\d{1,2}|[1-9]\d|[1-9])(?P<int_end>\.\d{3})*|0)(?!,)|(?P<dec_int_having_frac>(?P<dec_int>(?P<dec_int_start>[1-9]\d{1,2}|[1-9]\d|[1-9])(?P<dec_int_end>\.\d{3})*,)|0,|,)(?=\d))(?P<frac_from_comma>(?<=,)(?P<frac>(?P<frac_start>\d{3}\.)*(?P<frac_end>\d{1,3})))?$
test
This one is for the same with the limit number>=50
^(?P<common_int>(?P<int>(?P<int_start>[1-9]\d{1,2}|[1-9]\d|[1-9])(?P<int_end>\.\d{3})+|(?P<int_short>[1-9]\d{2}|[5-9]\d))(?!,)|(?P<dec_int_having_frac>(?P<dec_int>(?P<dec_int_start>[1-9]\d{1,2}|[1-9]\d|[1-9])(?P<dec_int_end>\.\d{3})+,)|(?P<dec_short_int>[1-9]\d{2}|[5-9]\d),)(?=\d))(?P<frac_from_comma>(?<=,)(?P<frac>(?P<frac_start>\d{3}\.)*(?P<frac_end>\d{1,3})))?$
tests
If you always have the integer part under 999.999 and fractal part always 2 digits, it will be a bit more simple:
^(?P<dec_int_having_frac>(?P<dec_int>(?P<dec_int_start>[1-9]\d{1,2}|[1-9]\d|[1-9])(?P<dec_int_end>\.\d{3})?,)|(?P<dec_short_int>[1-9]\d{2}|[5-9]\d),)(?=\d)(?P<frac_from_comma>(?<=,)(?P<frac>(?P<frac_end>\d{1,2})))?$
test
If you can guarantee that the number is correctly formed -- that is, that the regex isn't expected to detect that 5,0.1 is invalid, then there are a limited number of passing cases:
ends with \d{3}
ends with [5-9]\d
contains \d{3},
contains [5-9]\d,
It's not actually necessary to do anything with \.
The easiest regex is to code for each of these individually:
(\d{3}$|[5-9]\d$|\d{3},|[5-9]\d)
You could make it more compact and efficient by merging some of the cases:
(\d{3}[$,]|[5-9]\d[$,])
If you need to also validate the format, you will need extra complexity. I would advise against attempting to do both in a single regex.
However unless you have a very good reason for having to do this with a regex, I recommend against it. Parse the string into an integer, and compare it with 50.

Regex, allow characters and digits, but allow up to 7 digits only

I would very much appreciate a bit of help with the following regex riddle.
I need regex statement that would validate against the following rules:
The input can contain letters, special characters and digits.
The input can't start with "0",
The input Can have up to 7 digits
Examples of valid input:
aa1234aa2.(less than 7 digits)
asd234566 (less than 7 digits)
Examples of invalid input:
0asdfd92 (starts with 0)
asd12312311 (more than 7 digits)
What I have tried so far:
^\D[0-9]{0,7}$,
validates against d0000000, but the input may be d0d0dddd1234d
The part can't start with 0 can be removed from the requirement if it complicates a lot. The most important is to have "Can have up to 7 digits" part.
Regards,
Oleg
This is what you need!
Attempt 1: ^[1-9]\d{0,6}$
Attempt 2: ^[^0][\d\w]{0,6}$
Attempt 3: ^[^0].{0,6}$
Attempt 4: ^([\D]*\d){0,7}[\D]*$
Attempt 5: ^([\D]*[1-9]){0,7}[\D]*$|^[^0]\d{0,6}$
Attempt 6: ^([\D]*[1-9]){1,7}[\D]*$|^[^0]\d{1,6}$ <- this should work
Example here
If I understand the requirements correctly, this will work:
^(?=[^0])(\D*\d){0,7}\D*$
That will allow any string that does not start with a zero and has 7 or fewer digits. Any other characters are allowed in any quantity.
Explanation
The first part (?=[^0]) is an assertion that checks to make sure the string does not start with zero. The rest matches any number of non-digits followed by a digit, up to 7 times. Then any number of non-digits before the end of the string.
Assuming Perl (it looks like Perl regular expressions):
Check for leading zero: if (subst($pass, 0, 1) eq '0') { fail }
Check for no more than seven digits: if (($pass =~ tr /0-9/0-9/) > 7) { fail }
I'm generally against trying to cram everything into a single regular expression, especially when there are other tools available to do the job. In this case, the tr will not be executed if there is a leading zero, and a leading zero is easy to spot in the beginning of a string.
Doing it this way, it's easy to add further restrictions independently of the others. For example, "there may be more than 7 digits if they are all separated by other types of characters" (a regex for this one, probably).
You can use this regex:
^[^0](?:\D*\d){1,7}\D*$
RegEx Demo
This will perform following validations:
Must start with non-zero
Has 1 to 7 digits after first char
Verbose, but does the trick.
(^[1-9][^\d]*([\d]?[^\d]*){0,6}$|^[^\d]+([\d]?[^\d]*){0,7}$)
I found it easier to split the RegEx into two cases: when the string starts with a digit, and when it doesn't.
^((?:\D+(?:\d?\D*){0,7})|(?:[1-9]\D*(?:\d?\D*){0,6}))$
You can test it here

RegEx which accepts only two decimal places

Hi I am working on RegEx. Correct response should NOT allow for number to the tenths only, as in RESPONSE = "925.0", nor should it allow for trailing zeros after the hundredths place as in RESPONSE = "925.000". Only correct responses: 925, 0925, 0925., 925., 925.00, 00925
I worked on it and finally came up with this
"^-?(0)*(\d*(\.(00))?\d+.|(\d){1,3}(,(\d){3})*(\.(00))?)$"
It works for three digit numbers but if i want it for 38400.00 it doesn't allow it
I am not quite certain whether the decimal places can be any digit or if they have to be zero. If the former, then this should do the trick:
^-?\d{1,3}(,?\d{3})*(\.(\d{2})?)?$
If the latter, then this:
^-?\d{1,3}(,?\d{3})*(\.(00)?)?$
The entire match starting with the decimal point is optional, and the two decimal places in that match are optional as well.
UPDATE I just realized that it appears you need to accept commas in the response as well - I assume for thousands, millions, etc.
UPDATE #2 per OP's comment
^-?(\d+|\d{1,3}(,\d{3})*)(\.(00)?)?$
UPDATE #3 Added link to regex101 for explanation of this regular expression.
Have a try with:
^-?\d{1,3}(?:,?\d{3})*(?:\.(?:00)?)?$
I think your problem is that you're trying to match it in chunks of three, with commas separating, but 38400.00 doesn't have commas.
Try this:
^-?\d+(\.?(\d{2})?)$
The - indicates the character, -. With the ? after, it says that it may or may not apply. This allows negative numbers, so if you only want positive numbers matched, delete the first two characters.
\d represents every digit. The + after says that there can be as many as you want, as long as there's at least one.
Then there's a \., which is just a dot in the number. The ? does the same as before.. Since you seem to allow trailing periods, I assumed you wanted it to be considered separately from the following digits.
The () encloses the next group, which is the period (\.) followed by two characters that match \d -- two digits -- and which may be repeated 0 or 1 times, as dictated by the ?. This allows people to either have no digits after the period or two, but nothing else.
The ^ at the beginning specifies it has to be at the beginning of the line, and the $ at the end specifies it has to end at the end of the line. Remember to enable the multiline (m) flag so it works properly.
Disclaimer: I've not done much regex work before, so I could well be totally off. If it doesn't work, let me know.
Couldn't you do this without the ?'s
^[0-9,]+(\.){0,1}(\d{2}){0,1}$
improved: ^\d+[0-9,]*(\.){0,1}(\d{2}){0,1}$
Edit:
Broken down a bit as requested
Old one:
[0-9,]+
1 or more digits/commas (would have accepted ',' as true) so improved version:
\d+
for starts with 1 or more digits
[0-9,]*
0 or more digits/commas
followed by
(\.){0,1}
0 or 1 decimal
Followed by
(\d{2}){0,1}
0 or 1 of (exactly 2 digits)

Matching numbers greater than 40

I'm trying to match numbers greater than 40. The good point is that all of them have 2 decimal places, so all of them are like: 3.25, 5.89, 999.75 and they don't use any leading zeros (except on the decimal part that always have 2 digits)...
At first I tried the following code but then I realized this wouldn't match numbers like 100, 1000... even if they are greater than 40.
[4-9][0-9]\.
I don't have to match the decimal part, so don't worry about matching that, just help me to find how to match numbers greater than 40 (up to 9999 would be fine).
Thanks for your help.
This should do the job:
([4-9][0-9]|\d{3,})\.
Check it here:
http://www.regexr.com/3a5v9
Don't use regular expressions for number comparison. If, for example, you're using Javascript:
var aNumber = parseFloat("50");
if (aNumber > 40) {
// yay!
}
If your regex flavour can use negative lookbehind to match the numbers from 41 to 9999 without decimal:
\b(?:[1-9][0-9]{2,3}|[5-9][0-9]|4[1-9])(?<!\.\d{1,2})\b
(40\.(?!0[^\d]|00)\d{1,2}|(((4[1-9](?!\d)|[5-9][0-9])(?![\d])|\d*[1-9]\d{2,})(\.\d{1,2})?))
This prevents false positives from leading 0s.
This worked for me.
It tries to match 40 followed by 1 or two decimals that are not 00.
It then tries to match 4 followed by 1-9, decimal optional.
If it can't match that it matches 5-9 followed by 0-9, decimal optional.
It then triese to match any digit, any number of times, followed by 1-9, followed by 1 or 2 digits, decimal optional.
If you want to require the decimal, just remove the last question mark.
This will do it:
([4-9][0-9]+|\d{3,})
This it will get all the numbers of two digits having the first one greater than 4 or any number with three digits.
As an example http://www.regexr.com/3a5v0
You can use brackets to indicate a minimum and, if desired, maximum number of characters to match. So,
([4-9][0-9]|[1-9][0-9]{2,})\.
matches 4-9 followed by one or more digits. Presumably there's a boundary of some sort at the beginning of this, but it sounds like you have that part worked out. This uses an OR to allow for two possible groups of first digits.
(Most of the other answer are perfect for me -- This is paranoia and a bad idea :)
for use with grep -Po or Perl we could use:
'\b(\d{3,}|[4-9]\d)\.\d\d'
but this would get 40.00 (not greater than 40)
'\b(\d{3,}|[5-9]\d|4[1-9])\.\d\d|\b40\.\d?[1-9]\d?'
Corresponding to:
DDD.DD
| [5-9]D.DD
| 4[1-9].DD
| 40.D[1-9]
| 40.[1-9]D
In flex(1) you have this code to parse strings and get numbers greater than 40:
pru.l:
%option noyywrap
%%
\+?(0*[4-9][0-9]|0*[1-9][0-9][0-9][0-9]*)(\.[0-9]*)? { printf("Greater than 40: %s\n", yytext); }
\-?[0-9]*(\.[0-9]*)? { printf("Lesser than 40: %s\n", yytext); }
\n |
. ;
%%
int main()
{ yylex(); }
Install flex and compile this file it with
make pru
Then run it as:
pru <filein >fileout
or just
pru
This code constructs a deterministic finite automaton from the regular expressions listed and prints the commands listed on the right when recognizes a value greater than 40. It allows a leading optional sign and leading zeros, and an optional fractional part composed of any number of digits. And it does this with only one asignment and one decision for each character read. You have access to the automaton state table generated by flex (it writes C code for you)
the regex that recognizes numbers greater than 40 (with decimals and leading sign and zeros) is:
\+?(0*[4-9][0-9]|0*[1-9][0-9][0-9][0-9]*)(\.[0-9]*)?
and can be abreviated as:
\+?(0*[4-9][0-9]|0*[1-9][0-9]{3,})(\.[0-9]*)?
explanation:
\+? matches an optional plus sign.
(...|...) two options:
0* optional arbitrary number of leadin zeros.
[4-9][0-9] the numbers 40 to 99
[1-9][0-9]{3,} the numbers 100 and up.
(.[0-9]*)? optional decimal point followed by an arbitrary number of digits.

Regular Expression - Two Digit Range (23-79)?

I have been reading the regex questions on this site but my issue seems to be a bit different. I need to match a 2 digit number, such as 23 through 75. I am doing this on an HP-UX Unix system. I found examples of 3 - 44 but or any digit number, nothing that is fixed in length, which is a bit surprising, but perhaps I am not understand the variable length example answer.
Since you're not indicating whether this is in addition to any other characters (or in the middle of a larger string), I've included the logic here to indicate what you would need to match the number portion of a string. This should get you there. We're creating a range for the second numbers we're looking for only allowing those characters. Then we're comparing it to the other ranges as an or:
(2[3456789]|[3456][0-9]|7[012345])
As oded noted you can do this as well since sub ranges are also accepted (depends on the implementation of REGEX in the application you're using):
(2[3-9]|[3-6][0-9]|7[0-5])
Based on the title you would change the last 5 to a 9 to go from 75-79:
(2[3-9]|[3-6][0-9]|7[0-9])
If you are trying to match these numbers specifically as a string (from start to end) then you would use the modifiers ^ and $ to indicate the beginning and end of the string.
There is an excellent technical reference of Regex ranges here:
http://www.regular-expressions.info/numericranges.html
If you're using something like grep and trying to match lines that contain the number with other content then you might do something like this for ranges thru 79:
grep "[^0-9]?(2[3-9]|[3-6][0-9]|7[0-9])[^0-9]?" folder
This tool is exactly what you need: Regex_For_Range
From 29 to 79: \b(2[3-9]|[3-7][0-9])\b
From 29 to 75: \b(29|[3-6][0-9]|7[0-5])\b
And just for fun, from 192 to 1742: \b(19[2-9]|[2-9][0-9]{2}|1[0-6][0-9]{2}|17[0-3][0-9]|174[0-2])\b :)
If I want 2 digit range 0-63
/^[0-9]|[0-5][0-9]|6[0-3]$/
[0-9] will allow single digit from 0 to 9
[0-5][0-9] will allow from 00 to 59
6[0-3] will allow from 60 till 63
This way you can take Regular Expression for any Two Digit Range
You have two classes of numbers you want to match:
the digit 2, followed by one of the digits between 3 and 9
one of the digits between 3 and 7, followed by any digit
Edit: Well, that's the title's range (23-79). Within your question (23-75), you have three:
the digit 2, followed by one of the digits between 3 and 9
one of the digits between 3 and 6, followed by any digit
the digit 7, followed by one of the digits between 0 and 5
Just to add to this, here is a solution for generating the string from the accepted answer in javascript. You can click "Run Code Snippet" to enter your own bounds and get your own string.
function regexRangeString(lower,upper){
let current=lower;
let nextRange=function(){
let currentString=String(current);
let len=currentString.length;
let string="";
let newUpper;
for(let digit=0;digit<len;digit++){
let index=len-digit-1;
let lower=Number(currentString[index]);
let thisString="";
for(let u=9;u>=lower;u--){
let us=currentString.substring(0,index)+u+currentString.substring(index+1,len);
if(Number(us)<=upper){
if(lower==u){
thisString=lower;
}
else{
thisString=`[${lower}-${u}]`;
}
currentString=currentString.substring(0,index)+u+currentString.substring(index+1,len);
break;
}
}
if(thisString!="[0-9]"){
string=currentString.substring(0,index)+thisString+string;
break;
}
else{
string=thisString+string
}
}
current=Number(currentString)+1;
return string
}
let string=""
while(current<upper){
string+="|"+nextRange(current);
}
string="("+string.slice(1)+")";
return string
}
let lower=prompt("Enter Lower Bound")
let upper=prompt("Enter Upper Bound")
alert(regexRangeString(lower,upper))
For example:
regexRangeString(72,189)
generates the following output string:
(7[2-9]|[8-9][0-9]|1[0-8][0-9])
This should do it:
/^([2][3-9]|[3-6][0-9]|[7][0-5])$/
^ and $ will make it strict that it will match only 2 numbers, so in case that you have i.e 234 it won't work.