regular expression about numbers - regex

is this regular expression valid in case I want to include numbers only up to 31 ?
'[^0-9>31]+ or it will also return alphabetic characters and I must somehow exclude them too ?

Your regex accepts one or more characters, each of which is not one of the following
0 1 2 3 4 5 6 7 8 9 >
What you want is:
/^(?:[0-9]|[12][0-9]|3[01])$/

Regular expressions are not the sonic screwdriver of text, able to magically do everything you could possibly want. There is nothing in regular expressions that will check the value of a number.
What you need to do is two steps, written here in Perl.
$ok = ($s =~ /^\d{1,2}$/) && ($s < 31);
That checks the value of $s for start of the string (^), one or two digits (\d{1,2}) and then the end of the string ($). If that is true, then it also checks to see that the numeric value of $s is less than 31.
Yes, you can use a complex regex like this from Ray Toal's answer:
/^(?:[0-9]|[12][0-9]|3[01])$/
but that is far less readable.

Related

Regex, allow characters and digits, but allow up to 7 digits only

I would very much appreciate a bit of help with the following regex riddle.
I need regex statement that would validate against the following rules:
The input can contain letters, special characters and digits.
The input can't start with "0",
The input Can have up to 7 digits
Examples of valid input:
aa1234aa2.(less than 7 digits)
asd234566 (less than 7 digits)
Examples of invalid input:
0asdfd92 (starts with 0)
asd12312311 (more than 7 digits)
What I have tried so far:
^\D[0-9]{0,7}$,
validates against d0000000, but the input may be d0d0dddd1234d
The part can't start with 0 can be removed from the requirement if it complicates a lot. The most important is to have "Can have up to 7 digits" part.
Regards,
Oleg
This is what you need!
Attempt 1: ^[1-9]\d{0,6}$
Attempt 2: ^[^0][\d\w]{0,6}$
Attempt 3: ^[^0].{0,6}$
Attempt 4: ^([\D]*\d){0,7}[\D]*$
Attempt 5: ^([\D]*[1-9]){0,7}[\D]*$|^[^0]\d{0,6}$
Attempt 6: ^([\D]*[1-9]){1,7}[\D]*$|^[^0]\d{1,6}$ <- this should work
Example here
If I understand the requirements correctly, this will work:
^(?=[^0])(\D*\d){0,7}\D*$
That will allow any string that does not start with a zero and has 7 or fewer digits. Any other characters are allowed in any quantity.
Explanation
The first part (?=[^0]) is an assertion that checks to make sure the string does not start with zero. The rest matches any number of non-digits followed by a digit, up to 7 times. Then any number of non-digits before the end of the string.
Assuming Perl (it looks like Perl regular expressions):
Check for leading zero: if (subst($pass, 0, 1) eq '0') { fail }
Check for no more than seven digits: if (($pass =~ tr /0-9/0-9/) > 7) { fail }
I'm generally against trying to cram everything into a single regular expression, especially when there are other tools available to do the job. In this case, the tr will not be executed if there is a leading zero, and a leading zero is easy to spot in the beginning of a string.
Doing it this way, it's easy to add further restrictions independently of the others. For example, "there may be more than 7 digits if they are all separated by other types of characters" (a regex for this one, probably).
You can use this regex:
^[^0](?:\D*\d){1,7}\D*$
RegEx Demo
This will perform following validations:
Must start with non-zero
Has 1 to 7 digits after first char
Verbose, but does the trick.
(^[1-9][^\d]*([\d]?[^\d]*){0,6}$|^[^\d]+([\d]?[^\d]*){0,7}$)
I found it easier to split the RegEx into two cases: when the string starts with a digit, and when it doesn't.
^((?:\D+(?:\d?\D*){0,7})|(?:[1-9]\D*(?:\d?\D*){0,6}))$
You can test it here

Regular Expression to Remove HTML Number Codes

I would like to remove special HTML characters using regular Expressions.
™ is the trade mark symbol - that's okay to stay.
But if the length of numbers between &# and ; is greater than 4 digits, it needs to be removed.
For example: 😏 is a smiley face - needs to be filtered out.
This line of code is not working $article =~ s/&#\d{4,};//;
Use the global flag to replace all instances of a pattern, rather than just the first.
If you want to replace instances with greater than 4 digits, then quantify with a minimum of 5.
$article =~ s/&#\d{5,};//g;

perl regular expression digit match

Doing the below regex match to verify whether date is in the YYYY_MM_DD Format. But the regular expression gives an error message if i have a value of 2012_07_7. Date part and month should be exactly 2 digits according to the regex pattern. Not sure why it's not working.
if ($cmdParams{RunId} !~ m/^\d{4}_\d{2}_\d{2}$/)
{
print "Not a valid date in the format YYYY_MM_DD";
}
Your regex specifies exactly 2 digits for the day component, if you want to allow either 1 or 2 digits you should use {1,2} rather than {2}
Well if you look at your data that you have: 2012_07_7 you can see that the day-part is not of two digits.
Obviously. Your pattern dictates that the last numeric chunk should be of two digits, whereas you are providing 1. So if you want your pattern to match this text, try something like:
if ($cmdParams{RunId} !~ m/^\d{4}_\d{2}_\d\d?$/)
My solution: ^\d{4}_(?:1[0-2]|0?[1-9])_(?:3[01]|[1-2]\d|0?[1-9])$
this pattern match: 2000_12_01 or 2001_1_1 or 2001_02_1

How do I write a Regular Expression to match any three digit number value?

I'm working with some pretty funky HTML markup that I inherited, and I need to remove the following attributes from about 72 td elements.
sdval="285"
I know I can do this with find/replace in my code editor, except since the value of each attribute is different by 5 degree increments, I can't match them all without a Regular Expression. (FYI I'm using Esspress and it does support RegExes in it's Find/Replace tool)
Only trouble is, I really can't figure out how to write a RegEx for this value. I understand the concept of RegExes, but really don't know how to use them.
So how would I write the following with a Regular Expression in place of the digits so that it would match any three digit value?
sdval="285"
/sdval="\d{3}"/
EDIT:
To answer your comment, \d in regular expressions means match any digit, and the {n} construct means repeat the previous item n times.
Easiest, most portable: [0-9][0-9][0-9]
More "modern": \d{3}
This should do (ignores leading zeros):
[1-9][0-9]{0,2}
import re
data = "719"
data1 = "79"
# This expression will match any single, double or triple digit Number
expression = '[\d]{1,3}'
print(re.search(expression, data).string)
# This expression will match only triple digit Number
expression1 = '[\d]{3}'
print(re.search(expression1, data1).string)
Output :
expression : 719
expression1 : 79
It sounds like you're trying to do a find / replace in Visual Studio of a 3 digit number (references to Express and Find/Replace tool). If that's the case the regex to find a 3 digit number in Visual Studio is the following
<:d:d:d>
Breakdown
The < and > establish a word boundary to make sure we don't match a number subset.
Each :d entry matches a single digit.

Regex - Find numbers between 2000 and 3000

I have a need to search all numbers with 4 digits between 2000 and 3000.
It can be that letters are before and after.
I thought I can use [2000-3000]{4}, but doesnt work, why?
thank you.
How about
^2\d{3}|3000$
Or as Amarghosh & Bart K. & jleedev pointed out, to match multiple instances
\b(?:2[0-9]{3}|3000)\b
If you need to match a3000 or 3000a but not 13000, you would need lookahead and lookbefore like
(?<![0-9])(?:2[0-9]{3}|3000)(?![0-9])
Regular expressions are rarely suitable for checking ranges since for ranges like 27 through 9076 inclusive, they become incredibly ugly. It can be done but you're really better off just doing a regex to check for numerics, something like:
^[0-9]+$
which should work on just about every regex engine, and then check the range manually.
In toto:
def isBetween2kAnd3k(s):
if not s.match ("^[0-9]+$"):
return false
i = s.toInt()
if i < 2000 or i > 3000:
return false
return true
What your particular regex [2000-3000]{4} is checking for is exactly four occurrences of any of the following character: 2,0,0,0-3,0,0,0 - in other words, exactly four digits drawn from 0-3.
With letters before an after, you will need to modify the regex and check the correct substring, something like:
def isBetween2kAnd3kWithLetters(s):
if not s.match ("^[A-Za-z]*[0-9]{4}[A-Za-z]*$"):
return false
idx = s.locate ("[0-9]")
i = s.substring(idx,4).toInt()
if i < 2000 or i > 3000:
return false
return true
As an aside, a regex for checking the range 27 through 9076 inclusive would be something like this hideous monstrosity:
^2[7-9]|[3-9][9-9]|[1-9][0-9]{2}|[1-8][0-9]{3}|90[0-6][0-9]|907[0-6]$
I think that's substantially less readable than using ^[1-9][0-9]+$ then checking if it's between 27 and 9076 with an if statement?
Hum tricky one. The dash - only applies to the character immediately before and after so what your regex is actually matching is exactly 4 characters between 0 and 3 inclusive (ie, 0, 1, 2 and 3). eg, 3210, 1230, 3333, etc... Try the expression below.
(2[0-9]{3})|(3000)
Here's explanation why and ways to detect ranges: http://www.regular-expressions.info/numericranges.html
Correct regex will be \b(2\d{3}|3000)\b. That means: match character '2' then exactly three digits (this will match any from 2000 to 2999) or just match '3000'. There are some good tutorials on regular expressions:
http://gnosis.cx/publish/programming/regular_expressions.html
http://immike.net/blog/2007/04/06/the-absolute-bare-minimum-every-programmer-should-know-about-regular-expressions/
http://www.regular-expressions.info/
why don't you check for greater or less than? its simpler than a regex
num >= 2000 and num <=3000