Need regex expression with multiple conditions - regex

I need regex with following conditions
It should accept maximum of 5 digits then upto 3 decimal places
it can be negative
it can be zero
it can be only numbers (max. upto 5 digit place)
it can be null
I have tried following but its not, its not fulfilling all conditions
#"^([\-\+]?)\d{0,5}(.[0-9]{1,3})?)$"
E.g. maximum value can hold is from -99999.999 to 99999.999

Use this regex:
^[-+]?\d{0,5}(\.[0-9]{1,3})?$
I only made two changes here. First, you don't need to escape any characters inside a character class normally, except for opening and closing brackets, or possibly backslash itself. Hence, we can use [-+] to capture an initial plus or minus. Second, you need to escape the dot in your regex, to tell the engine that you want to match a literal dot.
However, I would probably phrase this regex as follows:
^[-+]?\d{1,5}(\.[0-9]{1,3})?$
This will match one to five digits, followed by an optional decimal point, followed by one to three digits.
Note that we want to capture things like:
0.123
But not
.123
i.e. we don't want to capture a leading decimal point should it not be prefixed by at least one number.
Demo here:
Regex101

I assume you're doing this in C# given the notation. Here's a little code you can use to test your expression, with two corrections:
You have to escape the dot, otherwise it means "any character". So, \. instead of .
There was an extraneous close parenthesis that prevented the expression from compiling
C#:
var expr = #"^([\-\+]?)\d{0,5}(\.[0-9]{1,3})?$";
var re = new Regex(expr);
string[] samples = {
"",
"0",
"1.1",
"1.12",
"1.123",
"12.3",
"12.34",
"12.345",
"123.4",
"12345.123",
".1",
".1234"
};
foreach(var s in samples) {
Console.WriteLine("Testing [{0}]: {1}", s, re.IsMatch(s) ? "PASS" : "FAIL");
}
Results:
Testing []: PASS
Testing [0]: PASS
Testing [1.1]: PASS
Testing [1.12]: PASS
Testing [1.123]: PASS
Testing [12.3]: PASS
Testing [12.34]: PASS
Testing [12.345]: PASS
Testing [123.4]: PASS
Testing [12345.123]: PASS
Testing [.1]: PASS
Testing [.1234]: FAIL

It should accept maximum of 5 digits
[0-9]{1,5}
then upto 3 decimal places
[0-9]{1,5}(\.[0-9]{1,3})?
it can be negative
[-]?[0-9]{1,5}(\.[0-9]{1,3})?
it can be zero
Already covered.
it can be only numbers (max. upto 5 digit place)
Already covered. 'Up to 5 digit place' contradicts your first rule, which allows 5.3.
it can be null
Not covered. I strongly suggest you remove this requirement. Even if you mean 'empty', as I sincerely hope you do, you should detect that case separately and beforehand, as you will certainly have to handle it differently.
Your regular expression contains ^ and $. I don't know why. There is nothing about start of line or end of line in the rules you specified. It also allows a leading +, which again isn't specified in your rules.

Related

Regex for string representation of a method call

I have a string that follows a specific pattern like so
operator(field,value)
and I'd like to use regex to extract out all three of operator, field and value. I'm struggling to come up with the syntax for how to capture these. In this case value can be alphanumeric as well, for example
"contains(name, Joe)"
or "lt(quantity, 2.5)"
Use something like this to capture groups, you may want to limit the characters accepted with [], note the use of ` and the use of \ escaping for () within the regexp:
func main() {
re := regexp.MustCompile(`(.+)\((.+),\s?(.+)\)`)
for _, t := range tests {
fmt.Println("result", re.FindStringSubmatch(t))
}
}
https://play.golang.org/p/43YLTafgQt
output:
result [contains(field, value) contains field value]
result [contains(name, Joe) contains name Joe]
result [lt(quantity, 2.5) lt quantity 2.5]
result [plus(no,44) plus no 44]
Depending on how strict you want to be you could use [a-z]+ or similar instead of .+ to match only certain characters but if you are not worried about bogus values this would probably be fine.
I don't know golang, but I do know regex's, so I'll do what I can here.
You probably want a group each for the "operator", "field", and "value". I'm going to assume for now that each of these can be represented as any combination of alphabetic, numeric, or underscore characters, with length of at least one character. In regex, we have a shortcut for that: \w represents a single alpha-numeric or underscore character, and the + modifier means "one or more". So \w+ means one or more such character in a row. If you want a more complex definition of what these fields can be named, I'll let you specify that in your question.
You say that you want to support "operator(field,value)". I'll start without whitespace anywhere, because it's simpler and you can easily remove all whitespace yourself before running the regex. We'll later add some whitespace support to the regex if you want it, but it'll make life difficult.
To do this, we want three groups, "1(2,3)" where 1 is the operator name, 2 is the field name, and 3 is the value name. Each of these, as given above, will be \w+ in our regex. We'll want to match the open and close parentheses as well as the comma, but we'll throw them away because they're really just delimiters. The parentheses will need to be escaped in the regex, since regex's have a special meaning for parentheses. The result looks like:
(\w+)\((\w+),(\w+)\)
\ 1 / \ 2 / \ 3 /
Where the second line shows you where the groups are each defined.
If you want to support some whitespace, you'll need to add \s* in all such locations. This gets hairy, but you can do it as such:
(\w+)\s*\(\s*(\w+)\s*,\s*(\w+)\s*\)
\ 1 / \ 2 / \ 3 /
You give an example of wanting to support floating point values, and I presume other kinds of values too. You can accomplish this using the "or" pipe, |. For example, group 3, instead of just being \w+, could be defined as
[a-zA-Z_]\w*|\d+\.?|\d*\.\d+
This string will support alphanumeric+underscore strings where the first character must be alphabetic or underscore, OR integers, OR floating point (defined as an integer string with a period at the beginning, middle, or end). Clearly, this can go on and on to support more complex string values, but you get the idea.
So the final regex might look like:
(\w+)\s*\(\s*(\w+)\s*,\s*([a-zA-Z_]\w+|\d+\.?|\d*\.\d+)\s*\)
Sorry for not giving any golang help, I hope someone else can edit my answer and fill in that major gap.

Using RegEx how do I remove the trailing zeros from a decimal number

I'm needing to write some regex that takes a number and removes any trailing zeros after a decimal point. The language is Actionscript 3. So I would like to write:
var result:String = theStringOfTheNumber.replace( [ the regex ], "" );
So for example:
3.04000 would be 3.04
0.456000 would be 0.456 etc
I've spent some time looking at various regex websites and I'm finding this harder to resolve than I initially thought.
Regex:
^(\d+\.\d*?[1-9])0+$
OR
(\.\d*?[1-9])0+$
Replacement string:
$1
DEMO
Code:
var result:String = theStringOfTheNumber.replace(/(\.\d*?[1-9])0+$/g, "$1" );
What worked best for me was
^([\d,]+)$|^([\d,]+)\.0*$|^([\d,]+\.[0-9]*?)0*$
For example,
s.replace(/^([\d,]+)$|^([\d,]+)\.0*$|^([\d,]+\.[0-9]*?)0*$/, "$1$2$3");
This changes
1.10000 => 1.1
1.100100 => 1.1001
1.000 => 1
1 >= 1
What about stripping the trailing zeros before a \b boundary if there's at least one digit after the .
(\.\d+?)0+\b
And replace with what was captured in the first capture group.
$1
See test at regexr.com
(?=.*?\.)(.*?[1-9])(?!.*?\.)(?=0*$)|^.*$
Try this.Grab the capture.See demo.
http://regex101.com/r/xE6aD0/11
Other answers didn't consider numbers without fraction (like 1.000000 ) or used a lookbehind function (sadly, not supported by implementation I'm using). So I modified existing answers.
Match using ^-?\d+(\.\d*[1-9])? - Demo (see matches). This will not work with numbers in text (like sentences).
Replace(with \1 or $1) using (^-?\d+\.\d*[1-9])(0+$)|(\.0+$) - Demo (see substitution). This one will work with numbers in text (like sentences) if you remove the ^ and $.
Both demos with examples.
Side note: Replace the \. with decimal separator you use (, - no need for slash) if you have to, but I would advise against supporting multiple separator formats within such regex (like (\.|,)). Internal formats normally use one specific separator like . in 1.135644131 (no need to check for other potential separators), while external tend to use both (one for decimals and one for thousands, like 1.123,541,921), which would make your regex unreliable.
Update: I added -? to both regexes to add support for negative numbers, which is not in demo.
If your regular expressions engine doesn't support "lookaround" feature then you can use this simple approach:
fn:replace("12300400", "([^0])0*$", "$1")
Result will be: 123004
I know I am kind of late but I think this can be solved in a far more simple way.
Either I miss something or the other repliers overcomplicate it, but I think there is a far more straightforward yet resilient solution RE:
([0-9]*[.]?([0-9]*[1-9]|[0]?))[0]*
By backreferencing the first group (\1) you can get the number without trailing zeros.
It also works with .XXXXX... and ...XXXXX. type number strings. For example, it will convert .45600 to .456 and 123. to 123. as well.
More importantly, it leaves integer number strings intact (numbers without decimal point). For example, it will convert 12300 to 12300.
Note that if there is a decimal point and there are only zeroes after that it will leave only one trailing zeroes. For example for the 42.0000 you get 42.0.
If you want to eliminate the leading zeroes too then youse this RE (just put a [0]* at the start of the former):
[0]*([0-9]*[.]?([0-9]*[1-9]|[0]?))[0]*
I tested few answers from the top:
^(\d+\.\d*?[1-9])0+$
(\.\d*?[1-9])0+$
(\.\d+?)0+\b
All of them not work for case when there are all zeroes after "." like 45.000 or 450.000
modified version to match that case: (\.\d*?[1-9]|)\.?0+$
also need to replace to '$1' like:
preg_replace('/(\.\d*?[1-9]|)\.?0+$/', '$1', $value);
try this
^(?!0*(\.0+)?$)(\d+|\d*\.\d+)$
And read this
http://www.regular-expressions.info/numericranges.html it might be helpful.
I know it's not what the original question is looking for, but anyone who is looking to format money and would only like to remove two consecutive trailing zeros, like so:
£30.00 => £30
£30.10 => £30.10 (and not £30.1)
30.00€ => 30€
30.10€ => 30.10€
Then you should be able to use the following regular expression which will identify two trailing zeros not followed by any other digit or exist at the end of a string.
([^\d]00)(?=[^\d]|$)
I'm a bit late to the party, but here's my solution:
(((?<=(\.|,)\d*?[1-9])0+$)|(\.|,)0+$)
My regular expression will only match the trailing 0s, making it easy to do a .replaceAll(..) type function.
Breaking it down, part one: ((?<=(\.|,)\d*?[1-9])0+$)
(?<=(\.|,): A positive look behind. Decimal must contain a . or a , (commas are used as a decimal point in some countries). But as its a look behind, it is not included in the matched text, but still must be present.
\d*?: Matches any number of digits lazily
[1-9]: Matches a single non-zero character (this will be the last digit before trailing 0s)
0+$: Matches 1 or more 0s that occur between the last non-zero digit and the line end.
This works great for everything except the case where trailing 0s begin immediately, like in 1.0 or 5.000. The second part fixes this (\.|,)0+$:
(\.|,): Matches a . or a , that will be included in matched text.
0+$ matches 1 or more 0s between the decimal point and the line end.
Examples:
1.0 becomes 1
5.0000 becomes 5
5.02394900022000 becomes 5.02394900022
Is it really necessary to use regex? Why not just check the last digits in your numbers? I am not familiar with Actionscript 3, but in python I would do something like this:
decinums = ['1.100', '0.0','1.1','10']
for d in decinums:
if d.find('.'):
while d.endswith('0'):
d = d[:-1]
if d.endswith('.'):
d = d[:-1]
print(d)
The result will be:
1.1
0
1.1
10

Regular expression for password (at least 2 digits and one special character and minimum length 8)

I have been searching for regular expression which accepts at least two digits and one special character and minimum password length is 8. So far I have done the following: [0-9a-zA-Z!##$%0-9]*[!##$%0-9]+[0-9a-zA-Z!##$%0-9]*
Something like this should do the trick.
^(?=(.*\d){2})(?=.*[a-zA-Z])(?=.*[!##$%])[0-9a-zA-Z!##$%]{8,}
(?=(.*\d){2}) - uses lookahead (?=) and says the password must contain at least 2 digits
(?=.*[a-zA-Z]) - uses lookahead and says the password must contain an alpha
(?=.*[!##$%]) - uses lookahead and says the password must contain 1 or more special characters which are defined
[0-9a-zA-Z!##$%] - dictates the allowed characters
{8,} - says the password must be at least 8 characters long
It might need a little tweaking e.g. specifying exactly which special characters you need but it should do the trick.
There is no reason, whatsoever, to implement all rules in a single regex.
Consider doing it like thus:
Pattern[] pwdrules = new Pattern[] {
Pattern.compile("........"), // at least 8 chars
Pattern.compile("\d.*\d"), // 2 digits
Pattern.compile("[-!"§$%&/()=?+*~#'_:.,;]") // 1 special char
}
String password = ......;
boolean passed = true;
for (Pattern p : pwdrules) {
Matcher m = p.matcher(password);
if (m.find()) continue;
System.err.println("Rule " + p + " violated.");
passed = false;
}
if (passed) { .. ok case.. }
else { .. not ok case ... }
This has the added benefit that passwort rules can be added, removed or changed without effort. They can even reside in some ressource file.
In addition, it is just more readable.
Try this one:
^(?=.*\d{2,})(?=.*[$-/:-?{-~!"^_`\[\]]{1,})(?=.*\w).{8,}$
Here's how it works shortly:
(?=.*\d{2,}) this part saying except at least 2 digits
(?=.*[$-/:-?{-~!"^_[]]{1,})` these are special characters, at least 1
(?=.*\w) and rest are any letters (equals to [A-Za-z0-9_])
.{8,}$ this one says at least 8 characters including all previous rules.
Below is map for current regexp (made with help of Regexper)
UPD
Regexp should look like this ^(?=(.*\d){2,})(?=.*[$-\/:-?{-~!"^_'\[\]]{1,})(?=.*\w).{8,}$
Check out comments for more details.
Try this regex. It uses lookahead to verified there is a least two digits and one of the special character listed by you.
^(?=.*?[0-9].*?[0-9])(?=.*[!##$%])[0-9a-zA-Z!##$%0-9]{8,}$
EXPLANATION
^ #Match start of line.
(?=.*?[0-9].*?[0-9]) #Look ahead and see if you can find at least two digits. Expression will fail if not.
(?=.*[!##$%]) #Look ahead and see if you can find at least one of the character in bracket []. Expression will fail if not.
[0-9a-zA-Z!##$%0-9]{8,} #Match at least 8 of the characters inside bracket [] to be successful.
$ # Match end of line.
Regular expressions define a structure on the string you're trying to match. Unless you define a spatial structure on your regex (e.g. at least two digits followed by a special char, followed by ...) you cannot use a regex to validate your string.
Try this : ^.*(?=.{8,15})(?=.*\d)(?=.*\d)[a-zA-Z0-9!##$%]+$
Please read below link for making password regular expression policy:-
Regex expression for password rules

Regex for Regex validation decimal[19,3]

I want to validate a decimal number (decimal[19,3]). I used this
#"[\d]{1,16}|[\d]{1,16}[\.]\d{1,3}"
but it didn't work.
Below are valid values:
1234567890123456.123
1234567890123456.12
1234567890123456.1
1234567890123456
1234567
0.0
.1
Simplification:
The \d doesn't have to be in []. Use [] only when you want to check whether a character is one of multiple characters or character classes.
. doesn't need to be escaped inside [] - [\.] appears to just allow ., but allowing \ to appear in the string in the place of the . may be a language dependent possibility (?). Or you can just take it out of the [] and keep it escaped.
So we get to:
\d{1,16}|\d{1,16}\.\d{1,3}
(which can be shortened using the optional / "once or not at all" quantifier (?)
to \d{1,16}(\.\d{1,3})?)
Corrections:
You probably want to make the second \d{1,16} optional, or equivalently simply make it \d{0,16}, so something like .1 is allowed:
\d{1,16}|\d{0,16}\.\d{1,3}
If something like 1. should also be allowed, you'll need to add an optional . to the first part:
\d{1,16}\.?|\d{0,16}\.\d{1,3}
Edit: I was under the impression [\d] matches \ or d, but it actually matches the character class \d (corrected above).
This would match your 3 scenarios
^(\d{1,16}|(\d{0,16}\.)?\d{1,3})$
first part: a 0 to 16 digit number
second: a 0 to 16 digit number with 1 to 3 decimals
third: nothing before a dot and then 1 to 3 decimals
the ^ and $ are anchorpoints that match start of line and end of line, so if you need to search for numbers inside lines of text, your should remove those.
Testdata:
Usage in C#
string resultString = null;
try {
resultString = Regex.Match(subjectString, #"\d{1,16}\.?|\d{0,16}\.\d{1,3}").Value;
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Slight optimization
A bit more complicated regex, but a bit more correct would be to have the ?: notation in the "inner" group, if you are not using it, to make that a non-capture group, like this:
^(\d{1,16}|(?:\d{0,16}\.)?\d{1,3})$
Following Regex will help you out -
#"^(\d{1,16}(\.\d{1,3})?|\.\d{1,3})$"
Try something like that
(\d{0,16}\.\d{0,3})|(\d{0,16})
It work with all your examples.
edit. new version ;)
You can try:
^\d{0,16}(?:\.|$)(?:\d{0,3}|)$
match 0 to 16 digits
then match a dot or end of string
and then match 3 more digits

Python: RE only captures first and last match

I'm trying to make a Regular Expression that captures the following:
- XX or XX:XX, up to 6 repetitions (XX:XX:XX:XX:XX:XX), where X is a hexadecimal number.
In other words, I'm trying to capture MAC addresses than can range from 1 to 6 bytes.
regex = re.compile("^([0-9a-fA-F]{2})(?:(?:\:([0-9a-fA-F]{2})){0,5})$")
The problem is that if I enter for example "11:22:33", it only captures the first match and the last, which results in ["11", "22"].
The question: is there any method that {0,5} character will let me catch all repetitions, and not the last one?
Thanks!
Not in Python, no. But you can first check the correct format with your regex, and then simply split the string at ::
result = s.split(':')
Also note that you should always write regular expressions as raw strings (otherwise you get problems with escaping). And your outer non-capturing group does nothing.
Technically there is a way to do it with regex only, but the regex is quite horrible:
r"^([0-9a-fA-F]{2})(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?(?:([0-9a-fA-F]{2}))?$"
But here you would always get six captures, just that some might be empty.