regex in python/ansible - regex

I am new bee to regex, I have an example string : account-device-v2-2-3-63-21900
and using this regular expression [1-9]-[0-9]-[0-9]*
I am getting output as 1-2-3
but my intention is to match/extract pattern 2-3-63
Meaning to get digits with hyphens after v2 (or v1 etc), I don't need last digit part (21000 or any other number)
Any suggestions please?

You want to get 1 or more digit except 0, dash, 1 or more digit, dash, 1 or more digit from account-device-v2-2-3-63-21900 or account-device-v1-2-3-63-21900?
Use v[12]-([1-9]+?-[0-9]+?-[0-9]+?)- and get first group.
Demo: https://regex101.com/r/hMLGsK/1

The pattern [1-9]-[0-9]-[0-9]* matches 2-2-3 because your pattern does not match the v and a digit part and this is the first part it can match.
Note that [0-9]* Matches optional digits, so 2-2- could also be a match.
Using a capture group to get the value:
\bv[1-9][0-9]*-([1-9][0-9]*-[0-9]+-[0-9]+)
\bv[1-9][0-9]*- Match v1 or also possibly v20 etc..
( Capture group 1
[1-9][0-9]* Match a digit starting at 1
-[0-9]+-[0-9]+ 2 parts matching - and 1 or more digits starting from 0
) Close group 1
Regex demo

Related

Removing trailing zeros using REPLACE regex

Remove trailing zeros to a number with 4 decimals
Sample expected output:
1.7500 -> 1.75
1.1010 -> 1.101
1.0000 -> 1
I am new with REGEX so I just tried this one first but not working:
REPLACE ALL OCCURRENCES OF REGEX '^\.[0]\d{0,3}' IN lv_rate WITH space.
Need help for the right regex to use. Thanks!
EDIT: SHIFT lv_rate RIGHT DELETING TRAILING '0' is not an option.
Try replacing on the following regex pattern:
\.?0+$
Use empty string as the replacement. This will match an optional decimal point, followed by trailing zeroes until the end of the string. See the demo below to see this pattern working.
Demo
This answer assumes that all inputs would always have a decimal component. If not, then we would need to add additional logic.
If you want to remove trailing zeros to a number with 4 decimals, one option is to use a capturing group and use group 1 in the replacement.
^(\d+(?=\.\d{4}$)(?:\.\d*[1-9])?)\.?0+$
In parts
^ Start of string
( Capture group 1
\d+ Match 1+ digits
(?=\.\d{4}$) Assert what is on the right is a . and 4 digits
(?:\.\d*[1-9])? Optionally match digits until the last digit 1-9
) Close group 1
\.?0+ Match an optional . and 1 or more times a zero
$ End of string
Regex demo

REGEX Capturing differing sets of repeating groups

this is a two-part question, but I feel the answers will be related.
I have this regex pattern:
(\d+)(aa|bb) which I use to capture this string: 1bb2aa3aa4bb5bb6aa7bb8cc9cc
See demo: example 1
The way it captures the random series of aa and bb (both preceded by a digit) is exactly what I want, and is good as far as it goes.
So we get this match on regex101:
Match 1
Full match 0-3 `1bb`
Group 1. 0-1 `1`
Group 2. 1-3 `bb`
Match 2
Full match 3-6 `2aa`
Group 1. 3-4 `2`
Group 2. 4-6 `aa`
Match 3
Full match 6-9 `3aa`
Group 1. 6-7 `3`
Group 2. 7-9 `aa`
Match 4
Full match 9-12 `4bb`
Group 1. 9-10 `4`
Group 2. 10-12 `bb`
Match 5
Full match 12-15 `5bb`
Group 1. 12-13 `5`
Group 2. 13-15 `bb`
Match 6
Full match 15-18 `6aa`
Group 1. 15-16 `6`
Group 2. 16-18 `aa`
Match 7
Full match 18-21 `7bb`
Group 1. 18-19 `7`
Group 2. 19-21 `bb`
As expected, the 8cc9ccbit at the end is ignored. I would like capture this as well, in the same way I have captured the first repeating groups, in the same expression. So in the final output, I'd get something like this added to the end of the output. This should work for any amounts of matches on either side. This text is just one example.
Full match 21-24 `8cc`
Group 1. 21-22 `8`
Group 2. 22-24 `cc`
Match 7
Full match 24-27 `9cc`
Group 1. 24-25 `9`
Group 2. 25-27 `cc`
Also, I'd like to do similar but flipping the 'or' group to the end i.e. this:
1cc2cc3cc4cc5cc6cc7ccb8aa9bb
My current regex pattern (\\d+)(cc) only matches the repeating 'cc' groups.
See demo: example 2
I would like a similar full capture, with any amount of permissible entries of each group.
Any thoughts?
You may use
(?:\G(?!^)(?(?=\d+(?:aa|bb))(?<!\dcc))|(?=(?:\d+(?:aa|bb))+(?:\d+cc)+))(\d+)(aa|bb|cc)
See the regex demo
The regex will only match the string that meets the pattern in the (?=(?:\d+(?:aa|bb))+(?:\d+cc)+) lookahead, and then will consecutively match and capture digits and aa, bb or cc, but digits + aa or bb will be matched unless digits + cc is not in front.
Details
(?:\G(?!^)(?(?=\d+(?:aa|bb))(?<!\dcc))|(?=(?:\d+(?:aa|bb))+(?:\d+cc)+)) - either of the two alternatives:
\G(?!^) - end of the previous successful match
(?(?=\d+(?:aa|bb))(?<!\dcc)) - if-then-else construct: if there is 1+ digits and aa or bb immediately to the right of the current location ((?=\d+(?:aa|bb)), then only continue matching if there is no digit followed with cc immediately to the left of the current location ((?<!\dcc))
| - or
^ - start of string
(?=(?:\d+(?:aa|bb))+(?:\d+cc)+) - a positive lookahead that, immediately to the right of the current location, searches for the following (and returns true if it finds the patterns, or false if it does not):
(?:\d+(?:aa|bb))+ - one or more occurrences of 1+ digits followed with aa or bb
(?:\d+cc)+ - one or more occurrences of 1+ digits followed with cc
(\d+) - Group 1: one or more digits
(aa|bb|cc) - aa, bb or cc.
For the second pattern, replace cc with (?:aa|bb):
(?:\G(?!^)(?(?=\d+cc)(?<!\d(?:aa|bb)))|(?=(?:\d+cc)+(?:\d+(?:aa|bb))+))(\d+)(aa|bb|cc)
I'm no expert with perl, so I'll give a bit of pseudo code here. Feel free to suggest an edit.
You can start by matching any number of xaa or xbb combos, followed by one or more xcc combos using this pattern: ^(?:\d+(?:aa|bb))+(?:\dcc)+$
Once you have that you can use this pattern to capture the appropriate groups: (\d+)(aa|bb|cc)
Demo 1
Demo 2
Something like:
if(ismatch("^(?:\d+(?:aa|bb))+(?:\dcc)+$", inputString))
{
match = match("(\d+)(aa|bb|cc)", inputString);
}
from here you can extract the information using the groups.

How to write that the pattern should be repeated?

I have a line of pattern:
double1, +double2,-double3.
For single double value pattern is :
[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)
How to make it for triple value?
Such as:
1.1, 0, -0
0, -123, 33
Not valid for:
""
1,123
123,123,123,123
You can use a slightly simpler pattern:
^(?:(?:^[+-]?|, ?[+-]?)\d+(?:\.\d+)?){3}$
Matches only triple occurences as you specified in your edit.
You can try it here.
As correctly pointed out by The Fourth Bird in his comments below, if you wish to match entries such as .9, where no digits precede the full stop you can use:
^(?:(?:^[+-]?|, ?[+-]?)(?:\d+(?:\.\d+)?|\.\d+)){3}$
You can check this pattern here.
The double part ([.][0-9]*)? is optional which will match 0 or 1 times.
To match it triple times, you could match a double using [-+]?(?:[0-9]+(?:\.[0-9]+)?|\.[0-9]+) which will match an optional + or - followed by an alternation that will match either a digit followed by an optional part that matches a dot and one or more digits or a dot followed by one or more digits.
Repeat that pattern 2 times using a quantifier {2} preceded by a comma and zero or more times a whitespace character \s*.
Add anchors to assert the start ^ and the end $ of the string and you could make use of a non capturing group (?: if you only want to check if it is a match and not refer to the groups anymore.
^[-+]?(?:[0-9]+(?:\.[0-9]+)?|\.[0-9]+)(?:,\s*[-+]?(?:[0-9]+(?:\.[0-9]+)?|\.[0-9]+)){2}$

Regular expression to get positive integer and -1

Below is the text I hope to match:
00000001,00000002,00000003
It works fine with ((([-1-9]+),)+)?[-1-9]+.
But it didn't match -1. The expression must not match with -2 or anything else except -1.
You may use
^(?:0*[1-9][0-9]*|-1)(?:,(?:0*[1-9][0-9]*|-1))*$
See the regex demo.
Pattern details:
^ - start of string
(?:0*[1-9][0-9]*|-1) - a non-capturing group matching...
0*[1-9][0-9]* - zero or mor 0 chars, followed with a non-zero digit followed with any 1 or more digits
| - or
-1 - a -1 substring
(?:,(?:0*[1-9][0-9]*|-1))* - a non-capturing group quantified with * (0 or more) quantifier matching 0 or more repetitions of:
, - a comma
(?:0*[1-9][0-9]*|-1) - same subpattern as in the beginning (-1 or a non-zero number with no fractions)
$ - end of string.
[-1-9]+ doesn't match what you're expecting it to match. It matches for example: "-31-23", which is obviously not a number.
A simple regex like:
(?:^-1)$|^[0-9]+
will match "-1", or any positive integer (including 0001, 00000002, etc...).
Also, depending on the language you're using, it would be simpler to use the language's features to decide if the number is "-1" or any other positive number.
As your state that ((([-1-9]+),)+)?[-1-9]+ works fine which captures a positive integer and looking at the title of the question, you might use this regex using alternation to capture -1 or only positive integers including 0 or 00000 from a string which could be preceded with zeroes.
The positive integers will be captured in group 1.
-[02-9][0-9]*|0*(-?[0-9]+)
Details
- Match literally
[02-9][0-9]* Match a 0 or digits 2-9 followed by zero or more times a digit. Note that the - is not part of the character class or else --- would also match.
| Or
0* Match zero or more times a zero
(-?[0-9]+) Capture in group 1 an optional hyphen followed by one or more times a digit

Regular expression of two digit number where two digits are not same

I am trying to write a regular expression that will match a two digit number where the two digits are not same.
I have used the following expression:
^([0-9])(?!\1)$
However, both the strings "11" and "12" are not matching. I thought "12" would match. Can anyone please tell me where I am going wrong?
You need to allow matching 2 digits. Your regex ^([0-9])(?!\1)$ only allows 1 digit string. Note that a lookahead does not consume characters, it only checks for presence or absence of something after the current position.
Use
^(\d)(?!\1)\d$
^^
See demo
Explanation of the pattern:
^ - start of string
(\d) - match and capture into Group #1 a digit
(?!\1) - make sure the next character is not the same digit as in Group 1
\d - one digit
$ - end of string.