I'm trying to use regex to pull just the patch version out of some semvars in the form v1.2.3
I've got some regex which can match the v1.2. part however I'm struggling to get the other part, the 3 (which I actually want back)
I'm using ^v\d+\.\d+\. to select the first part.
I'm trying to use a negative lookahead with this to then select everything after it with (?!(v\d+\.\d+\.)).* but this just seems to return everything after the v rather than everything after the group
Any pointers would be really appreciated, thanks!
In this special case:
'^(?<=v\d\.\d\.)[[:alnum:]]+'
The regular expression matches as follows:
Node
Explanation
^
start of string
(?<=
look behind to see if there is:
v
v
\d
digits (0-9)
\.
.
\d
digits (0-9)
\.
.
)
end of look-behind
[[:alnum:]]+
any character of: letters and digits (1 or more times (matching the most amount possible))
A more generic solution than works with any length of digits
'^v\d+\.\d+\.\K.[[:alnum:]]+'
The regular expression matches as follows:
Node
Explanation
^
start of string
v
v
\d+
digits (0-9) (1 or more times (matching the most amount possible))
\.
.
\d+
digits (0-9) (1 or more times (matching the most amount possible))
\.
.
\K
resets the start of the match (what is Kept) as a shorter alternative to using a look-behind assertion: look arounds and Support of K in regex
[[:alnum:]]+
any character of: letters and digits (1 or more times (matching the most amount possible))
Check man tr | grep -FA1 '[:' for a POSIX character classes like [[:alnum:]]
Related
I would like to change all numeric constants of the form XX.XXX to XX.XXXf avoiding constants already in the form XX.XXXf, where XX represents decimal numbers.
For example 10.04 would be changed to 10.04f and 5.08f would be unchanged.
The idea is to search for ([0-9]+\.[0-9]+)([^f]) and replace with \1f\2
But... it doesn't really work... and I don't understand why. The first pattern ([0-9]+\.[0-9]+) works, but if I add the second pattern ([^f]) 10.005f still matches.
On the other hand, if I modify the first pattern to test only one digit after the dot ([0-9]+\.[0-9])([^f]), it works fine, but I would like it to work with several digits after the dot as well.
In fact, I understand that the last digit is seen as a character different from "f", and that's why it (10.005f) matches.
How to make it work regardless of the number of digits?
Thank you
Use
re.sub(r'\d+\.\d+\b(?!f)', r'\g<0>f', text)
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
f 'f'
--------------------------------------------------------------------------------
) end of look-ahead
I am trying to extract the last section of the following string :
"/subscriptions/5522233222-d762-666e-555a-e6666666666/resourcegroups/rg-sql-Belguim-01/providers/Microsoft.Compute/snapshots/vm-sql-image-v3.3-pre-sysprep-Oct-2021-BG"
I want to capture:
"snapshots/vm-sql-image-v3.3-pre-sysprep-Oct-2021-BG"
I tried below with no luck:
(\w*?\/\w*?)$
How to pull this off using regex?
Use
[^\/]+\/[^\/]+$
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
[^\/]+ any character except: '\/' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
\/ '/'
--------------------------------------------------------------------------------
[^\/]+ any character except: '\/' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
Your issues
(\w*?/\w*?)$ is for simple or empty last 2 segments (tested), e.g.
matched hello/world/subscriptions123/snap_shots capturing subscriptions123/snap_shots
matched /1/2// capturing the last 2 empty segments
OK was:
capture-group
/ to match the last path-separator before end ($)
\w*? intended to match the path-segment of any length
What to improve:
*? is a bit too unrestricted, choose quantifier as + for at least one (instead * for any or ? for zero or one)
\w is for word-meta-character, does not match hyphens or dots (OK for snapshot, not for given last segment)
Quick-fixed
(\w+/[\w\.-]+)$ (tested)
added dot \. and hyphen - to character-set containing \w
Simple but solid
(snapshots/[^\/]+)$ (tested)
fore-last path-segment assumed as fix constant snapshots
[^\/] any character except (^) slash in last segment
Note: the slash doesn't need to be escaped \/ like Ryszard answered
I have these two sentence
TAGGING ODP:-7.160792, 113.496069
TAGGING pel:-7.160792, 113.496069
I want to match -7.160792 part only if the full sentence contain "odp" in it.
I tried the following (?(?=odp)-\d+.\d+) but it doesn't work, i don't know why.
Any help is appreciated.
(?(?=odp)-\d+\.\d+) won't work because (?=odp) is a positive lookahead that imposes a constraint on the pattern on the right, -\d+\.\d+. Namely, it requires odp string to occur exactly at the same location where - and a number are expected.
Use
(?<=ODP:)-\d+\.\d+
ODP:(-\d+\.\d+)
If lookbehinds are supported, the first variant is more viable.
Otherwise, another option with capturing groups is good to use.
And if odp can appear anywhere, even after the number:
(?i)^(?=.*odp).*(-\d+\.\d+)
This will capture the value into a group.
EXPLANATION
--------------------------------------------------------------------------------
(?i) set flags for this block (case-
insensitive) (with ^ and $ matching
normally) (with . not matching \n)
(matching whitespace and # normally)
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
odp 'odp'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
- '-'
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \1
You can use the regex, (?i)(?<=odp:)[^,]*.
Explanation:
(?i): Case-insenstitive flag
(?<=odp:): Positive lookbehind for odp:
[^,]*: Anything but ,
👉 If you want the match to be restricted to numbers only, you can use the regex, (?i)(?<=odp:)(?:-\d+.\d+)
Explanation:
(?i): Case-insenstitive flag
(?<=odp:): Positive lookbehind for odp:
(?:: Start non capturing group
-: Literal, -
\d+: 1+ digit(s)
.\d+: . followed by 1+ digit(s)
): End non capturing group
👉 If the sign can be either + or -, you can use the regex, (?i)(?<=odp:)(?:[+-]\d+.\d+)
The pattern (?(?=odp)\-\d+\.\d+) is using a conditional (? stating in the if clause:
If what is directly to the right from the current position is odp,
then match -\d+.\d+
That can not match.
What you also could do is match odp followed by any char other than a digit using \D* and capture the digit part in a group.
\bodp\b\D*(-\d+\.\d+)\b
The pattern matches:
\bodp\b match odp between word boundaries to prevent a partial match
\D* Optionally match any char other than a digit
(-\d+\.\d+) Capture - and 1+ digits with a decimal part in group 1
\b A word boundary
Regex demo
(?<=ODP:)(-\d+.\d+)
You can try using the negative look behind.
This should solve for the code you ve provided.
I am trying to create a regex in ruby that matches against strings with 10 characters which are not special characters i.e. would match with \w.
So far I have come up with this:
/\w{10,}/
but the issue is that it will only count a consecutive sequence of word characters. I want to match any string which counts up to have at least 10 "word" characters. Is this possible? I am fairly new to regex as a whole so any help would be appreciated.
If I understood correctly, this should work:
/(?:\w[^\w]*){9,}\w/
Explanation:
We start with a single
\w
We want to capture all the other characters until another \w, hence:
\w[^\w]*
[^<list of chars>] matches any character other than listed in the brackets, so [^\w] means any character that is not a word character. * denotes 0 or more. The above will match "a-- ", "b" and "c!" in "a-- bc!" string.
Since we need 10 \w, we will match 9 (or more) groups like that, followed by a single \w
(\w[^\w]*){9,}\w
We don't really care for captures here (especially since ruby will ignore repeated group captures anyway, so we make the group non-capturing)
(?:\w[^\w]*){9,}\w
Alternatively we could just use simpler regex:
(?:\w[^\w]*){10,}
But it will also cover characters after the last word character in a string - not sure if this is required here.
Match anywhere in the string:
/\w(?:\W*\w){9,19}/
/(?:\W*\w){10,20}/
Validate a string of 10 to 20 characters long:
/\A(?:\W*\w){10,20}\W*\z/
Prefer non-capturing groups, particularly when extracting found matches.
Watch out for ^ and $ that mark up start and end of the line respectively in Ruby's regex.
EXPLANATION
--------------------------------------------------------------------------------
\A the beginning of the string
--------------------------------------------------------------------------------
(?: group, but do not capture (between 10 and
20 times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\W* non-word characters (all but a-z, A-Z, 0-
9, _) (0 or more times (matching the
most amount possible))
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _)
--------------------------------------------------------------------------------
){10,20} end of grouping
--------------------------------------------------------------------------------
\W* non-word characters (all but a-z, A-Z, 0-
9, _) (0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
\z the end of the string
I have filename in a unix-path starting with two digits ... how can i extract the name without the extension
/this/is/my/path/to/the/file/01filename.ext should be filename
I currently have [^/]+(?=\.ext$) so I get 01filename, but how do I get rid of the first two digits?
You can add a look-behind in front of what you already have, looking for two digits:
(?<=\d\d)[^/]+(?=.ext$)
This only works if you have exactly two digits! Unfortunately, in most regex engines it is not possible to use quantifiers like * or + in lookbehinds.
(?<=\d\d) - checks for two digits before the match
[^/]+ - matches 1 or more characters, except /
(?=.ext$) - checks for .ext behind the match
Try this one :
/\d\d(.*?).\w{3}$
Explanation :
/\d\d : slash followed by two digit
(.*?) : the capture
.\w{3} : a dot followed by three letters
$ : end of string
It works for me on Expresso
Consider the following Regex...
(?<=\d{2})[^/]+(?=.ext$)
Good Luck!
A more general regex:
(?:^|\/)[\d]+([^.]+)\.[\w.]+$
Explanation:
(?: group, but do not capture:
^ the beginning of the string
| OR
\/ '/'
) end of grouping
[\d]+ any character of: digits (0-9) (1 or more
times (matching the most amount possible))
( group and capture to \1:
[^.]+ any character except: '.' (1 or more
times (matching the most amount
possible))
) end of \1
\. '.'
[\w\.]+ any character of: word characters (a-z, A-
Z, 0-9, _), '.' (1 or more times
(matching the most amount possible))
$ before an optional \n, and the end of the
string