Regex Capture Parts of Line

Regex Capture Parts of Line - regex

I have been struggling to capture a part of an snmp response.
Text
IF-MIB::ifDescr.1 = 1/1/g1, Office to DMZ
Regex
(?P<ifDescr>(?<=ifDescr.\d = ).*)
Current Capture
1/1/g1, Office to DMZ
How to capture only?
1/1/g1
Office to DMZ
EDIT
1/1/g1
This should match the digit and forward slashes for the port notation in the snmp response.
(?P<ifDescr>(?<=ifDescr.\d = )\d\/\d\/g\d)
Link to regexr
Office to DMZ
This should start the match past the port notation and capture remaining description.
(?P<ifDescr>(?<=ifDescr.\d = \d\/\d\/g\d, ).*)
Link to regexr

You could just use the answer I gave you yesterday and split the first return group, 1/1/g10, by '/' and get the third part.
1/1/g10
split by '/' gives
1
1
g10 <- third part
Why use a more complicated regex when you can use simple code to accomplish the task?

With your shown samples, could you please try following regex with PCRE options available.
(?<=IF-MIB::ifDescr)\.\d+\s=\s\K(?:\d+\/){2}g(?:\d+)
Here is Online demo of above regex
OR with a little variation use following:
(?<=IF-MIB::ifDescr)\.\d+\s=\s\K(?:(?:\d+\/){2}g\d+)
Explanation: Adding detailed explanation for above.
(?<=IF-MIB::ifDescr) ##using look behind to make sure all mentioned further conditions must be preceded by this expression(IF-MIB::ifDescr)
\.\d+\s=\s ##Matching literal dot with digits one or more occurrences then with 1 or more occurrences of space = followed by one or more occurrences of spaces.
\K ##\K is GNU specific to simply forget(kind of) as of now matched regex and consider values in regex for further given expressions only.
(?:\d+\/){2}g(?:\d+) ##Creating a non-capturing group where matching 1 or more digits with g and 1 or more digits.
Without PCRE flavor: To get values in 1st capture group try following, confirmed by OP in comments about its working.
(?<=IF-MIB::ifDescr)\.\d+\s=\s((\d+\/){2}g\d+)

Here are my attempts.
const string pattern = ".* = (.*), (.*)";
var r = Regex.Match(s, pattern);
const string pattern2 = ".* = ([0-9a-zA-Z\\/]*), (.*)";
var r2 = Regex.Match(s, pattern2);

Using the named capture group ifDescr to capture the value 1/1/g1 you can use a match instead of lookarounds.
(Note to escape the dot \. to match it literally)
ifDescr\.\d+ = (?P<ifDescr>\d+\/\d+\/g\d+),
The pattern matches:
ifDescr\.\d+ = Match ifDescr. and 1+ digits followed by =
(?P<ifDescr> Named group ifDescr
\d+\/\d+\/g\d+ Match 1+ digits / 1+ digits /g and 1+ digits
), Close group and match the trailing comma
Regex demo

Do the following:
ifDescr\.\d+\s=\s((?:\d\/){2}g\d+)
The resultant capture groups contain the intended result. Note that \d+ accepts one or more digits, so you don't need the OR operator as used by you.
Demo
Alternatively, it looks like that the number after g will always be the number after ifDescr.. If that is the case, do this:
ifDescr\.(\d+)\s=\s((?:\d\/){2}g\1)
This basically captures the number in a group, then reuses it to match using backreference (note the usage of \1). The intended result in this case is available in the second capturing group.
Demo

I think is what you are looking for
= (.+), (.+)
It looks for "= " then captures all until a comma and then everything afterwards. It returns
1/1/g1
Office to DMZ
as requested.
See it working on regex101.com.

Related

replaceAll regex to remove last - from the output

I was able to achieve some of the output but not the right one. I am using replace all regex and below is the sample code.
final String label = "abcs-xyzed-abc-nyd-request-xyxpt--1-cnaq9";
System.out.println(label.replaceAll(
"([^-]+)-([^-]+)-(.+)-([^-]+)-([^-]+)", "$3"));
i want this output:
abc-nyd-request-xyxpt
but getting:
abc-nyd-request-xyxpt-
here is the code https://ideone.com/UKnepg

You may use this .replaceFirst solution:
String label = "abcs-xyzed-abc-nyd-request-xyxpt--1-cnaq9";
label.replaceFirst("(?:[^-]*-){2}(.+?)(?:--1)?-[^-]+$", "$1");
//=> "abc-nyd-request-xyxpt"
RegEx Demo
RegEx Details:
(?:[^-]+-){2}: Match 2 repetitions of non-hyphenated string followed by a hyphen
(.+?): Match 1+ of any characters and capture in group #1
(?:--1)?: Match optional --1
-: Match a -
[^-]+: Match a non-hyphenated string
$: End

The following works for your example case
([^-]+)-([^-]+)-(.+[^-])-+([^-]+)-([^-]+)
https://regex101.com/r/VNtryN/1
We don't want to capture any trailing - while allowing the trailing dashes to have more than a single one which makes it match the double --.

With your shown samples and attempts, please try following regex. This is going to create 1 capturing group which can be used in replacement. Do replacement like: $1in your function.
^(?:.*?-){2}([^-]*(?:-[^-]*){3})--.*
Here is the Online demo for above regex.
Explanation: Adding detailed explanation for above regex.
^(?:.*?-){2} ##Matching from starting of value in a non-capturing group where using lazy match to match very near occurrence of - and matching 2 occurrences of it.
([^-]*(?:-[^-]*){3}) ##Creating 1st and only capturing group and matching everything before - followed by - followed by everything just before - and this combination 3 times to get required output.
--.* ##Matching -- to all values till last.

Regex to get 2nd or 3rd level domain with path in it

I created this regex: [^.]*\.[^.]{2,3}(?:\.[^.]{2,3})?$
Given: https://this.is.my.nice.service.co.uk
My regex will return: service.co.uk
But it will not work for: https://this.is.my.nice.service.co.uk/sample
I would like my regex to always return the 2nd or 3rd level domain regardless if there's a path or not.
So given https://this.is.my.nice.service.co.uk and https://this.is.my.nice.service.co.uk/sample the result should be: service.co.uk.
How can I achieve that?
Demo: https://regex101.com/r/kygHUa/1

For your example strings, you can exclude matching the dot and forward slash, and optionally assert / followed by optional non whitespace chars till the end of the string.
Then get the first match, in case pattern can match multiple times in the string.
[^./\s/]*(?:\.[^\s./]{2,3}){1,2}(?=(?:\/\S*)?$)
See a regex 101 demo.
const regex = /[^./\s/]*(?:\.[^\s./]{2,3}){1,2}(?=(?:\/\S*)?$)/;
[
"https://this.is.my.nice.service.co.uk",
"https://service.co.uk",
"https://this.is.my.nice.service.co.uk/sample",
"https://service.com",
"https://this.is.my.nice.service.co.uk/"
].forEach(s => {
const m = s.match(regex);
if (m) {
console.log(m[0]);
}
});
If all the parts start with https:// you could make the pattern a bit more specific, starting with the protocol and optional non greedy repetitions of the allowed characters followed by a dot.
Then get the capture group 1 value.
https?:\/\/(?:[^./\s/]*\.)*?([^./\s/]*(?:\.[^\s./]{2,3}){1,2}(?=(?:\/\S*)?$))
Regex demo

Regex to get value from <key, value> by asserting conditions on the value

I have a regex which takes the value from the given key as below
Regex .*key="([^"]*)".* InputValue key="abcd-qwer-qaa-xyz-vwxc"
output abcd-qwer-qaa-xyz-vwxc
But, on top of this i need to validate the value with starting only with abcd- and somewhere the following pattern matches -xyz
Thus, the input and outputs has to be as follows:
I tried below which is not working as expected
.*key="([^"]*)"?(/Babcd|-xyz).*
The key value pair is part of the large string as below:
object{one="ab-vwxc",two="value1",key="abcd-eest-wd-xyz-bnn",four="obsolete Values"}
I think by matching the key its taking the value and that's y i used this .*key="([^"]*)".*
Note:
Its a dashboard. you can refer this link and search for Regex: /"([^"]+)"/ This regex is applied on the query result which is a string i referred. Its working with that regex .*key="([^"]*)".* above. I'm trying to alter with that regexGroup itself. Hope this helps?
Can anyone guide or suggest me on this please? That would be helpful. Thanks!

Looks like you could do with:
\bkey="(abcd(?=.*-xyz\b)(?:-[a-z]+){4})"
See the demo online
\bkey=" - A word-boundary and literally match 'key="'
( - Open 1st capture group.
abcd - Literally match 'abcd'.
(?=.*-xyz\b) - Positive lookahead for zero or more characters (but newline) followed by literally '-xyz' and a word-boundary.
(?: - Open non-capturing group.
-[a-z]+ - Match an hyphen followed by at least a single lowercase letter.
){4} - Close non-capture group and match it 4 times.
) - Close 1st capture group.
" - Match a literal double quote.
I'm not a 100% sure you'd only want to allow for lowercase letter so you can adjust that part if need be. The whole pattern validates the inputvalue whereas you could use capture group one to grab you key.
Update after edited question with new information:
Prometheus uses the RE2 engine in all regular expressions. Therefor the above suggestion won't work due to the lookarounds. A less restrictive but possible answer for OP could be:
\bkey="(abcd(?:-\w+)*-xyz(?:-\w+)*)"
See the online demo

Will this work?
Pattern
\bkey="(abcd-[^"]*\bxyz\b[^"]*)"
Demo

You could use the following regular expression to verify the string has the desired format and to match the portion of the string that is of interest.
(?<=\bkey=")(?=.*-xyz(?=-|$))abcd(?:-[a-z]+)+(?=")
Start your engine!
Note there are no capture groups.
The regex engine performs the following operations.
(?<=\bkey=") : positive lookbehind asserts the current
position in the string is preceded by 'key='
(?= : begin positive lookahead
.*-xyz : match 0+ characters, then '-xyz'
(?=-|$) : positive lookahead asserts the current position is
: followed by '-' or is at the end of the string
) : end non-capture group
abcd : match 'abcd'
(?: : begin non-capture group
-[a-z]+ : match '-' followed by 1+ characters in the class
)+ : end non-capture group and execute it 1+ times
(?=") : positive lookahead asserts the current position is
: followed by '"'

Regex to extract static text and number using only regular expression

I am completely new to this regular expression.
But I tried to write the regular expression to get some static text and phone number for the below text
"password":"password123:cityaddress:mailaddress:9233321110:gender:45"
I written like below to extract this : "password":9233321110
(([\"]password[\"][\s]*:{1}[\s]*))(\d{10})?
regex link for demo:
https://regex101.com/r/2vNpMU/2
the correct regexp gives full match as "password":9233321110 in regex tool
I am not using any programming language here, this is for network packet capture at F5 level.
Please help me with the regexp;

I would use /^([^:]+)(?::[^:]+){3}:([^:]+)/ for this.
Explained (more detailed explanation at regex101):
^ matches from the start of the string
(…) is the first capture group. This will collect that initial "password"
[^:]+ matches one or more non-colon characters
(?:…) is a non-capturing group (it collects nothing for later)
:[^:]+ matches a colon and then 1+ non-colons
{3} instructs us to match the previous item (the non-capturing group) 3 times
: matches a literal colon
([^:]+) captures a match of 1+ non-colons, which will get us 9233321110 in this example
The first capture group is typically stored as $1 or the first item of the returned array. (In Javascript, the zeroth item is the full match and item index 1 is the first capture group.) The second capture group is $2, etc.
To always match the "password" key, hard-code it: /^("password")(?::[^:]+){3}:([^:]+)/
Here's a live snippet demonstrating it:
x = `"password":"password123:cityaddress:mailaddress:9233321110:gender:45"`;
match = x.match(/^([^:]+)(?::[^:]+){3}:([^:]+)/);
if (match) console.log(match[1] + ":" + match[2]);
else console.log("no match");

Regex - optional capture group after wildcard

Say I have the following list:
No 1 And Your Bird Can Sing (4)
No 2 Baby, You're a Rich Man (5)
No 3 Blue Jay Way S
No 4 Everybody's Got Something to Hide Except Me and My Monkey (1)
And I want to extract the number, the title and the number of weeks in the parenthesis if it exists.
Works, but the last group is not optional (regstorm):
No (?<no>\d{1,3}) (?<title>.*?) \((?<weeks>\d)\)
Last group optional, only matches number (regstorm):
No (?<no>\d{1,3}) (?<title>.*?)( \((?<weeks>\d)\))?
Combining one pattern with week capture with a pattern without week capture works, but there gotta be a better way:
(No (?<no>\d{1,3}) (?<title>.*) \((?<weeks>\d)\))|(No (?<no>\d{1,3}) (?<title>.*))
I use C# and javascript but I guess this is a general regex question.

Your regex is almost there!
First and most importantly, you should add a $ at the end. This makes (?<title>.*?) match all the way towards the end of the string. Currently, (?<title>.*?) matches an empty string and then stops, because it realises that it has reached a point where the rest of the regex matches. Why does the rest of the regex match? Because the optional group can match any empty string. By putting the $, you are making the rest of the regex "harder" to match.
Secondly, you forgot to match an open parenthesis \(.
This is how your regex should look like:
No (?<no>\d{1,3}) (?<title>.*?)( \((?<weeks>\d)\))?$
Demo

You may use this regex with an optional last part:
^No (?<no>\d{1,3}) (?<title>.*?\S)(?: \((?<weeks>\d)\))?$
RegEx Demo

Another option could be for the title to match either not ( or when it does encounter a ( it should not be followed by a digit and a closing parenthesis.
^No (?<no>\d{1,3}) (?<title>(?:[^(\r\n]+|\((?!\d\)))+)(?:\((?<weeks>\d)\))?
In parts
^No
(?\d{1,3}) Group no and space
(?<title>
(?: Non capturing group
[^(\r\n]+ Match any char except ( or newline
| Or
\((?!\d\)) Match ( if not directly followed by a digit and )
)+ Close group and repeat 1+ times
) Close group title
(?: Non capturing group
\((?<weeks>\d)\) Group weeks between parenthesis
)? Close group and make it optional
Regex demo
If you don't want to trim the last space of the title you could exclude it from matching before the weeks.
Regex demo

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex Capture Parts of Line - regex

You could just use the answer I gave you yesterday and split the first return group, 1/1/g10, by '/' and get the third part. 1/1/g10 split by '/' gives 1 1 g10 <- third part Why use a more complicated regex when you can use simple code to accomplish the task?

Here are my attempts. const string pattern = ".* = (.), (.)"; var r = Regex.Match(s, pattern); const string pattern2 = ".* = ([0-9a-zA-Z\\/]), (.)"; var r2 = Regex.Match(s, pattern2);

I think is what you are looking for = (.+), (.+) It looks for "= " then captures all until a comma and then everything afterwards. It returns 1/1/g1 Office to DMZ as requested. See it working on regex101.com.

Related

replaceAll regex to remove last - from the output

Regex to get 2nd or 3rd level domain with path in it

Regex to get value from <key, value> by asserting conditions on the value

Regex to extract static text and number using only regular expression

Regex - optional capture group after wildcard

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex Capture Parts of Line - regex

You could just use the answer I gave you yesterday and split the first return group, 1/1/g10, by '/' and get the third part. 1/1/g10 split by '/' gives 1 1 g10 <- third part Why use a more complicated regex when you can use simple code to accomplish the task?

Here are my attempts. const string pattern = ".* = (.*), (.*)"; var r = Regex.Match(s, pattern); const string pattern2 = ".* = ([0-9a-zA-Z\\/]*), (.*)"; var r2 = Regex.Match(s, pattern2);

I think is what you are looking for = (.+), (.+) It looks for "= " then captures all until a comma and then everything afterwards. It returns 1/1/g1 Office to DMZ as requested. See it working on regex101.com.

Related

replaceAll regex to remove last - from the output

Regex to get 2nd or 3rd level domain with path in it

Regex to get value from <key, value> by asserting conditions on the value

Regex to extract static text and number using only regular expression

Regex - optional capture group after wildcard

Categories

Resources

Here are my attempts. const string pattern = ".* = (.), (.)"; var r = Regex.Match(s, pattern); const string pattern2 = ".* = ([0-9a-zA-Z\\/]), (.)"; var r2 = Regex.Match(s, pattern2);