I need some regex that will return the value between first and fifth backslash has highlighted below in bold:
dataCapture/22E6F953EA6D445C8FB20E9D29A977D7/6.20.0-3c1e4b0c459eb93e43eb64fed7447a41fb4d4029/uuid_2b896c17-eb5c-4fd1-ae44-78dcda6c8ee9/36/3D1C3A58A039103375D320E524500A74
So far I've only been able to come up with regex that returns data up till the first backslash:
\/dataCapture\/(.+?)\/
How do I extend the above to include data up to the fifth backslash?
Might not be the cleanest but that makes the job done:
const regex = /dataCapture\/([a-zA-Z0-9]+\/[a-zA-Z0-9\.\-]+\/[a-zA-Z0-9\.\-\_]+\/[0-9]+)\/.*/;
const value = "dataCapture/22E6F953EA6D445C8FB20E9D29A977D7/6.20.0-3c1e4b0c459eb93e43eb64fed7447a41fb4d4029/uuid_2b896c17-eb5c-4fd1-ae44-78dcda6c8ee9/36/3D1C3A58A039103375D320E524500A74";
console.log(value.match(regex)[1]); // => 22E6F953EA6D445C8FB20E9D29A977D7/6.20.0-3c1e4b0c459eb93e43eb64fed7447a41fb4d4029/uuid_2b896c17-eb5c-4fd1-ae44-78dcda6c8ee9/36
In order to solve this regex pattern, you have to use the following code:
^\/dataCapture\/(.+?)\/(.+?)\/(.+?)\/(.+?)\/
You can test this regex in this site.
I am not familiar with JMeter, but I understand it uses a slight variant of Perl5's regex engine, so I expect matching the following regular expression will extract the string of interest.
(?<=^dataCapture\/)(?:[^\/]*\/){3}[^\/]*(?=\/)
demo
The regex engine performs the following operations.
(?<= : begin positive lookbehind
^ : match beginning of string
dataCapture\/ : match 'dataCapture\/
) : end positive lookbehind
(?:[^\/]*\/) : match 0+ charsother than '/', followed by '/', in
a non-capture group
{3} : execute the non-capture group 3 times
[^\/]* : match 0+ chars other than '/'
(?=\/) : positive lookahead asserts that the next char is '/'
Related
I'm struggling with the following combination of characters that I'm trying to parse:
I have two types of text:
1. AF-B-W23F4-USLAMC-X99-JLK
2. LS-V-A23DF-SDLL--X22-LSM
I want to get the last two combination of characters devided by - within dash.
From the 1. X99-JLK and from the 2. X22-LSM
I accomplished the 2. with the following regex '--(.*-.*)'
How can I parse the 1. sample and is there any option to parse it at one time with something like OR operator?
Thanks for any help!
The pattern --(.*-.*) that you tried matches the second example because it contains -- and it matches the first occurrence.
Then it matches until the end of the string and backtracks to find another hyphen.
As .* can match any character (also -) and there are no anchors or boundaries set, this is a very broad match.
If there have to be 2 dashes, you can match the first one, and use a capture group for the part with the second one using a negated character class [^-]
The character class can also match a newline. If you don't want to match a newline you can use [^-\r\n] or also not matching spaces [^-\s] (as there are none in the example data)
-([^-]+-[^-]+)$
Explanation
- Match -
( Capture group 1
[^-]+-[^-]+ Match the second dash between chars other than -
) Close group 1
$ End of string
See a regex demo
For example using Javascript:
const regex = /-([^-]+-[^-]+)$/;
[
"AF-B-W23F4-USLAMC-X99-JLK",
"LS-V-A23DF-SDLL--X22-LSM"
].forEach(s => {
const m = s.match(regex);
if (m) {
console.log(m[1]);
}
})
You can try lookahead to match the last pair before the new line. JavaScript example:
const str = `
AF-B-W23F4-USLAMC-X99-JLK
LS-V-A23DF-SDLL--X22-LSM
`;
const re = /[^-]*-[^-]*(?=\n)/g;
console.log(str.match(re));
I have a regex which takes the value from the given key as below
Regex .*key="([^"]*)".* InputValue key="abcd-qwer-qaa-xyz-vwxc"
output abcd-qwer-qaa-xyz-vwxc
But, on top of this i need to validate the value with starting only with abcd- and somewhere the following pattern matches -xyz
Thus, the input and outputs has to be as follows:
I tried below which is not working as expected
.*key="([^"]*)"?(/Babcd|-xyz).*
The key value pair is part of the large string as below:
object{one="ab-vwxc",two="value1",key="abcd-eest-wd-xyz-bnn",four="obsolete Values"}
I think by matching the key its taking the value and that's y i used this .*key="([^"]*)".*
Note:
Its a dashboard. you can refer this link and search for Regex: /"([^"]+)"/ This regex is applied on the query result which is a string i referred. Its working with that regex .*key="([^"]*)".* above. I'm trying to alter with that regexGroup itself. Hope this helps?
Can anyone guide or suggest me on this please? That would be helpful. Thanks!
Looks like you could do with:
\bkey="(abcd(?=.*-xyz\b)(?:-[a-z]+){4})"
See the demo online
\bkey=" - A word-boundary and literally match 'key="'
( - Open 1st capture group.
abcd - Literally match 'abcd'.
(?=.*-xyz\b) - Positive lookahead for zero or more characters (but newline) followed by literally '-xyz' and a word-boundary.
(?: - Open non-capturing group.
-[a-z]+ - Match an hyphen followed by at least a single lowercase letter.
){4} - Close non-capture group and match it 4 times.
) - Close 1st capture group.
" - Match a literal double quote.
I'm not a 100% sure you'd only want to allow for lowercase letter so you can adjust that part if need be. The whole pattern validates the inputvalue whereas you could use capture group one to grab you key.
Update after edited question with new information:
Prometheus uses the RE2 engine in all regular expressions. Therefor the above suggestion won't work due to the lookarounds. A less restrictive but possible answer for OP could be:
\bkey="(abcd(?:-\w+)*-xyz(?:-\w+)*)"
See the online demo
Will this work?
Pattern
\bkey="(abcd-[^"]*\bxyz\b[^"]*)"
Demo
You could use the following regular expression to verify the string has the desired format and to match the portion of the string that is of interest.
(?<=\bkey=")(?=.*-xyz(?=-|$))abcd(?:-[a-z]+)+(?=")
Start your engine!
Note there are no capture groups.
The regex engine performs the following operations.
(?<=\bkey=") : positive lookbehind asserts the current
position in the string is preceded by 'key='
(?= : begin positive lookahead
.*-xyz : match 0+ characters, then '-xyz'
(?=-|$) : positive lookahead asserts the current position is
: followed by '-' or is at the end of the string
) : end non-capture group
abcd : match 'abcd'
(?: : begin non-capture group
-[a-z]+ : match '-' followed by 1+ characters in the class
)+ : end non-capture group and execute it 1+ times
(?=") : positive lookahead asserts the current position is
: followed by '"'
How do I craft a regular expression with a group that includes text with an open parenthesis not preceded by a space, but does not include an open parenthesis preceded by a space (and everything after that)?
Some examples:
Matching: "Yasmani Grandal (1B 1.84)"
Would return: "Yasmani Grandal"
Matching: "J.T. Realmuto"
Would return: "J.T. Realmuto"
Matching: "WillD. Smith(LAD)"
Would return: "WillD. Smith(LAD)"
Matching: "Adley(round/1/2019) Rutschman"
Would return: "Adley(round/1/2019) Rutschman"
Attempted solutions:
(.+)(?:\s\(.*)
This regular expression returns the "Yasmani Grandal" as group 1 when matching "Yasmani Grandal (1B 1.84)", but doesn't match "J.T. Realmuto" because the second (non-matching) group is not optional.
But if I make it optional: (.+)(?:\s\(.*)?
...then group 1 when matching "Yasmani Grandal (1B 1.84)" is ""Yasmani Grandal (1B 1.84)".
You may use
^(.*?)(?:\s+\(.*\))?$
See the regex demo
Details
^ - start of string
(.*?) - Capturing group 1: any 0 or more chars other than line break chars as few as possible
(?:\s+\(.*\))? - an optional non-capturing group matching 1 or 0 occurrences of
\s+ - 1+ whitespaces
\( - a ( char
.* - any 0 or more chars other than line break chars as many as possible
\) - a ) char
$ - end of string.
You could use the following regular expression to convert matches to empty strings. (I've escaped the leading space merely for readability.)
\ +\((?!.* \)).*
The converted string is presumably what you want, so there seems no point to saving it to a capture group. If you need to capture the part of the string that is converted to an empty string, replace .* with
(.*).
As this regex contains nothing more exotic the a positive lookahead it should work with most regex engines.
Start your engine!
The regex engine performs the following operations.
\ + : match 1+ spaces
\( : match '('
(?!.* \)) : use a negative lookahead to assert the remainder of
the line does contain the string ' )'`
.* : match 0+ characters other than line terminators
I've assumed you want to remove all spaces preceding the left parenthesis that is preceded by at least one space. If, for example, the string were:
Yasmani Grandal (1B 1.84)
^^^^^^^^^^^^^^^
the part identified by the party hats would be converted to an empty string.
Can you try this and let me know if this works?
(.+)\s\(.*
public class HelloWorld{
public static void main(String []args){
String[] names = new String[] {"Yasmani Grandal (1B 1.84)","J.T. Realmuto","WillD. Smith(LAD)","Adley(round/1/2019) Rutschman"};
for (String in : names)
System.out.println(in.replaceAll("(.+)\\s\\(.*","$1"));
}
}
Please note I wrote a minimal expression for this. You can extend it as per your additional requirements. The above code works just fine.
So I need to match the following:
1.2.
3.4.5.
5.6.7.10
((\d+)\.(\d+)\.((\d+)\.)*) will do fine for the very first line, but the problem is: there could be many lines: could be one or more than one.
\n will only appear if there are more than one lines.
In string version, I get it like this: "1.2.\n3.4.5.\n1.2."
So my issue is: if there is only one line, \n needs not to be at the end, but if there are more than one lines, \n needs be there at the end for each line except the very last.
Here is the pattern I suggest:
^\d+(?:\.\d+)*\.?(?:\n\d+(?:\.\d+)*\.?)*$
Demo
Here is a brief explanation of the pattern:
^ from the start of the string
\d+ match a number
(?:\.\d+)* followed by dot, and another number, zero or more times
\.? followed by an optional trailing dot
(?:\n followed by a newline
\d+(?:\.\d+)*\.?)* and another path sequence, zero or more times
$ end of the string
You might check if there is a newline at the end using a positive lookahead (?=.*\n):
(?=.*\n)(\d+)\.(\d+)\.((\d+)\.)*
See a regex demo
Edit
You could use an alternation to either match when on the next line there is the same pattern following, or match the pattern when not followed by a newline.
^(?:\d+\.\d+\.(?:\d+\.)*(?=.*\n\d+\.\d+\.)|\d+\.\d+\.(?:\d+\.)*(?!.*\n))
Regex demo
^ Start of string
(?: Non capturing group
\d+\.\d+\. Match 2 times a digit and a dot
(?:\d+\.)* Repeat 0+ times matching 1+ digits and a dot
(?=.*\n\d+\.\d+\.) Positive lookahead, assert what follows a a newline starting with the pattern
| Or
\d+\.\d+\. Match 2 times a digit and a dot
(?:\d+\.)* Repeat 0+ times matching 1+ digits and a dot
*(?!.*\n) Negative lookahead, assert what follows is not a newline
) Close non capturing group
(\d+\.*)+\n* will match the text you provided. If you need to make sure the final line also ends with a . then (\d+\.)+\n* will work.
Most programming languages offer the m flag. Which is the multiline modifier. Enabling this would let $ match at the end of lines and end of string.
The solution below only appends the $ to your current regex and sets the m flag. This may vary depending on your programming language.
var text = "1.2.\n3.4.5.\n1.2.\n12.34.56.78.123.\nthis 1.2. shouldn't hit",
regex = /((\d+)\.(\d+)\.((\d+)\.)*)$/gm,
match;
while (match = regex.exec(text)) {
console.log(match);
}
You could simplify the regex to /(\d+\.){2,}$/gm, then split the full match based on the dot character to get all the different numbers. I've given a JavaScript example below, but getting a substring and splitting a string are pretty basic operations in most languages.
var text = "1.2.\n3.4.5.\n1.2.\n12.34.56.78.123.\nthis 1.2. shouldn't hit",
regex = /(\d+\.){2,}$/gm;
/* Slice is used to drop the dot at the end, otherwise resulting in
* an empty string on split.
*
* "1.2.3.".split(".") //=> ["1", "2", "3", ""]
* "1.2.3.".slice(0, -1) //=> "1.2.3"
* "1.2.3".split(".") //=> ["1", "2", "3"]
*/
console.log(
text.match(regex)
.map(match => match.slice(0, -1).split("."))
);
For more info about regex flags/modifiers have a look at: Regular Expression Reference: Mode Modifiers
Basically I have a string and I want to find the shortest sub-string (including the beginning) that matches the repetition of a character N times, it doesn't matter if consecutive or not. I want to use it in Javascript.
Example:
Let's figure out the character is '/' and we want it to match 5 repetitions.
For this string:
http://remote-computer.example.local/home/dev/proj/sdk/docs/index.html#/api
The matching string would be:
http://remote-computer.example.local/home/dev/
For this string:
////remote-computer/example/local/home
The matching string would be:
////remote-computer/
How about this regex:
^((?:[^/]*/){5})
The sub-string you want will be catched in group 1.
In javascript you could do:
var re = new RegExp("^((?:[^/]*/){5})); // excape the slashes is not mandatory
or
var re = /^((?:[^\/]*\/){5})/; // here you have to excape the slashes
Explanation:
^ : begining of the string
( : start capture group 1
(?: : start non capture group
[^/]*/ : 0 or more any character that is not a slash, followed by a slash
){5} : the non capture group occurs 5 times
) : end of group 1
You can use (?:.*?/){5}. See a demo here.
This matches the exact same substrings as Toto’s regexp but is shorter:
There’s no need to use ^; regexps start matching at the beginning by default.
You don’t need a capture group because you want the whole match.
.*?/ matches "everything until the next /, including it", which can also be written as [^/]*/ like Toto did.