Regex to extract static text and number using only regular expression - regex

I am completely new to this regular expression.
But I tried to write the regular expression to get some static text and phone number for the below text
"password":"password123:cityaddress:mailaddress:9233321110:gender:45"
I written like below to extract this : "password":9233321110
(([\"]password[\"][\s]*:{1}[\s]*))(\d{10})?
regex link for demo:
https://regex101.com/r/2vNpMU/2
the correct regexp gives full match as "password":9233321110 in regex tool
I am not using any programming language here, this is for network packet capture at F5 level.
Please help me with the regexp;

I would use /^([^:]+)(?::[^:]+){3}:([^:]+)/ for this.
Explained (more detailed explanation at regex101):
^ matches from the start of the string
(…) is the first capture group. This will collect that initial "password"
[^:]+ matches one or more non-colon characters
(?:…) is a non-capturing group (it collects nothing for later)
:[^:]+ matches a colon and then 1+ non-colons
{3} instructs us to match the previous item (the non-capturing group) 3 times
: matches a literal colon
([^:]+) captures a match of 1+ non-colons, which will get us 9233321110 in this example
The first capture group is typically stored as $1 or the first item of the returned array. (In Javascript, the zeroth item is the full match and item index 1 is the first capture group.) The second capture group is $2, etc.
To always match the "password" key, hard-code it: /^("password")(?::[^:]+){3}:([^:]+)/
Here's a live snippet demonstrating it:
x = `"password":"password123:cityaddress:mailaddress:9233321110:gender:45"`;
match = x.match(/^([^:]+)(?::[^:]+){3}:([^:]+)/);
if (match) console.log(match[1] + ":" + match[2]);
else console.log("no match");

Related

Pattern to match everything except a string of 5 digits

I only have access to a function that can match a pattern and replace it with some text:
Syntax
regexReplace('text', 'pattern', 'new text'
And I need to return only the 5 digit string from text in the following format:
CRITICAL - 192.111.6.4: rta nan, lost 100%
Created Time Tue, 5 Jul 8:45
Integration Name CheckMK Integration
Node 192.111.6.4
Metric Name POS1
Metric Value DOWN
Resource 54871
Alert Tags 54871, POS1
So from this text, I want to replace everything with "" except the "54871".
I have come up with the following:
regexReplace("{{ticket.description}}", "\w*[^\d\W]\w*", "")
Which almost works but it doesn't match the symbols. How can I change this to match any word that includes a letter or symbol, essentially.
As you can see, the pattern I have is very close, I just need to include special characters and letters, whereas currently it is only letters:
You can match the whole string but capture the 5-digit number into a capturing group and replace with the backreference to the captured group:
regexReplace("{{ticket.description}}", "^(?:[\w\W]*\s)?(\d{5})(?:\s[\w\W]*)?$", "$1")
See the regex demo.
Details:
^ - start of string
(?:[\w\W]*\s)? - an optional substring of any zero or more chars as many as possible and then a whitespace char
(\d{5}) - Group 1 ($1 contains the text captured by this group pattern): five digits
(?:\s[\w\W]*)? - an optional substring of a whitespace char and then any zero or more chars as many as possible.
$ - end of string.
The easiest regex is probably:
^(.*\D)?(\d{5})(\D.*)?$
You can then replace the string with "$2" ("\2" in other languages) to only place the contents of the second capture group (\d{5}) back.
The only issue is that . doesn't match newline characters by default. Normally you can pass a flag to change . to match ALL characters. For most regex variants this is the s (single line) flag (PCRE, Java, C#, Python). Other variants use the m (multi line) flag (Ruby). Check the documentation of the regex variant you are using for verification.
However the question suggest that you're not able to pass flags separately, in which case you could pass them as part of the regex itself.
(?s)^(.*\D)?(\d{5})(\D.*)?$
regex101 demo
(?s) - Set the s (single line) flag for the remainder of the pattern. Which enables . to match newline characters ((?m) for Ruby).
^ - Match the start of the string (\A for Ruby).
(.*\D)? - [optional] Match anything followed by a non-digit and store it in capture group 1.
(\d{5}) - Match 5 digits and store it in capture group 2.
(\D.*)? - [optional] Match a non-digit followed by anything and store it in capture group 3.
$ - Match the end of the string (\z for Ruby).
This regex will result in the last 5-digit number being stored in capture group 2. If you want to use the first 5-digit number instead, you'll have to use a lazy quantifier in (.*\D)?. Meaning that it becomes (.*?\D)?.
(?s) is supported by most regex variants, but not all. Refer to the regex variant documentation to see if it's available for you.
An example where the inline flags are not available is JavaScript. In such scenario you need to replace . with something that matches ALL characters. In JavaScript [^] can be used. For other variants this might not work and you need to use [\s\S].
With all this out of the way. Assuming a language that can use "$2" as replacement, and where you do not need to escape backslashes, and a regex variant that supports an inline (?s) flag. The answer would be:
regexReplace("{{ticket.description}}", "(?s)^(.*\D)?(\d{5})(\D.*)?$", "$2")

replaceAll regex to remove last - from the output

I was able to achieve some of the output but not the right one. I am using replace all regex and below is the sample code.
final String label = "abcs-xyzed-abc-nyd-request-xyxpt--1-cnaq9";
System.out.println(label.replaceAll(
"([^-]+)-([^-]+)-(.+)-([^-]+)-([^-]+)", "$3"));
i want this output:
abc-nyd-request-xyxpt
but getting:
abc-nyd-request-xyxpt-
here is the code https://ideone.com/UKnepg
You may use this .replaceFirst solution:
String label = "abcs-xyzed-abc-nyd-request-xyxpt--1-cnaq9";
label.replaceFirst("(?:[^-]*-){2}(.+?)(?:--1)?-[^-]+$", "$1");
//=> "abc-nyd-request-xyxpt"
RegEx Demo
RegEx Details:
(?:[^-]+-){2}: Match 2 repetitions of non-hyphenated string followed by a hyphen
(.+?): Match 1+ of any characters and capture in group #1
(?:--1)?: Match optional --1
-: Match a -
[^-]+: Match a non-hyphenated string
$: End
The following works for your example case
([^-]+)-([^-]+)-(.+[^-])-+([^-]+)-([^-]+)
https://regex101.com/r/VNtryN/1
We don't want to capture any trailing - while allowing the trailing dashes to have more than a single one which makes it match the double --.
With your shown samples and attempts, please try following regex. This is going to create 1 capturing group which can be used in replacement. Do replacement like: $1in your function.
^(?:.*?-){2}([^-]*(?:-[^-]*){3})--.*
Here is the Online demo for above regex.
Explanation: Adding detailed explanation for above regex.
^(?:.*?-){2} ##Matching from starting of value in a non-capturing group where using lazy match to match very near occurrence of - and matching 2 occurrences of it.
([^-]*(?:-[^-]*){3}) ##Creating 1st and only capturing group and matching everything before - followed by - followed by everything just before - and this combination 3 times to get required output.
--.* ##Matching -- to all values till last.

Regex Capture Parts of Line

I have been struggling to capture a part of an snmp response.
Text
IF-MIB::ifDescr.1 = 1/1/g1, Office to DMZ
Regex
(?P<ifDescr>(?<=ifDescr.\d = ).*)
Current Capture
1/1/g1, Office to DMZ
How to capture only?
1/1/g1
Office to DMZ
EDIT
1/1/g1
This should match the digit and forward slashes for the port notation in the snmp response.
(?P<ifDescr>(?<=ifDescr.\d = )\d\/\d\/g\d)
Link to regexr
Office to DMZ
This should start the match past the port notation and capture remaining description.
(?P<ifDescr>(?<=ifDescr.\d = \d\/\d\/g\d, ).*)
Link to regexr
You could just use the answer I gave you yesterday and split the first return group, 1/1/g10, by '/' and get the third part.
1/1/g10
split by '/' gives
1
1
g10 <- third part
Why use a more complicated regex when you can use simple code to accomplish the task?
With your shown samples, could you please try following regex with PCRE options available.
(?<=IF-MIB::ifDescr)\.\d+\s=\s\K(?:\d+\/){2}g(?:\d+)
Here is Online demo of above regex
OR with a little variation use following:
(?<=IF-MIB::ifDescr)\.\d+\s=\s\K(?:(?:\d+\/){2}g\d+)
Explanation: Adding detailed explanation for above.
(?<=IF-MIB::ifDescr) ##using look behind to make sure all mentioned further conditions must be preceded by this expression(IF-MIB::ifDescr)
\.\d+\s=\s ##Matching literal dot with digits one or more occurrences then with 1 or more occurrences of space = followed by one or more occurrences of spaces.
\K ##\K is GNU specific to simply forget(kind of) as of now matched regex and consider values in regex for further given expressions only.
(?:\d+\/){2}g(?:\d+) ##Creating a non-capturing group where matching 1 or more digits with g and 1 or more digits.
Without PCRE flavor: To get values in 1st capture group try following, confirmed by OP in comments about its working.
(?<=IF-MIB::ifDescr)\.\d+\s=\s((\d+\/){2}g\d+)
Here are my attempts.
const string pattern = ".* = (.*), (.*)";
var r = Regex.Match(s, pattern);
const string pattern2 = ".* = ([0-9a-zA-Z\\/]*), (.*)";
var r2 = Regex.Match(s, pattern2);
Using the named capture group ifDescr to capture the value 1/1/g1 you can use a match instead of lookarounds.
(Note to escape the dot \. to match it literally)
ifDescr\.\d+ = (?P<ifDescr>\d+\/\d+\/g\d+),
The pattern matches:
ifDescr\.\d+ = Match ifDescr. and 1+ digits followed by =
(?P<ifDescr> Named group ifDescr
\d+\/\d+\/g\d+ Match 1+ digits / 1+ digits /g and 1+ digits
), Close group and match the trailing comma
Regex demo
Do the following:
ifDescr\.\d+\s=\s((?:\d\/){2}g\d+)
The resultant capture groups contain the intended result. Note that \d+ accepts one or more digits, so you don't need the OR operator as used by you.
Demo
Alternatively, it looks like that the number after g will always be the number after ifDescr.. If that is the case, do this:
ifDescr\.(\d+)\s=\s((?:\d\/){2}g\1)
This basically captures the number in a group, then reuses it to match using backreference (note the usage of \1). The intended result in this case is available in the second capturing group.
Demo
I think is what you are looking for
= (.+), (.+)
It looks for "= " then captures all until a comma and then everything afterwards. It returns
1/1/g1
Office to DMZ
as requested.
See it working on regex101.com.

Regex- to extract a string before and after string

Want extract string before and after the word. Below are the content.
Content:
1. http://www.example.com/myplan/mp/public/pl_be?Id=543543&timestamp=06280435435
2. http://www.example.com/course/df/public/pl_de?Id=454354&timestamp=0628031746
3. http://www.example.com/book/rg/public/pl_fo?Id=4445577&timestamp=0628031734
4. http://www.example.com/trip/tr/public/pl_ds?Id=454354&timestamp=06280314546
5. http://www.example.com/trip/tr/public/pl_ds
I want capture data for above string as below
1. http://www.example.com/myplan/mp/public/?Id=543543
2. http://www.example.com/course/df/public/?Id=454354
3. http://www.example.com/book/rg/public/?Id=4445577
4. http://www.example.com/trip/tr/public/?Id=454354
5. http://www.example.com/trip/tr/public/
I have tried with (./(?![A-Za-z]{2}_[A-Za-z]{2}).(?=&)). But it won't help.
I hope somebody can help me with this.
This pattern will catch what you want in two groups. It's more safe than other other examples that have been suggested so far because it allows for some variance in the URL.
(.*)\w\w_\w\w.*?(?:(?:[&?]\w+=\d+|%\w*)*?(\?Id=\d+)(?:.*))?
(.*) captures everything up until your xx_xx part (capture group 1)
\w\w_\w\w.* matches xx_xx and everything up until the next capture section
(?:[&?]\w+=\d+|%\w*)*? allows for there to be other & % or ? properties in your URL before your ?Id= property
(\?Id=\d+) captures your Id property (capture group 2)
(?:.*) is unnecessary but it bugs me when not all of the text is highlighted on regex101 ¯\_(ツ)_/¯
the optional non-capturing group here (?:(?:[&?]\w+=\d+|%\w*)*?(\?Id=\d+)(?:.*))? allows it to match URLs that don't have ID properties.
Here's an example of how it works
Response updated:
This pattern will do the work for you:
(.*\/)[^?]*(?:(\?[^&]*).*)?
Explanation:
(.*\/) -> Will match and capture every character until the / character is present (The .* is a greedy operator).
[^?]* -> Will match everything that's not a ? character.
(?:(\?[^&]*).*)? -> First of all, (?: ... ) is a non-capturing group, the ? at the end of this makes this group optional, (\?[^&]*) will match and capture the ? character and every non & character next to it, the last .* will match everything after the first param in the URL.
Then you can replace the string using only the first and second capture groups.
Here is a working example in regex101
Edit 2:
As emsimpson92 mentioned in the comments, the Id couldn't always be the first param, so you can use this pattern to match the Id param:
(.*\/)[^?]*(?:(\?).*?(Id=[^&]*).*)?
The important part here is that .*?(Id=[^&]*).* matches the Id param no matter its position.
.*? -> It matches all the characters until Id= is present. The trick here is that .* is a greedy quantifier but when is used in conjunction with ? it becomes a lazy one.
Here is an Example of this scenario in regex101

Regular Expression to Anonymize Names

I am using Notepad++ and the Find and Replace pattern with regular expressions to alter usernames such that only the first and last character of the screen name is shown, separated by exactly four asterisks (*). For example, "albobz" would become "a****z".
Usernames are listed directly after the cue "screen_name: " and I know I can find all the usernames using the regular expression:
screen_name:\s([^\s]+)
However, this expression won't store the first or last letter and I am not sure how to do it.
Here is a sample line:
February 3, 2018 screen_name: FR33Q location: Europe verified: false lang: en
Method 1
You have to work with \G meta-character. In N++ using \G is kinda tricky.
Regex to find:
(?>(screen_name:\s+\S)|\G(?!^))\S(?=\S)
Breakdown:
(?> Construct a non-capturing group (atomic)
( Beginning of first capturing group
screen_name:\s\S Match up to first letter of name
) End of first CG
| Or
\G(?!^) Continue from previous match
) End of NCG
\S Match a non-whitespace character
(?=\S) Up to last but one character
Replace with:
\1*
Live demo
Method 2
Above solution substitutes each inner character with a * so length remains intact. If you want to put four number of *s without considering length you would search for:
(screen_name:\s+\S)(\S*)(\S)
and replace with: \1****\3
Live demo