How can I print only match pattern word in Terraform - regex

If i got value like
email = "Mark Johnson (mark#johnson.com)"
How can i print only word in the () and output become
mark#johnson.com
Currently I using regex but It seems doesn't work with my pattern
email = regex("x{(,)}", "var.email")
How can i solve this issue , Thanks for your help.

Terrafrom used RE2 regex engine, and the regex function returns only the captured value if you define a capturing group in the regex pattern. It will return a list of captures if you have more than one capturing group in your pattern, but here, you need just one.
To extract all text inside parentheses:
> regex("[(]([^()]+)[)]", "Mark Johnson (mark#johnson.com)")
The [(] matches a ( char, ([^()]+) captures into Group 1 any one or more chars into Group 1, and [)] matches a ) char.
To extract an email-like string from parentheses:
> regex("[(]([^()#[:space:]]+#[^()[:space:]]+[.][^()[:space:]]+)[)]", "Mark Johnson (mark#johnson.com)")
Here, [^()#[:space:]]+ matches 1 or more chars other than (, ), # and whitespace.
See the regex demo

Related

Regex ignores negative lookahead

I've got the following string:
#index 1#n John Doe#a some University#pc 7#cn 4#hi 1#pi 0.5889
And want to extract the part between #n and the following # with regex. The result should then be:
"John Doe"
This works with the following regex:
(?<=#cn\s).(?:(?!#).)*
However, if the string looks as follows:
#index 1#n #a some University#pc 7#cn 4#hi 1#pi 0.5889
The regex returns:
"#a some University"
But I need it to return an empty string. Can someone help me with this problem?
You may do that by extracting one or more chars other than # after #n and a whitespace:
(?<=#n\s)[^#]+
See the regex demo. The (?<=#n\s) positive lookbehind matches a location immediately preceded with #n and a whitespace, and [^#]+ matches one or more chars other than #.
If there can be any one or more whitespaces, you can use a capturing group. In PySpark, it will look like
df.withColumn("result", regexp_extract(col("source"), r"#n\s+([^#]+)", 1))
See this regex demo. With #n\s+([^#]+), you match #n, one or more whitespaces, and then capture one or more non-#s into Group 1.

Regex Capture Parts of Line

I have been struggling to capture a part of an snmp response.
Text
IF-MIB::ifDescr.1 = 1/1/g1, Office to DMZ
Regex
(?P<ifDescr>(?<=ifDescr.\d = ).*)
Current Capture
1/1/g1, Office to DMZ
How to capture only?
1/1/g1
Office to DMZ
EDIT
1/1/g1
This should match the digit and forward slashes for the port notation in the snmp response.
(?P<ifDescr>(?<=ifDescr.\d = )\d\/\d\/g\d)
Link to regexr
Office to DMZ
This should start the match past the port notation and capture remaining description.
(?P<ifDescr>(?<=ifDescr.\d = \d\/\d\/g\d, ).*)
Link to regexr
You could just use the answer I gave you yesterday and split the first return group, 1/1/g10, by '/' and get the third part.
1/1/g10
split by '/' gives
1
1
g10 <- third part
Why use a more complicated regex when you can use simple code to accomplish the task?
With your shown samples, could you please try following regex with PCRE options available.
(?<=IF-MIB::ifDescr)\.\d+\s=\s\K(?:\d+\/){2}g(?:\d+)
Here is Online demo of above regex
OR with a little variation use following:
(?<=IF-MIB::ifDescr)\.\d+\s=\s\K(?:(?:\d+\/){2}g\d+)
Explanation: Adding detailed explanation for above.
(?<=IF-MIB::ifDescr) ##using look behind to make sure all mentioned further conditions must be preceded by this expression(IF-MIB::ifDescr)
\.\d+\s=\s ##Matching literal dot with digits one or more occurrences then with 1 or more occurrences of space = followed by one or more occurrences of spaces.
\K ##\K is GNU specific to simply forget(kind of) as of now matched regex and consider values in regex for further given expressions only.
(?:\d+\/){2}g(?:\d+) ##Creating a non-capturing group where matching 1 or more digits with g and 1 or more digits.
Without PCRE flavor: To get values in 1st capture group try following, confirmed by OP in comments about its working.
(?<=IF-MIB::ifDescr)\.\d+\s=\s((\d+\/){2}g\d+)
Here are my attempts.
const string pattern = ".* = (.*), (.*)";
var r = Regex.Match(s, pattern);
const string pattern2 = ".* = ([0-9a-zA-Z\\/]*), (.*)";
var r2 = Regex.Match(s, pattern2);
Using the named capture group ifDescr to capture the value 1/1/g1 you can use a match instead of lookarounds.
(Note to escape the dot \. to match it literally)
ifDescr\.\d+ = (?P<ifDescr>\d+\/\d+\/g\d+),
The pattern matches:
ifDescr\.\d+ = Match ifDescr. and 1+ digits followed by =
(?P<ifDescr> Named group ifDescr
\d+\/\d+\/g\d+ Match 1+ digits / 1+ digits /g and 1+ digits
), Close group and match the trailing comma
Regex demo
Do the following:
ifDescr\.\d+\s=\s((?:\d\/){2}g\d+)
The resultant capture groups contain the intended result. Note that \d+ accepts one or more digits, so you don't need the OR operator as used by you.
Demo
Alternatively, it looks like that the number after g will always be the number after ifDescr.. If that is the case, do this:
ifDescr\.(\d+)\s=\s((?:\d\/){2}g\1)
This basically captures the number in a group, then reuses it to match using backreference (note the usage of \1). The intended result in this case is available in the second capturing group.
Demo
I think is what you are looking for
= (.+), (.+)
It looks for "= " then captures all until a comma and then everything afterwards. It returns
1/1/g1
Office to DMZ
as requested.
See it working on regex101.com.

Notepad++ regex to extract usernames from this list

I have this list below:
scrapeDate,username,full_name,is_private,follower_count,following_count,media_count,biography,hasProfilePic,external_url,email,contact_phone_number,address_street,category,businessJoinDate,businessCountry,businessAds,countryCode,cityName,isverified
07/05/2020 05:37 AM,maplethenorwich,Maple the Norwich,False,0,0,0,,False,,,,,,,,,,,No
07/05/2020 05:37 AM,baby_yoda_militia,Baby Yoda,False,0,0,0,,False,,,,,,,,,,,No
07/05/2020 05:37 AM,caciquegoldendoodle,CaciqueGoldenDoodle,False,0,0,0,,False,,,,,,,,,,,No
07/05/2020 05:37 AM,ja_watts,Julie Anna Watts,False,0,0,0,,False,,,,,,,,,,,No
07/05/2020 05:37 AM,lets_go_zumba_and_travel,Mrsirenetakamoto,False,0,0,0,,False,,,,,,,,,,,No
07/05/2020 05:37 AM,bunnyslash,Bunnyslash,False,0,0,0,,False,,,,,,,,,,,No
I would like to get the Usernames only as below:
maplethenorwich
baby_yoda_militia
caciquegoldendoodle
ja_watts
lets_go_zumba_and_travel
bunnyslash
I've tried ^(?:[^,\r\n]*,){3}([^,\r\n]+).* but it gets me "False".
I wish somebody who can help me to find the right Regex to extract the Usernames only.
You may try:
.*?,(.*?),.*
Explanation of the above regex:
.*? - Lazily matches everything except the new line.
, - Matches , literally.
(.*?) - Represents first capturing group matching lazily username or the second values in csv.
,.* - Greedily matching everything except new line. If you don't want to remove the contents; just leave this and capture the above group and write them to a new file or according to your requirement.
$1 - For the replacement part replace all the matched text with just the captured group using $1.
You can find the demo of the above regex in here.
Result Snap from notepad++
You are repeating the group 3 times using quantifier {3}, but there is no need to repeat it because you want the second value.
^(?:[^,\r\n]*,){3}([^,\r\n]+).*
^^^ ^^^^
You can omit the quantifier and the non capturing group as there is nothing to repeat.
^[^,\r\n]*,([^,\r\n]+).*
^ Start of the string
[^,\r\n]*, Match 0+ times any char except a comma or newline, then match ,
( Capture group 1
[^,\r\n]+ Match 1+ times any char except a comma or newline
) Close group 1
.* Match the rest of the line
Regex demo

Regex to extract static text and number using only regular expression

I am completely new to this regular expression.
But I tried to write the regular expression to get some static text and phone number for the below text
"password":"password123:cityaddress:mailaddress:9233321110:gender:45"
I written like below to extract this : "password":9233321110
(([\"]password[\"][\s]*:{1}[\s]*))(\d{10})?
regex link for demo:
https://regex101.com/r/2vNpMU/2
the correct regexp gives full match as "password":9233321110 in regex tool
I am not using any programming language here, this is for network packet capture at F5 level.
Please help me with the regexp;
I would use /^([^:]+)(?::[^:]+){3}:([^:]+)/ for this.
Explained (more detailed explanation at regex101):
^ matches from the start of the string
(…) is the first capture group. This will collect that initial "password"
[^:]+ matches one or more non-colon characters
(?:…) is a non-capturing group (it collects nothing for later)
:[^:]+ matches a colon and then 1+ non-colons
{3} instructs us to match the previous item (the non-capturing group) 3 times
: matches a literal colon
([^:]+) captures a match of 1+ non-colons, which will get us 9233321110 in this example
The first capture group is typically stored as $1 or the first item of the returned array. (In Javascript, the zeroth item is the full match and item index 1 is the first capture group.) The second capture group is $2, etc.
To always match the "password" key, hard-code it: /^("password")(?::[^:]+){3}:([^:]+)/
Here's a live snippet demonstrating it:
x = `"password":"password123:cityaddress:mailaddress:9233321110:gender:45"`;
match = x.match(/^([^:]+)(?::[^:]+){3}:([^:]+)/);
if (match) console.log(match[1] + ":" + match[2]);
else console.log("no match");

Regular Expression to Anonymize Names

I am using Notepad++ and the Find and Replace pattern with regular expressions to alter usernames such that only the first and last character of the screen name is shown, separated by exactly four asterisks (*). For example, "albobz" would become "a****z".
Usernames are listed directly after the cue "screen_name: " and I know I can find all the usernames using the regular expression:
screen_name:\s([^\s]+)
However, this expression won't store the first or last letter and I am not sure how to do it.
Here is a sample line:
February 3, 2018 screen_name: FR33Q location: Europe verified: false lang: en
Method 1
You have to work with \G meta-character. In N++ using \G is kinda tricky.
Regex to find:
(?>(screen_name:\s+\S)|\G(?!^))\S(?=\S)
Breakdown:
(?> Construct a non-capturing group (atomic)
( Beginning of first capturing group
screen_name:\s\S Match up to first letter of name
) End of first CG
| Or
\G(?!^) Continue from previous match
) End of NCG
\S Match a non-whitespace character
(?=\S) Up to last but one character
Replace with:
\1*
Live demo
Method 2
Above solution substitutes each inner character with a * so length remains intact. If you want to put four number of *s without considering length you would search for:
(screen_name:\s+\S)(\S*)(\S)
and replace with: \1****\3
Live demo