Substitute one group with another group; Followup question - regex

I'll be referring to this thread:
Substitute one group with another group
What I'd like to do is to put a value of P1(y) in to P4(y),
with end result: (...) <P4 x="-0,36935" y="0,26315"/>
My previous question, being similar, seems to require a completely new approach.
And unfortunately I couldn't find a reliable solution.
Example to work on:
https://regex101.com/r/iua3p0/2
<P1 x="-0,36935" y="0,26315"/><P2 (...)/><P3 (..)/><P4 x="-0,36935" y="-0,40351"/>
<P1 x="4,64065" y="0,26315"/><P2 (...)/><P3 (..)/><P4 x="4,64065" y="-0,40351"/>

To put a value of P1(y) in to P4(y) on the same line, you could use:
<P1[^>]*\hy="([^"]+)"[^>]*>.*?<P4[^>]*\hy="\K[^"]+(?=[^>]*>)
The pattern matches:
<P1[^>]* Match <P1 and optional chars other than >
\hy=" Match a space and y="
([^"]+) Capture chars other than " in group 1
"[^>]*> Match " and optional chars other than > and then match >
.*? Match as few as possible chars
<P4[^>]*\hy=" Match <P4 and optional char other than > and then match a space and y="
\K[^"]+ Clear the match buffer, and then match what you want to remove, in this case 1+ chars other than "
(?=[^>]*>) Positive lookahead to assert a > to the right
And replace with group 1 using $1
See a regex demo.
Note that to not match across lines using the negated character class, you can exclude matching newlines using [^"\r\n] and [^>\r\n]*

Related

replaceAll regex to remove last - from the output

I was able to achieve some of the output but not the right one. I am using replace all regex and below is the sample code.
final String label = "abcs-xyzed-abc-nyd-request-xyxpt--1-cnaq9";
System.out.println(label.replaceAll(
"([^-]+)-([^-]+)-(.+)-([^-]+)-([^-]+)", "$3"));
i want this output:
abc-nyd-request-xyxpt
but getting:
abc-nyd-request-xyxpt-
here is the code https://ideone.com/UKnepg
You may use this .replaceFirst solution:
String label = "abcs-xyzed-abc-nyd-request-xyxpt--1-cnaq9";
label.replaceFirst("(?:[^-]*-){2}(.+?)(?:--1)?-[^-]+$", "$1");
//=> "abc-nyd-request-xyxpt"
RegEx Demo
RegEx Details:
(?:[^-]+-){2}: Match 2 repetitions of non-hyphenated string followed by a hyphen
(.+?): Match 1+ of any characters and capture in group #1
(?:--1)?: Match optional --1
-: Match a -
[^-]+: Match a non-hyphenated string
$: End
The following works for your example case
([^-]+)-([^-]+)-(.+[^-])-+([^-]+)-([^-]+)
https://regex101.com/r/VNtryN/1
We don't want to capture any trailing - while allowing the trailing dashes to have more than a single one which makes it match the double --.
With your shown samples and attempts, please try following regex. This is going to create 1 capturing group which can be used in replacement. Do replacement like: $1in your function.
^(?:.*?-){2}([^-]*(?:-[^-]*){3})--.*
Here is the Online demo for above regex.
Explanation: Adding detailed explanation for above regex.
^(?:.*?-){2} ##Matching from starting of value in a non-capturing group where using lazy match to match very near occurrence of - and matching 2 occurrences of it.
([^-]*(?:-[^-]*){3}) ##Creating 1st and only capturing group and matching everything before - followed by - followed by everything just before - and this combination 3 times to get required output.
--.* ##Matching -- to all values till last.

Notepad++: reemplace ocurrences of characters before other character

I have a file with text like this:
"Title" = "Body"
And I would like to remove both " before the =, to leave it like this:
Title = "Body"
So far I managed to select the first block of text with:
.+(=)
That selects everything up to the =, but I can't find how to reemplace (or delete) both " .
Any suggestions?
You could use a capture group in the replacement, and match the double quotes to be removed while asserting an equals sign at the right.
Find what:
"([^"]+)"(?=\h*=)
" Match literally
([^"]+) Capture group 1, match 1+ times any char other than "
" Match literally
(?=\h*=) Positive lookahead, assert an = sigh at the right
Regex demo
Replace with:
$1
To match the whole pattern from the start till end end of the string, you might also use 2 capture groups and use those in the replacement.
^"([^"]+)"(\h*=\h*"[^"]+")$
Regex demo
In the replacement use $1$2
You can use
(?:\G(?!^)|^(?=.*=))[^"=\v]*\K"
Replace with an empty string.
Details:
(?:\G(?!^)|^(?=.*=)) - end of the previous successful match (\G(?!^)) or (|) start of a line that contains = somewhere on it (^(?=.*=))
[^"=\v]* - any zero or more chars other than ", = and vertical whitespace
\K - omit the text matched
" - a " char (matched, consumed and removed)
See the screenshot with settings and a demo:

Regex to get value from <key, value> by asserting conditions on the value

I have a regex which takes the value from the given key as below
Regex .*key="([^"]*)".* InputValue key="abcd-qwer-qaa-xyz-vwxc"
output abcd-qwer-qaa-xyz-vwxc
But, on top of this i need to validate the value with starting only with abcd- and somewhere the following pattern matches -xyz
Thus, the input and outputs has to be as follows:
I tried below which is not working as expected
.*key="([^"]*)"?(/Babcd|-xyz).*
The key value pair is part of the large string as below:
object{one="ab-vwxc",two="value1",key="abcd-eest-wd-xyz-bnn",four="obsolete Values"}
I think by matching the key its taking the value and that's y i used this .*key="([^"]*)".*
Note:
Its a dashboard. you can refer this link and search for Regex: /"([^"]+)"/ This regex is applied on the query result which is a string i referred. Its working with that regex .*key="([^"]*)".* above. I'm trying to alter with that regexGroup itself. Hope this helps?
Can anyone guide or suggest me on this please? That would be helpful. Thanks!
Looks like you could do with:
\bkey="(abcd(?=.*-xyz\b)(?:-[a-z]+){4})"
See the demo online
\bkey=" - A word-boundary and literally match 'key="'
( - Open 1st capture group.
abcd - Literally match 'abcd'.
(?=.*-xyz\b) - Positive lookahead for zero or more characters (but newline) followed by literally '-xyz' and a word-boundary.
(?: - Open non-capturing group.
-[a-z]+ - Match an hyphen followed by at least a single lowercase letter.
){4} - Close non-capture group and match it 4 times.
) - Close 1st capture group.
" - Match a literal double quote.
I'm not a 100% sure you'd only want to allow for lowercase letter so you can adjust that part if need be. The whole pattern validates the inputvalue whereas you could use capture group one to grab you key.
Update after edited question with new information:
Prometheus uses the RE2 engine in all regular expressions. Therefor the above suggestion won't work due to the lookarounds. A less restrictive but possible answer for OP could be:
\bkey="(abcd(?:-\w+)*-xyz(?:-\w+)*)"
See the online demo
Will this work?
Pattern
\bkey="(abcd-[^"]*\bxyz\b[^"]*)"
Demo
You could use the following regular expression to verify the string has the desired format and to match the portion of the string that is of interest.
(?<=\bkey=")(?=.*-xyz(?=-|$))abcd(?:-[a-z]+)+(?=")
Start your engine!
Note there are no capture groups.
The regex engine performs the following operations.
(?<=\bkey=") : positive lookbehind asserts the current
position in the string is preceded by 'key='
(?= : begin positive lookahead
.*-xyz : match 0+ characters, then '-xyz'
(?=-|$) : positive lookahead asserts the current position is
: followed by '-' or is at the end of the string
) : end non-capture group
abcd : match 'abcd'
(?: : begin non-capture group
-[a-z]+ : match '-' followed by 1+ characters in the class
)+ : end non-capture group and execute it 1+ times
(?=") : positive lookahead asserts the current position is
: followed by '"'

notepad++ regex how to extract userId from this list

I have this list below:
originalscrape,scrapeDate,userId,username,full_name,is_private,follower_count,following_count,media_count,biography,hasProfilePic,external_url,email,contact_phone_number,address_street,isbusiness,Engagement %,MostRecentPostDate,AvgLikes,AvgComments,category,businessJoinDate,businessCountry,businessAds,countryCode,cityName,isverified
,07/03/2020 05:54 AM,="189389157",stronger_together_forever,stronger_together_forever 🌈🏖☀️,False,0,0,0,,False,,,,,No,0,Has no posts.,0,0,,,,,,,No
,07/03/2020 05:54 AM,="51807820",aaronistattoo,Aaron Is.,False,0,0,0,,False,,,,,No,0,Has no posts.,0,0,,,,,,,No
,07/03/2020 05:54 AM,="194962598",djcoley727,djcoley727,False,0,0,0,,False,,,,,No,0,Has no posts.,0,0,,,,,,,No
,07/03/2020 05:54 AM,="4182106610",cesararce1985,Cesar Arce,False,0,0,0,,False,,,,,No,0,Has no posts.,0,0,,,,,,,No
,07/03/2020 05:54 AM,="8957742561",minkwhiz,𝕄𝕚𝕟𝕜𝕎𝕙𝕚𝕫,False,0,0,0,,False,,,,,No,0,Has no posts.,0,0,,,,,,,No
I would like to get the userIds only as below:
189389157
51807820
194962598
4182106610
8957742561
I've used ^(?:[^,\r\n]*,){3}([^,\r\n]+).* but it gets me "Usernames", I want is Userids.
I wish somebody who can help me to find the right Regex to extract the userids only.
Thank you
Use the advantage the time in the AM/PM format is present before each ID as well as the ID is surrounded with " characters:
(?:AM|PM),=\"(\d+)\"
Check the demo at Regex101.
You could use Match the =" and repeat the group 2 times instead of 3. Then capture 1+ digits.
Note to repeat the character class [^,\r\n] using * for 0 or more times.
If you want the digits only, you could replace with group 1 using $1
^(?:[^,\r\n]*,){2}="(\d+)".*
^ Start of string
(?:[^,\r\n]*,){2} Repeat 2 times matching 0 or more times any char except a comma or a newline, then match ,
=" Match literally
(\d+) Capture group 1, match 1+ digits
".* Match " and match the rest of the line
Regex demo
If you want the match only, you could make use of \K to reset the match buffer, then match the digits and assert a double quote on the right.
^(?:[^,\r\n]*,){2}="\K\d+(?=")
Regex demo

Using regex replacement in Sublime 3

I am trying to use replace in Sublime using regular expressions but I'm stuck. I tried various combinations but don't seem to be getting there.
This is the input and my desired output:
Input: N_BBP_c_46137_n
Output : BBP
I tried combinations of:
[^BBP]+\b
\*BBP*+\g
But none of the above (and many others) don't seem to work.
To turn N_BBP_c_46137_n into BBP and according to the comment just want that entire long name such as N_BBP_ to be replaced by only BBP* you might also use a capture group to keep BBP.
\bN_(BBP)_\S*
\bN_ Match N preceded by a word boundary
(BBP) Capture group 1, match BBP (or use [A-Z]+ to match 1+ uppercase chars)
_\S* Match _ followed by 0+ times a non whitespace char
In the replacement use the first capturing group $1
Regex demo
You may use
(N_)[^_]*(_c_\d+_n)
Replace with ${1}some new value$2.
Details
(N_) - Group 1 ($1 or ${1} if the next char is a digit): N_
[^_]* - any 0 or more chars other than _
-(_c_\d+_n) - Group 2 ($2): _c_, 1 or more digits and then _n.
See the regex demo.