Select part of line in regular expression - regex

I have this string:
#1#http://test.ir:8080/srvSC.svc#1#
#2#http://test.ir:8081/srvSC.svc#2#
#3#http://test.ir:8082/srvSC.svc#3#
#4#http://test.ir:8083/srvSC.svc#4#
#5#http://test.ir:8084/srvSC.svc#5#
#6#http://test.ir:8085/srvSC.svc#6#
I want to select all #1# #2# ... so in order to i wrote this expression : ^(^\#.\#) but it just select first line.How could i select first #.# and last of #.#?

You can use
^(#\d+#)(.+)\1$
That will capture the first #s in a group, repeat any characters, and then match the same characters that were matched in the first group. The string you want will be in the second captured group.
https://regex101.com/r/7Er0Ch/5

Related

How to match strings that are entirely composed of a predefined set of substrings with regex

How to match strings that are entirely composed of a predefined set of substrings. For example, I want to see if a string is composed of only the following allowed substrings:
,
034
140
201
In the case when my string is as follows:
034,201
The string is fully composed of the 'allowed' substrings, so I want to positively match it.
However, in the following string:
034,055,201
There is an additional 055, which is not in my 'allowed' substrings set. So I want to not match that string.
What regex would be capable of doing this?
Try this one:
^(034|201|140|,)+$
Here is a demo
Step by step:
^ begining of a line
(034|201|140|,) captures group with alternative possible matches
+ captured group appears one or more times
$ end of a line
This regex will match only your values and ensure that the line doesn't start or end with a comma. Only matches in group 0 if it is valid, the groups are non-matching.
^(?:034|140|201)(?:,(?:034|140|201))*$
^: start
(?:034|140|201): non-matching group for your set of items (no comma)
(?:,(?:034|140|201))*: non-matching group of a comma followed by non-matching group of values, 0 or more times
$: end

How to get text that is before and after of a matched group in a regex expression

I have following regex that matches any number in the string and returns it in the group.
^.*[^0-9]([0-9]+).*$  $1
Is there a way I can get the text before and after of the matched group i.e. also as my endgoal is to reconstruct the string by replacing the value of only the matched group.
For e.g. in case of this string /this_text_appears_before/73914774/this_text_appears_after, i want to do something like $before_text[replaced_text]$after_text to generate a final result of /this_text_appears_before/[replaced_text]/this_text_appears_after
You only need a single capture group, which should capture the first part instead of the digits:
^(.*?[^0-9])[0-9]+
Regex demo
In the replacement use group 1 followed by your replacement text \1[replaced_text]
Example
pattern = r"^(.*?[^0-9])[0-9]+"
s = "/this_text_appears_before/73914774/this_text_appears_after"
result = re.sub(pattern, r"\1[replaced_text]", s)
if result:
print (result)
Output
/this_text_appears_before/[replaced_text]/this_text_appears_after
Other options for the example data can be matching the /
^(.*?/)[0-9]+
Or if you want to match the first 2 occurrences of the /
^(/[^/]+/)[0-9]+

Regular Expression to Anonymize Names

I am using Notepad++ and the Find and Replace pattern with regular expressions to alter usernames such that only the first and last character of the screen name is shown, separated by exactly four asterisks (*). For example, "albobz" would become "a****z".
Usernames are listed directly after the cue "screen_name: " and I know I can find all the usernames using the regular expression:
screen_name:\s([^\s]+)
However, this expression won't store the first or last letter and I am not sure how to do it.
Here is a sample line:
February 3, 2018 screen_name: FR33Q location: Europe verified: false lang: en
Method 1
You have to work with \G meta-character. In N++ using \G is kinda tricky.
Regex to find:
(?>(screen_name:\s+\S)|\G(?!^))\S(?=\S)
Breakdown:
(?> Construct a non-capturing group (atomic)
( Beginning of first capturing group
screen_name:\s\S Match up to first letter of name
) End of first CG
| Or
\G(?!^) Continue from previous match
) End of NCG
\S Match a non-whitespace character
(?=\S) Up to last but one character
Replace with:
\1*
Live demo
Method 2
Above solution substitutes each inner character with a * so length remains intact. If you want to put four number of *s without considering length you would search for:
(screen_name:\s+\S)(\S*)(\S)
and replace with: \1****\3
Live demo

Insert a character when capturing group

I want to select a group out of a given string and insert a character in position 5 of that group.
Input String: xxx123456789yyy
Expression: ^x{3}(?<serialno>\d{5}\d{4})y{3}$
Output (serialno): 123456789
Now I want the serialno group to contain a 'A' between 5 and 6, so that I get '12345A6789' instead of 123456789'. The character is always an 'A' and I want to do this in one Regular Expression.
Is it possible to do this with match or do I have to call match and replace?
You can't alter a string with a match, so you'll need to use preg_replace:
$output = preg_replace('/^x{3}(\d{5})(\d{4})y{3}$/', '$1A$2', $input);

remove port number from list in notepad++

I have csv file that has the computer name in one column, and the same computer name with the port number on the second column. I want to compare that the name in both column 1 and 2 are the same. So I am trying to remove the :##### from the list. How do I do this?
I can't post a picture as I am too new here, but it looks like this:
ComputerName,ComputerName:18062
ComputerName2,ComputerName2:198099
Find ^((.*?),\2).*?$ and replace with \1. Use regular expression search mode, without . matches newline.
^ match start of line
Outer () define group 1.
(.*?), Match any character until , is found. Result is stored to group 2.
\2 Match same string again as in group 2.
.*?$ Match remaining characters until the end of line.
Entire line is replaced with group 1.