Regular expression for duplicate string - regex

Hello I am trying to formulate the regular expression to find substring and replace portion of that string. I have input in the format
Some_text_beginning_AASHISH_XX_YY_COPY_COPY_COPY_COPY
Please see that every string will have word AASHISH and in the end there could be indeterminate number of COPY. I want to delete all the COPY
I wrote the regular expression as
(.*)_AASHISH_(.*)_COPY+
I could find all the valid expression with this. But when I try to replace it with
$1_AASHISH_$2
It replaces just the last _COPY All the _COPY which came before last one are taken to be in group 2.
Further see that I am not using any programming language. I am using some third party tool. All it allows me is to search for string and replace it. It allows me to write regular expression.
Just to clarify why this question is not the same as posted before, tool I am using does not allow me use all regular expression somehow. I dont know how that tool is created. I just have UI.
Thanks in advance

Here's a regex that will capture the whole portion you want to maintain, resulting in a replacement that's just $1.
(.*_AASHISH_.*?)(?:_COPY)+
A few notes:
.*? - The ? on the end makes the repetition operator * non-greedy. It will match the minimum characters given its context.
(?:_COPY) - The ?: prefix makes this a non-capturing grouping.
+ - The repetition operator will make the entire last group (_COPY) repeat 1 or more times, not just the Y.

Related

Regex to match one of any terms, some terms with spaces

I'm trying to write a RegEx that matches one of several terms, as part of a spam filter. The problem is, some of these terms contain spaces, and I'm having trouble writing a valid expression.
What I originally had (before multiple word temrs) was this:
(?i)(alzheimers|baldness|obese)
Now, I want to add, for example "blood pressure", but the following expression is chucking a barny:
(?i)(alzheimers|baldness|blood pressure|obese)
You can have whitespace characters in an either-or group, your expression works. Check it out for yourself:
https://regex101.com/r/56tz6B/1
Your expression should also match "blood pressure" without any problems.
Could you try to use \s+ instead of the space character and see if it works? Please note that this would also match any whitespace (tabs, new lines etc.).

A regular expression that replaces a group with hard coded text

First of all, I'm not sure if this is something you can even do in regular expressions. If you can, I have no idea on how to search for how to do it.
Let's say I have text:
Click this link for more information.
And a regular expression:
<a[^>]*>([^<]*)</a>
The application of the regular expression would yield this for group 1:
this link
Let's say I wanted to write the regular expression to instead return hard coded text for group 1
<a[^>]*>(${{replacement text}}[^<]*)</a>
(this is made up syntax by the way)
So that the application of the regular expression to the text would yield this for group 1:
replacement text
Is this possible?
Here's another example just to solidify my objective:
Examples of text:
serverNode1/appPortal
serverNode1/appPortal2
serverNode1/appPortal3
My regular expression
appPortal((?:?{{"1"}}\b)|(?:\d))
(using the same made up syntax)
The expected output for the first character group should be
1
2
3
(The point of the expression is to match the word break and replace it with "1" or otherwise use the digit character class to match a digit. The sub-groups are made optional with the ?: so the outside group is still group 1).
What is the point of this you may ask? I am using Splunk to do field extractions, and I'd like for the field to be extracted as 1, 2, or 3, like in my above example, and I can only rely on the regular expression groups to give me the fields (as in, I don't have anywhere to put code to say if group 1 == "" then change to "1").
Basically, as the regular expressions defined, it is not possible. By definition, regular expressions match the patterns in the text. To be clear, regexp engine returns matches that are always part of the original string, nothing more. There are some regex extensions that allows to specify name of the capturing group, but it does not transform the match.
The behaviour you described can be easy achieved processing the regex match in any programming language, but it also can be achieved by combining regex substitution and parsing.
For example, s/appPortal(?!\d)/appPortal1/ will replace "appPortal" without the digit after it with "appPortal1" and then you can apply another regex to build the match you want.

Regex for AND operator

I need a regex that needs to match
start from origin to id= and ;to cases.
I applied "OR" condition but it satifies only one condition. Any suggestions?
origin=eBook;id=**N27F-00000-00**;type=cases
Regex:
(^(.*id=)|(;type=cases.*))
You are mistaking some fundamentals of regular expressions, which I'll explain in a minute. But for now, try this:
id=(.*?);type=cases
Regular expressions try to match as much as a string as possible. This means it can match part of a string, and you don't need to use .* on either side of the string (unless you want to capture that information).
Since we aren't matching the .* in the beginning, you won't need to start from the beginning of the string (^).
There is no such thing as an AND operator, since an entire regular expression must match by default.
Link
Update
This will still match the whole chunk of regex. Since I used parenthesis around the important part (N27F-00000-00), it will be placed in a "match group". If you don't want to deal with match groups, you can use "lookarounds":
(?<=id=).*?(?=;type=cases)
Link

regular expression for find replace modification

I wanted to use regular expressions in eclipse to adept code to a software update.
instead of
{$CFG->prefix}example1.xy
the code needs to be:
{example1}.xy
to work.
another example would be:
{$CFG->prefix}example2.foo
>
{example2}.foo
constant parts are : {$CFG->prefix}; .
i tried the following (i used whitespaces to make reading easier):
Find: \{\$CFG-\>prefix\} ([a-z]|[0-9])* \. ([a-z]|[0-9])*
Which will find the requested String. I struggle in replacing it.
i can use ,/1to store the result of the regex and use it in the replacement (right?) but i am not sure how i can modify/manipulate this result.
thanks for any help.
You can try
\{\$CFG-\>prefix\}([a-z0-9]*)\.
and replace with
{\1}.
I am not sure why you do have the whitespaces in your regex, I removed them.
the quantifier * should be inside your group, otherwise you will have only the last matched character in \1 and not the complete word.
Since you don't want to replace the last part, you don't need to match and replace it.
Try the following search and replace :
Find: \{\$CFG->prefix\}([a-z0-9]*)\.([a-z0-9]*)
Replace with : {\1}.\2
Using the above the following :
BECOMES
Here is a quick screen-cast to show this in action.
Changes made to the OP's Find reg-ex
In order to get the above find-replace to work, I had to make the following changes to the OP's find expression :
Removed whitespaces.
Moved the Greedy Match modifier inside the groups : i.e. ([...]*) instead of ([...])*
Corrected the character set : i.e [a-z0-9] instead of [a-z]|[0-9]
Introduced another Group which captures the part after the period. This however is not strictly needed but may be useful in some scenarios.

Regular expression question

I have some text like this:
dagGeneralCodes$_ctl1$_ctl0
Some text
dagGeneralCodes$_ctl2$_ctl0
Some text
dagGeneralCodes$_ctl3$_ctl0
Some text
dagGeneralCodes$_ctl4$_ctl0
Some text
I want to create a regular expression that extracts the last occurrence of dagGeneralCodes$_ctl[number]$_ctl0 from the text above.
the result should be: dagGeneralCodes$_ctl4$_ctl0
Thanks in advance
Wael
This should do it:
.*(dagGeneralCodes\$_ctl\d\$_ctl0)
The .* at the front is greedy so initially it will grab the entire input string. It will then backtrack until it finds the last occurrence of the text you want.
Alternatively you can just find all the matches and keep the last one, which is what I'd suggest.
Also, specific advice will probably need to be given depending on what language you're doing this in. In Java, for example, you will need to use DOTALL mode to . matches newlines because ordinarily it doesn't. Other languages call this multiline mode. Javascript has a slightly different workaround for this and so on.
You can use:
[\d\D]*(dagGeneralCodes\$_ctl\d+\$_ctl0)
I'm using [\d\D] instead of . to make it match new-line as well. The * is used in a greedy way so that it will consume all but the last occurrence of dagGeneralCodes$_ctl[number]$_ctl0.
I really like using this Regular Expression Cheatsheet; it's free, a single page, and printed, fits on my cube wall.