"Max Mustermann" <max.mustermann#domain.com>
max.mustermann#domain.com
Max <max.mustermann#domain.com>
I need a regular Expression which matches everthing outside the arrow brackets (including the brackets).
The Match should be removed afterwards.
After the replacement it should look like this:
"Max Mustermann" <max.mustermann#domain.com> => max.mustermann#domain.com
The easiest solution would be to search for
[^<]*<([^>]*)>.*
and replace that with \1 or $1, depending on your regex engine.
This removes everything until the first < and everything from the next > until the end of the string.
Let's just hope that there will be no brackets inside the quoted names.
This should work, but beware that it is very simplified:
(?:[^<]*<)?([^>]+).*
Answer of email will be in $1.
For example, in Perl use:
$email =~ s/(?:[^<]*<)?([^>]+).*/$1/;
See RegexPlanet online demo.
Related
I have two types of strings:
1: ANN=abcdefgh;blabla
2 wrong version: ANN=abcdefgh\tyxz\tyxz
2 actual version: ANN=abcdefgh
Now I want to extract the abcdefgh with a regex. So the start to extract is always after "ANN=". But the end is eighter a semicolon (;) or the FIRST occurrence of a tab.
How does the regex for this look? I tried:
(my #splitUpAnn) = $tabValues[7] =~ /ANN=(.*)[;\t]/;
But I always get just the version 1 with the semicolon back, but it does not work for version two...
EDIT: To be clear. I did not get back ANYTHING for the version two. The problem is NOT that the last tab is used!
EDIT2: Ups, there was something different in the input data than expected. Either I have a semicolon at the end of NOTHING (see "2 actual version"). Sorry for that! So what would the regex then be?
Use .*? instead of .*.
.* is greedy so it matches with second occurrence of TAB.
DEMO
Just use the non-greedy quantifier *? that matches the least it can:
for my $string ('ANN=abcdefgh;blabla', "ANN=abcdefgh\tyxz\tyxz") {
(my #splitUpAnn) = $string =~ /ANN=(.*?)[;\t]/;
print "#splitUpAnn\n";
}
If you want to get the string up to the first semicolon if present, or everything otherwise, just use
$string =~ /ANN=([^;]*)/
i.e. capture everything that's not a semicolon.
/ANN=(.*?)[;\t]/
Make your regex non greedy.
.* is greedy and will match upto the last ; or \t available.
my ($ann) = $tabValues[7] =~ /ANN=(.*?)[;\t]/;
The leading ^ negates the character class, so [^;\t] matches any character except ; and tab.
There are multiple suggestions of making you .* non-greedy, but using non-greediness as anything but an optimization is very fragile and error prone.
I've tested and I got a match
if ( "ANN=abcdefgh;blabla" =~ /(ANN=(.*)[;\t])/ ) {
print $1."\n" ;}
if ( "ANN=abcdefgh\tyxz\tyxz" =~ /(ANN=(.*)[;\t])/ ) {
print $1."\n" ;}
result is:
ANN=abcdefgh;
ANN=abcdefgh yxz
So:
your request is really greedy, as described in previous answers
Perhaps the problem lies in the way you put the values in the array, but the regexp is correct
I want to find a regular expression in Perl which matches a pattern such as this:
my $sumthing = "people say
for -->";
Over here after say there is a single newline character. So I need to find a regular expression which could match such a pattern which includes a newline within a pattern. Please help me to find this as I'm new to Perl & regular expression.
The possible methods I tried were these:
if (($sumthing !~ (/\n+$/)) && ($sumthing !~ (/^\n+/m)))
They kindly help me to find out an expression to match this kind of a pattern, but not getting the output as desired.
It's not clear what you want. Do you want match that string exactly? If so, you could use
$sumthing =~ /^people say\nfor -->\z/
or
$sumthing eq "people say\nfor -->"
Or maybe what you need to know is that . matches any character including newline when /s is used?
/people .* -->/s
The following will check for anything then new line then anything. Not sure if I totally understood your question.
if($sumthing =~ m/.*\n.*/)
Have a look at the /s modifier which causes .to match anything, including a newline.
my $str = "people say for\nsomething...";
$str =~ m{say(.*)}s and print "'$1'\n";
This would print:
' for
something...'
I have a string:
set a "ODUCTP-1-1-1-2P1"
regexp {.*?\-(.*)} $a match sub
I expect the value of sub to be 1-1-1-2P1
But I'm getting empty string. Can any one tell me how to properly use the regex?
The problem is that the non-greediness of the .*? is leaking over to the .* later on, which is a feature of the RE engine being used (automata-theoretic instead of stack-based).
The simplest fix is to write the regular expression differently.
Because Tcl has unanchored regular expressions (by default) and starts matches as soon as it can, a greedy match from the first - to the end of the string is perfect (with sub being assigned everything after the -). That's a very simple RE: -(.*). To use that, you do this:
regexp -- {-(.*)} $a match sub
Note the --; it's needed here because the regular expression starts with a - symbol and is otherwise confused as weird (and unsupported) option. Apart from that one niggle, it's all entirely straight-forward.
$str = "ODUCTP-1-1-1-2P1";
$str =~ s/^.*?-//;
print $str;
or:
$str =~ /^.*?-(.*)$/;
print $1;
I've got strings like:
('Michael Herold','Michael Herold'),
but I need to remove the last parts so I end up with:
('Michael Herold'),
I'm still new to Regular Expressions so they confuse me. I'm using Notepad++.
find: \('([^']*)','\1'\)
Replace: ('\1')
So the actual function you use will depend on the language. Notepad++ is a text editor, not a language.
The regular expression that you will want will be ",'Michael Herold'" and you'll replace any matches with "", the empty string.
So in PHP for example, you'll have
$source = "('Michael Herold','Michael Herold')";
$pattern = "/(,'Michael Herold')+/";
$newString = $preg_replace($pattern, $source, "");
Do the equivalent in whatever language you use.
I'm not sure what flavor of regular expressions Notepad++ uses, but try replacing this expression:
\('([^']*)','\1'\)
with this one:
('$1')
The \1 matches whatever was found in the first set of single quotes (Michael Herold in your example), and $1 is replaced with that same string. (Try \1 if $1 doesn't work in Notepad++.)
See it in action here.
I have a query where I want to replace
avg(j2)
with
avg(case when j2 <> 0 then j2 else 0 end)
The above is a specific example but the pattern is the same with all the replacements. It's always a word followed by a number that needs to be replaced with the case statement that checks if the number is not 0.
I tried the following for find:
avg(\(\w\d\))
and the find works. Now, I want to do a replace so I try:
avg(case when \1 <> 0 then \1 else 0 end)
but it puts literal \1 and not the captured text from the match. I tried \\1 & $1 as well and it takes all of them literally. Can anyone tell me what the right syntax is for using the captured text for replacement? Is this supported?
Thanks,
Ashish
I am not sure if the PL/SQL Developer IDE supports group capture. The recent versions do seem to support regex based find and replace though. Cant find a source to confirm if group capture works.
Why dont you try pasting the code in a something like Notepad++ and try the same regex. It should work. You could paste the result back to your IDE and continue from there...
You can replace it using $ and number like,
$0 or $1 etc. see an example below
find: TABLE (.*\..*) IS
replace: COLUMN $1 IS
http://regexr.com/3gm6c