Regular Expression for an mail address - regex

"Max Mustermann" <max.mustermann#domain.com>
max.mustermann#domain.com
Max <max.mustermann#domain.com>
I need a regular Expression which matches everthing outside the arrow brackets (including the brackets).
The Match should be removed afterwards.
After the replacement it should look like this:
"Max Mustermann" <max.mustermann#domain.com> => max.mustermann#domain.com

The easiest solution would be to search for
[^<]*<([^>]*)>.*
and replace that with \1 or $1, depending on your regex engine.
This removes everything until the first < and everything from the next > until the end of the string.
Let's just hope that there will be no brackets inside the quoted names.

This should work, but beware that it is very simplified:
(?:[^<]*<)?([^>]+).*
Answer of email will be in $1.
For example, in Perl use:
$email =~ s/(?:[^<]*<)?([^>]+).*/$1/;
See RegexPlanet online demo.

Related

Can not catch substring by regex which ends with tab

I have two types of strings:
1: ANN=abcdefgh;blabla
2 wrong version: ANN=abcdefgh\tyxz\tyxz
2 actual version: ANN=abcdefgh
Now I want to extract the abcdefgh with a regex. So the start to extract is always after "ANN=". But the end is eighter a semicolon (;) or the FIRST occurrence of a tab.
How does the regex for this look? I tried:
(my #splitUpAnn) = $tabValues[7] =~ /ANN=(.*)[;\t]/;
But I always get just the version 1 with the semicolon back, but it does not work for version two...
EDIT: To be clear. I did not get back ANYTHING for the version two. The problem is NOT that the last tab is used!
EDIT2: Ups, there was something different in the input data than expected. Either I have a semicolon at the end of NOTHING (see "2 actual version"). Sorry for that! So what would the regex then be?
Use .*? instead of .*.
.* is greedy so it matches with second occurrence of TAB.
DEMO
Just use the non-greedy quantifier *? that matches the least it can:
for my $string ('ANN=abcdefgh;blabla', "ANN=abcdefgh\tyxz\tyxz") {
(my #splitUpAnn) = $string =~ /ANN=(.*?)[;\t]/;
print "#splitUpAnn\n";
}
If you want to get the string up to the first semicolon if present, or everything otherwise, just use
$string =~ /ANN=([^;]*)/
i.e. capture everything that's not a semicolon.
/ANN=(.*?)[;\t]/
Make your regex non greedy.
.* is greedy and will match upto the last ; or \t available.
my ($ann) = $tabValues[7] =~ /ANN=(.*?)[;\t]/;
The leading ^ negates the character class, so [^;\t] matches any character except ; and tab.
There are multiple suggestions of making you .* non-greedy, but using non-greediness as anything but an optimization is very fragile and error prone.
I've tested and I got a match
if ( "ANN=abcdefgh;blabla" =~ /(ANN=(.*)[;\t])/ ) {
print $1."\n" ;}
if ( "ANN=abcdefgh\tyxz\tyxz" =~ /(ANN=(.*)[;\t])/ ) {
print $1."\n" ;}
result is:
ANN=abcdefgh;
ANN=abcdefgh yxz
So:
your request is really greedy, as described in previous answers
Perhaps the problem lies in the way you put the values in the array, but the regexp is correct

Regular expression which matches a specific pattern

I want to find a regular expression in Perl which matches a pattern such as this:
my $sumthing = "people say
for -->";
Over here after say there is a single newline character. So I need to find a regular expression which could match such a pattern which includes a newline within a pattern. Please help me to find this as I'm new to Perl & regular expression.
The possible methods I tried were these:
if (($sumthing !~ (/\n+$/)) && ($sumthing !~ (/^\n+/m)))
They kindly help me to find out an expression to match this kind of a pattern, but not getting the output as desired.
It's not clear what you want. Do you want match that string exactly? If so, you could use
$sumthing =~ /^people say\nfor -->\z/
or
$sumthing eq "people say\nfor -->"
Or maybe what you need to know is that . matches any character including newline when /s is used?
/people .* -->/s
The following will check for anything then new line then anything. Not sure if I totally understood your question.
if($sumthing =~ m/.*\n.*/)
Have a look at the /s modifier which causes .to match anything, including a newline.
my $str = "people say for\nsomething...";
$str =~ m{say(.*)}s and print "'$1'\n";
This would print:
' for
something...'

Using regex to fetch a value

I have a string:
set a "ODUCTP-1-1-1-2P1"
regexp {.*?\-(.*)} $a match sub
I expect the value of sub to be 1-1-1-2P1
But I'm getting empty string. Can any one tell me how to properly use the regex?
The problem is that the non-greediness of the .*? is leaking over to the .* later on, which is a feature of the RE engine being used (automata-theoretic instead of stack-based).
The simplest fix is to write the regular expression differently.
Because Tcl has unanchored regular expressions (by default) and starts matches as soon as it can, a greedy match from the first - to the end of the string is perfect (with sub being assigned everything after the -). That's a very simple RE: -(.*). To use that, you do this:
regexp -- {-(.*)} $a match sub
Note the --; it's needed here because the regular expression starts with a - symbol and is otherwise confused as weird (and unsupported) option. Apart from that one niggle, it's all entirely straight-forward.
$str = "ODUCTP-1-1-1-2P1";
$str =~ s/^.*?-//;
print $str;
or:
$str =~ /^.*?-(.*)$/;
print $1;

regular expression for find and replace

I've got strings like:
('Michael Herold','Michael Herold'),
but I need to remove the last parts so I end up with:
('Michael Herold'),
I'm still new to Regular Expressions so they confuse me. I'm using Notepad++.
find: \('([^']*)','\1'\)
Replace: ('\1')
So the actual function you use will depend on the language. Notepad++ is a text editor, not a language.
The regular expression that you will want will be ",'Michael Herold'" and you'll replace any matches with "", the empty string.
So in PHP for example, you'll have
$source = "('Michael Herold','Michael Herold')";
$pattern = "/(,'Michael Herold')+/";
$newString = $preg_replace($pattern, $source, "");
Do the equivalent in whatever language you use.
I'm not sure what flavor of regular expressions Notepad++ uses, but try replacing this expression:
\('([^']*)','\1'\)
with this one:
('$1')
The \1 matches whatever was found in the first set of single quotes (Michael Herold in your example), and $1 is replaced with that same string. (Try \1 if $1 doesn't work in Notepad++.)
See it in action here.

How do I access matched objects for replacement when using regular expression mode in PL/SQL Developer Find & Replace?

I have a query where I want to replace
avg(j2)
with
avg(case when j2 <> 0 then j2 else 0 end)
The above is a specific example but the pattern is the same with all the replacements. It's always a word followed by a number that needs to be replaced with the case statement that checks if the number is not 0.
I tried the following for find:
avg(\(\w\d\))
and the find works. Now, I want to do a replace so I try:
avg(case when \1 <> 0 then \1 else 0 end)
but it puts literal \1 and not the captured text from the match. I tried \\1 & $1 as well and it takes all of them literally. Can anyone tell me what the right syntax is for using the captured text for replacement? Is this supported?
Thanks,
Ashish
I am not sure if the PL/SQL Developer IDE supports group capture. The recent versions do seem to support regex based find and replace though. Cant find a source to confirm if group capture works.
Why dont you try pasting the code in a something like Notepad++ and try the same regex. It should work. You could paste the result back to your IDE and continue from there...
You can replace it using $ and number like,
$0 or $1 etc. see an example below
find: TABLE (.*\..*) IS
replace: COLUMN $1 IS
http://regexr.com/3gm6c