Powershell regex replace only first hit - regex

I'm trying to use a regular expression to replace the first character after a single hit, while using PowerShell.
No matter how I try, I can't seem to make it work. Here's what I'm talking about:
Code:
$info = 'AB/F/*ZXCVBN/MTF/ ---'
$regex = [REGEX]'/*'
$regex.Replace($info,"/C",1)
$regex
Output:
/CAB/F/*ZXCVBN/MTF/ ---
I'm simply trying to replace the /F in the expression with /C, but it fails every time.
I'm using /* since I don't really know what character will I find after the first / but that's what I want to replace in the end of the day.
I pretty sure this will be pretty simple but, as you can see, I'm, just not familiar enough with regular expressions.

Ok, rather than just a comment I guess I'll add an answer. You can use a negative lookbehind to make sure that there are no /'s before what you are matching, so it will only match the first one. Also, as Noah stated the * is not a wildcard, . is. This will match any / plus 1 character that does not have another / anywhere before it in the string:
"(?<!/.*)/."
So in context to your code, it would look like this:
$info = 'AB/F/*ZXCVBN/MTF/ ---'
$regex = [REGEX]"(?<!/)/."
$regex.Replace($info,"/C",1)
Those lines will output:
AB/C/*ZXCVBN/MTF/ ---
Edit: RegEx broken down at RegEx101: http://regex101.com/r/tI7oN1/1

$info = 'AB/F/*ZXCVBN/MTF/ ---'
$regex = [REGEX]'^([^/]*)/[a-zA-Z]'
$regex.Replace($info,"$1/C",1)
$regex
^([^/]*) - this looks for anything but slashes at the beginning, captured in a group
/[a-zA-Z] - then a slash followed by a letter
The replacement puts back whatever was matched by the first group, and adds /C

$info = 'AB/F/*ZXCVBN/MTF/ ---'
$regex = [REGEX]'/.'
$regex.Replace($info,"/C",1)
Or simply:
$info -replace '^(.*?)/.','$1/C'

If you don't want to use the regex, you can accomplish the same thing with -split and -join:
$info = 'AB/F/*ZXCVBN/MTF/ ---'
$info -split '/.',2 -join '/C'
AB/C/*ZXCVBN/MTF/ ---
The ,2 will stop the split after the first match (2 elements). Then re-join the elements with /C.

You are misunderstanding * in regex. It is not a wildcard character.
The star in regex means capture 0 or more of the preceding items in the expression.
Wildcard in regex is actually a period.

Related

perl script not extracting only path (regex)

I have this regex expression
($oldpath = $_) =~ m/^\/(.+\/)*/;
This is the input:
/cd-lib/mp3/rock/LittleFeat/Dixie_Chicken/110-lafayette_railroad.mp3
But the output is:
/cd-lib/mp3/rock/LittleFeat/Dixie_Chicken/110-lafayette_railroad.mp3
When it should be:
/cd-lib/mp3/rock/LittleFeat/Dixie_Chicken/
Thanks in advance. :)
What do you mean by "output"? $1 contains
cd-lib/mp3/rock/LittleFeat/Dixie_Chicken/
which is almost what you wanted (it just misses the leading /).
You assigned $_ to $oldpath, than matched it against a regex. It doesn't change either $_ or $oldpath.
The canonical way is
my ($match) = m/^\/(.+\/)*/;
or rather (to prevent the leaning toothpick syndrome)
my ($match) = m{^/(.+/)*};
i.e. running the match in list context returns the matching capture groups, and the first one is assinged to $match.

Make a regular expression in perl to grep value work on a string with different endings

I have this code in perl where I want to extract the value of 'EUR_AF', in this case '0.39'.
Sometimes 'EUR_AF' ends with ';', sometimes it doesn't.
Alternatively, 'EUR_AF' may end with '=0' instead of '=0.39;' or '=0.39'.
How do I make the code handle that? Can't seem to find it online...I could of course wrap everything in an almost endless if-elsif-else statement, but that seems overkill.
Example text:
AVGPOST=0.9092;AN=2184;RSQ=0.5988;ERATE=0.0081;AC=144;VT=SNP;THETA=0.0045;AA=A;SNPSOURCE=LOWCOV;LDAF=0.0959;AF=0.07;ASN_AF=0.05;AMR_AF=0.10;AFR_AF=0.11;EUR_AF=0.039
Code: $INFO =~ m/\;EUR\_AF\=(.*?)(;)/
I did find that: $INFO =~ m/\;EUR\_AF\=(.*?0)/ handles the cases of EUR_AF=0, but how to handle alternative scenarios efficiently?
Extract one value:
my ($eur_af) = $s =~ /(?:^|;)EUR_AF=([^;]*)/;
my ($eur_af) = ";$s" =~ /;EUR_AF=([^;]*)/;
Extract all values:
my %rec = split(/[=;]/, $s);
my $eur_af = $rec{EUR_AF};
This regex should work for you: (?<=EUR_AF=)\d+(\.\d+)?
It means
(?<=EUR_AF=) - look for a string preceeded by EUR_AF=
\d+(\.\d+)? - consist of a digit, optionally a decimal digit
EDIT: I originally wanted the whole regex to return the correct result, not only the capture group. If you want the correct capture group edit it to (?<=EUR_AF=)(\d+(?:\.\d+)?)
I have found the answer. The code:
$INFO =~ m/(?:^|;)EUR_AF=([^;]*)/
seems to handle the cases where EUR_AF=0 and EUR_AF=0.39, ending with or without ;. The resulting $INFO will be 0 or 0.39.

Regular Expression - Perl

I am trying to get the a sub string from a string using regular expression but it getting error as my regular expression is not working. Can any one help me out in writing correct one :
Here is the Pattern on which i am trying to write the regular expression :
MSM8_BD_V4.3_1-1_idle-Kr_Run3.xlsx
MSM8_BD_V4.3_2-6_mp3-Kr_Run2.xlsx
MSM8_BD_V4.3_Camera_snap-7.xlsx
MSM8_BD_V4.3_Camera_snap-8.xlsx
MSM8_BD_V4.3_Radio_202.16-0.xlsx
I am trying to get the bold part of the substring .
below is the Regular expression i tried:
my $line = "MSM8939_BD_V4.3_1-1_idle-Kratos_Run3.xlsx";
my ($captured) = $line =~ /MSM8939_BD_V4\.\3\_[d]*(.+?)\w/gx;
print "$captured\n";
[d] matches nothing but the literal letter d. You want \d, without the brackets, to match a digit. However, it looks like you also want to include underscores. That would be [\d_].
Try this:
/^MSM8_BD_V4\.3_[\d_]*-?([^-]+)/
If I run this on your input (with e.g. perl -nE 'say $1 if /^MSM8_BD_V4\.3_[\d_]*-?([^-]+)/'), I get this output:
1_idle
6_mp3
Camera_snap
Camera_snap
Radio_202.16
my $line = "MSM8939_BD_V4.3_1-1_idle-Kratos_Run3.xlsx";
for (qw(
MSM8939_BD_V4.3_1-1_idle-Kratos_Run3.xlsx
MSM8939_BD_V4.3_2-6_mp3-Kratos_Run2.xlsx
MSM8939_BD_V4.3_Camera_snap-7.xlsx
MSM8939_BD_V4.3_Camera_snap-8.xlsx
MSM8939_BD_V4.3_Radio_202.16-0.xlsx
)) {
my ($captured) = ($_ =~ /.*[-_]([^\W_]+_[\w.]+)-/gx);
print "$captured\n";
}
Use a greedy pattern to go as far as possible, then grab the last two strings that look like what you want which are still followed by a hyphen.
As does the other answer which was just edited while I was typing, this produces:
1_idle
6_mp3
Camera_snap
Camera_snap
Radio_202.16
This one may be more general in that the beginning of the substring is not hard-coded, i.e., you could use it in other cases which did not necessarily start with MSM8_BD_V4.3.

Cycle through regex and replace found instance

I currently have the following:
# Pre-append "$" to variable names.
# ['"](?:[^'"]*?(?:\\")*)*["'] Matches strings within double or single quotes.
# (*SKIP)(*F) Causes the preceding pattern to fail. Tries to match the pattern on the right side of the | operator using the remaining strings.
my $temp = $entire_line;
while ($temp =~ /['"](?:[^'"]*?(?:\\")*)*["'](*SKIP)(*F)|([A-Za-z0-9_]+)/g){
my $variable_name = $1;
$entire_line =~ s/$variable_name/\$$variable_name/;
}
Given $entire_line = ((factor0 + factor1) * factor2) + factor0
I would like my output to be:
(($factor0 + $factor1) * $factor2) + $factor0
However, I'm getting:
(($$factor0 + $factor1) * $factor2) + factor0
I know this is happening because it is finding the first instance offactor0 twice. Is there a good way to prevent this from happening and replace the instance that is being found?
Also do I need to use the $temp variable?
Thanks for your help.
(\w+)
Use this.Replace with $$1.
See dmeo.
http://regex101.com/r/qC9cH4/17
The long regex is not finding the first factor0 twice. It's the simple regex in the substitution that does. In order to get that to work, you need to make sure it doesn't find the ones that start with a $.
$entire_line =~ s/([^\$])$variable_name/$1\$$variable_name/;
You can just use $entire_line with that solution and get rid of $temp, but it's very confusing in general. If this is production code, I suggest you add comments to the code and also to the regex by using the /x flag. Your future self will thank you later.
Check your regex here: http://regex101.com/r/vX0aJ9/1

Regular expression using powershell

Here's is the scenario, i have these lines mentioned below i wanted to extract only the middle character in between two dots.
"scvmm.new.resources" --> This after an regular expression match should return only "new"
"sc.new1.rerces" --> This after an regular expression match should return only "new1"
What my basic requirement was to exract anything between two dots anything can come in prefix and suffix
(.*).<required code>.(.*)
Could anyone please help me out??
You can do that without using regex. Split the string on '.' and grab the middle element:
PS> "scvmm.new.resources".Split('.')[1]
new
Or this
'scvmm.new.resources' -replace '.*\.(.*)\..*', '$1'
Like this:
([regex]::Match("scvmm.new1.resources", '(?<=\.)([^\.]*)(?=\.)' )).value
You don't actually need regular expressions for such a trivial substring extraction. Like Shay's Split('.') one can use IndexOf() for similar effect like so,
$s = "scvmm.new.resources"
$l = $s.IndexOf(".")+1
$r = $s.IndexOf(".", $l)
$s.Substring($l, $r-$l) # Prints new
$s = "sc.new1.rerces"
$l = $s.IndexOf(".")+1
$r = $s.IndexOf(".", $l)
$s.Substring($l, $r-$l) # Prints new1
This looks the first occurence of a dot. Then it looks for first occurense of a dot after the first hit. Then it extracts the characters between the two locations. This is useful in, say, scenarios in which the separation characters are not the same (though the Split() way would work in many cases too).