Regular Expression (Regex) for pulling name between quotations - Powershell - regex

I have strings that look as follows:
\\.\ROOT\abc\kjasdkj\MyClass:InstanceName.name="sxs-test3"
I want a regex that can only pull out the name in quotations so the result is sxs-test3
Also, I am using windows Powershell to do this, can this be done in PowerShell?
Thanks

Here's another option that doesn't require regex:
$path.TrimEnd('"').Split('="',[StringSplitOptions]::RemoveEmptyEntries)[-1]

For that string this regex seems simplest to me:
$string = '\\.\ROOT\abc\kjasdkj\MyClass:InstanceName.name="sxs-test3"'
if ($string -match '"(.+?)"') {
$Matches[1]
}
Matches everything inside double quotes as few times as possible (lazy expansion).

if ($subject -cmatch '(?<=")[^"]*(?=")') {
$result = $matches[0]
}
This looks for any number of characters except quotes ([^"]*), but only if they are preceded by a quote ((?<=")) and followed by a quote ((?=")).
It does not even try to handle escaped quotes.

Related

PowerShell Regex with "." PowerShell

i want to replace a part of some Strings in a loop with PowerShell
example string
testvm029.vmxxx
I want to replace everything at .vmxxx. Every string has another length but all ends with .vm... So the result should be: testvm029
I tried the following Script:
Foreach($String in $Strings) {
$StringTest = $String -replace "(.vm)(.+)","$null"
}
Of course this kills my String on the first vm and not at .vm...
result:
test
how can i achieve my goal?
EDIT: .vm is not the last '.' of my String so something like that:
test123.vmabc.string
I want to cut it after .vm
You need to use
$String -replace '\.vm.*'
See the regex demo
The pattern will find the first .vm substring (with \.vm) and then match the rest of the string (with .*). The match will be replaced with an empty string (it is used by default, but you may write it explicitly, $String -replace '\.vm.*', '').

perl regex remove newlines in string

I have a Perl script which runs over a database dump in a plain text file, trying to remove all instances of newlines and possibly other odd characters when I see strings between quotes:
INSERT INTO ... VALUES ( "... these are the lines I'm interested in." )
I slurp in the file:
#file = <FILE>;
and:
foreach my $line (#file) {
$line =~ s/"[^"]*(\R)+[^"]*"//g;
# I want to get rid of newlines in strings
# And other odd characters I might come across
}
One character class I used instead of (\R) was:
([\r\n\t\v\f]+)
and I would try to:
$line =~ s/"[^"]+?([\r\n\t\v\f]+)[^"]*"//g;
I'm sure I'm missing something. I try to start matching with a literal double quote, scan past anything not a double quote (non-greedy, at least one match), reach the characters I want to get rid of, and keep scanning not double quote (any number of other characters not a double quote) until I reach the ending double quote.
So I wanted to replace $1 capture above with nothing.
I've tried on-line regex builders, and
/"[^"]*?([\r\n\t\f\v]+)[^"]*"/
worked with an on-line test, using a short paragraph with newlines and tabs in it, although it was in PHP pcre mode. I thought it would have worked with Perl.
Perhaps I'm not escaping some characters properly in the regex for Perl? Or the pattern is just not going to work the way I want it to, because it's wrong.
Thank you, any help appreciated.
The regex at regex101.com:
"[^"]*?([\r\n\f\t\v]+)[^"]*?"
matches for strings like this:
"This is
my\t test
string.
So there!"
I'm thoroughly puzzled now. :)
The real problem is that you will only find one group of \R's when there could be many groups between quotes. The best thing to do is make a callback (eval) with a general match between quotes, then substitute the \R's in
the replacement.
something like:
sub repl {
my ($content) = _#;
$content =~ s/\R+//g;
return $content;
}
$input =~ s/"([^"]*)"/ repl($1) /ge;
edit: If you're looking for only 1 linebreak cluster, you have to
exclude linebreaks leading up to it. For example: [^"\r\n]+
edit2: To slurp the file into $input, do a
$/ = undef;
my $input = <$fh>;

Regular Expression - Perl

I am trying to get the a sub string from a string using regular expression but it getting error as my regular expression is not working. Can any one help me out in writing correct one :
Here is the Pattern on which i am trying to write the regular expression :
MSM8_BD_V4.3_1-1_idle-Kr_Run3.xlsx
MSM8_BD_V4.3_2-6_mp3-Kr_Run2.xlsx
MSM8_BD_V4.3_Camera_snap-7.xlsx
MSM8_BD_V4.3_Camera_snap-8.xlsx
MSM8_BD_V4.3_Radio_202.16-0.xlsx
I am trying to get the bold part of the substring .
below is the Regular expression i tried:
my $line = "MSM8939_BD_V4.3_1-1_idle-Kratos_Run3.xlsx";
my ($captured) = $line =~ /MSM8939_BD_V4\.\3\_[d]*(.+?)\w/gx;
print "$captured\n";
[d] matches nothing but the literal letter d. You want \d, without the brackets, to match a digit. However, it looks like you also want to include underscores. That would be [\d_].
Try this:
/^MSM8_BD_V4\.3_[\d_]*-?([^-]+)/
If I run this on your input (with e.g. perl -nE 'say $1 if /^MSM8_BD_V4\.3_[\d_]*-?([^-]+)/'), I get this output:
1_idle
6_mp3
Camera_snap
Camera_snap
Radio_202.16
my $line = "MSM8939_BD_V4.3_1-1_idle-Kratos_Run3.xlsx";
for (qw(
MSM8939_BD_V4.3_1-1_idle-Kratos_Run3.xlsx
MSM8939_BD_V4.3_2-6_mp3-Kratos_Run2.xlsx
MSM8939_BD_V4.3_Camera_snap-7.xlsx
MSM8939_BD_V4.3_Camera_snap-8.xlsx
MSM8939_BD_V4.3_Radio_202.16-0.xlsx
)) {
my ($captured) = ($_ =~ /.*[-_]([^\W_]+_[\w.]+)-/gx);
print "$captured\n";
}
Use a greedy pattern to go as far as possible, then grab the last two strings that look like what you want which are still followed by a hyphen.
As does the other answer which was just edited while I was typing, this produces:
1_idle
6_mp3
Camera_snap
Camera_snap
Radio_202.16
This one may be more general in that the beginning of the substring is not hard-coded, i.e., you could use it in other cases which did not necessarily start with MSM8_BD_V4.3.

Need Regex to parse String

Using a regular expression, I want to parse a String like "DOCID = 1234567 THIS IS TEST" and remove the remaining String after the numbers.
How can I do this using the VIM editor?
In Perl:
$str =~ s/(= \d+).*$/$1/;
In php:
$str = preg_replace('/(= \d+).*$/', "$1", $str);
That will do the job:
:%s/\d\+\zs.*
Explanation:
% use the whole buffer, you can omit this if you want to change current line only
s the substitute command
\d\+ match as many numbers
\zs set the start of match here
.* everything else
you can omit the replacement string because you want to delete the match
In VIM, in command mode (press ESC), write :
:s/\([^0-9]\+[0-9]\+\).*/\1/
This will do the job.
If you want to do all replacement possible, then :
:s/\([^0-9]\+[0-9]\+\).*/\1/g
in java string.replaceFirst("(= \\d+).*$","\\1");

How can I capture multiple matches from the same Perl regex?

I'm trying to parse a single string and get multiple chunks of data out from the same string with the same regex conditions. I'm parsing a single HTML doc that is static (For an undisclosed reason, I can't use an HTML parser to do the job.) I have an expression that looks like:
$string =~ /\<img\ssrc\="(.*)"/;
and I want to get the value of $1. However, in the one string, there are many img tags like this, so I need something like an array returned (#1?) is this possible?
As Jim's answer, use the /g modifier (in list context or in a loop).
But beware of greediness, you dont want the .* to match more than necessary (and dont escape < = , they are not special).
while($string =~ /<img\s+src="(.*?)"/g ) {
...
}
#list = ($string =~ m/\<img\ssrc\="(.*)"/g);
The g modifier matches all occurences in the string. List context returns all of the matches. See the m// operator in perlop.
You just need the global modifier /g at the end of the match. Then loop through
until there are no matches remaining
my #matches;
while ($string =~ /\<img\ssrc\="(.*)"/g) {
push(#matches, $1);
}
Use the /g modifier and list context on the left, as in
#result = $string =~ /\<img\ssrc\="(.*)"/g;